Skip to content

Implementation of paper "Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection"

License

Notifications You must be signed in to change notification settings

mever-team/rine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paper

This repository contains the implementation code for the ECCV 2024 accepted paper:

Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection (available at arXiv:2402.19091)

Christos Koutlis, Symeon Papadopoulos

Figure 1. The RINE architecture. A batch of $b$ images is processed by CLIP's image encoder. The concatenation of the $n$ $d$-dimensional CLS tokens (one from each Transformer block) is first projected and then multiplied with the blocks' scores, estimated by the Trainable Importance Estimator (TIE) module. Summation across the second dimension results in one feature vector per image. Finally, after the second projection and the consequent classification head modules, two loss functions are computed. Binary cross-entropy $\mathfrak{L}_{CE}$ directly optimizes SID, while the contrastive loss $\mathfrak{L}_{Cont.}$ assists the training by forming a dense feature vector cluster per class.

News

🎉 4/7/2024 Paper acceptance at ECCV 2024

29/2/2024 Pre-print release --> arXiv:2402.19091

💥 29/2/2024 Code and checkpoints release

Setup

Clone the repository:

git clone https://github.com/mever-team/rine

Create the environment:

conda create -n rine python=3.9
conda activate rine
conda install pytorch==2.1.1 torchvision==0.16.1 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt

Store the datasets in data/:

The data/ directory should look like:

data
└── coco
└── latent_diffusion_trainingset
└── RAISEpng
└── synthbuster
└── train
      ├── airplane	
      │── bicycle
      |     .
└── val
      ├── airplane	
      │── bicycle
      |     .
└── test					
      ├── progan	
      │── cyclegan   	
      │── biggan
      │      .
      │── diffusion_datasets
                │── guided
                │── ldm_200
                |       .

Evaluation

To evaluate the 1-class, 2-class, and 4-class chechpoints as well as the LDM-trained model provided in ckpt/ run python scripts/validation.py. The results will be displayed in terminal.

To get all the reported results (figures, tables) of the paper run python scripts/results.py.

Re-run experiments

To reproduce the conducted experiments, re-run in the following order:

  1. the 1-epoch hyperparameter grid experiments with python scripts/experiments.py
  2. the ablation study with python scripts/ablations.py
  3. the training duration experiments with python scripts/epochs.py
  4. the training set size experiments with python scripts/dataset_size.py
  5. the perturbation experiments with python scripts/perturbations.py
  6. the LDM training experiments with python scripts/diffusion.py

Finally, to save the best 1-class, 2-class, and 4-class models (already stored in ckpt/) run python scripts/best.py, that re-trains the best configurations and stores the corresponding trainable model parts.

With this code snippet the whole project can be reproduced:

import subprocess

subprocess.run("python scripts/experiments.py", shell=True)
subprocess.run("python scripts/ablations.py", shell=True)
subprocess.run("python scripts/epochs.py", shell=True)
subprocess.run("python scripts/dataset_size.py", shell=True)
subprocess.run("python scripts/perturbations.py", shell=True)
subprocess.run("python scripts/diffusion.py", shell=True)
subprocess.run("python scripts/best.py", shell=True)
subprocess.run("python scripts/validation.py", shell=True)
subprocess.run("python scripts/results.py", shell=True)

Demo

In demo/, we also provide code for inference on one real and one fake image from the DALL-E generative model. To demonstrate run python demo/demo.py.

Citation

@InProceedings{10.1007/978-3-031-73220-1_23,
author="Koutlis, Christos
and Papadopoulos, Symeon",
editor="Leonardis, Ale{\v{s}}
and Ricci, Elisa
and Roth, Stefan
and Russakovsky, Olga
and Sattler, Torsten
and Varol, G{\"u}l",
title="Leveraging Representations from Intermediate Encoder-Blocks for Synthetic Image Detection",
booktitle="Computer Vision -- ECCV 2024",
year="2025",
publisher="Springer Nature Switzerland",
address="Cham",
pages="394--411",
isbn="978-3-031-73220-1"
}

Contact

Christos Koutlis ([email protected])

About

Implementation of paper "Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages