Config examples

For example usage, you can find configurations in the example-configs directory.

Note

In this project, the dataset is needed. However, it is private and contains sensitive data. To access this dataset, you must contact the author. Permission will be required for the data to be made available, and it will be considered on an individual basis.

Each configuration can be run using the following command:

python src/main.py -f config.yml

You can also use the optional –verbose parameter to enable more detailed logging:

python src/main.py -f config.yml --verbose

Training generator model

To train a generator model, you can use the following configuration file train-generator.yml:

general:
total_steps: 100_000
image_size: [256,512]
models:
    unet:
    dim: 64
    dim_mults: [1, 2, 4, 8]
    channels: 1

    diffusion:
    timesteps: 1000
    classifier:
    architecture: resnet34
    pretrained: False


data:
validation_split: 0.1 / 0.9
csv_file_path: <dataset-csv-file-path>
split_seed: 42


experiment:
loop:
    - name: train_generator
    diagnosis: reference
    batch_size: 32
    lr: 1e-4
    save_and_sample_every: 2000
    num_steps: 100_000
    gradient_accumulate_every: 4


output:
results_dir : .results
copy_results_to: <path-to-external-storage>

Generating Images

To generate images using a trained generator model, you can use the following configuration file generate.yml:

general:
image_size: [256,512]
models:
    unet:
    dim: 64
    dim_mults: [1, 2, 4, 8]
    channels: 1

    diffusion:
    timesteps: 1000


data:
validation_split: 0.1 / 0.9
csv_file_path: <dataset-csv-file-path>
split_seed: 42


experiment:
loop:
    - name: generate_samples
    repeat: false
    relative_dataset_results_dir: <relative-dataset-results-directory>
    num_samples: 1000
    batch_size: 8
    base_on: reference
    model_version: latest
    wandb: false



output:
results_dir : .results
copy_results_to: <path-to-external-storage>

Validate with classification model

Note

For the validation dataloader, the real dataset should be split into train, validation, and test directories, each containing subdirectories for each class separately. The structure should match the following:

ophthal_anonym
├── test
│   ├── benign
│   ├── fluid
│   ├── precancerous
│   └── reference
├── train
│   ├── benign
│   ├── fluid
│   ├── precancerous
│   └── reference
└── val
    ├── benign
    ├── fluid
    ├── precancerous
    └── reference

To validate the generated images using a trained classification model, you can use the following configuration file classification.yml:

general:
    image_size: [256,512]
    models:
        classifier:
        architecture: resnet34
        pretrained: False

data:
    validation_split: 0.1 / 0.9
    csv_file_path: <dataset-csv-file-path>
    split_seed: 42

experiment:
    loop:
        - name: validate
        classification:
            loss_fn: cross_entropy
            epochs: 15
            lr: 1e-4
            batch_size: 32
            train_data_type: real
            train_dataset_dir: <train-dataset-directory>
            val_dataset_dir: <validation-dataset-directory>
            test_dataset_dir : <test-dataset-directory>
            logger_experiment_name: <experiment-name>
            logger_tags:
            - real
            - resnet34

output:
    results_dir : .results