conda env create -f environment.yml
conda activate cs231n
pip install -r requirements.txt
pip install kaggle
Follow instructions here to create API token: https://github.com/Kaggle/kaggle-api#api-credentials
kaggle competitions download -c uw-madison-gi-tract-image-segmentation
tar -xvzf uw-madison-gi-tract-image-segmentation.zip
[Optional] Rename dataset folder to something more intuitive
import os
os.rename("train", "datasets")
python main.py train --<hyperparameter> value
python main.py test --checkpoint_path <path to checkpoint>
This repo is designed to speed up th research iteration in the early stage of the project.
Some design principles we followed:
- Centralize the logic of configuration
- Include only necessary kick-starter pieces
- Only abstract the common component and structure across projects
- Expose 100% data loading logic, model architecture and forward/backward logic in original PyTorch
- Be prepared to hyper-changes
main.py
defines all the experiments level configuration (e.g. which model/optimizer to use, how to decay the learning rate, when to save the model and where, and etc.). We use Fire to automatically generate CLI for function like train(...)
and test(...)
. For most of the hyper-parameter searching experiments, modifying main.py
should be enough
To further modify the training loop logic (for GAN, meta-learning, and etc.), you may want to update the train(...)
and test(...)
functions. You can try all your crazy research ideas there!
data/dataset.py
provides a basic example but you probably want to define your own dataset with on-the-fly transforms and augmentations. This can be done by implement your class of dataset and transforming functions in data
module and use them in train/valid/test_dataloader()
in lightning/model.py
. If you have a lot of dataset, you might also want to implement some get_dataset(args)
method to help fetch the correct dataset.
We include most of the established backbone models in models/pretrained.py
but you are welcome to implement your own, just as what you did in plain PyTorch.
We would suggest you to put the implementation of optimizer, loss, evaluation metrics, logger and constants into /util
.
For other project-specified codes (such as pre-processing and data visualization), you might want to leave them to /custom
.
Maintainers: @Hao