Diff of /src/README.md [000000] .. [735bb5]

Switch to unified view

a b/src/README.md
1
# Source code 
2
3
## Directory structure
4
5
- `./constants/`: Contains constants used in the code, such as paths to datasets, glossaries, name mappings, etc.
6
7
- `./evaluation/`: Contains code for evaluation, including generation of tables and plots, and computation of statistical tests.
8
9
- `./experiments/`: Contains experiments for each method:
10
  - `./experiments/rf.py`: Random Forest
11
  - `./experiments/bilstm.py`: BiLSTM
12
  - `./experiments/bert.py`: ClinicalBERT architectures
13
14
- `./extensions/`: Contains extensions to Python Baal and Transformers libraries.
15
16
- `./features/`: Contains implementations of input features for RF and BiLSTM methods, and input representations for Clinical BERT and Paired Clinical BERT.
17
18
- `./ml_models/`: Implementation of the different machine learning (ML) methods.
19
20
- `./models/`: Contains data models for preprocessing and feature generation.
21
22
- `./preprocessing/`: Contains preprocessing scripts. 
23
24
- `./re_datasets/`: Datasets factory for BiLSTM and BERT models, creating a Hugging Face (HF) Dataset.
25
26
- `./scripts/`: Bash scripts for running experiments on the GPU cluster.
27
28
- `./training/`: Contains trainers for each ML method and common training resources.
29
30
- `./config/`: Configuration of the logging of results to Neptune.ai.
31
32
- `./nlp_pipeline.py`: NLP Spacy pipeline.
33
34
- `./utils.py`: Helper functions used throughout the code.
35
36
- `./vocabulary.py`: Vocabulary module, representing mapping between tokens and indices.
37
38
39
Most of the subdirectories contain a more detailed `README` file.