Switch to unified view

a/README.md b/README.md
...
...
57
57
58
The input file format is described below, or you can refer to the reference data we provide (https://github.com/BioAI-kits/AttentionMOI/tree/master/AttentionMOI/example).
58
The input file format is described below, or you can refer to the reference data we provide (https://github.com/BioAI-kits/AttentionMOI/tree/master/AttentionMOI/example).
59
59
60
f | omic_file
60
f | omic_file
61
61
62
> REQUIRED: File path for omics files (should be matrix)
62
REQUIRED: File path for omics files (should be matrix)
63
63
64
**NOTE:The file must be in csv format, such as rna.csv. Of course, it can be compressed with gz, such as rna.csv.gz.**. Example: The first line is the header, patient_id and gene (features) names.
64
**NOTE:The file must be in csv format, such as rna.csv. Of course, it can be compressed with gz, such as rna.csv.gz.**. Example: The first line is the header, patient_id and gene (features) names.
65
65
66
>  patient_id,A1BG,A1CF,A2BP1,A2LD1,....
66
patient_id,A1BG,A1CF,A2BP1,A2LD1,....
67
>
67
68
>  TCGA.KL.8323,3.3491,0.0,0.0,5.8939,....
68
TCGA.KL.8323,3.3491,0.0,0.0,5.8939,....
69
>
69
70
>  TCGA.KL.8324,2.922,0.5557,0.5557,6.4226,....
70
TCGA.KL.8324,2.922,0.5557,0.5557,6.4226,....
71
71
72
n | omic_name
72
n | omic_name
73
73
74
> REQUIRED: Omic names for omics files, should be the same order as the omics file
74
#### REQUIRED: Omic names for omics files, should be the same order as the omics file
75
75
76
l | label_file
76
l | label_file
77
77
78
> REQUIRED: File path for label file
78
#### REQUIRED: File path for label file
79
79
80
**NOTE:The file must be in csv format, such as label.csv. Of course, it can be compressed with gz, such as label.csv.gz.**. Example: The first line is the header, patient_id and label represent the sample name and sample classification label respectively. 
80
**NOTE:The file must be in csv format, such as label.csv. Of course, it can be compressed with gz, such as label.csv.gz.**. Example: The first line is the header, patient_id and label represent the sample name and sample classification label respectively. 
81
81
82
> patient_id,label
82
 patient_id,label
83
>
83
84
> TCGA.KL.8328,0
84
TCGA.KL.8328,0
85
>
85
86
> TCGA.KL.8339,0
86
TCGA.KL.8339,0
87
>
87
88
> TCGA.KM.8439,1
88
TCGA.KM.8439,1
89
>
89
90
> TCGA.KM.8441,1
90
TCGA.KM.8441,1
91
>
91
92
> TCGA.KM.8442,1
92
TCGA.KM.8442,1
93
93
94
94
95
**2. Output**
95
**2. Output**
96
96
97
o | outdir
97
o | outdir
98
98
99
> OPTIONAL: Setting output file path, default=./output
99
OPTIONAL: Setting output file path, default=./output
100
100
101
101
102
**3. Feature selection**
102
**3. Feature selection**
103
103
104
method
104
method
105
105
106
> OPTIONAL: Method of feature selection, choosing from ANOVA, RFE, LASSO, PCA, default is no feature selection
106
OPTIONAL: Method of feature selection, choosing from ANOVA, RFE, LASSO, PCA, default is no feature selection
107
107
108
percentile
108
percentile
109
109
110
> OPTIONAL: Percent of features to keep for ANOVA (integer between 1-100), only used when using ANOVA, default=30
110
OPTIONAL: Percent of features to keep for ANOVA (integer between 1-100), only used when using ANOVA, default=30
111
111
112
num_pc
112
num_pc
113
113
114
> OPTIONAL: Number of PCs to keep for PCA (integer), only used when using PCA, default=50
114
OPTIONAL: Number of PCs to keep for PCA (integer), only used when using PCA, default=50
115
115
116
FSD
116
FSD
117
117
118
> OPTIONAL: Whether to use FSD to mitigate noise of omics. Default is not using FSD, and set --FSD to use FSD
118
OPTIONAL: Whether to use FSD to mitigate noise of omics. Default is not using FSD, and set --FSD to use FSD
119
119
120
i | iteration
120
i | iteration
121
121
122
> OPTIONAL: The number of FSD iterations (integer), default=10
122
OPTIONAL: The number of FSD iterations (integer), default=10
123
123
124
s | seed
124
s | seed
125
125
126
> OPTIONAL: Random seed for FSD (integer), default=0
126
OPTIONAL: Random seed for FSD (integer), default=0
127
127
128
threshold
128
threshold
129
129
130
> OPTIONAL: FSD threshold to select features (float), default=0.8 (select features that are selected in 80 percent FSD iterations)
130
OPTIONAL: FSD threshold to select features (float), default=0.8 (select features that are selected in 80 percent FSD iterations)
131
131
132
132
133
**4. Building Model**
133
**4. Building Model**
134
134
135
m | model 
135
m | model 
136
136
137
> OPTIONAL: Model names, choosing from DNN, Net (Net for AttentionMOI), RF, XGboost, svm, mogonet, moanna, default=DNN.
137
 OPTIONAL: Model names, choosing from DNN, Net (Net for AttentionMOI), RF, XGboost, svm, mogonet, moanna, default=DNN.
138
138
139
t | test_size
139
t | test_size
140
140
141
> OPTIONAL: Testing dataset proportion when split train test dataset (float), default=0.3 (30 percent data for testing)
141
OPTIONAL: Testing dataset proportion when split train test dataset (float), default=0.3 (30 percent data for testing)
142
142
143
b | batch
143
b | batch
144
144
145
> OPTIONAL: Mini-batch number for model training (integer), default=32
145
OPTIONAL: Mini-batch number for model training (integer), default=32
146
146
147
e | epoch
147
e | epoch
148
148
149
> OPTIONAL: Epoch number for model training (integer), default=300
149
 OPTIONAL: Epoch number for model training (integer), default=300
150
150
151
r | lr
151
r | lr
152
152
153
> OPTIONAL: Learning rate for model training(float), default=0.0001
153
 OPTIONAL: Learning rate for model training(float), default=0.0001
154
154
155
w | weight_decay
155
w | weight_decay
156
156
157
> OPTIONAL: weight_decay parameter for model training (float), default=0.0001
157
OPTIONAL: weight_decay parameter for model training (float), default=0.0001
158
158
159
---
159
---
160
160
161
### Example
161
### Example
162
162