Diff of /README.md [000000] .. [4782c6]

Switch to unified view

a b/README.md
1
# MoGCN
2
## What is it?
3
MoGCN, a multi-omics integration method based on graph convolutional network.<br>
4
![Image text](https://github.com/Lifoof/MoGCN/blob/master/data/Figs1.png)
5
As shown in figure, inputs to the model are multi-omics expression matrices, including but not limited to genomics, transcriptomics, proteomics, etc. MoGCN exploits the GCN model to incorporate and extend two unsupervised multi-omics integration algorithms: Autoencoder algorithm (AE) based on expression matrix and similarity network fusion algorithm based on patient similarity network. Feature extraction is not necessary before AE and SNF. <br>
6
7
## Requirements 
8
MoGCN is a Python scirpt tool, Python environment need:<br>
9
Python 3.6 or above <br>
10
Pytorch 1.4.0 or above <br>
11
snfpy 0.2.2 <br>
12
13
14
## Usage
15
The whole workflow is divided into three steps: <br>
16
* Use AE to reduce the dimensionality of multi-omics data to obtain multi-omics feature matrix <br>
17
* Use SNF to construct patient similarity network <br>
18
* Input multi-omics feature matrix  and the patient similarity network to GCN <br>
19
The sample data is in the data folder, which contains the CNV, mRNA and RPPA data of BRCA. <br>
20
### Command Line Tool
21
```Python
22
python AE_run.py -p1 data/fpkm_data.csv -p2 data/gistic_data.csv -p3 data/rppa_data.csv -m 0 -s 0 -d cpu
23
python SNF.py -p data/fpkm_data.csv data/gistic_data.csv data/rppa_data.csv -m sqeuclidean
24
python GCN_run.py -fd result/latent_data.csv -ad result/SNF_fused_matrix.csv -ld data/sample_classes.csv -ts data/test_sample.csv -m 1 -d gpu -p 20
25
```
26
The meaning of the parameters can be viewed through -h/--help <br>
27
28
### Data Format
29
* The input type of each omics data must be .csv, the rows represent samples, and the columns represent features (genes). In each expression matrix, the first column must be the samples, and the remaining columns are features. Samples in all omics data must be consistent. AE and SNF are unsupervised models and do not require sample labels.<br>
30
* GCN is a semi-supervised classification model, it requires sample label files (.csv format) during training. The first column of the label file is the sample name, the second column is the digitized sample label, the remaining columns are not necessary. <br>
31
32
## Contact
33
For any questions please contact Dr. Xiao Li (Email: lixiaoBioinfo@163.com).
34
35
## License
36
MIT License
37
38
## Citation
39
Li X, Ma J, Leng L, Han M, Li M, He F and Zhu Y (2022) MoGCN: A Multi-Omics Integration Method Based on Graph Convolutional Network for Cancer Subtype Analysis. Front. Genet. 13:806842. doi: 10.3389/fgene.2022.806842. <br>