Switch to unified view

a/README.md b/README.md
1
[![Example](https://img.shields.io/badge/Usage-example-green)](https://github.com/shenwanxiang/bidd-aggmap/tree/master/paper/example)
1
[![Example](https://img.shields.io/badge/Usage-example-green)](https://github.com/shenwanxiang/bidd-aggmap/tree/master/paper/example)
2
[![Documentation Status](https://readthedocs.org/projects/bidd-aggmap/badge/?version=latest)](https://bidd-aggmap.readthedocs.io/en/latest/?badge=latest)
2
[![Documentation Status](https://readthedocs.org/projects/bidd-aggmap/badge/?version=latest)](https://bidd-aggmap.readthedocs.io/en/latest/?badge=latest)
3
[![Downloads](https://pepy.tech/badge/aggmap)](https://pepy.tech/project/aggmap)
3
[![Downloads](https://pepy.tech/badge/aggmap)](https://pepy.tech/project/aggmap)
4
[![PyPI version](https://badge.fury.io/py/aggmap.svg)](https://badge.fury.io/py/aggmap)
4
[![PyPI version](https://badge.fury.io/py/aggmap.svg)](https://badge.fury.io/py/aggmap)
5
[![Paper](https://img.shields.io/badge/paper-Nucleic_Acids_Research-blue)](https://academic.oup.com/nar/article/50/8/e45/6517966?login=false)
5
[![Paper](https://img.shields.io/badge/paper-Nucleic_Acids_Research-blue)](https://academic.oup.com/nar/article/50/8/e45/6517966?login=false)
6
6
7
  
7
  
8
<img src="./docs/images/logo.png" align="left" height="170" width="130" >
8
<img src="https://raw.githubusercontent.com/shenwanxiang/bidd-aggmap/master/docs/images/logo.png" align="left" height="170" width="130" >
9
9
10
10
11
11
12
# Jigsaw-like AggMap
12
# Jigsaw-like AggMap
13
13
14
## A Robust and Explainable Omics Deep Learning Tool
14
## A Robust and Explainable Omics Deep Learning Tool
15
15
16
----
16
----
17
17
18
18
19
### Installation (Only on Linux system) 
19
### Installation (Only on Linux system) 
20
20
21
install aggmap by:
21
install aggmap by:
22
```bash
22
```bash
23
# create an aggmap env
23
# create an aggmap env
24
conda create -n aggmap python=3.8
24
conda create -n aggmap python=3.8
25
conda activate aggmap
25
conda activate aggmap
26
pip install --upgrade pip
26
pip install --upgrade pip
27
pip install aggmap==1.2.1
27
pip install aggmap==1.2.1
28
```
28
```
29
29
30
----
30
----
31
31
32
### Usage
32
### Usage
33
33
34
```python
34
```python
35
import pandas as pd
35
import pandas as pd
36
from sklearn.datasets import load_breast_cancer
36
from sklearn.datasets import load_breast_cancer
37
from aggmap import AggMap, AggMapNet
37
from aggmap import AggMap, AggMapNet
38
38
39
# Data loading
39
# Data loading
40
data = load_breast_cancer()
40
data = load_breast_cancer()
41
dfx = pd.DataFrame(data.data, columns=data.feature_names)
41
dfx = pd.DataFrame(data.data, columns=data.feature_names)
42
dfy = pd.get_dummies(pd.Series(data.target))
42
dfy = pd.get_dummies(pd.Series(data.target))
43
43
44
# AggMap object definition, fitting, and saving 
44
# AggMap object definition, fitting, and saving 
45
mp = AggMap(dfx, metric = 'correlation')
45
mp = AggMap(dfx, metric = 'correlation')
46
mp.fit(cluster_channels=5, emb_method = 'umap', verbose=0)
46
mp.fit(cluster_channels=5, emb_method = 'umap', verbose=0)
47
mp.save('agg.mp')
47
mp.save('agg.mp')
48
48
49
# AggMap visulizations: Hierarchical tree, embeddng scatter and grid
49
# AggMap visulizations: Hierarchical tree, embeddng scatter and grid
50
mp.plot_tree()
50
mp.plot_tree()
51
mp.plot_scatter(enabled_data_labels=True, radius=5)
51
mp.plot_scatter(enabled_data_labels=True, radius=5)
52
mp.plot_grid(enabled_data_labels=True)
52
mp.plot_grid(enabled_data_labels=True)
53
53
54
# Transoformation of 1d vectors to 3D Fmaps (-1, w, h, c) by AggMap
54
# Transoformation of 1d vectors to 3D Fmaps (-1, w, h, c) by AggMap
55
X = mp.batch_transform(dfx.values, n_jobs=4, scale_method = 'minmax')
55
X = mp.batch_transform(dfx.values, n_jobs=4, scale_method = 'minmax')
56
y = dfy.values
56
y = dfy.values
57
57
58
# AggMapNet training, validation, early stopping, and saving
58
# AggMapNet training, validation, early stopping, and saving
59
clf = AggMapNet.MultiClassEstimator(epochs=50, gpuid=0)
59
clf = AggMapNet.MultiClassEstimator(epochs=50, gpuid=0)
60
clf.fit(X, y, X_valid=None, y_valid=None)
60
clf.fit(X, y, X_valid=None, y_valid=None)
61
clf.save_model('agg.model')
61
clf.save_model('agg.model')
62
62
63
# Model explaination by simply-explainer: global, local
63
# Model explaination by simply-explainer: global, local
64
simp_explainer = AggMapNet.simply_explainer(clf, mp)
64
simp_explainer = AggMapNet.simply_explainer(clf, mp)
65
global_simp_importance = simp_explainer.global_explain(clf.X_, clf.y_)
65
global_simp_importance = simp_explainer.global_explain(clf.X_, clf.y_)
66
local_simp_importance = simp_explainer.local_explain(clf.X_[[0]], clf.y_[[0]])
66
local_simp_importance = simp_explainer.local_explain(clf.X_[[0]], clf.y_[[0]])
67
67
68
# Model explaination by shapley-explainer: global, local
68
# Model explaination by shapley-explainer: global, local
69
shap_explainer = AggMapNet.shapley_explainer(clf, mp)
69
shap_explainer = AggMapNet.shapley_explainer(clf, mp)
70
global_shap_importance = shap_explainer.global_explain(clf.X_)
70
global_shap_importance = shap_explainer.global_explain(clf.X_)
71
local_shap_importance = shap_explainer.local_explain(clf.X_[[0]])
71
local_shap_importance = shap_explainer.local_explain(clf.X_[[0]])
72
```
72
```
73
73
74
74
75
### How It Works?
75
### How It Works?
76
76
77
- AggMap flowchart of feature mapping and agglomeration into ordered (spatially correlated) multi-channel feature maps (Fmaps)
77
- AggMap flowchart of feature mapping and agglomeration into ordered (spatially correlated) multi-channel feature maps (Fmaps)
78
78
79
![how-it-works](https://raw.githubusercontent.com/shenwanxiang/bidd-aggmap/master/docs/images/hiw.jpg)
79
![how-it-works](https://raw.githubusercontent.com/shenwanxiang/bidd-aggmap/master/docs/images/hiw.jpg)
80
**a**, AggMap flowchart of feature mapping and aggregation into ordered (spatially-correlated) channel-split feature maps (Fmaps).**b**, CNN-based AggMapNet architecture for Fmaps learning. **c**, proof-of-concept illustration of AggMap restructuring of unordered data (randomized MNIST) into clustered channel-split Fmaps (reconstructed MNIST) for CNN-based learning and important feature analysis. **d**, typical biomedical applications of AggMap in restructuring omics data into channel-split Fmaps for multi-channel CNN-based diagnosis and biomarker discovery (explanation `saliency-map` of important features).
80
**a**, AggMap flowchart of feature mapping and aggregation into ordered (spatially-correlated) channel-split feature maps (Fmaps).**b**, CNN-based AggMapNet architecture for Fmaps learning. **c**, proof-of-concept illustration of AggMap restructuring of unordered data (randomized MNIST) into clustered channel-split Fmaps (reconstructed MNIST) for CNN-based learning and important feature analysis. **d**, typical biomedical applications of AggMap in restructuring omics data into channel-split Fmaps for multi-channel CNN-based diagnosis and biomarker discovery (explanation `saliency-map` of important features).
81
81
82
82
83
----
83
----
84
### Proof-of-Concepts of reconstruction ability on MNIST Dataset
84
### Proof-of-Concepts of reconstruction ability on MNIST Dataset
85
85
86
<video width="320" height="240" controls>
86
<video width="320" height="240" controls>
87
  <source src="https://www.shenwx.com/files/Video_MNIST.mp4" type="video/mp4">
87
  <source src="https://www.shenwx.com/files/Video_MNIST.mp4" type="video/mp4">
88
</video>
88
</video>
89
89
90
- It can reconstruct to the original image from completely randomly permuted (disrupted) MNIST data:
90
- It can reconstruct to the original image from completely randomly permuted (disrupted) MNIST data:
91
91
92
92
93
93
94
![reconstruction](https://raw.githubusercontent.com/shenwanxiang/bidd-aggmap/master/docs/images/reconstruction.png)
94
![reconstruction](https://raw.githubusercontent.com/shenwanxiang/bidd-aggmap/master/docs/images/reconstruction.png)
95
95
96
`Org1`: the original grayscale images (channel = 1), `OrgRP1`: the randomized images of Org1 (channel = 1), `RPAgg1, 5`: the reconstructed images of `OrgPR1` by AggMap feature restructuring (channel = 1, 5 respectively, each color represents features of one channel). `RPAgg5-tkb`: the original images with the pixels divided into 5 groups according to the 5-channels of `RPAgg5` and colored in the same way as `RPAgg5`.
96
`Org1`: the original grayscale images (channel = 1), `OrgRP1`: the randomized images of Org1 (channel = 1), `RPAgg1, 5`: the reconstructed images of `OrgPR1` by AggMap feature restructuring (channel = 1, 5 respectively, each color represents features of one channel). `RPAgg5-tkb`: the original images with the pixels divided into 5 groups according to the 5-channels of `RPAgg5` and colored in the same way as `RPAgg5`.
97
97
98
98
99
----
99
----
100
100
101
101
102
102
103
### The effect of the number of channels on model performance 
103
### The effect of the number of channels on model performance 
104
104
105
- Multi-channel Fmaps can boost the model performance notably:
105
- Multi-channel Fmaps can boost the model performance notably:
106
![channel_effect](https://raw.githubusercontent.com/shenwanxiang/bidd-aggmap/master/docs/images/channel_effect.png)
106
![channel_effect](https://raw.githubusercontent.com/shenwanxiang/bidd-aggmap/master/docs/images/channel_effect.png)
107
107
108
The performance of AggMapNet using different number of channels on the `TCGA-T (a)` and `COV-D (b)`. For `TCGA-T`, ten-fold cross validation average performance, for `COV-D`, a fivefold cross validation was performed and repeat 5 rounds using different random seeds (total 25 training times), their average performances of the validation set were reported.
108
The performance of AggMapNet using different number of channels on the `TCGA-T (a)` and `COV-D (b)`. For `TCGA-T`, ten-fold cross validation average performance, for `COV-D`, a fivefold cross validation was performed and repeat 5 rounds using different random seeds (total 25 training times), their average performances of the validation set were reported.
109
----
109
----
110
110
111
111
112
### Example for Restructured Fmaps
112
### Example for Restructured Fmaps
113
- The example on WDBC dataset: click [here](https://github.com/shenwanxiang/bidd-aggmap/blob/master/paper/example/00_breast_cancer/00_WDBC_example_flow.ipynb) to find out more!
113
- The example on WDBC dataset: click [here](https://github.com/shenwanxiang/bidd-aggmap/blob/master/paper/example/00_breast_cancer/00_WDBC_example_flow.ipynb) to find out more!
114
![Fmap](https://raw.githubusercontent.com/shenwanxiang/bidd-aggmap/master/docs/images/WDBC.png)
114
![Fmap](https://raw.githubusercontent.com/shenwanxiang/bidd-aggmap/master/docs/images/WDBC.png)
115
115
116
----
116
----
117
117
118
118
119
119
120
### Citation
120
### Citation
121
Shen, Wan Xiang, et al. "AggMapNet: enhanced and explainable low-sample omics deep learning with feature-aggregated multi-channel networks." Nucleic Acids Research 50.8 (2022): e45-e45.
121
Shen, Wan Xiang, et al. "AggMapNet: enhanced and explainable low-sample omics deep learning with feature-aggregated multi-channel networks." Nucleic Acids Research 50.8 (2022): e45-e45.
122
122
123
----
123
----
124
124