Diff of /README.md [000000] .. [be320b]

Switch to unified view

a b/README.md
1
# BBBD: Blood Biomarkers for Brain Diseases
2
3
## Overview
4
This project analyzes blood-based biomarkers for brain diseases, particularly focusing on Alzheimer's Disease (AD) and Mild Cognitive Impairment (MCI). The analysis uses gene expression data from blood samples and cross-references it with tissue-specific expression data to identify brain-enriched genes.
5
6
## Datasets
7
8
### GEO AddNeuroMed Cohort
9
- **Batch 1 (GSE63060)**
10
  - 145 AD samples
11
  - 80 MCI samples
12
  - 104 healthy controls (CTL)
13
  
14
- **Batch 2 (GSE63061)**
15
  - 175 AD samples
16
  - 78 MCI samples
17
  - 135 healthy controls (CTL)
18
19
### GTEx Data
20
- Version: V8
21
- Scope: Bulk tissue expression
22
- Coverage: 56,200 genes across 49 tissues (including 18 brain tissues)
23
- Measurement: Gene TPM (Transcripts Per Million)
24
25
## Methodology
26
27
### Data Preparation
28
29
1. **Batch Normalization**
30
   - Cross-batch normalization for GEO datasets
31
   - Standardization of expression values
32
33
2. **Brain-Enriched Gene Filtering**
34
   Based on NCBI definition: genes expressed at least 4x higher in brain compared to other organs
35
   
36
   Two filtering approaches:
37
   - **Filtering 1**: mean(brain subtissues) > 4 * mean(other tissues)
38
   - **Filtering 2**: brain subtissue > 4 * mean(other tissues)
39
40
### Data Processing Pipeline
41
42
1. **Initial Filtering**
43
   - Row means filtering
44
   - Interquartile Range (IQR) filtering
45
   - Log Fold Change (LogFC) filtering
46
47
2. **Statistical Analysis**
48
   - P-value computation
49
   - P-value adjustment for multiple testing
50
   - Significance filtering (threshold = 0.01)
51
52
## Results
53
54
An enrichment analysis was conducted on the resulting genes, using Enrichr and EnrichrKG. Finding a strong (p-val: e-28) correlation between 2 out of 11 genes involved in ATP synthesis mitochondrial processes with many brain diseases. <br>
55
Recent articles confirm (using different methods: LASSO, SVM) these two genes are candidates to predict LO-AD and MCI. <br> 
56
Further analysis will be conducted on other GWAS datasets as ADNI. 
57
Also, as partial inhibition of mitochondrial-complex-I has been exploited as therapeuthic target for AD, further analysis can be conducted on these 2-11 genes using MIENTURENET to evaluate the potential RNA therapeutic approaches for AD. <br>
58
59
## Future Directions
60
61
### Planned Extensions
62
1. **ADNI Dataset Integration**
63
   - Additional validation of findings
64
   - Cross-cohort analysis
65
66
2. **GWAS Analysis**
67
   - Integration with genetic variant data
68
   - Investigation of genetic associations
69
70
71
## Data Access
72
- GEO datasets: GSE63060, GSE63061
73
- GTEx data: V8 release
74
75
76
## Author
77
Stefano Patalano (2024)
78
79