|
a |
|
b/README.md |
|
|
1 |
# IntelliGenes |
|
|
2 |
|
|
|
3 |
This is the CLI implementation of _IntelliGenes_. A GUI version is available in the intelligenes-gui branch. |
|
|
4 |
|
|
|
5 |
IntelliGenes is a Python-based portable pipeline that addresses challenges arising from the cascading volume of genomics datasets being created that require interpretation. IntelliGenes serves as a comprehensive toolkit, fitting cutting-edge algorithms for discovering disease-associated biomarkers and patient prediction to users’ unique cohorts. IntelliGenes integrates demographics with genomics, facilitating investigations that consider both variables simultaneously. With IntelliGenes, we introduce I-Genes Scores, our novel metric for understanding the relevance of biomarkers in disease prediction engines. |
|
|
6 |
|
|
|
7 |
_IntelliGenes_ can be installed through our GitHub using the terminal. Follow the provided steps to install IntelliGenes and the package’s dependencies: |
|
|
8 |
``` |
|
|
9 |
# Clone IntelliGenes’ GitHub Repository |
|
|
10 |
git clone https://github.com/drzeeshanahmed/intelligenes.git |
|
|
11 |
|
|
|
12 |
# Navigate to IntelliGenes |
|
|
13 |
cd intelligenes/ |
|
|
14 |
|
|
|
15 |
# Install IntelliGenes |
|
|
16 |
pip install . |
|
|
17 |
``` |
|
|
18 |
|
|
|
19 |
_IntelliGenes_ offers a robust selection of tools to help users understand their multi-genomics datasets. _IntelliGenes_ has been designed as an easy-to-understand pipeline for those at all levels of computational understanding. _IntelliGenes_ has three functions: |
|
|
20 |
``` |
|
|
21 |
# Discover Biomarkers |
|
|
22 |
igenes_select -i data/cigt_file.csv -o results/ |
|
|
23 |
|
|
|
24 |
# Disease Prediction & I-Genes Scores |
|
|
25 |
igenes_predict -i data/cigt_file.csv -f features_file.csv -o results/ |
|
|
26 |
|
|
|
27 |
# IntelliGenes (Discovering Biomarkers & Predicting Disease) |
|
|
28 |
igenes -i data/cigt_file.csv -o results/ |
|
|
29 |
``` |
|
|
30 |
|
|
|
31 |
These are sample commands. We have provided an example CIGT file in tests/. |
|
|
32 |
|
|
|
33 |
These commands all users to write various flags that will tailor _IntelliGenes_ to their exact needs: |
|
|
34 |
``` |
|
|
35 |
# IntelliGenes Selection Help |
|
|
36 |
igenes_select --help |
|
|
37 |
|
|
|
38 |
# IntelliGenes Prediction Help |
|
|
39 |
igenes_predict --help |
|
|
40 |
|
|
|
41 |
# IntelliGenes Help |
|
|
42 |
igenes --help |
|
|
43 |
``` |
|
|
44 |
|
|
|
45 |
_IntelliGenes_ requires a CIGT formatted dataset as an input. Examples of CIGT datasets can be found on our GitHub. The CIGT formatted dataset integrates demographics and transcriptomic: |
|
|
46 |
- Columns contain demographic or transcriptomic biomarkers, while rows contain identifiers for individual patients. |
|
|
47 |
- Demographics such as ‘Age’, ‘Race’, and ‘Sex’ should be integers (use EHR standards). These demographics are not required, as IntelliGenes works using only genomics/transcriptomics. |
|
|
48 |
- There must be a ‘Type’ column, denoting a patient’s status as an integer (use 0 or 1). |
|
|
49 |
|
|
|
50 |
More information is available in **Supplementary Material 2: _IntelliGenes_: Installation, configuration, and user’s guidelines** |
|
|
51 |
|
|
|
52 |
If using _IntelliGenes_, please cite: |
|
|
53 |
|
|
|
54 |
Degroat, W., Mendhe, D., Bhurasi, A., Abdelhalim, H., Saman, Z., & Ahmed, Z. (2023). IntelliGenes: A novel machine learning pipeline for biomarker discovery and predictive analysis using multi-genomic profiles. Bioinformatics. 39, 12. btad755. PMID: 38096588. doi:10.1093/bioinformatics/btad755 (Oxford). |