a b/docs/introduction.mdx
1
# AItrika
2
3
![AItrika](images/logo.png)
4
5
[![License](https://img.shields.io/badge/License-Apache%202.0-orange.svg)](https://opensource.org/licenses/Apache-2.0)
6
![GitHub forks](https://img.shields.io/github/forks/dSupertramp/AItrika)
7
![GitHub commit activity (branch)](https://img.shields.io/github/commit-activity/t/dSupertramp/AItrika/main)
8
![GitHub last commit (branch)](https://img.shields.io/github/last-commit/dSupertramp/AItrika/main)
9
10
![Static Badge](https://img.shields.io/badge/medical-content?logo=syringe&logoColor=cyan&color=cyan)
11
12
Enhance your knowledge in medical research.
13
14
AItrika (formerly **PubGPT**) is a tool that can extract lots of relevant informations inside medical papers in an easy way:
15
16
- Abstract
17
- Full text (when available)
18
- Genes
19
- Diseases
20
- Mutations
21
- Associations between genes and diseases
22
- MeSH terms
23
- Other terms
24
- Results
25
- Bibliography
26
27
And so on!
28
29
## 🚀 Run the demo app
30
31
You can try AItrika with the Streamlit app by running:
32
33
```
34
streamlit run app.py
35
```
36
37
Or you can use it a script by running:
38
39
```
40
python main.py
41
```
42
43
## 📦 Install
44
45
To install everything, you need `uv`.
46
47
First of all, install `uv` with the command:
48
49
```
50
python main.py
51
```
52
53
After that, create a virtual environment with the command:
54
55
```
56
uv venv venv_name
57
```
58
59
Activate the virtual env:
60
61
```
62
source venv_name/bin/activate
63
```
64
65
And install dependencies:
66
67
```
68
uv pip install -r requirements.in
69
```
70
71
## 🔑 Set LLM API Keys
72
73
In order to set API keys, insert your keys into the `env.example` file and rename it to `.env`.
74
75
## 🔍 Usage
76
77
You can easily get informations of a paper by passing a PubMed ID:
78
79
```python
80
from aitrika.engine.aitrika import OnlineAItrika
81
aitrika_engine = OnlineAItrika(pubmed_id=pubmed_id)
82
title = aitrika_engine.get_title()
83
print(title)
84
```
85
86
Or you can parse a local pdf:
87
88
```python
89
from aitrika.engine.aitrika import LocalAItrika
90
aitrika_engine = LocalAItrika(pdf_path = pdf_path)
91
title = aitrika_engine.get_title()
92
print(title)
93
```
94
95
```
96
Breast cancer genes: beyond BRCA1 and BRCA2.
97
```
98
99
You can get other informations, like the associations between genes and diseases:
100
101
```python
102
associations = aitrika_engine.get_associations()
103
```
104
105
```
106
[
107
  {
108
    "gene": "BRIP1",
109
    "disease": "Breast Neoplasms"
110
  },
111
  {
112
    "gene": "PTEN",
113
    "disease": "Breast Neoplasms"
114
  },
115
  {
116
    "gene": "CHEK2",
117
    "disease": "Breast Neoplasms"
118
  },
119
]
120
...
121
```
122
123
Or you can get a nice formatted DataFrame:
124
125
```python
126
associations = aitrika_engine.associations(dataframe = True)
127
```
128
129
```
130
      gene                          disease
131
0    BRIP1                 Breast Neoplasms
132
1     PTEN                 Breast Neoplasms
133
2    CHEK2                 Breast Neoplasms
134
...
135
```
136
137
With the power of RAG, you can query your document:
138
139
```python
140
## Prepare the documents
141
documents = generate_documents(content=abstract)
142
143
## Set the LLM
144
llm = GroqLLM(documents=documents, api_key=os.getenv("GROQ_API_KEY"))
145
146
## Query your document
147
query = "Is BRCA1 associated with breast cancer?"
148
print(llm.query(query=query))
149
```
150
151
```
152
The provided text suggests that BRCA1 is associated with breast cancer, as it is listed among the high-penetrance genes identified in family linkage studies as responsible for inherited syndromes of breast cancer.
153
```
154
155
Or you can extract other informations:
156
157
```python
158
results = engine.extract_results(llm=llm)
159
print(results)
160
```
161
162
```
163
** RESULTS **
164
165
- High-penetrance genes - BRCA1, BRCA2, PTEN, TP53 - responsible for inherited syndromes
166
- Moderate-penetrance genes - CHEK2, ATM, BRIP1, PALB2, RAD51C - associated with moderate BC risk
167
- Low-penetrance alleles - common alleles - associated with slightly increased or decreased risk of BC
168
- Current clinical practice - high-penetrance genes - widely used
169
- Future prospect - all familial breast cancer genes - to be included in genetic test
170
- Research need - clinical management - of moderate and low-risk variants
171
```
172
173
## 🚀 Run the API
174
175
To run the AItrika API, follow these steps:
176
177
1. Ensure you have set up your environment and installed all dependencies as described in the Installation section.
178
179
2. Run the API server using the following command:
180
181
```bash
182
python api.py
183
```
184
185
The API will start running on http://0.0.0.0:8000. You can now make requests to the various endpoints:
186
187
- /associations: Get associations from a PubMed article
188
- /abstract: Get abstract of a PubMed article
189
- /query: Query a PubMed article
190
- /results: Get results from a PubMed article
191
- /participants: Get number of participants from a PubMed article
192
- /outcomes: Get outcomes from a PubMed article
193
194
You can use tools like curl, Postman, or any HTTP client to interact with the API. For example:
195
196
```bash
197
curl -X POST "http://localhost:8000/abstract" -H "Content-Type: application/json" -d '{"pubmed_id": 12345678}'
198
```
199
200
The API documentation is automatically generated and saved to <code>docs/api-reference/openapi.json</code>.
201
You can use this file with tools like Swagger UI for a more interactive API exploration experience.
202
203
## Support the Project
204
205
If you find this project useful, please consider supporting it:
206
207
- 🌟 Star the project on GitHub
208
- 🐛 Report bugs or suggest new features
209
- 🤝 Contribute with pull requests
210
- ☕️ [Buy me a coffee](https://www.buymeacoffee.com/dsupertramp) or consider a sponsor.
211
212
### Commercial / Business use
213
214
If you're using this project in a business or commercial context, please [contact me](salvatoredanilopalumbo@gmail.com).
215
216
I'm available for consulting, custom development, or commercial licensing.
217
218
Your support helps keep this project active and continuously improving. Thank you!
219
220
## License
221
222
AItrika is licensed under the Apache 2.0 License. See the LICENSE file for more details.
223
224
## Star History
225
226
[![Star History Chart](https://api.star-history.com/svg?repos=dSupertramp/AItrika&type=Date)](https://star-history.com/#dSupertramp/AItrika&Date)