a b/README.md
1
# Templating for Clinical Trials Eligibility Criteria
2
Scripts, data, and clusters used in the process of templating clinical trials eligibility data in order to provide more structure.
3
4
## Scripts
5
Most scripts used for clustering and automatic templating are stored in /scripts/.
6
7
## Raw Data
8
Raw data for both the sample set and full cancer set are stored in /Formatted Data/.
9
10
## Clusters
11
12
### Sample set
13
Clusters are stored at /clusters/final340numclusters2/.
14
15
### Full cancer set
16
Clusters are stored at /clusters/final9knumclusters/, ordered clusters at /clusters/ordered9kclusters, clusters of cluster centers at clusters/center9kclusters/, and megaclusters at /clusters/mega9kclust/.
17
18
### Other clusters
19
Other intermediate clusters and clustering schemes used along the way are also stored in /clusters/, but final clustering schemes are noted above.
20
21
## Templates
22
Some templates for the full cancer set generated by the program are stored at /templates/templates/ and /templates/fillins/.  Thresholds for the templates shown are noted in /templates/thresholds.  All booleanized templates for the sample set are stored at /templates/booleanizedtemplates/.
23
24
## Parser
25
Code for the parser is stored at /templates/parser/.