The runtime and accuracy reported in this page are generated using
n2-standard-96
GCP instances which has the following configuration:
GCP instance type: n2-standard-96
CPUs: 96-core (vCPU)
Memory: 384GiB
GPUs: 0
Runtime is on HG002/HG003/HG004 (all chromosomes).
Reported runtime is an average of 5 runs.
Stage | Wall time (minutes) |
---|---|
make_examples | 172m53.87s |
call_variants: HG002 | 269m26.55s |
call_variants: HG003 | 268m2.29s |
call_variants: HG004 | 270m22.72s |
postprocess_variants (parallel) | 34m12.36s; 35m4.75s; 35m8.14s |
vcf_stats_report(optional):HG002 | 6m36.58s |
vcf_stats_report(optional):HG003 | 6m39.92s |
vcf_stats_report(optional):HG003 | 6m40.64s |
total | 1028m3.08s (17h08m3.08s) |
We report hap.py results on HG002/HG003/HG004 trio (chr20, using NIST v4.2.1
truth), which was held out while training.
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 11208 | 48 | 13 | 0.995736 | 0.998884 | 0.997308 |
SNP | 71088 | 245 | 41 | 0.996565 | 0.999424 | 0.997993 |
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 10578 | 50 | 24 | 0.995295 | 0.99783 | 0.996561 |
SNP | 69977 | 189 | 64 | 0.997306 | 0.999087 | 0.998196 |
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 10949 | 51 | 23 | 0.995364 | 0.997993 | 0.996676 |
SNP | 71445 | 214 | 48 | 0.997014 | 0.999329 | 0.99817 |
Read haplotagging in DeepTrio PacBio is on by default. You no longer
need to run DeepVariant->WhatsHap->DeepTrio, and can just run DeepTrio once.
Runtime is on HG002/HG003/HG004 (all chromosomes).
Reported runtime is an average of 5 runs.
Stage | Wall time (minutes) |
---|---|
make_examples | 16m48.88s+288m15.08s |
call_variants: HG002 | 279m5.76s |
call_variants: HG003 | 274m47.90s |
call_variants: HG004 | 283m37.89s |
postprocess_variants (parallel) | 44m12.28s; 51m39.02s; 51m52.66s |
vcf_stats_report(optional):HG002 | 6m49.94s |
vcf_stats_report(optional):HG003 | 6m53.24s |
vcf_stats_report(optional):HG003 | 7m19.57s |
total | 1206m35.85s (20h6m35.85s) |
We report hap.py results on HG002/HG003/HG004 trio (chr20, using NIST v4.2.1
truth), which was held out while training.
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 11213 | 43 | 84 | 0.99618 | 0.992863 | 0.994519 |
SNP | 71305 | 28 | 21 | 0.999607 | 0.999706 | 0.999657 |
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 10577 | 51 | 77 | 0.995201 | 0.993089 | 0.994144 |
SNP | 70143 | 23 | 35 | 0.999672 | 0.999502 | 0.999587 |
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 10954 | 46 | 70 | 0.995818 | 0.993931 | 0.994874 |
SNP | 71617 | 42 | 22 | 0.999414 | 0.999693 | 0.999554 |
Runtime is on HG002/HG003/HG004 (all chromosomes).
Reported runtime is an average of 5 runs.
Stage | Wall time (minutes) |
---|---|
make_examples | 7m11.47s |
call_variants: HG002 | 3m49.25s |
call_variants: HG003 | 3m53.32s |
call_variants: HG004 | 3m52.68s |
postprocess_variants (parallel) | 0m40.52s; 0m42.09s; 0m42.30s |
vcf_stats_report(optional):HG002 | 0m5.65s |
vcf_stats_report(optional):HG003 | 0m5.69s |
vcf_stats_report(optional):HG003 | 0m7.15s |
total | 20m6.26s |
We report hap.py results on HG002/HG003/HG004 trio (chr20, using NIST v4.2.1
truth), which was held out while training.
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 34 | 0 | 0 | 1.0 | 1.0 | 1.0 |
SNP | 670 | 2 | 0 | 0.997024 | 1.0 | 0.99851 |
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 29 | 0 | 0 | 1.0 | 1.0 | 1.0 |
SNP | 683 | 2 | 1 | 0.99708 | 0.998538 | 0.997809 |
Type | TRUTH.TP | TRUTH.FN | QUERY.FP | METRIC.Recall | METRIC.Precision | METRIC.F1_Score |
---|---|---|---|---|---|---|
INDEL | 32 | 1 | 1 | 0.969697 | 0.969697 | 0.969697 |
SNP | 676 | 3 | 0 | 0.995582 | 1.0 | 0.997786 |
For simplicity and consistency, we report runtime with a
CPU instance with 96 CPUs
For bigger datasets (WGS and PACBIO), we used bigger disk size (900G).
This is NOT the fastest or cheapest configuration.
Use gcloud compute ssh
to log in to the newly created instance.
Download and run any of the following case study scripts:
curl -O https://raw.githubusercontent.com/google/deepvariant/r1.8/scripts/inference_deeptrio.sh
# WGS
bash inference_deeptrio.sh --model_preset WGS
# WES
bash inference_deeptrio.sh --model_preset WES
# PacBio
bash inference_deeptrio.sh --model_preset PACBIO
Runtime metrics are taken from the resulting log after each stage of
DeepTrio. The runtime numbers reported above are the average of 5 runs each.
The accuracy metrics come from the hap.py summary.csv output file.
The runs are deterministic so all 5 runs produced the same output.