This is a short guide to using the show_examples tool to view the pileup images
used within DeepVariant and save them as PNG image files. This tool is
particularly useful when you want to try to understand how a candidate variant
of interest was represented when it was passed into the neural network.
This example was generated with the data from the
quick start guide and the example commands below.
For more information on the pileup images and how to read them, please see the
"Looking through DeepVariant's Eyes" blog post.
The show_examples
tool is introduced in DeepVariant 1.0.0, so it is not
available in older versions, but it will work with make_examples output files
from older versions of DeepVariant.
First, find the make_examples.tfrecord.gz files output by DeepVariant during the
make_examples (first) stage.
If you followed along with the quick start guide
and case studies that used the Docker version, then these files are usually
hidden inside the Docker container. But you can get them exported into the same
output directory where the VCF file appears by adding the following setting in
the run_deepvariant
command.
# Add the following to your run_deepvariant command.
--intermediate_results_dir=/output/
Then the make_examples file should appear in the directory docker mounted as
/output/
. For example, if you followed the
quick-start documentation, it looks like this:
${OUTPUT_DIR}/make_examples.tfrecord-00000-of-00001.gz
.
Once you have a make_examples output tfrecord file, then you can run
show_examples
to see the pileup images inside:
# Continuing from the quick start linked above:
INPUT_DIR="${PWD}/quickstart-testdata"
OUTPUT_DIR="${PWD}/quickstart-output"
BIN_VERSION="1.6.1" # show_examples is available only in version 1.0.0 and later.
sudo docker run \
-v "${INPUT_DIR}":"/input" \
-v "${OUTPUT_DIR}":"/output" \
google/deepvariant:"${BIN_VERSION}" /opt/deepvariant/bin/show_examples \
--examples=/output/intermediate_results_dir/make_examples.tfrecord-00000-of-00001.gz \
--example_info_json=/output/intermediate_results_dir/make_examples.tfrecord-00000-of-00001.gz.example_info.json \
--output=/output/pileup \
--num_records=20 \
--curate
# And then your images are here:
ls "${OUTPUT_DIR}"/pileup*.png
--regions chr20:1-3000000
or paths to BED or--vcf variants.vcf
. This can be a piece--num_records 10
.--examples make_examples.tfrecord@64.gz
--regions
or --vcf
--curate
to create a TSV file with concepts for each pileup. Then--filter_by_tsv
to e.g. get pileup images only for examples with lowgrep
would be an easy option (the--write_tfrecords
after applying any