---
title: "Importing Spatial Data"
output:
html_document:
toc: true
toc_depth: 4
toc_float:
collapsed: false
smooth_scroll: false
---
```{css, echo=FALSE}
.watch-out {
color: black;
}
```
```{r setup, include=FALSE}
# use rmarkdown::render_site(envir = knitr::knit_global())
knitr::opts_chunk$set(highlight = TRUE, echo = TRUE)
```
# Importing Spatial Datasets
VoltRon is an end-to-end spatial omic analysis package that supports a large selection of spatial data resolutions. Currently, there exists a considerable amount of spatial omic technologies that generate datasets whose omic profiles are spatially resolved.
VoltRon objects are compatible with readouts of almost all of these technologies where we provide a selection of built-in functions to help users constructing VoltRon objects with ease. In this tutorial, we will review these spatial omic instruments and the functions available within the VoltRon package to import their readouts.
## Visium (10x Genomics)
Image Credit: The Visium Spatial Gene Expression Slide (https://www.10xgenomics.com/)
10x Genomics [Visium](https://www.10xgenomics.com/products/spatial-gene-expression) Spatial Gene Expression Platform incorporates in situ arrays (i.e. spots) to capture spatial localization of omic profiles where these spots are of 55 \mu m in diameter and constitute a grid that covers a significant portion of a tissue section placed on the slide of the instrument. We will use the readouts of **Visium CytAssist** platform that was derived from a single tissue section of a breast cancer sample. More information on the Visium CytAssist data and the study can be also be found on the [BioArxiv preprint](https://www.biorxiv.org/content/10.1101/2022.10.06.510405v1). You can download the data from the [10x Genomics website](https://www.10xgenomics.com/products/xenium-in-situ/preview-dataset-human-breast) (specifically, import the **Visium Spatial** data). We use the **importVisium** function to import the Visium readouts and create a VoltRon object. Here, we point to the folder of all the files with **dir.path** argument and also determine the name of this sample (**sample_name**). ```{r eval = FALSE, class.source="watch-out"} library(VoltRon) BiocManager::install("rhdf5") library(rhdf5) Vis_R1 <- importVisium(dir.path = "Visium/", sample_name = "VisiumR1") ``` ``` VoltRon Object VisiumR1: Layers: Section1 Assays: Visium(Main) ``` While importing the readouts, we can also determine the name of the assay as well as the name of the image. The **SampleMetadata** function summarizes the entire collection of assays, layers (sections) and samples (tissue blocks) within the R object. ```{r eval = FALSE, class.source="watch-out"} Vis_R1 <- importVisium(dir.path = "Visium/", sample_name = "VisiumR1", assay_name = "Visium_assay", image_name = "H&E_stain") SampleMetadata(Vis_R1) ``` ``` Assay Layer Sample Assay1 Visium_assay Section1 VisiumR1 ``` The current VoltRon object has only one assay associated with a single layer and a tissue block, and the image of this assay is currently the "H&E_stain". ```{r eval = FALSE, class.source="watch-out"} vrImageNames(Vis_R1) ``` ``` [1] "H&E_stain" ```https://www.10xgenomics.com/products/visium-hd-spatial-gene-expression
10x Genomics [VisiumHD](https://www.10xgenomics.com/products/visium-hd-spatial-gene-expression) Spatial Gene Expression Platform contains two 6.5 x 6.5 mm Capture Areas with a continuous lawn of oligonucleotides arrayed in millions of 2 x 2 µm barcoded squares without gaps, achieving single cell–scale spatial resolution. The data is output at 2 µm, as well as multiple bin sizes. The 8 x 8 µm bin is the recommended starting point for visualization and analysis. We will use the readouts of **Visium CytAssist** platform that was derived from a single tissue section of a breast cancer sample. More information on the Visium CytAssist data and the study can be also be found on the [BioArxiv preprint](https://www.biorxiv.org/content/10.1101/2022.10.06.510405v1). You can download the data from the [10x Genomics website](https://www.10xgenomics.com/products/xenium-in-situ/preview-dataset-human-breast) (specifically, import the **Visium Spatial** data). We use the **importVisium** function to import the Visium readouts and create a VoltRon object. Here, we point to the folder of all the files with **dir.path** argument and also determine the name of this sample (**sample_name**). ```{r eval = FALSE, class.source="watch-out"} library(VoltRon) install.packages("arrow") BiocManager::install("rhdf5") library(arrow) library(rhdf5) hddata <- importVisium(dir.path = "VisiumHD/outs/", sample_name = "VisiumHD") ```Image Credit: https://www.biorxiv.org/content/10.1101/2022.10.06.510405v2
The 10x Genomics [Xenium In Situ](https://www.10xgenomics.com/platforms/xenium) provides spatial localization of both (i) transcripts from a few hundred number of genes as well as (ii) the single cells with transcriptomic profiles. We will use the readouts of a single Xenium platform replicate that was derived from a single tissue section of a breast cancer sample. More information on the Xenium data and the study can be also be found on the [BioArxiv preprint](https://www.biorxiv.org/content/10.1101/2022.10.06.510405v1). You can download the data from the [10x Genomics website](https://www.10xgenomics.com/products/xenium-in-situ/preview-dataset-human-breast) (specifically, import the **In Situ Replicate 1** data). We use the **importXenium** function to import the Xenium readouts and create a VoltRon object. Here, we point to the folder of all the files with **dir.path** argument and also determine the name of this sample (**sample_name**). ```{r eval = FALSE, class.source="watch-out"} library(VoltRon) BiocManager::install("rhdf5") library(rhdf5) Xen_R1 <- importXenium("Xenium_R1/outs", sample_name = "XeniumR1") ``` ``` VoltRon Object XeniumR1: Layers: Section1 Assays: Xenium(Main) ``` You can use the **import_molecules** argument to import positions and features of the transcripts along with the single cell profiles. ```{r eval = FALSE, class.source="watch-out"} Xen_R1 <- importXenium("Xenium_R1/outs", sample_name = "XeniumR1", import_molecules = TRUE) Xen_R1 ``` ``` VoltRon Object XeniumR1: Layers: Section1 Assays: Xenium(Main) Xenium_mol ``` The **SampleMetadata** function summarizes the entire collection of assays, layers (sections) and samples (tissue blocks) within the R object. In this case, the function will generate two assays in a single layer where one is a "cell" assay and the other is a "molecule assay". ```{r eval = FALSE, class.source="watch-out"} SampleMetadata(Xen_R1) ``` ``` Assay Layer Sample Assay1 Xenium Section1 XeniumR1 Assay2 Xenium_mol Section1 XeniumR1 ``` The Xenium in situ platform provides multiple resolution of the same Xenium slide which can be parsed from the OME.TIFF image file of DAPI stained tissue section (e.g. morphology_mip.ome.tif). The **resolution_level** argument determines the resolution of the DAPI image generated from the OME.TIFF file. More information on resolution levels can be found [here](https://kb.10xgenomics.com/hc/en-us/articles/11636252598925-What-are-the-Xenium-image-scale-factors-). ```{r eval = FALSE, class.source="watch-out"} Xen_R1 <- importXenium("Xenium_R1/outs", sample_name = "XeniumR1", import_molecules = TRUE, resolution_level = 4, overwrite_resolution = TRUE) vrImages(Xen_R1, assay = "Xenium") ``` ``` # A tibble: 1 × 7 format width height colorspace matte filesize densityImage Credit: https://www.biochain.com/nanostring-geomx-digital-spatial-profiling/
The [Nanostring's GeoMx Digital Spatial Profiler](https://nanostring.com/products/geomx-digital-spatial-profiler/geomx-dsp-overview/) is a high-plex spatial profiling technology which produces segmentation-based protein and RNA assays. The instrument allows users to select regions of interest (ROIs) from fluorescent microscopy images that capture the morphological context of the tissue. These are ROIs are then used to generate transcriptomic or proteomic profiles. We will import the ROI profiles generated from the GeoMx scan area where COVID-19 lung tissues were fitted into. See [GSE190732](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE190732) for more information on this study. Here is the usage of **importGeoMx** function and necessary files for this example:Argument | Description | Link |
---|---|---|
dcc.path | The path to DCC files directory | DDC files |
pkc.file | GeoMx™ DSP configuration file | Human RNA Whole Transcriptomic Atlas for NGS |
summarySegment | Segment summary table (.xls or .csv) | ROI Metadata file |
image | The Morphology Image of the scan area | Image file |
ome.tiff | The OME.TIFF Image of the scan area | OME.TIFF file |
The OME.TIFF Image XML file | OME.TIFF (XML) file |
Image Credit: https://www.biorxiv.org/content/10.1101/2021.11.03.467020v1.full
The [Nanostring's CosMx Spatial Molecular Imaging](https://nanostring.com/products/cosmx-spatial-molecular-imager/cosmx-smi-single-cell-imaging-de/) platform is a high-plex spatial multiomics technology that captures the spatial localization of both (i) transcripts from thousands of genes as well as (ii) the single cells with transcriptomic and proteomic profiles. We will use the readouts from two slides of a single CosMx experiment. You can download the data from the [Nanostring website](https://nanostring.com/products/cosmx-spatial-molecular-imager/ffpe-dataset/cosmx-smi-mouse-brain-ffpe-dataset/). We use the **importCosMx** function to import the CosMx readouts and create a VoltRon object. Here, we point to the folder of the [TileDB](https://tiledb.com/) array that stores feature matrix as well as the transcript metadata. ```{r eval = FALSE, class.source="watch-out"} CosMxR1 <- importCosMx(tiledbURI = "MuBrainDataRelease/") ``` ``` VoltRon Object Slide1: Layers: Section1 Slide2: Layers: Section1 Assays: CosMx(Main) ``` You can use the **import_molecules** argument to import positions and features of the transcripts along with the single cell profiles. ```{r eval = FALSE, class.source="watch-out"} CosMxR1 <- importCosMx(tiledbURI = "MuBrainDataRelease/", import_molecules = TRUE) ``` ## STOmics (MGI)Image Credit: https://bgi-australia.com.au/stomics
Before importing the STOmics data to VoltRon, we first convert STOmics readouts to an h5ad file using the **stereopy** Python module. For more information, visit [https://stereopy.readthedocs.io/](https://stereopy.readthedocs.io/en/latest/Tutorials/Format_Conversion.html). See [here](https://stereopy.readthedocs.io/en/latest/content/00_Installation.html) for instructions on how to install **stereopy**. ```{python class.source="watch-out", eval = FALSE} import stereo as st import warnings warnings.filterwarnings('ignore') # read the GEF file data_path = './SS200000135TL_D1.tissue.gef' data = st.io.read_gef(file_path=data_path, bin_size=50) data.tl.raw_checkpoint() # remember to set flavor as scanpy adata = st.io.stereo_to_anndata(data, flavor='scanpy', output='sample.h5ad') ```Image Credit: https://spatialgenomics.com/applications/
We use the **importGenePS** function to import the GenePS ([Spatial Genomics](https://spatialgenomics.com/product/)) readouts and create a VoltRon object. You can use the **import_molecules** argument to import positions and features of the transcripts along with the single cell profiles. The **resolution_level** argument determines the resolution of the DAPI image generated from the TIFF file. ```{r eval = FALSE, class.source="watch-out"} vrdata <- importGenePS(dir.path = "out/", import_molecules = TRUE, resolution_level = 7) ``` ``` VoltRon Object Sample1: Layers: Section1 Assays: GenePS(Main) GenePS_mol ``` ## PhenoCycler (Akoya)Image Credit: https://tep.cancer.illinois.edu/phenocycler-fusion-system/
We use the **importPhenoCycler** function to import the PhenoCycler ([Akoya Biosciences](https://www.akoyabio.com/)) readouts and create a VoltRon object. The function supports multiple readouts types depending on how the readouts were generated, this is controlled by the **type** arguement. For more information on all arguements of the function, see **help(importPhenoCycler)**. We used the Human FFPE tonsil tissue example with 83000 cells which could be found [here](https://akoya.app.box.com/s/lqaz1eyefni57sfynveh03e9sdy4aeuk). You have to download the contents and **dir.path** arguement should be set to the location of the **Example-dataset-for-MAV** folder. ```{r eval = FALSE, class.source="watch-out"} vr_pheno <- importPhenoCycler(dir.path = "Example-dataset-for-MAV/", type = "processor", sample_name = "Tonsil18AB") ``` ``` VoltRon Object Tonsil18AB: Layers: Section1 Assays: PhenoCycler(Main) ``` ## OpenSTImage Credit: https://www.cell.com/cell/fulltext/S0092-8674(24)00636-6
We use the **importOpenST** function to import the OpenST [https://rajewsky-lab.github.io/openst/latest/](https://rajewsky-lab.github.io/openst/latest/) readouts and create a VoltRon object. The function will parse each section from the output h5ad file, define it as an independent assay in a single layer where these layers are ordered in a single tissue block. We use the metastatic lymph node example that is deposited to NCBI/GEO ([GSE251926](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE251926)). Please use the **GSE251926_metastatic_lymph_node_3d.h5ad.gz** file and unzip it to use the importOpenST function below. ```{r eval = FALSE, class.source="watch-out"} vr_openst <- importOpenST(h5ad.path = "GSE251926_metastatic_lymph_node_3d.h5ad", sample_name = "MLN_3D") ``` ``` VoltRon Object MLN_3D: Layers: Section1 Section2 Section3 Section4 Section5 Section6 Section7 Section8 Section9 Section10 Section11 Section12 Section13 Section14 Section15 Section16 Section17 Section18 Section19 Assays: OpenST(Main) ``` ## DBIT-SeqImage Credit: https://www.cell.com/cell/fulltext/S0092-8674(20)31390-8
We use the **importDBITSeq** function to import the DBIT-Seq [https://www.cell.com/cell/fulltext/S0092-8674(20)31390-8](https://www.cell.com/cell/fulltext/S0092-8674(20)31390-8) readouts and create a VoltRon object. The default path to the rna count matrix is accompanied by the path to the protein count matrix, which is optional. The **size** parameter here determines the size of each square pixel on the DBIT-Seq slide (default is 10$\mu$m). We use the example with developing eye field in a E10 mouse embryo using 10-μm microfluidic (sample id: 0719cL for RNA, and 0719aL for Protein) channels that is deposited to NCBI/GEO ([GSE137986](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE137986)). ```{r eval = FALSE, class.source="watch-out"} vr_dbit <- importDBITSeq(path.rna = "GSM4189615_0719cL.tsv", path.prot = "GSM4202309_0719aL.tsv", size = 10, sample_name = "E10_Eye_2", image_name = "main") ``` ``` VoltRon Object E10_Eye_2: Layers: Section1 Assays: DBIT-Seq-RNA(Main) DBIT-Seq-Prot ``` ## ST dataImage Credit: Ståhl, et. al (2016). Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science, 353(6294), 78-82.
We demonstrate importing the original Spatial Transcriptomics (ST) datasets by formulating custom spot-level spatial transcriptomics datasets. We will use the **formVoltRon** function directly. We use the example provided by the [https://doi.org/10.5281/zenodo.4751624](https://doi.org/10.5281/zenodo.4751624). For more information you can also check the paper [here](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8516894/). We first import and manipulate count matrices and spot coordinates from the provided output files. Here we only demonstrate building a custom spot-level VoltRon object for the section "A1", however the remaining sections can be imported using the same method. ```{r eval = FALSE, class.source="watch-out"} # count matrix raw.data <- read.table("A1.tsv", header = TRUE, sep = "\t") spatialentities <- raw.data$X raw.data <- raw.data[,-1] raw.data <- t(as.matrix(raw.data)) # coords coords <- read.table("A1_selection.tsv", header = TRUE, sep = "\t") rownames(coords) <- paste(coords$x, coords$y, sep = "x") coords <- coords[entities,] coords <- coords[,c("pixel_x", "pixel_y")] colnames(coords) <- c("x", "y") ``` The image can be imported using the **magick** package. ```{r eval = FALSE, class.source="watch-out"} library(magick) img <- magick::image_read("HE/A1.jpg") img_info <- magick::image_info(img) ``` Before we form the VoltRon object, we should define the parameters for spot-level datasets. Here, we provide * the radius of a spot in the physical space (i.e. **spot.radius**) * the radius of a spot for visualization (i.e. **vis.spot.radius**) * The distance to the nearest spot, to be used for spatial neighborhood calculation (i.e. **nearestpost.distance**) The scaling parameter (scale_param) is required to overlay localization and distances of spots to the coordinate system of the imported image. Here, * the width of a ST slide is of 6200$\mu$m, * the diameter of a spot is 100$\mu$m (hence a radius of 50$\mu$m) and * the center-to-center distance between two spots is 200$\mu$m. * Each spot has at most 8 neighboring spots including vertically, horizontally and diagonally adjacent spots. ```{r eval = FALSE, class.source="watch-out"} scale_param <- img_info$width/6200 params <- list( spot.radius = 50*(scale_param), vis.spot.radius = 100*(scale_param), nearestpost.distance = (200*sqrt(2) + 50)*scale_param ) ``` Now we can combine all components into a VoltRon object. We should also flip the coordinates of spots vertically after creating the object. ```{r eval = FALSE, class.source="watch-out"} # make voltron object stdata <- formVoltRon(data = datax, image = img, coords = coords, assay.type = "spot", params = params, sample_name = "A1") stdata <- flipCoordinates(stdata) ``` We can now visualize spots and the adjacency between these spots simultaneously. ```{r eval = FALSE, class.source="watch-out"} stdata <- getSpatialNeighbors(stdata, method = "radius") vrSpatialPlot(stdata, graph.name = "radius", crop = T) ```![]() |
![]() |