Diff of /docs/nicheclustering.Rmd [000000] .. [413088]

Switch to unified view

a b/docs/nicheclustering.Rmd
1
---
2
title: "Niche Clustering"
3
output: 
4
  html_document:
5
    toc: true
6
    toc_float:
7
      collapsed: false
8
      smooth_scroll: false
9
---
10
11
<style>
12
.title{
13
  display: none;
14
}
15
body {
16
  text-align: justify
17
}
18
.center {
19
  display: block;
20
  margin-left: auto;
21
  margin-right: auto;
22
}
23
</style>
24
25
```{css, echo=FALSE}
26
.watch-out {
27
  color: black;
28
}
29
```
30
31
```{r setup, include=FALSE}
32
# use rmarkdown::render_site(envir = knitr::knit_global())
33
knitr::opts_chunk$set(highlight = TRUE, echo = TRUE)
34
```
35
36
<br>
37
38
# Spot-based Niche Clustering
39
40
Spot-based spatial transcriptomic assays capture spatially-resolved gene expression profiles that are somewhat closer to single cell resolution. However, each spot still includes a few number of cells that are likely originated from few number of cell types, hence transcriptomic profile of each spot would likely include markers from multiple cell types. Here, **RNA deconvolution** can be incorporated to estimate the percentage/abundance of cell types for each spot. We use a scRNAseq dataset as a reference to computationally estimate the relative abundance of cell types across across the spots. 
41
42
VoltRon includes wrapper commands for using popular spot-level RNA deconvolution methods such as [RCTD](https://www.nature.com/articles/s41587-021-00830-w) and return estimated abundances as additional feature sets within each layer. These estimated percentages of cell types for each spot could be incorporated to detect **niches** (i.e. small local microenvironments of cells) within the tissue. We can process cell type abundance assays and used them for clustering to detect these niches. 
43
44
<br>
45
46
## Import Visium Data
47
48
For this tutorial we will analyze spot-based transcriptomic assays from Mouse Brain generated by the [Visium](https://www.10xgenomics.com/products/spatial-gene-expression) instrument. 
49
50
51
You can find and download readouts of all four Visium sections [here](https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Cellanalysis/Visium/MouseBrainSerialSections.zip). The **Mouse Brain Serial Section 1/2** datasets can be also downloaded from [here](https://www.10xgenomics.com/resources/datasets?menu%5Bproducts.name%5D=Spatial%20Gene%20Expression&query=&page=1&configure%5BhitsPerPage%5D=50&configure%5BmaxValuesPerFacet%5D=1000) (specifically, please filter for **Species=Mouse**, **AnatomicalEntity=brain**, **Chemistry=v1** and **PipelineVersion=v1.1.0**).
52
53
We will now import each of four samples separately and merge them into one VoltRon object. There are four brain tissue sections in total given two serial anterior and serial posterior sections, hence we have **two tissue blocks each having two layers**. 
54
55
```{r eval = FALSE, class.source="watch-out"}
56
library(VoltRon)
57
Ant_Sec1 <- importVisium("Sagittal_Anterior/Section1/", sample_name = "Anterior1")
58
Ant_Sec2 <- importVisium("Sagittal_Anterior/Section2/", sample_name = "Anterior2")
59
Pos_Sec1 <- importVisium("Sagittal_Posterior/Section1/", sample_name = "Posterior1")
60
Pos_Sec2 <- importVisium("Sagittal_Posterior/Section2/", sample_name = "Posterior2")
61
62
# merge datasets
63
MBrain_Sec_list <- list(Ant_Sec1, Ant_Sec2, Pos_Sec1, Pos_Sec2)
64
MBrain_Sec <- merge(MBrain_Sec_list[[1]], MBrain_Sec_list[-1], 
65
                    samples = c("Anterior", "Anterior", "Posterior", "Posterior"))
66
MBrain_Sec
67
```
68
69
```
70
VoltRon Object 
71
Anterior: 
72
  Layers: Section1 Section2 
73
Posterior: 
74
  Layers: Section1 Section2 
75
Assays: Visium(Main) 
76
Features: RNA(Main) 
77
```
78
79
VoltRon maps metadata features on the spatial images, multiple features can be provided for all assays/layers associated with the main assay (Visium). 
80
81
```{r eval = FALSE, class.source="watch-out"}
82
vrSpatialFeaturePlot(MBrain_Sec, features = "Count", crop = TRUE, alpha = 1, ncol = 2)
83
```
84
85
<img width="80%" height="80%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_first_plot.png" class="center">
86
87
<br>
88
89
## Import scRNA data
90
91
We will now import the scRNA data for reference which can be downloaded from [here](https://www.dropbox.com/s/cuowvm4vrf65pvq/allen_cortex.rds?dl=1). Specifically, we will use a scRNA data of Mouse cortical adult brain with 14,000 cells, generated with the SMART-Seq2 protocol, from the Allen Institute. This scRNA data is also used by the Spatial Data Analysis tutorial in [Seurat](https://satijalab.org/seurat/articles/spatial_vignette.html#integration-with-single-cell-data) website.  
92
93
```{r eval = FALSE, class.source="watch-out"}
94
# install packages if necessary
95
if(!requireNamespace("Seurat"))
96
  install.packages("Seurat")
97
if(!requireNamespace("dplyr"))
98
  install.packages("dplyr")
99
100
# import scRNA data
101
library(Seurat)
102
allen_reference <- readRDS("allen_cortex.rds")
103
104
# process and reduce dimensionality
105
library(dplyr)
106
allen_reference <- SCTransform(allen_reference, ncells = 3000, verbose = FALSE) %>%
107
  RunPCA(verbose = FALSE) %>%
108
  RunUMAP(dims = 1:30)
109
```
110
111
Before deconvoluting Visium spots, we correct cell types labels and drop some cell types with extremely few number of cells (e.g. "CR").  
112
113
```{r eval = FALSE, class.source="watch-out"}
114
# update labels and subset
115
allen_reference$subclass <- gsub("L2/3 IT", "L23 IT", allen_reference$subclass)
116
allen_reference <- allen_reference[,colnames(allen_reference)[!allen_reference@meta.data$subclass %in% "CR"]]
117
118
# visualize
119
Idents(allen_reference) <- "subclass"
120
gsubclass <- DimPlot(allen_reference, reduction = "umap", label = T) + NoLegend()
121
Idents(allen_reference) <- "class"
122
gclass <- DimPlot(allen_reference, reduction = "umap", label = T) + NoLegend()
123
gsubclass | gclass
124
```
125
126
<img width="95%" height="95%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_singlecell.png" class="center">
127
128
<br>
129
130
## Spot Deconvolution with RCTD
131
132
In order to integrate the scRNA data and the spatial data sets within the VoltRon object and estimate relative cell type abundances for each Visium spot, we will use **RCTD** algorithm which is accessible with the [spacexr](https://github.com/dmcable/spacexr) package. 
133
134
```{r eval = FALSE, class.source="watch-out"}
135
if(!requireNamespace("spacexr"))
136
  devtools::install_github("dmcable/spacexr", build_vignettes = FALSE)
137
```
138
139
After running **getDeconvolution**, an additional feature set within the same Visium assay with name **Decon** will be created.
140
141
```{r eval = FALSE, class.source="watch-out"}
142
library(spacexr)
143
MBrain_Sec <- getDeconvolution(MBrain_Sec, sc.object = allen_reference, sc.cluster = "subclass", max_cores = 6)
144
MBrain_Sec
145
```
146
147
```
148
VoltRon Object 
149
Anterior: 
150
  Layers: Section1 Section2 
151
Posterior: 
152
  Layers: Section1 Section2 
153
Assays: Visium(Main) 
154
Features: RNA(Main) Decon 
155
```
156
157
We can now switch to the **Decon** feature type where features are cell types from the scRNA reference and the data values are cell types percentages in each spot. 
158
159
```{r eval = FALSE, class.source="watch-out"}
160
vrMainFeatureType(MBrain_Sec) <- "Decon"
161
vrFeatures(MBrain_Sec)
162
```
163
164
```
165
 [1] "Astro"      "Endo"       "L23 IT"     "L4"         "L5 IT"      "L5 PT"     
166
 [7] "L6 CT"      "L6 IT"      "L6b"        "Lamp5"      "Macrophage" "Meis2"     
167
[13] "NP"         "Oligo"      "Peri"       "Pvalb"      "Serpinf1"   "SMC"       
168
[19] "Sncg"       "Sst"        "Vip"        "VLMC"          
169
```
170
171
These features (i.e. cell type abundances) can be visualized like any other feature type. 
172
173
```{r eval = FALSE, class.source="watch-out"}
174
vrSpatialFeaturePlot(MBrain_Sec, features = c("L4", "L5 PT", "Oligo", "Vip"), 
175
                     crop = TRUE, ncol = 2, alpha = 1, keep.scale = "all")
176
```
177
178
<img width="90%" height="90%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_spatialfeature_plot.png" class="center">
179
180
<br>
181
182
## Clustering 
183
184
Relative cell type abundances that are learned by RCTD and stored within VoltRon can now be used to cluster spots. These groups or clusters of spots can often be referred to as **niches**. Here, as a definition, a niche is a region or a collection of regions within tissue that have a distinct cell type composition as opposed to the remaining parts of the tissue.   
185
186
The cell type abundances (which adds up to one for each spot) can be normalized and processed like transcriptomic and proteomic profiles prior to clustering (i.e. niche clustering). We treat cell type abundances as [compositional data](https://en.wikipedia.org/wiki/Compositional_data), hence we incorporate **centred log ratio (CLR)** transformation for normalizing them. 
187
188
```{r eval = FALSE, class.source="watch-out"}
189
vrMainFeatureType(MBrain_Sec) <- "Decon"
190
MBrain_Sec <- normalizeData(MBrain_Sec, method = "CLR")
191
```
192
193
The CLR normalized assay have only 25 features, each representing a cell type from the single cell reference data. Hence, we can **directly calculate UMAP reductions from this feature abundances** since we dont have much number of features which necessitates dimensionality reduction such as PCA. 
194
195
However, we may still need to reduce the dimensionality of this space with 25 features using UMAP for visualizing purposes. VoltRon is also capable of calculating the UMAP reduction from normalized data slots. Hence, we build a UMAP reduction from CLR data directly. However, UMAP will always be calculated from a PCA reduction by default (if a PCA embedding is found in the object). 
196
197
```{r eval = FALSE, class.source="watch-out"}
198
MBrain_Sec <- getUMAP(MBrain_Sec, data.type = "norm")
199
vrEmbeddingPlot(MBrain_Sec, embedding = "umap", group.by = "Sample")
200
```
201
202
<img width="60%" height="60%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_embedding_sample.png" class="center">
203
204
<br>
205
206
Using normalized cell type abundances, we can now generate k-nearest neighbor graphs and cluster the graph using leiden method.  
207
208
```{r eval = FALSE, class.source="watch-out"}
209
MBrain_Sec <- getProfileNeighbors(MBrain_Sec, data.type = "norm", method = "SNN")
210
vrGraphNames(MBrain_Sec)
211
```
212
213
```
214
[1] "SNN"
215
```
216
217
```{r eval = FALSE, class.source="watch-out"}
218
MBrain_Sec <- getClusters(MBrain_Sec, resolution = 0.6, graph = "SNN")
219
```
220
221
<br>
222
223
## Visualization
224
225
VoltRon incorporates distinct plotting functions for, e.g. embeddings, coordinates, heatmap and even barplots. We can now map the clusters we have generated on UMAP embeddings. 
226
227
```{r eval = FALSE, class.source="watch-out"}
228
# visualize 
229
g1 <- vrEmbeddingPlot(MBrain_Sec, embedding = "umap", group.by = "Sample")
230
g2 <- vrEmbeddingPlot(MBrain_Sec, embedding = "umap", group.by = "niche_clusters", label = TRUE)
231
g1 | g2
232
```
233
234
<img width="100%" height="100%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_embedding_clusters.png" class="center">
235
236
<br>
237
238
Mapping clusters on the spatial images and spots would show the niche structure across all four tissue sections.
239
240
```{r eval = FALSE, class.source="watch-out"}
241
vrSpatialPlot(MBrain_Sec, group.by = "niche_clusters", crop = TRUE, alpha = 1)
242
```
243
244
<img width="80%" height="80%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_spatial_clusters.png" class="center">
245
246
<br>
247
248
We use **vrHeatmapPlot** to investigate relative cell type abundances across these niche clusters. You will need to have **ComplexHeatmap** package in your namespace.
249
250
```{r eval = FALSE, class.source="watch-out"}
251
# install packages if necessary
252
if(!requireNamespace("ComplexHeatmap"))
253
  BiocManager::install("ComplexHeatmap")
254
255
# heatmap of niches
256
library(ComplexHeatmap)
257
vrHeatmapPlot(MBrain_Sec, features = vrFeatures(MBrain_Sec), group.by = "niche_clusters", 
258
              show_row_names = T, show_heatmap_legend = T)
259
```
260
261
<img width="90%" height="90%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_heatmap_clusters.png" class="center">
262
<br>
263
264
# Cell-based Niche Clustering
265
266
Similar to spot-based spatial omics assays, we can build and cluster niche associated to each cell for spatial transcriptomics datasets in single cell resolution. For this, we require building niche assays for the collections of cells where a niche of cell is defined as a region of sets of regions with distinct cell type population that each of these cells belong to. 
267
268
Here, we dont require any scRNA reference dataset but we may first need to cluster and annotate cells in the RNA/transcriptome level profiles, and determine cell types. Then, we first detect the mixture of cell types within a spatial neighborhood around all cells and use that as a profile to perform clustering where these clusters will be associated with niches. 
269
270
## Import Xenium Data
271
272
For this, the data has to be already clustered (and annotated if possible). We will use the cluster labels generated at the end of the Xenium analysis workflow from [Cell/Spot Analysis](spotanalysis.html). You can download the VoltRon object with clustered and annotated Xenium cells along with the Visium assay from [here](https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/SpatialDataAlignment/Xenium_vs_Visium/VRBlock_data_clustered.rds). 
273
274
```{r eval = FALSE, class.source="watch-out"}
275
Xen_data <- readRDS("VRBlock_data_clustered.rds")
276
```
277
278
We will use all these 18 cell types used for annotating Xenium cells for detecting niches with distinct cellular type mixtures.
279
280
```{r eval = FALSE, class.source="watch-out"}
281
vrMainSpatial(Xen_data, assay = "Assay1") <- "main"
282
vrMainSpatial(Xen_data, assay = "Assay3") <- "main"
283
vrSpatialPlot(Xen_data, group.by = "CellType", pt.size = 0.13, background.color = "black", 
284
              legend.loc = "top", n.tile = 500)
285
```
286
287
<img width="100%" height="100%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_xenium_clusters.png" class="center">
288
289
<br>
290
291
## Creating Niche Assay
292
293
For calculating niche profiles for each cell, we have to first build spatial neighborhoods around cells and capture the local cell type mixtures. Using **getSpatialNeighbors**, we build a spatial neighborhood graph to connect all cells to other cells within at most 15 distance apart. 
294
295
```{r eval = FALSE, class.source="watch-out"}
296
Xen_data <- getSpatialNeighbors(Xen_data, radius = 15, method = "radius")
297
vrGraphNames(Xen_data)
298
```
299
300
```
301
[1] "radius"
302
```
303
304
Now, we can build a niche assay for cells using the **getNicheAssay** function which will create an additional feature set for cells called **Niche**. Here, each cell type is a feature and the profile of a cell represents the relative abundance of cell types around each cell.
305
306
```{r eval = FALSE, class.source="watch-out"}
307
Xen_data <- getNicheAssay(Xen_data, label = "CellType", graph.type = "radius")
308
Xen_data
309
```
310
311
```
312
VoltRon Object 
313
10XBlock: 
314
  Layers: Section1 Section2 Section3 
315
Assays: Xenium(Main) Visium 
316
Features: RNA(Main) Niche
317
```
318
319
<br>
320
321
## Clustering
322
323
The Niche assay can be normalized similar to the spot-level niche analysis using **centred log ratio (CLR)** transformation. 
324
325
```{r eval = FALSE, class.source="watch-out"}
326
vrMainFeatureType(Xen_data) <- "Niche"
327
Xen_data <- normalizeData(Xen_data, method = "CLR")
328
```
329
330
Default clustering functions could be used to analyze the normalized niche profiles of cells to detect niches associated with each cell. However, we use K-means algorithm to perform the niche clustering. For this exercise, we pick an estimate of 7 clusters which will be the number of niche clusters we get.  
331
332
```{r eval = FALSE, class.source="watch-out"}
333
Xen_data <- getClusters(Xen_data, nclus = 7, method = "kmeans", label = "niche_clusters")
334
```
335
336
After the niche clustering, the metadata is updated and observed later like below.
337
338
```{r eval = FALSE, class.source="watch-out"}
339
head(Metadata(Xen_data))
340
```
341
342
<div><pre><code style="font-size: 10px;">               id Count assay_id  Assay    Layer   Sample CellType niche_clusters
343
1_Assay1 1_Assay1    28   Assay1 Xenium Section1 10XBlock   DCIS_1              2
344
2_Assay1 2_Assay1    94   Assay1 Xenium Section1 10XBlock   DCIS_2              2
345
3_Assay1 3_Assay1     9   Assay1 Xenium Section1 10XBlock   DCIS_1              2
346
4_Assay1 4_Assay1    11   Assay1 Xenium Section1 10XBlock   DCIS_1              2
347
5_Assay1 5_Assay1    48   Assay1 Xenium Section1 10XBlock   DCIS_2              2
348
6_Assay1 6_Assay1     7   Assay1 Xenium Section1 10XBlock   DCIS_1              2</code></pre></div>
349
350
<br>
351
352
## Visualization
353
354
After niche clustering, each cell in the Xenium assay will be assigned a niche which is initially a number which indicates the ID of each particular niche. It is up to the user to annotate, filter and visualize these niches moving forward. 
355
356
```{r eval = FALSE, class.source="watch-out"}
357
vrSpatialPlot(Xen_data, group.by = "niche_clusters", alpha = 1, legend.loc = "top")
358
```
359
360
We use **vrHeatmapPlot** to investigate the abundance of each cell type across the niche clusters. You will need to have **ComplexHeatmap** package in your namespace. We see that niche cluster 1 include all invasive tumor subtypes (IT 1-3). We see this for two subtypes of in situ ductal carcinoma (DCIS 1,2) subtypes as well other than a third DCIS subcluster being within proximity to myoepithelial cells. Niche cluster 6 also shows regions within the breast cancer tissue where T cells and B cells are found together abundantly. 
361
362
<img width="100%" height="100%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_xenium_nicheclusters.png" class="center">
363
<br>
364
365
```{r eval = FALSE, class.source="watch-out"}
366
# install packages if necessary
367
if(!requireNamespace("ComplexHeatmap"))
368
  BiocManager::install("ComplexHeatmap")
369
370
# heatmap of niches
371
library(ComplexHeatmap)
372
vrHeatmapPlot(Xen_data, features = vrFeatures(Xen_data), group.by = "niche_clusters")
373
```
374
375
<img width="100%" height="100%" src="https://bimsbstatic.mdc-berlin.de/landthaler/VoltRon/Package/images/decon_xenium_heatmapclusters.png" class="center">
376
<br>