a b/vignettes/netOmics.Rmd
1
---
2
title: "netOmics"
3
author: 
4
- name:  "Antoine Bodein"
5
  affiliation: "CHU de Québec Research Center, Université Laval, Molecular Medicine department, Québec, QC, Canada"
6
  email: "antoine.bodein.1@ulaval.ca"
7
- name:  "Marie-Pier Scott-Boyer"
8
  affiliation: "CHU de Québec Research Center, Université Laval, Molecular Medicine department, Québec, QC, Canada"
9
- name: "Olivier Perin"
10
  affiliation: "Digital Sciences Department, L’Oréal Advanced Research, Aulnay-sous-bois, France"
11
- name: "Kim-Anh Lê Cao"
12
  affiliation: "Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, Australia"
13
- name: "Arnaud Droit"
14
  affiliation: "CHU de Québec Research Center, Université Laval, Molecular Medicine department, Québec, QC, Canada"
15
package: netOmics  
16
output: 
17
  BiocStyle::html_document
18
  
19
vignette: >
20
  %\VignetteIndexEntry{netOmics}
21
  %\VignetteEngine{knitr::rmarkdown}
22
  %\VignetteEncoding{UTF-8}  
23
24
bibliography: ["mybib.bib"]
25
biblio-style: apalike
26
link-citations: true
27
---
28
29
```{r, echo=FALSE}
30
knitr::opts_chunk$set(fig.align = "center")
31
```
32
33
34
The emergence of multi-omics data enabled the development of 
35
integration methods.
36
37
With netOmics, we go beyond integration by introducing an interpretation tool.
38
netOmics is a package for the creation and exploration of multi-omics networks.
39
40
Depending on the provided dataset, it allows to create inference networks from 
41
expression data but also interaction networks from knowledge databases.
42
After merging the sub-networks to obtain a global multi-omics network, 
43
we propose network exploration methods using propoagation techniques to perform 
44
functional prediction or identification of molecular mechanisms.
45
46
Furthermore, the package has been developed for longitudinal multi-omics data 
47
and can be used in conjunction with our previously published package timeOmics.
48
49
![Overview](./img/netomics_overview.png)
50
51
For more informnation about the method, please check [@bodein2020interpretation]
52
53
In this vignette, we introduced a case study which depict the main steps to 
54
create and explore multi-omics networks from multi-omics time-course data.
55
56
# Requirements
57
58
```{r,eval=FALSE}
59
# install the package via BioConductor
60
if (!requireNamespace("BiocManager", quietly = TRUE))
61
    install.packages("BiocManager")
62
63
BiocManager::install("netOmics")
64
```
65
66
```{r,eval=FALSE}
67
# install the package via github
68
library(devtools)
69
install_github("abodein/netOmics")
70
```
71
72
73
```{r, eval=TRUE, message=FALSE}
74
# load the package
75
library(netOmics)
76
```
77
78
79
```{r, eval=TRUE, message=FALSE}
80
# usefull packages to build this vignette
81
library(timeOmics)
82
library(tidyverse)
83
library(igraph)
84
```
85
86
# Case Study: Human Microbiome Project T2D
87
88
The package will be illustrated on longitudinal MO dataset to study 
89
the seasonality of MO expression in patients with diabetes [@sailani2020deep].
90
91
The data used in this vignette is a subset of the data available at: 
92
https://github.com/aametwally/ipop_seasonal
93
94
We focused on a single individual with 7 timepoints.
95
6 different omics were sampled 
96
(RNA, proteins, cytokines, gut microbiome, metabolites and clinical variables).
97
98
```{r load_data}
99
# load data
100
data("hmp_T2D")
101
```
102
103
104
# (optional: *timeOmics* analysis)
105
106
The first step of the analysis is the preprocessing and longitudinal clustering.
107
This step is carried out with timeOmics and should be reserved for longitudinal 
108
data.
109
110
It ensures that the time profiles are classified into groups of similar profiles
111
so each MO molecule is labbeled with its cluster.
112
113
In addition, timeOmics can identify a multi-omics signature of the clusters.
114
These molecules can be, for example, the starting points of the propogation 
115
analysis.
116
117
For more informations about *timeOmics*, please 
118
check http://www.bioconductor.org/packages/release/bioc/html/timeOmics.html
119
120
As illustrated in the R chunk below the timeOmics step includes:
121
122
* omic-specific preprocessing and longitudinal fold-change filtering
123
* modelling of expression profiles
124
* clustering of MO expression profiles
125
* signature identification by cluster
126
127
```{r timeOmics_1, eval=FALSE}
128
# not evaluated in this vignette
129
130
#1 filter fold-change
131
remove.low.cv <- function(X, cutoff = 0.5){
132
    # var.coef
133
    cv <- unlist(lapply(as.data.frame(X), 
134
                    function(x) abs(sd(x, na.rm = TRUE)/mean(x, na.rm= TRUE))))
135
    return(X[,cv > cutoff])
136
}
137
fc.threshold <- list("RNA"= 1.5, "CLINICAL"=0.2, "GUT"=1.5, "METAB"=1.5,
138
                     "PROT" = 1.5, "CYTO" = 1)
139
140
# --> hmp_T2D$raw
141
data.filter <- imap(raw, ~{remove.low.cv(.x, cutoff = fc.threshold[[.y]])})
142
143
#2 scale
144
data <- lapply(data.filter, function(x) log(x+1))
145
# --> hmp_T2D$data
146
147
148
#3 modelling
149
lmms.func <- function(X){
150
    time <- rownames(X) %>% str_split("_") %>% 
151
      map_chr(~.x[[2]]) %>% as.numeric()
152
    lmms.output <- lmms::lmmSpline(data = X, time = time,
153
                                   sampleID = rownames(X), deri = FALSE,
154
                                   basis = "p-spline", numCores = 4, 
155
                                   keepModels = TRUE)
156
    return(lmms.output)
157
}
158
data.modelled <- lapply(data, function(x) lmms.func(x))
159
160
# 4 clustering
161
block.res <- block.pls(data.modelled, indY = 1, ncomp = 1)
162
getCluster.res <- getCluster(block.res)
163
# --> hmp_T2D$getCluster.res
164
165
166
# 5 signature
167
list.keepX <- list("CLINICAL" = 4, "CYTO" = 3, "GUT" = 10, "METAB" = 3, 
168
                   "PROT" = 2,"RNA" = 34)
169
sparse.block.res  <- block.spls(data.modelled, indY = 1, ncomp = 1, scale =TRUE, 
170
                                keepX =list.keepX)
171
getCluster.sparse.res <- getCluster(sparse.block.res)
172
# --> hmp_T2D$getCluster.sparse.res
173
```
174
175
timeOmics resulted in 2 clusters, labelled `1` and `-1`
176
177
```{r timeOmics_2}
178
# clustering results
179
cluster.info <- hmp_T2D$getCluster.res
180
```
181
182
# Network building
183
184
Each layer of the network is built sequentially and then assembled 
185
in a second section.
186
187
All the functions in the package can be used on one element or a list of 
188
elements.
189
In the longitudinal context of the data, kinetic cluster sub-networks are built 
190
plus a global network
191
(`1`, `-1` and `All`).
192
193
## Inference Network
194
195
Multi-omics network building starts with a first layer of gene.
196
Currently, the ARACNe algorithm handles the inference but we will include more 
197
algorithms in the future.
198
199
The function `get_grn` return a Gene Regulatory Network from gene expression 
200
data. 
201
Optionally, the user can provide a timeOmics clustering result (`?getCluster`) 
202
to get cluster specific sub-networks. In this case study, this will 
203
automatically build the networks (`1`, `-1` and `All`), as indicated previously.
204
205
The `get_graph_stats` function provides basic graph statistics such as the 
206
number of vertices and edges.
207
If the vertices have different attributes, it also includes a summary of these.
208
209
```{r graph.rna, warning=FALSE}
210
cluster.info.RNA <- timeOmics::getCluster(cluster.info, user.block = "RNA")
211
graph.rna <- get_grn(X = hmp_T2D$data$RNA, cluster = cluster.info.RNA)
212
213
# to get info about the network
214
get_graph_stats(graph.rna)
215
```
216
217
## Interaction from databases
218
219
As for the genes, the second layer is a protein layer (Protein-Protein 
220
Interaction).
221
This time, no inference is performed. Instead, known interactions are extracted
222
from a database of interaction (BIOGRID).
223
224
The function `get_interaction_from_database` will fetch the interactions from a 
225
database provided as a `data.frame` (with columns `from` and `to`) or a graph 
226
(`igraph` object).
227
In addition to the interactions between the indicated molecules, the first 
228
degree neighbors can also be collected (option `user.ego = TRUE`)
229
230
231
```{r PROT_graph, warning=FALSE}
232
# Utility function to get the molecules by cluster
233
get_list_mol_cluster <- function(cluster.info, user.block){
234
  require(timeOmics)
235
    tmp <- timeOmics::getCluster(cluster.info, user.block) 
236
    res <- tmp %>% split(.$cluster) %>% 
237
        lapply(function(x) x$molecule)
238
    res[["All"]] <- tmp$molecule
239
    return(res)
240
}
241
242
cluster.info.prot <- get_list_mol_cluster(cluster.info, user.block = 'PROT')
243
graph.prot <-  get_interaction_from_database(X = cluster.info.prot, 
244
                                             db = hmp_T2D$interaction.biogrid, 
245
                                             type = "PROT", user.ego = TRUE)
246
# get_graph_stats(graph.prot)
247
```
248
249
In this example, only a subset of the Biogrid database is used 
250
(matching elements).
251
252
## Other inference methods
253
254
Another way to compute networks from expression data is to use other inference
255
methods.
256
In the following chunk, we intend to illustrate the use of the SparCC algorithm
257
[@friedman2012inferring] on the gut data and how it can be integrate into the 
258
pipeline. 
259
(sparcc is not included in this package)
260
261
262
```{r GUT_graph, eval = FALSE}
263
# not evaluated in this vignette
264
library(SpiecEasi)
265
266
get_sparcc_graph <- function(X, threshold = 0.3){
267
    res.sparcc <- sparcc(data = X)
268
    sparcc.graph <- abs(res.sparcc$Cor) >= threshold
269
    colnames(sparcc.graph) <-  colnames(X)
270
    rownames(sparcc.graph) <-  colnames(X)
271
    res.graph <- graph_from_adjacency_matrix(sparcc.graph, 
272
                                             mode = "undirected") %>% simplify
273
    return(res.graph)
274
}
275
276
gut_list <- get_list_mol_cluster(cluster.info, user.block = 'GUT')
277
278
graph.gut <- list()
279
graph.gut[["All"]] <- get_sparcc_graph(hmp_T2D$raw$GUT, threshold = 0.3)
280
graph.gut[["1"]] <- get_sparcc_graph(hmp_T2D$raw$GUT %>% 
281
                                       dplyr::select(gut_list[["1"]]), 
282
                                     threshold = 0.3)
283
graph.gut[["-1"]] <- get_sparcc_graph(hmp_T2D$raw$GUT %>% 
284
                                        dplyr::select(gut_list[["-1"]]), 
285
                                      threshold = 0.3)
286
class(graph.gut) <- "list.igraph"
287
```
288
289
```{r GUT}
290
graph.gut <- hmp_T2D$graph.gut
291
# get_graph_stats(graph.gut)
292
```
293
294
## Other examples
295
296
For this case study, we complete this first step of network building with the 
297
missing layers.
298
299
```{r CYTO_graph, warning=FALSE}
300
# CYTO -> from database (biogrid)
301
cyto_list = get_list_mol_cluster(cluster.info = cluster.info, 
302
                                 user.block = "CYTO")
303
graph.cyto <-  get_interaction_from_database(X = cyto_list,
304
                                             db = hmp_T2D$interaction.biogrid, 
305
                                             type = "CYTO", user.ego = TRUE)
306
# get_graph_stats(graph.cyto)
307
308
# METAB -> inference
309
cluster.info.metab <-  timeOmics::getCluster(X = cluster.info, 
310
                                             user.block = "METAB")
311
graph.metab <-  get_grn(X = hmp_T2D$data$METAB, 
312
                        cluster = cluster.info.metab)
313
# get_graph_stats(graph.metab)
314
315
# CLINICAL -> inference
316
cluster.info.clinical <- timeOmics::getCluster(X = cluster.info, 
317
                                               user.block = 'CLINICAL')
318
graph.clinical <- get_grn(X = hmp_T2D$data$CLINICAL,
319
                          cluster = cluster.info.clinical)
320
# get_graph_stats(graph.clinical)
321
```
322
323
# Layer merging
324
325
We included 2 types of layer merging: 
326
327
* *merging with interactions* uses the shared elements between 2 graphs to build
328
a larger network.
329
* *merging with correlations* uses the spearman correlation from expression 
330
profiles between 2 layers when any interaction is known.
331
332
333
## Merging with interactions
334
335
The function `combine_layers` enables the fusion of different network layers.
336
It combines the network (or list of network) in `graph1` with the network
337
(or list of network) in `graph2`, based on the shared vertices between
338
the networks.
339
340
Additionally, the user can provide an interaction table `interaction.df`
341
(in the form of a data.frame or igraph object).
342
343
In the following chunk, we sequentially merge RNA, PROT and CYTO layers and uses
344
the TFome information (TF protein -> Target Gene) to connect these layers.
345
346
```{r, merged_0}
347
full.graph <- combine_layers(graph1 = graph.rna, graph2 = graph.prot)
348
full.graph <- combine_layers(graph1 = full.graph, graph2 = graph.cyto)
349
350
full.graph <- combine_layers(graph1 = full.graph,
351
                             graph2 = hmp_T2D$interaction.TF)
352
# get_graph_stats(full.graph)
353
```
354
355
## Merging with correlations
356
357
To connect omics layers for which no interaction information is available,
358
we propose to use a threshold on the correlation between the expression profiles
359
of two or more omics data.
360
361
The strategy is as follows: we isolate the omics from the data and calculate
362
the correlations between this omics and the other data.
363
364
```{r merged_1_gut, warning=FALSE}
365
all_data <- reduce(hmp_T2D$data, cbind)
366
367
# omic = gut
368
gut_list <- get_list_mol_cluster(cluster.info, user.block = "GUT")
369
omic_data <- lapply(gut_list, function(x)dplyr::select(hmp_T2D$data$GUT, x))
370
371
# other data = "RNA", "PROT", "CYTO"
372
other_data_list <- get_list_mol_cluster(cluster.info,
373
                                        user.block = c("RNA", "PROT", "CYTO"))
374
other_data <- lapply(other_data_list, function(x)dplyr::select(all_data, x))
375
376
# get interaction between gut data and other data
377
interaction_df_gut <- get_interaction_from_correlation(X = omic_data,
378
                                                       Y = other_data,
379
                                                       threshold = 0.99)
380
381
# and merge with full graph
382
full.graph <- combine_layers(graph1 = full.graph,
383
                             graph2 = hmp_T2D$graph.gut,
384
                             interaction.df = interaction_df_gut$All)
385
```
386
387
388
```{r, merged_2_clinical, warning=FALSE}
389
# omic =  Clinical
390
clinical_list <- get_list_mol_cluster(cluster.info, user.block = "CLINICAL")
391
omic_data <- lapply(clinical_list, 
392
                    function(x)dplyr::select(hmp_T2D$data$CLINICAL, x))
393
394
# other data = "RNA", "PROT", "CYTO", "GUT"
395
other_data_list <- get_list_mol_cluster(cluster.info,
396
                                        user.block = c("RNA", "PROT", 
397
                                                       "CYTO", "GUT"))
398
other_data <- lapply(other_data_list, function(x)dplyr::select(all_data, x))
399
400
401
# get interaction between gut data and other data
402
interaction_df_clinical <- get_interaction_from_correlation(X = omic_data
403
                                                            , Y = other_data,
404
                                                            threshold = 0.99)
405
406
# and merge with full graph
407
full.graph <- combine_layers(graph1 = full.graph,
408
                             graph2 = hmp_T2D$graph.clinical, 
409
                             interaction.df = interaction_df_clinical$All)
410
```
411
412
413
```{r, merged_3_metab, warning=FALSE}
414
# omic =  Metab
415
metab_list <- get_list_mol_cluster(cluster.info, user.block = "METAB")
416
omic_data <- lapply(metab_list, function(x)dplyr::select(hmp_T2D$data$METAB, x))
417
418
# other data = "RNA", "PROT", "CYTO", "GUT", "CLINICAL"
419
other_data_list <- get_list_mol_cluster(cluster.info,
420
                                        user.block = c("RNA", "PROT", "CYTO", 
421
                                                       "GUT", "CLINICAL"))
422
other_data <- lapply(other_data_list, function(x)dplyr::select(all_data, x))
423
424
# get interaction between gut data and other data
425
interaction_df_metab <- get_interaction_from_correlation(X = omic_data,
426
                                                         Y = other_data, 
427
                                                         threshold = 0.99)
428
429
# and merge with full graph
430
full.graph <- combine_layers(graph1 = full.graph, 
431
                             graph2 = graph.metab, 
432
                             interaction.df = interaction_df_metab$All)
433
```
434
435
# Addition of supplemental layers
436
437
For the interpretation of the MO integration results, the use of additional 
438
information layers or molecules can be useful to enrich the network.
439
440
## Over Representation Analysis
441
442
ORA is a common step to include knowledge.
443
The function `get_interaction_from_ORA` perform the ORA analysis from the 
444
desired molecules and return an interaction graph with the enriched terms and 
445
the corresponding molecules.
446
447
Then, the interaction graph with the new vertices can be linked to the network 
448
as illustrated in the previous step.
449
450
Here, ORA was performed with RNA, PROT, and CYTO against the Gene Ontology.
451
452
```{r}
453
# ORA by cluster/All
454
mol_ora <- get_list_mol_cluster(cluster.info, 
455
                                user.block = c("RNA", "PROT", "CYTO"))
456
457
# get ORA interaction graph by cluster
458
graph.go <- get_interaction_from_ORA(query = mol_ora,
459
                                     sources = "GO",
460
                                     organism = "hsapiens",
461
                                     signif.value = TRUE)
462
463
# merge
464
full.graph <- combine_layers(graph1 = full.graph, graph2 = graph.go)
465
```
466
467
## External knowledge
468
469
Additionally, knowledge from external sources can be included in the network.
470
471
In the following chunk, we performed disease-related gene enrichment analysis
472
from *medlineRanker* (http://cbdm-01.zdv.uni-mainz.de/~jfontain/cms/?page_id=4).
473
We converted the results into a data.frame (with the columns `from` and `to`)
474
and this acted as an interaction database.
475
476
```{r}
477
# medlineRanker -> database
478
medlineranker.res.df <- hmp_T2D$medlineranker.res.df %>% 
479
  dplyr::select(Disease, symbol) %>% 
480
  set_names(c("from", "to"))
481
  
482
mol_list <-  get_list_mol_cluster(cluster.info = cluster.info,
483
                                  user.block = c("RNA", "PROT", "CYTO"))
484
graph.medlineranker <-  get_interaction_from_database(X = mol_list,
485
                                                      db = medlineranker.res.df, 
486
                                                      type = "Disease",
487
                                                      user.ego = TRUE)
488
# get_graph_stats(graph.medlineranker)
489
490
# merging
491
full.graph <- combine_layers(graph1 = full.graph, graph2 = graph.medlineranker)
492
```
493
494
We complete the MO network preparation with attribute cleaning and addition of 
495
several attributes such as:
496
497
* mode = "core" if the vertex was originally present in the data; "extended"
498
otherwise
499
* sparse = TRUE if the vertex was present in kinetic cluster signature; FALSE
500
otherwise
501
* type = type of omics ("RNA","PROT","CLINICAL","CYTO","GUT","METAB","GO",
502
"Disease")
503
* cluster = '1', '-1' or 'NA' (for vertices not originally present in the
504
original data)
505
506
```{r}
507
# graph cleaning
508
graph_cleaning <- function(X, cluster.info){
509
    # no reusability
510
    X <- igraph::simplify(X)
511
    va <- vertex_attr(X)
512
    viewed_mol <- c()
513
    for(omic in unique(cluster.info$block)){
514
        mol <- intersect(cluster.info %>% dplyr::filter(.$block == omic) %>%
515
                           pull(molecule), V(X)$name)
516
        viewed_mol <- c(viewed_mol, mol)
517
        X <- set_vertex_attr(graph = X, 
518
                             name = "type", 
519
                             index = mol, 
520
                             value = omic)
521
        X <- set_vertex_attr(graph = X, 
522
                             name = "mode",
523
                             index = mol,
524
                             value = "core")
525
    }
526
    # add medline ranker and go
527
    mol <- intersect(map(graph.go, ~ as_data_frame(.x)$to) %>%
528
                       unlist %>% unique(), V(X)$name) # only GO terms
529
    viewed_mol <- c(viewed_mol, mol)
530
    X <- set_vertex_attr(graph = X, name = "type", index = mol, value = "GO")
531
    X <- set_vertex_attr(graph = X, name = "mode", 
532
                         index = mol, value = "extended")
533
    
534
    mol <- intersect(as.character(medlineranker.res.df$from), V(X)$name)
535
    viewed_mol <- c(viewed_mol, mol)
536
    X <- set_vertex_attr(graph = X, name = "type",
537
                         index = mol, value = "Disease")
538
    X <- set_vertex_attr(graph = X, name = "mode",
539
                         index = mol, value = "extended")
540
    
541
    other_mol <- setdiff(V(X), viewed_mol)
542
    if(!is_empty(other_mol)){
543
        X <- set_vertex_attr(graph = X, name = "mode",
544
                             index = other_mol, value = "extended")
545
    }
546
    X <- set_vertex_attr(graph = X, name = "mode", 
547
                         index = intersect(cluster.info$molecule, V(X)$name), 
548
                         value = "core")
549
    
550
    # signature
551
    mol <-  intersect(V(X)$name, hmp_T2D$getCluster.sparse.res$molecule)
552
    X <- set_vertex_attr(graph = X, name = "sparse", index = mol, value = TRUE)
553
    mol <-  setdiff(V(X)$name, hmp_T2D$getCluster.sparse.res$molecule)
554
    X <- set_vertex_attr(graph = X, name = "sparse", index = mol, value = FALSE)
555
    
556
    return(X)
557
}
558
```
559
560
561
```{r}
562
FULL <- lapply(full.graph, function(x) graph_cleaning(x, cluster.info))
563
get_graph_stats(FULL)
564
```
565
566
# Network exploration
567
568
## Basics network exploration
569
570
We can use basic graph statistics to explore the network such as degree 
571
distribution, modularity, and short path.
572
573
```{r, eval = FALSE}
574
# degree analysis
575
d <- degree(FULL$All)
576
hist(d)
577
d[max(d)]
578
579
# modularity # Warnings: can take several minutes
580
res.mod <- walktrap.community(FULL$All)
581
# ...
582
583
# modularity
584
sp <- shortest.paths(FULL$All)
585
```
586
587
## Random walk with restart
588
589
RWR is a powerful tool to explore the MO networks which simulates a particle 
590
that randomly walk on the network.
591
From a starting point (`seed`) it ranks the other vertices based on their
592
proximity with the seed and the network structure.
593
594
We use RWR for function prediction and molecular mechanism identification.
595
596
In the example below, the seeds were the GO terms vertices.
597
598
```{r}
599
# seeds = all vertices -> takes 5 minutes to run on regular computer
600
# seeds <- V(FULL$All)$name
601
# rwr_res <- random_walk_restart(FULL, seeds)
602
603
# seed = some GO terms
604
seeds <- head(V(FULL$All)$name[V(FULL$All)$type == "GO"])
605
rwr_res <- random_walk_restart(FULL, seeds)
606
```
607
608
### Find vertices with specific attributes
609
610
After the RWR analysis, we implemented several functions to extract valuable
611
information.
612
613
To identify MO molecular functions, the seed can be a GO term and we are 
614
interested to identify vertices with different omics type within the 
615
closest nodes.
616
617
The function `rwr_find_seeds_between_attributes` can identify which seeds were 
618
able to reach vertices with different attributes (ex: `type`) within the 
619
closest `k` (ex: `15`) vertices.
620
621
The function `summary_plot_rwr_attributes` displays the number of different 
622
values for a seed attribute as a bar graph.
623
624
```{r}
625
rwr_type_k15 <- rwr_find_seeds_between_attributes(X = rwr_res, 
626
                                                  attribute = "type", k = 15)
627
628
# a summary plot function
629
summary_plot_rwr_attributes(rwr_type_k15)
630
summary_plot_rwr_attributes(rwr_type_k15$All)
631
```
632
633
Alternatively, we can be interested to find functions or molecules which 
634
link different kinetic cluster (to find regulatory mechanisms).
635
636
```{r}
637
rwr_type_k15 <- rwr_find_seeds_between_attributes(X = rwr_res$All, 
638
                                                  attribute = "cluster", k = 15)
639
summary_plot_rwr_attributes(rwr_type_k15)
640
```
641
642
A RWR subnetworks can also be displayed with `plot_rwr_subnetwork` 
643
from a specific seed.
644
```{r}
645
sub_res <- rwr_type_k15$`GO:0005737`
646
sub <- plot_rwr_subnetwork(sub_res, legend = TRUE, plot = TRUE)
647
```
648
649
### Function prediction
650
651
Finally, RWR can also be used for function prediction.
652
From an annotated genes, the predicted function can be the closest vertex of the
653
type "GO".
654
655
We generalized this principle to identify, from a seed of interest, the closest
656
node (or `top` closest nodes) with specific attributes and value.
657
658
In the example below, the gene "ZNF263" is linked to the 5 closest nodes of 
659
type = 'GO' and type = 'Disease'.
660
661
```{r}
662
rwr_res <- random_walk_restart(FULL$All, seed = "ZNF263")
663
664
# closest GO term
665
rwr_find_closest_type(rwr_res, seed = "ZNF263", attribute = "type", 
666
                      value = "GO", top = 5)
667
668
# closest Disease
669
rwr_find_closest_type(rwr_res, seed = "ZNF263", attribute = "type", 
670
                      value = "Disease", top = 5)
671
672
# closest nodes with an attribute "cluster" and the value "-1"
673
rwr_find_closest_type(rwr_res, seed = "ZNF263", attribute = "cluster",
674
                      value = "-1", top = 5)
675
```
676
677
678
```{r, eval = FALSE}
679
seeds <- V(FULL$All)$name[V(FULL$All)$type %in% c("GO", "Disease")]
680
```
681
682
```{r}
683
sessionInfo()
684
```
685
686
# References