Diff of /README.md [000000] .. [0bdad5]

Switch to unified view

a b/README.md
1
2
<!-- README.md is generated from README.Rmd. Please edit that file -->
3
4
# UCSCXenaTools <img src='man/figures/logo.png' align="right" height="200" alt="logo"/>
5
6
<!-- badges: start -->
7
8
[![CRAN
9
status](https://www.r-pkg.org/badges/version/UCSCXenaTools)](https://cran.r-project.org/package=UCSCXenaTools)
10
[![lifecycle](https://img.shields.io/badge/lifecycle-stable-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html)
11
[![R-CMD-check](https://github.com/ropensci/UCSCXenaTools/actions/workflows/main.yml/badge.svg)](https://github.com/ropensci/UCSCXenaTools/actions/workflows/main.yml)
12
[![](https://cranlogs.r-pkg.org/badges/grand-total/UCSCXenaTools?color=orange)](https://cran.r-project.org/package=UCSCXenaTools)
13
[![rOpenSci](https://badges.ropensci.org/315_status.svg)](https://github.com/ropensci/software-review/issues/315)
14
[![DOI](https://joss.theoj.org/papers/10.21105/joss.01627/status.svg)](https://doi.org/10.21105/joss.01627)
15
16
<!-- badges: end -->
17
18
**UCSCXenaTools** is an R package for accessing genomics data from UCSC
19
Xena platform, from cancer multi-omics to single-cell RNA-seq. Public
20
omics data from UCSC Xena are supported through [**multiple turn-key
21
Xena Hubs**](https://xenabrowser.net/datapages/), which are a collection
22
of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE,
23
and others. Databases are normalized so they can be combined, linked,
24
filtered, explored and downloaded.
25
26
**Who is the target audience and what are scientific applications of
27
this package?**
28
29
- Target Audience: cancer and clinical researchers, bioinformaticians
30
- Applications: genomic and clinical analyses
31
32
## Table of Contents
33
34
- [Installation](#installation)
35
- [Data Hub List](#data-hub-list)
36
- [Basic usage](#basic-usage)
37
- [Citation](#citation)
38
- [How to contribute](#how-to-contribute)
39
- [Acknowledgment](#acknowledgment)
40
41
## Installation
42
43
Install stable release from r-universe/CRAN with:
44
45
``` r
46
install.packages('UCSCXenaTools', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))
47
#install.packages("UCSCXenaTools")
48
```
49
50
You can also install devel version of **UCSCXenaTools** from github
51
with:
52
53
``` r
54
# install.packages("remotes")
55
remotes::install_github("ropensci/UCSCXenaTools")
56
```
57
58
If you want to build vignette in local, please add two options:
59
60
``` r
61
remotes::install_github("ropensci/UCSCXenaTools", build_vignettes = TRUE, dependencies = TRUE)
62
```
63
64
## Data Hub List
65
66
All datasets are available at <https://xenabrowser.net/datapages/>.
67
68
Currently, **UCSCXenaTools** supports the following data hubs of UCSC
69
Xena.
70
71
- UCSC Public Hub: <https://ucscpublic.xenahubs.net/>
72
- TCGA Hub: <https://tcga.xenahubs.net/>
73
- GDC Xena Hub (new): <https://gdc.xenahubs.net/>
74
- GDC v18.0 Xena Hub (old): <https://gdcV18.xenahubs.net/>
75
- ICGC Xena Hub: <https://icgc.xenahubs.net/>
76
- Pan-Cancer Atlas Hub: <https://pancanatlas.xenahubs.net/>
77
- UCSC Toil RNAseq Recompute Compendium Hub:
78
  <https://toil.xenahubs.net/>
79
- PCAWG Xena Hub: <https://pcawg.xenahubs.net/>
80
- ATAC-seq Hub: <https://atacseq.xenahubs.net/>
81
- Singel Cell Xena Hub: <https://singlecellnew.xenahubs.net/>
82
  (**Disabled by UCSCXena**)
83
- Kids First Xena Hub: <https://kidsfirst.xenahubs.net/>
84
- Treehouse Xena Hub: <https://xena.treehouse.gi.ucsc.edu:443/>
85
86
Users can update dataset list from the newest version of UCSC Xena by
87
hand with `XenaDataUpdate()` function, followed by restarting R and
88
`library(UCSCXenaTools)`.
89
90
If any url of data hub is changed or a new data hub is online, please
91
remind me by emailing to <w_shixiang@163.com> or [opening an issue on
92
GitHub](https://github.com/ropensci/UCSCXenaTools/issues).
93
94
## Basic usage
95
96
Download UCSC Xena datasets and load them into R by **UCSCXenaTools** is
97
a workflow with `generate`, `filter`, `query`, `download` and `prepare`
98
5 steps, which are implemented as `XenaGenerate`, `XenaFilter`,
99
`XenaQuery`, `XenaDownload` and `XenaPrepare` functions, respectively.
100
They are very clear and easy to use and combine with other packages like
101
`dplyr`.
102
103
To show the basic usage of **UCSCXenaTools**, we will download clinical
104
data of LUNG, LUAD, LUSC from TCGA (hg19 version) data hub. Users can
105
learn more about **UCSCXenaTools** by running
106
`browseVignettes("UCSCXenaTools")` to read vignette.
107
108
### XenaData data.frame
109
110
**UCSCXenaTools** uses a `data.frame` object (built in package)
111
`XenaData` to generate an instance of `XenaHub` class, which records
112
information of all datasets of UCSC Xena Data Hubs.
113
114
You can load `XenaData` after loading `UCSCXenaTools` into R.
115
116
``` r
117
library(UCSCXenaTools)
118
#> =========================================================================================
119
#> UCSCXenaTools version 1.6.0
120
#> Project URL: https://github.com/ropensci/UCSCXenaTools
121
#> Usages: https://cran.r-project.org/web/packages/UCSCXenaTools/vignettes/USCSXenaTools.html
122
#> 
123
#> If you use it in published research, please cite:
124
#> Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
125
#>   from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq.
126
#>   Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627
127
#> =========================================================================================
128
#>                               --Enjoy it--
129
data(XenaData)
130
131
head(XenaData)
132
#> # A tibble: 6 × 17
133
#>   XenaHosts XenaHostNames XenaCohorts XenaDatasets SampleCount DataSubtype Label
134
#>   <chr>     <chr>         <chr>       <chr>              <int> <chr>       <chr>
135
#> 1 https://… publicHub     Breast Can… ucsfNeve_pu…          51 gene expre… Neve…
136
#> 2 https://… publicHub     Breast Can… ucsfNeve_pu…          57 phenotype   Phen…
137
#> 3 https://… publicHub     Glioma (Ko… kotliarov20…         194 copy number Kotl…
138
#> 4 https://… publicHub     Glioma (Ko… kotliarov20…         194 phenotype   Phen…
139
#> 5 https://… publicHub     Lung Cance… weir2007_pu…         383 copy number CGH  
140
#> 6 https://… publicHub     Lung Cance… weir2007_pu…         383 phenotype   Phen…
141
#> # ℹ 10 more variables: Type <chr>, AnatomicalOrigin <chr>, SampleType <chr>,
142
#> #   Tags <chr>, ProbeMap <chr>, LongTitle <chr>, Citation <chr>, Version <chr>,
143
#> #   Unit <chr>, Platform <chr>
144
```
145
146
### Workflow
147
148
Select datasets.
149
150
``` r
151
# The options in XenaFilter function support Regular Expression
152
XenaGenerate(subset = XenaHostNames=="tcgaHub") %>% 
153
  XenaFilter(filterDatasets = "clinical") %>% 
154
  XenaFilter(filterDatasets = "LUAD|LUSC|LUNG") -> df_todo
155
156
df_todo
157
#> class: XenaHub 
158
#> hosts():
159
#>   https://tcga.xenahubs.net
160
#> cohorts() (3 total):
161
#>   TCGA Lung Cancer (LUNG)
162
#>   TCGA Lung Adenocarcinoma (LUAD)
163
#>   TCGA Lung Squamous Cell Carcinoma (LUSC)
164
#> datasets() (3 total):
165
#>   TCGA.LUNG.sampleMap/LUNG_clinicalMatrix
166
#>   TCGA.LUAD.sampleMap/LUAD_clinicalMatrix
167
#>   TCGA.LUSC.sampleMap/LUSC_clinicalMatrix
168
```
169
170
Query and download.
171
172
``` r
173
XenaQuery(df_todo) %>%
174
  XenaDownload() -> xe_download
175
#> This will check url status, please be patient.
176
#> All downloaded files will under directory /tmp/RtmpYsoGw3.
177
#> The 'trans_slash' option is FALSE, keep same directory structure as Xena.
178
#> Creating directories for datasets...
179
#> Downloading TCGA.LUNG.sampleMap/LUNG_clinicalMatrix
180
#> Downloading TCGA.LUAD.sampleMap/LUAD_clinicalMatrix
181
#> Downloading TCGA.LUSC.sampleMap/LUSC_clinicalMatrix
182
```
183
184
Prepare data into R for analysis.
185
186
``` r
187
cli = XenaPrepare(xe_download)
188
class(cli)
189
#> [1] "list"
190
names(cli)
191
#> [1] "LUNG_clinicalMatrix" "LUAD_clinicalMatrix" "LUSC_clinicalMatrix"
192
```
193
194
## More to read
195
196
- [Introduction and basic usage of
197
  UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-intro/)
198
- [UCSCXenaTools: Retrieve Gene Expression and Clinical Information from
199
  UCSC Xena for Survival
200
  Analysis](https://shixiangwang.github.io/home/en/post/ucscxenatools-201908/)
201
- [Obtain RNAseq Values for a Specific Gene in Xena
202
  Database](https://shixiangwang.github.io/home/en/post/2020-07-22-ucscxenatools-single-gene/)
203
- [UCSC Xena Access APIs in
204
  UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-api/)
205
206
## Citation
207
208
Cite me by the following paper.
209
210
    Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
211
      from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. 
212
      Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627
213
214
    # For BibTex
215
      
216
    @article{Wang2019UCSCXenaTools,
217
        journal = {Journal of Open Source Software},
218
        doi = {10.21105/joss.01627},
219
        issn = {2475-9066},
220
        number = {40},
221
        publisher = {The Open Journal},
222
        title = {The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq},
223
        url = {https://dx.doi.org/10.21105/joss.01627},
224
        volume = {4},
225
        author = {Wang, Shixiang and Liu, Xuesong},
226
        pages = {1627},
227
        date = {2019-08-05},
228
        year = {2019},
229
        month = {8},
230
        day = {5},
231
    }
232
233
Cite UCSC Xena by the following paper.
234
235
    Goldman, Mary, et al. "The UCSC Xena Platform for cancer genomics data 
236
        visualization and interpretation." BioRxiv (2019): 326470.
237
238
## How to contribute
239
240
For anyone who wants to contribute, please follow the guideline:
241
242
- Clone project from GitHub
243
- Open `UCSCXenaTools.Rproj` with RStudio
244
- Modify source code
245
- Run `devtools::check()`, and fix all errors, warnings and notes
246
- Create a pull request
247
248
## Acknowledgment
249
250
This package is based on [XenaR](https://github.com/mtmorgan/XenaR),
251
thanks [Martin Morgan](https://github.com/mtmorgan) for his work.
252
253
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)