Switch to unified view

a/README.md b/README.md
1
1
---
2
<!-- README.md is generated from README.Rmd. Please edit that file -->
2
output: github_document
3
3
---
4
# UCSCXenaTools <img src='man/figures/logo.png' align="right" height="200" alt="logo"/>
4
5
5
<!-- README.md is generated from README.Rmd. Please edit that file -->
6
<!-- badges: start -->
6
7
7
```{r, echo = FALSE}
8
[![CRAN
8
knitr::opts_chunk$set(
9
status](https://www.r-pkg.org/badges/version/UCSCXenaTools)](https://cran.r-project.org/package=UCSCXenaTools)
9
  collapse = TRUE,
10
[![lifecycle](https://img.shields.io/badge/lifecycle-stable-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html)
10
  comment = "#>",
11
[![R-CMD-check](https://github.com/ropensci/UCSCXenaTools/actions/workflows/main.yml/badge.svg)](https://github.com/ropensci/UCSCXenaTools/actions/workflows/main.yml)
11
  fig.path = "README-"
12
[![](https://cranlogs.r-pkg.org/badges/grand-total/UCSCXenaTools?color=orange)](https://cran.r-project.org/package=UCSCXenaTools)
12
)
13
[![rOpenSci](https://badges.ropensci.org/315_status.svg)](https://github.com/ropensci/software-review/issues/315)
13
```
14
[![DOI](https://joss.theoj.org/papers/10.21105/joss.01627/status.svg)](https://doi.org/10.21105/joss.01627)
14
15
15
16
<!-- badges: end -->
16
<!-- badges: start -->
17
17
18
**UCSCXenaTools** is an R package for accessing genomics data from UCSC
18
[![CRAN status](https://www.r-pkg.org/badges/version/UCSCXenaTools)](https://cran.r-project.org/package=UCSCXenaTools)
19
Xena platform, from cancer multi-omics to single-cell RNA-seq. Public
19
[![lifecycle](https://img.shields.io/badge/lifecycle-stable-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html)
20
omics data from UCSC Xena are supported through [**multiple turn-key
20
[![R-CMD-check](https://github.com/ropensci/UCSCXenaTools/actions/workflows/main.yml/badge.svg)](https://github.com/ropensci/UCSCXenaTools/actions/workflows/main.yml) 
21
Xena Hubs**](https://xenabrowser.net/datapages/), which are a collection
21
[![](https://cranlogs.r-pkg.org/badges/grand-total/UCSCXenaTools?color=orange)](https://cran.r-project.org/package=UCSCXenaTools)
22
of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE,
22
[![rOpenSci](https://badges.ropensci.org/315_status.svg)](https://github.com/ropensci/software-review/issues/315)
23
and others. Databases are normalized so they can be combined, linked,
23
[![DOI](https://joss.theoj.org/papers/10.21105/joss.01627/status.svg)](https://doi.org/10.21105/joss.01627)
24
filtered, explored and downloaded.
24
25
25
<!-- badges: end -->
26
**Who is the target audience and what are scientific applications of
26
27
this package?**
27
**UCSCXenaTools** is an R package for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. 
28
28
Public omics data from UCSC Xena are supported through [**multiple turn-key Xena Hubs**](https://xenabrowser.net/datapages/), which are a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others. Databases are normalized so they can be combined, linked, filtered, explored and downloaded.
29
- Target Audience: cancer and clinical researchers, bioinformaticians
29
30
- Applications: genomic and clinical analyses
30
**Who is the target audience and what are scientific applications of this package?**
31
31
32
## Table of Contents
32
* Target Audience: cancer and clinical researchers, bioinformaticians
33
33
* Applications: genomic and clinical analyses
34
- [Installation](#installation)
34
35
- [Data Hub List](#data-hub-list)
35
## Table of Contents
36
- [Basic usage](#basic-usage)
36
37
- [Citation](#citation)
37
* [Installation](#installation)
38
- [How to contribute](#how-to-contribute)
38
* [Data Hub List](#data-hub-list)
39
- [Acknowledgment](#acknowledgment)
39
* [Basic usage](#basic-usage)
40
40
* [Citation](#citation)
41
## Installation
41
* [How to contribute](#how-to-contribute)
42
42
* [Acknowledgment](#acknowledgment)
43
Install stable release from r-universe/CRAN with:
43
44
44
## Installation
45
``` r
45
46
install.packages('UCSCXenaTools', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))
46
Install stable release from r-universe/CRAN with:
47
#install.packages("UCSCXenaTools")
47
48
```
48
```{r, eval=FALSE}
49
49
install.packages('UCSCXenaTools', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))
50
You can also install devel version of **UCSCXenaTools** from github
50
#install.packages("UCSCXenaTools")
51
with:
51
```
52
52
53
``` r
53
You can also install devel version of **UCSCXenaTools** from github with:
54
# install.packages("remotes")
54
55
remotes::install_github("ropensci/UCSCXenaTools")
55
```{r gh-installation, eval = FALSE}
56
```
56
# install.packages("remotes")
57
57
remotes::install_github("ropensci/UCSCXenaTools")
58
If you want to build vignette in local, please add two options:
58
```
59
59
60
``` r
60
If you want to build vignette in local, please add two options:
61
remotes::install_github("ropensci/UCSCXenaTools", build_vignettes = TRUE, dependencies = TRUE)
61
62
```
62
```{r, eval=FALSE}
63
63
remotes::install_github("ropensci/UCSCXenaTools", build_vignettes = TRUE, dependencies = TRUE)
64
## Data Hub List
64
```
65
65
66
All datasets are available at <https://xenabrowser.net/datapages/>.
66
## Data Hub List
67
67
68
Currently, **UCSCXenaTools** supports the following data hubs of UCSC
68
All datasets are available at <https://xenabrowser.net/datapages/>.
69
Xena.
69
70
70
Currently, **UCSCXenaTools** supports the following data hubs of UCSC Xena.
71
- UCSC Public Hub: <https://ucscpublic.xenahubs.net/>
71
72
- TCGA Hub: <https://tcga.xenahubs.net/>
72
* UCSC Public Hub: <https://ucscpublic.xenahubs.net/>
73
- GDC Xena Hub (new): <https://gdc.xenahubs.net/>
73
* TCGA Hub: <https://tcga.xenahubs.net/>
74
- GDC v18.0 Xena Hub (old): <https://gdcV18.xenahubs.net/>
74
* GDC Xena Hub (new): <https://gdc.xenahubs.net/>
75
- ICGC Xena Hub: <https://icgc.xenahubs.net/>
75
* GDC v18.0 Xena Hub (old): <https://gdcV18.xenahubs.net/>
76
- Pan-Cancer Atlas Hub: <https://pancanatlas.xenahubs.net/>
76
* ICGC Xena Hub: <https://icgc.xenahubs.net/>
77
- UCSC Toil RNAseq Recompute Compendium Hub:
77
* Pan-Cancer Atlas Hub: <https://pancanatlas.xenahubs.net/>
78
  <https://toil.xenahubs.net/>
78
* UCSC Toil RNAseq Recompute Compendium Hub: <https://toil.xenahubs.net/>
79
- PCAWG Xena Hub: <https://pcawg.xenahubs.net/>
79
* PCAWG Xena Hub: <https://pcawg.xenahubs.net/>
80
- ATAC-seq Hub: <https://atacseq.xenahubs.net/>
80
* ATAC-seq Hub: <https://atacseq.xenahubs.net/>
81
- Singel Cell Xena Hub: <https://singlecellnew.xenahubs.net/>
81
* Singel Cell Xena Hub: <https://singlecellnew.xenahubs.net/> (**Disabled by UCSCXena**)
82
  (**Disabled by UCSCXena**)
82
* Kids First Xena Hub: <https://kidsfirst.xenahubs.net/>
83
- Kids First Xena Hub: <https://kidsfirst.xenahubs.net/>
83
* Treehouse Xena Hub: <https://xena.treehouse.gi.ucsc.edu:443/>
84
- Treehouse Xena Hub: <https://xena.treehouse.gi.ucsc.edu:443/>
84
85
85
Users can update dataset list from the newest version of UCSC Xena by hand with `XenaDataUpdate()` function, followed
86
Users can update dataset list from the newest version of UCSC Xena by
86
by restarting R and `library(UCSCXenaTools)`.
87
hand with `XenaDataUpdate()` function, followed by restarting R and
87
88
`library(UCSCXenaTools)`.
88
If any url of data hub is changed or a new data hub is online, please remind me by emailing to <w_shixiang@163.com> or [opening an issue on GitHub](https://github.com/ropensci/UCSCXenaTools/issues).
89
89
90
If any url of data hub is changed or a new data hub is online, please
90
91
remind me by emailing to <w_shixiang@163.com> or [opening an issue on
91
## Basic usage
92
GitHub](https://github.com/ropensci/UCSCXenaTools/issues).
92
93
93
Download UCSC Xena datasets and load them into R by **UCSCXenaTools** is a workflow with `generate`, `filter`, `query`, `download` and `prepare` 5 steps, which are implemented as `XenaGenerate`, `XenaFilter`, `XenaQuery`, `XenaDownload` and `XenaPrepare` functions, respectively. They are very clear and easy to use and combine with other packages like `dplyr`.
94
## Basic usage
94
95
95
To show the basic usage of **UCSCXenaTools**, we will download clinical data of LUNG, LUAD, LUSC from TCGA (hg19 version) data hub. Users can learn more about **UCSCXenaTools** by running `browseVignettes("UCSCXenaTools")` to read vignette.
96
Download UCSC Xena datasets and load them into R by **UCSCXenaTools** is
96
97
a workflow with `generate`, `filter`, `query`, `download` and `prepare`
97
### XenaData data.frame
98
5 steps, which are implemented as `XenaGenerate`, `XenaFilter`,
98
99
`XenaQuery`, `XenaDownload` and `XenaPrepare` functions, respectively.
99
**UCSCXenaTools** uses a `data.frame` object (built in package) `XenaData` to generate an instance of `XenaHub` class, which records information of all datasets of UCSC Xena Data Hubs.
100
They are very clear and easy to use and combine with other packages like
100
101
`dplyr`.
101
You can load `XenaData` after loading `UCSCXenaTools` into R.
102
102
103
To show the basic usage of **UCSCXenaTools**, we will download clinical
103
```{r}
104
data of LUNG, LUAD, LUSC from TCGA (hg19 version) data hub. Users can
104
library(UCSCXenaTools)
105
learn more about **UCSCXenaTools** by running
105
data(XenaData)
106
`browseVignettes("UCSCXenaTools")` to read vignette.
106
107
107
head(XenaData)
108
### XenaData data.frame
108
```
109
109
110
**UCSCXenaTools** uses a `data.frame` object (built in package)
110
### Workflow
111
`XenaData` to generate an instance of `XenaHub` class, which records
111
112
information of all datasets of UCSC Xena Data Hubs.
112
Select datasets.
113
113
114
You can load `XenaData` after loading `UCSCXenaTools` into R.
114
```{r}
115
115
# The options in XenaFilter function support Regular Expression
116
``` r
116
XenaGenerate(subset = XenaHostNames=="tcgaHub") %>% 
117
library(UCSCXenaTools)
117
  XenaFilter(filterDatasets = "clinical") %>% 
118
#> =========================================================================================
118
  XenaFilter(filterDatasets = "LUAD|LUSC|LUNG") -> df_todo
119
#> UCSCXenaTools version 1.6.0
119
120
#> Project URL: https://github.com/ropensci/UCSCXenaTools
120
df_todo
121
#> Usages: https://cran.r-project.org/web/packages/UCSCXenaTools/vignettes/USCSXenaTools.html
121
```
122
#> 
122
123
#> If you use it in published research, please cite:
123
Query and download.
124
#> Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
124
125
#>   from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq.
125
```{r}
126
#>   Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627
126
XenaQuery(df_todo) %>%
127
#> =========================================================================================
127
  XenaDownload() -> xe_download
128
#>                               --Enjoy it--
128
```
129
data(XenaData)
129
130
130
Prepare data into R for analysis.
131
head(XenaData)
131
132
#> # A tibble: 6 × 17
132
```{r}
133
#>   XenaHosts XenaHostNames XenaCohorts XenaDatasets SampleCount DataSubtype Label
133
cli = XenaPrepare(xe_download)
134
#>   <chr>     <chr>         <chr>       <chr>              <int> <chr>       <chr>
134
class(cli)
135
#> 1 https://… publicHub     Breast Can… ucsfNeve_pu…          51 gene expre… Neve…
135
names(cli)
136
#> 2 https://… publicHub     Breast Can… ucsfNeve_pu…          57 phenotype   Phen…
136
```
137
#> 3 https://… publicHub     Glioma (Ko… kotliarov20…         194 copy number Kotl…
137
138
#> 4 https://… publicHub     Glioma (Ko… kotliarov20…         194 phenotype   Phen…
138
## More to read
139
#> 5 https://… publicHub     Lung Cance… weir2007_pu…         383 copy number CGH  
139
140
#> 6 https://… publicHub     Lung Cance… weir2007_pu…         383 phenotype   Phen…
140
- [Introduction and basic usage of UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-intro/)
141
#> # ℹ 10 more variables: Type <chr>, AnatomicalOrigin <chr>, SampleType <chr>,
141
- [UCSCXenaTools: Retrieve Gene Expression and Clinical Information from UCSC Xena for Survival Analysis](https://shixiangwang.github.io/home/en/post/ucscxenatools-201908/)
142
#> #   Tags <chr>, ProbeMap <chr>, LongTitle <chr>, Citation <chr>, Version <chr>,
142
- [Obtain RNAseq Values for a Specific Gene in Xena Database](https://shixiangwang.github.io/home/en/post/2020-07-22-ucscxenatools-single-gene/)
143
#> #   Unit <chr>, Platform <chr>
143
- [UCSC Xena Access APIs in UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-api/)
144
```
144
145
145
## Citation
146
### Workflow
146
147
147
Cite me by the following paper.
148
Select datasets.
148
149
149
```
150
``` r
150
Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
151
# The options in XenaFilter function support Regular Expression
151
  from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. 
152
XenaGenerate(subset = XenaHostNames=="tcgaHub") %>% 
152
  Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627
153
  XenaFilter(filterDatasets = "clinical") %>% 
153
154
  XenaFilter(filterDatasets = "LUAD|LUSC|LUNG") -> df_todo
154
# For BibTex
155
155
  
156
df_todo
156
@article{Wang2019UCSCXenaTools,
157
#> class: XenaHub 
157
    journal = {Journal of Open Source Software},
158
#> hosts():
158
    doi = {10.21105/joss.01627},
159
#>   https://tcga.xenahubs.net
159
    issn = {2475-9066},
160
#> cohorts() (3 total):
160
    number = {40},
161
#>   TCGA Lung Cancer (LUNG)
161
    publisher = {The Open Journal},
162
#>   TCGA Lung Adenocarcinoma (LUAD)
162
    title = {The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq},
163
#>   TCGA Lung Squamous Cell Carcinoma (LUSC)
163
    url = {https://dx.doi.org/10.21105/joss.01627},
164
#> datasets() (3 total):
164
    volume = {4},
165
#>   TCGA.LUNG.sampleMap/LUNG_clinicalMatrix
165
    author = {Wang, Shixiang and Liu, Xuesong},
166
#>   TCGA.LUAD.sampleMap/LUAD_clinicalMatrix
166
    pages = {1627},
167
#>   TCGA.LUSC.sampleMap/LUSC_clinicalMatrix
167
    date = {2019-08-05},
168
```
168
    year = {2019},
169
169
    month = {8},
170
Query and download.
170
    day = {5},
171
171
}
172
``` r
172
```
173
XenaQuery(df_todo) %>%
173
174
  XenaDownload() -> xe_download
174
Cite UCSC Xena by the following paper. 
175
#> This will check url status, please be patient.
175
176
#> All downloaded files will under directory /tmp/RtmpYsoGw3.
176
```
177
#> The 'trans_slash' option is FALSE, keep same directory structure as Xena.
177
Goldman, Mary, et al. "The UCSC Xena Platform for cancer genomics data 
178
#> Creating directories for datasets...
178
    visualization and interpretation." BioRxiv (2019): 326470.
179
#> Downloading TCGA.LUNG.sampleMap/LUNG_clinicalMatrix
179
```
180
#> Downloading TCGA.LUAD.sampleMap/LUAD_clinicalMatrix
180
181
#> Downloading TCGA.LUSC.sampleMap/LUSC_clinicalMatrix
181
## How to contribute
182
```
182
183
183
For anyone who wants to contribute, please follow the guideline:
184
Prepare data into R for analysis.
184
185
185
* Clone project from GitHub
186
``` r
186
* Open `UCSCXenaTools.Rproj` with RStudio
187
cli = XenaPrepare(xe_download)
187
* Modify source code 
188
class(cli)
188
* Run `devtools::check()`, and fix all errors, warnings and notes
189
#> [1] "list"
189
* Create a pull request
190
names(cli)
190
191
#> [1] "LUNG_clinicalMatrix" "LUAD_clinicalMatrix" "LUSC_clinicalMatrix"
191
## Acknowledgment
192
```
192
193
193
This package is based on [XenaR](https://github.com/mtmorgan/XenaR), thanks [Martin Morgan](https://github.com/mtmorgan) for his work.
194
## More to read
194
195
195
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)
196
- [Introduction and basic usage of
197
  UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-intro/)
198
- [UCSCXenaTools: Retrieve Gene Expression and Clinical Information from
199
  UCSC Xena for Survival
200
  Analysis](https://shixiangwang.github.io/home/en/post/ucscxenatools-201908/)
201
- [Obtain RNAseq Values for a Specific Gene in Xena
202
  Database](https://shixiangwang.github.io/home/en/post/2020-07-22-ucscxenatools-single-gene/)
203
- [UCSC Xena Access APIs in
204
  UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-api/)
205
206
## Citation
207
208
Cite me by the following paper.
209
210
    Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
211
      from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. 
212
      Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627
213
214
    # For BibTex
215
      
216
    @article{Wang2019UCSCXenaTools,
217
        journal = {Journal of Open Source Software},
218
        doi = {10.21105/joss.01627},
219
        issn = {2475-9066},
220
        number = {40},
221
        publisher = {The Open Journal},
222
        title = {The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq},
223
        url = {https://dx.doi.org/10.21105/joss.01627},
224
        volume = {4},
225
        author = {Wang, Shixiang and Liu, Xuesong},
226
        pages = {1627},
227
        date = {2019-08-05},
228
        year = {2019},
229
        month = {8},
230
        day = {5},
231
    }
232
233
Cite UCSC Xena by the following paper.
234
235
    Goldman, Mary, et al. "The UCSC Xena Platform for cancer genomics data 
236
        visualization and interpretation." BioRxiv (2019): 326470.
237
238
## How to contribute
239
240
For anyone who wants to contribute, please follow the guideline:
241
242
- Clone project from GitHub
243
- Open `UCSCXenaTools.Rproj` with RStudio
244
- Modify source code
245
- Run `devtools::check()`, and fix all errors, warnings and notes
246
- Create a pull request
247
248
## Acknowledgment
249
250
This package is based on [XenaR](https://github.com/mtmorgan/XenaR),
251
thanks [Martin Morgan](https://github.com/mtmorgan) for his work.
252
253
[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)