|
a |
|
b/docs/source/pasilla.rst |
|
|
1 |
pasilla |
|
|
2 |
======= |
|
|
3 |
|
|
|
4 |
.. currentmodule:: inmoose.data.pasilla |
|
|
5 |
|
|
|
6 |
This module is a port of the R Bioconductor `pasilla package |
|
|
7 |
<https://bioconductor.org/packages/release/data/experiment/html/pasilla.html>`_, |
|
|
8 |
version 1.31.0. |
|
|
9 |
|
|
|
10 |
This module provides per-exon and per-gene read counts computed for selected |
|
|
11 |
genes from RNA-seq data that were presented in [Brooks2011]_. The experiment |
|
|
12 |
studied the effect of RNAi knockdown of Pasilla, the Drosophila melanogaster |
|
|
13 |
ortholog of mammalian NOVA1 and NOVA2, on the transcriptome. The `R package |
|
|
14 |
vignette |
|
|
15 |
<https://bioconductor.org/packages/release/data/experiment/vignettes/pasilla/inst/doc/create_objects.html>`_ |
|
|
16 |
describes how the data provided here were derived from the RNA-Seq read sequence |
|
|
17 |
data that are provided by NCBI Gene Expression Omnibus under accession numbers |
|
|
18 |
GSM461176 to GSM461181. |
|
|
19 |
|
|
|
20 |
We describe below how to load the data to build an :class:`AnnData` object (NB: |
|
|
21 |
the snippet below is wrapped in the :func:`pasilla` function for convenience):: |
|
|
22 |
|
|
|
23 |
import importlib.resources |
|
|
24 |
import pandas as pd |
|
|
25 |
import anndata as ad |
|
|
26 |
|
|
|
27 |
data_dir = importlib.resources.files("inmoose.data.pasilla") |
|
|
28 |
cts = pd.read_csv(data_dir.joinpath("pasilla_gene_counts.tsv"), sep='\t', index_col=0) |
|
|
29 |
anno = pd.read_csv(data_dir.joinpath("pasilla_sample_annotation.csv"), index_col=0) |
|
|
30 |
|
|
|
31 |
# The columns of `cts` and the rows of `anno` use different labels and are |
|
|
32 |
# not in the same order. We first need to harmonize them before building the |
|
|
33 |
# AnnData object. |
|
|
34 |
|
|
|
35 |
# first get rid of the "fb" suffix |
|
|
36 |
anno.index = [i[:-2] for i in anno.index] |
|
|
37 |
|
|
|
38 |
# second reorder the index |
|
|
39 |
anno = anno.reindex(cts.columns) |
|
|
40 |
|
|
|
41 |
# we are now ready to build the AnnData object |
|
|
42 |
adata = ad.AnnData(cts.T, anno) |
|
|
43 |
adata |
|
|
44 |
|
|
|
45 |
|
|
|
46 |
Code documentation |
|
|
47 |
------------------ |
|
|
48 |
|
|
|
49 |
.. autofunction:: pasilla |
|
|
50 |
|
|
|
51 |
References |
|
|
52 |
---------- |
|
|
53 |
|
|
|
54 |
.. [Brooks2011] A.N. Brooks, L. Yang, M.O. Duff, K.D. Hansen, J.W. Park, S. |
|
|
55 |
Dudoit, S.E. Brenner, B.R. Graveley. 2011. Conservation of an RNA regulatory |
|
|
56 |
map between Drosophila and mammals. *Genome Research* |
|
|
57 |
:doi:`10.1101/gr.108662.110` |
|
|
58 |
|