a b/docs/source/edgepy.rst
1
======
2
edgepy
3
======
4
5
.. currentmodule:: inmoose.edgepy
6
7
This module is a partial port in Python of the R Bioconductor `edgeR package
8
<https://bioconductor.org/packages/release/bioc/html/edgeR.html>`_.
9
Only the functionalities necessary to :func:`inmoose.pycombat.pycombat_seq` and
10
differential expression analysis have been ported so far.
11
12
Differential Expression Analysis Example
13
========================================
14
15
We give below an example of how to use :code:`edgepy` to perform a differential
16
expression analysis on the pasilla dataset.
17
18
.. repl::
19
   from inmoose.data.pasilla import pasilla
20
   from inmoose.edgepy import DGEList, glmLRT, topTags
21
   from patsy import dmatrix
22
23
   # load the pasilla dataset as an AnnData
24
   pas = pasilla()
25
26
   # extract the count matrix and the annotation dataframe from the AnnData object
27
   counts = pas.X.T
28
   anno = pas.obs
29
   # build the design matrix
30
   design = dmatrix("~condition", data=anno)
31
32
   # build a DGEList object
33
   dge_list = DGEList(counts=counts, samples=anno, group_col="condition", genes=pas.var)
34
   # estimate the dispersions
35
   dge_list.estimateGLMCommonDisp(design=design)
36
37
   # fit the GLM
38
   fit = dge_list.glmFit(design=design)
39
40
   # run a differential expression analysis based on LRT
41
   lrt = glmLRT(fit)
42
43
   topTags(lrt)
44
45
46
References
47
==========
48
49
.. [Chen2016] Y. Chen, A.T.L Lun, G.K. Smyth. 2016. From reads to genes to
50
   pathways: differential expression analysis of RNA-Seq experiments using
51
   Rsubread and the edgeR quasi-likelihood pipeline. *F1000Research* 5, 1438.
52
   :doi:`10.12688/f1000research.8987.2`
53
54
.. [Gibbons1975] J.D. Gibbons, J.W. Pratt. 1975. P-values: interpretation and
55
   methodology. *The American Statistician* 29, 20-25.
56
   :doi:`10.1080/00031305.1975.10479106`
57
58
.. [Lun2016] A.T.L. Lun, Y. Chen, G.K. Smyth. 2016. It's DE-licious: a recipe
59
   for differential expression analyses of RNA-seq experiments using
60
   quasi-likelihood methods in edgeR.  *Methods in Molecular Biology* 1418,
61
   391-416. :doi:`10.1007/978-1-4939-3578-9_19`
62
63
.. [Lund2012] S.P. Lund, D. Nettleton, D.J. McCarthy, G.K. Smyth. 2012.
64
   Detecting differential expression in RNA-sequence data using quasi-likelihood
65
   with shrunken dispersion estimates. *Statistical Applications in Genetics and
66
   Molecular Biology* Volume 11, Issue 5, Article 8.
67
   :doi:`10.1515/1544-6115.1826`
68
69
.. [Lun2017] A.T.L. Lun, G.K. Smyth. 2017. No counts, no variance: allowing for
70
   loss of degrees of freedom when assessing biological variability from RNA-seq
71
   data. *Statistical Applications in Genetics and Molecular Biology* 16(2),
72
   83-93. :doi:`10.1515/sagmb-2017-0010`
73
74
.. [McCarthy2012] D. J. McCarthy, Y. Chen, G. K. Smyth. 2012. Differential
75
   expression analysis of multifactor RNA-Seq experiments with respect to
76
   biological variation. Nucleic Acids Research 40, 4288-4297.
77
   :doi:`10.1093/nar/gks042`
78
79
.. [Phipson2016] B. Phipson, S. Lee, I.J. Majewski, W. S. Alexander, G.K. Smyth.
80
   2016. Robust hyperparameter estimation protects against hypervariable genes
81
   and improves power to detect differential expression. *Annals of Applied
82
   Statistics* 10, 946-963. :doi:`10.1214/16-AOAS920`
83
84
.. [Robinson2008] M.D. Robinson, g.K. Smyth. 2008. Small-sample estimation of
85
   negative binomial dispersion, with applications to SAGE data.
86
   *Biostatistics* 9, 321-332.  :doi:`10.1093/biostatistics/kxm030`
87
88
89
Code documentation
90
==================
91
92
.. autosummary::
93
   :toctree: generated/
94
95
   DGEList
96
97
   addPriorCount
98
   adjustedProfileLik
99
   aveLogCPM
100
   binomTest
101
   designAsFactor
102
   dispCoxReid
103
   dispCoxReidInterpolateTagwise
104
   estimateGLMCommonDisp
105
   estimateGLMTagwiseDisp
106
   exactTest
107
   exactTestBetaApprox
108
   exactTestByDeviance
109
   exactTestBySmallP
110
   exactTestDoubleTail
111
   glmFit
112
   glmLRT
113
   glmQLFit
114
   glmQLFTest
115
   mglmLevenberg
116
   mglmOneGroup
117
   mglmOneWay
118
   movingAverageByCol
119
   nbinomDeviance
120
   plotQLDisp
121
   predFC
122
   splitIntoGroups
123
   systematicSubset
124
   topTags
125
   validDGEList