|
a/README.md |
|
b/README.md |
1 |
|
1 |
|
2 |
<!-- README.md is generated from README.Rmd. Please edit that file --> |
2 |
|
3 |
|
3 |
# R/`biotmle` |
4 |
# R/`biotmle` |
4 |
|
5 |
|
5 |
[](https://github.com/nhejazi/biotmle/actions) |
6 |
[](https://github.com/nhejazi/biotmle/actions) |
6 |
[](https://codecov.io/github/nhejazi/biotmle?branch=master) |
7 |
[](http://www.repostatus.org/#active) |
8 |
Status](https://img.shields.io/codecov/c/github/nhejazi/biotmle/master.svg)](https://codecov.io/github/nhejazi/biotmle?branch=master) |
8 |
[](https://bioconductor.org/checkResults/release/bioc-LATEST/biotmle) |
9 |
[](https://bioconductor.org/packages/release/bioc/html/biotmle.html) |
10 |
state and is being actively |
10 |
[](https://bioconductor.org/packages/release/bioc/html/biotmle.html) |
11 |
developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) |
11 |
[](http://opensource.org/licenses/MIT) |
12 |
[](https://zenodo.org/badge/latestdoi/65854775) |
13 |
status](http://www.bioconductor.org/shields/build/release/bioc/biotmle.svg)](https://bioconductor.org/checkResults/release/bioc-LATEST/biotmle) |
13 |
[](http://joss.theoj.org/papers/02be843d9bab1b598187bfbb08ce3949) |
14 |
[](https://bioconductor.org/packages/release/bioc/html/biotmle.html) |
15 |
Targeted Learning with Moderated Statistics for Biomarker Discovery |
16 |
[](https://bioconductor.org/packages/release/bioc/html/biotmle.html) |
17 |
__Authors:__ [Nima Hejazi](https://nimahejazi.org), [Mark van der |
18 |
[, and [Alan |
19 |
license](http://img.shields.io/badge/license-MIT-brightgreen.svg)](http://opensource.org/licenses/MIT) |
19 |
Hubbard](https://hubbard.berkeley.edu) |
20 |
[](https://zenodo.org/badge/latestdoi/65854775) |
20 |
|
21 |
[](http://joss.theoj.org/papers/02be843d9bab1b598187bfbb08ce3949) |
22 |
|
23 |
|
23 |
## What's `biotmle`? |
24 |
> Targeted Learning with Moderated Statistics for Biomarker Discovery |
24 |
|
25 |
|
25 |
The `biotmle` R package facilitates biomarker discovery through a generalization |
26 |
**Authors:** [Nima Hejazi](https://nimahejazi.org), [Mark van der |
26 |
of the moderated t-statistic [@smyth2004linear] that extends the procedure to |
27 |
Laan](https://vanderlaan-lab.org/about), and [Alan |
27 |
locally efficient estimators of asymptotically linear target parameters |
28 |
Hubbard](https://hubbard.berkeley.edu) |
28 |
[@tsiatis2007semiparametric]. The set of methods implemented modify targeted |
29 |
|
29 |
maximum likelihood (TML) estimators of statistical (or causal) target parameters |
30 |
----- |
30 |
(e.g., average treatment effect) to apply variance moderation to the standard |
31 |
|
31 |
variance estimator based on the efficient influence function (EIF) of the target |
32 |
## What’s `biotmle`? |
32 |
parameter [@vdl2011targeted; @vdl2018targeted]. By performing a moderated |
33 |
|
33 |
hypothesis test that pools the individual probe-specific EIF-based variance |
34 |
The `biotmle` R package facilitates biomarker discovery through a |
34 |
estimates, a robust variance estimator is constructed, which stabilizes the |
35 |
generalization of the moderated t-statistic (Smyth 2004) that extends |
35 |
standard error estimates and improves the performance of such estimators both in |
36 |
the procedure to locally efficient estimators of asymptotically linear |
36 |
smaller samples and in settings where the EIF is poorly estimated. The resultant |
37 |
target parameters (Tsiatis 2007). The set of methods implemented modify |
37 |
procedure allows for the construction of conservative hypothesis tests that |
38 |
targeted maximum likelihood (TML) estimators of statistical (or causal) |
38 |
reduce the false discovery rate and/or the family-wise error rate |
39 |
target parameters (e.g., average treatment effect) to apply variance |
39 |
[@hejazi2021generalization]. Improvements upon prior TML-based approaches to |
40 |
moderation to the standard variance estimator based on the efficient |
40 |
biomarker discovery (e.g., @bembom2009biomarker) include both the moderated |
41 |
influence function (EIF) of the target parameter (van der Laan and Rose |
41 |
variance estimator as well as the use of conservative reference distributions |
42 |
2011, 2018). By performing a moderated hypothesis test that pools the |
42 |
for the corresponding moderated test statistics (e.g., logistic distribution), |
43 |
individual probe-specific EIF-based variance estimates, a robust |
43 |
inspired by tail bounds based on concentration |
44 |
variance estimator is constructed, which stabilizes the standard error |
44 |
inequalities [@rosenblum2009confidence]; the latter prove critical for obtaining |
45 |
estimates and improves the performance of such estimators both in |
45 |
robust inference when the finite-sample distribution of the estimator deviates |
46 |
smaller samples and in settings where the EIF is poorly estimated. The |
46 |
from normality. |
47 |
resultant procedure allows for the construction of conservative |
47 |
|
48 |
hypothesis tests that reduce the false discovery rate and/or the |
48 |
--- |
49 |
family-wise error rate (Hejazi, van der Laan, and Hubbard 2021). |
49 |
|
50 |
Improvements upon prior TML-based approaches to biomarker discovery |
50 |
## Installation |
51 |
(e.g., Bembom et al. (2009)) include both the moderated variance |
51 |
|
52 |
estimator as well as the use of conservative reference distributions for |
52 |
For standard use, install from |
53 |
the corresponding moderated test statistics (e.g., logistic |
53 |
[Bioconductor](https://bioconductor.org/packages/biotmle) using |
54 |
distribution), inspired by tail bounds based on concentration |
54 |
[`BiocManager`](https://CRAN.R-project.org/package=BiocManager): |
55 |
inequalities (Rosenblum and van der Laan 2009); the latter prove |
55 |
|
56 |
critical for obtaining robust inference when the finite-sample |
56 |
```{r bioc-installation, eval = FALSE} |
57 |
distribution of the estimator deviates from normality. |
57 |
if (!requireNamespace("BiocManager", quietly=TRUE)) { |
58 |
|
58 |
install.packages("BiocManager") |
59 |
----- |
59 |
} |
60 |
|
60 |
BiocManager::install("biotmle") |
61 |
## Installation |
61 |
``` |
62 |
|
62 |
|
63 |
For standard use, install from |
63 |
To contribute, install the bleeding-edge _development version_ from GitHub via |
64 |
[Bioconductor](https://bioconductor.org/packages/biotmle) using |
64 |
[`remotes`](https://CRAN.R-project.org/package=remotes): |
65 |
[`BiocManager`](https://CRAN.R-project.org/package=BiocManager): |
65 |
|
66 |
|
66 |
```{r gh-master-installation, eval = FALSE} |
67 |
``` r |
67 |
remotes::install_github("nhejazi/biotmle") |
68 |
if (!requireNamespace("BiocManager", quietly=TRUE)) { |
68 |
``` |
69 |
install.packages("BiocManager") |
69 |
|
70 |
} |
70 |
Current and prior [Bioconductor](https://bioconductor.org) releases are |
71 |
BiocManager::install("biotmle") |
71 |
available under branches with numbers prefixed by "RELEASE_". For example, to |
72 |
``` |
72 |
install the version of this package available via Bioconductor 3.6, use |
73 |
|
73 |
|
74 |
To contribute, install the bleeding-edge *development version* from |
74 |
```{r gh-develop-installation, eval = FALSE} |
75 |
GitHub via [`remotes`](https://CRAN.R-project.org/package=remotes): |
75 |
remotes::install_github("nhejazi/biotmle", ref = "RELEASE_3_6") |
76 |
|
76 |
``` |
77 |
``` r |
77 |
|
78 |
remotes::install_github("nhejazi/biotmle") |
78 |
--- |
79 |
``` |
79 |
|
80 |
|
80 |
## Example |
81 |
Current and prior [Bioconductor](https://bioconductor.org) releases are |
81 |
|
82 |
available under branches with numbers prefixed by “RELEASE\_”. For |
82 |
For details on how to best use the `biotmle` R package, please consult the most |
83 |
example, to install the version of this package available via |
83 |
recent [package |
84 |
Bioconductor 3.6, use |
84 |
vignette](https://bioconductor.org/packages/release/bioc/vignettes/biotmle/inst/doc/exposureBiomarkers.html) |
85 |
|
85 |
available through the [Bioconductor |
86 |
``` r |
86 |
project](https://bioconductor.org/packages/biotmle). |
87 |
remotes::install_github("nhejazi/biotmle", ref = "RELEASE_3_6") |
87 |
|
88 |
``` |
88 |
--- |
89 |
|
89 |
|
90 |
----- |
90 |
## Issues |
91 |
|
91 |
|
92 |
## Example |
92 |
If you encounter any bugs or have any specific feature requests, please [file an |
93 |
|
93 |
issue](https://github.com/nhejazi/biotmle/issues). |
94 |
For details on how to best use the `biotmle` R package, please consult |
94 |
|
95 |
the most recent [package |
95 |
--- |
96 |
vignette](https://bioconductor.org/packages/release/bioc/vignettes/biotmle/inst/doc/exposureBiomarkers.html) |
96 |
|
97 |
available through the [Bioconductor |
97 |
## Contributions |
98 |
project](https://bioconductor.org/packages/biotmle). |
98 |
|
99 |
|
99 |
Contributions are very welcome. Interested contributors should consult our |
100 |
----- |
100 |
[contribution |
101 |
|
101 |
guidelines](https://github.com/nhejazi/biotmle/blob/master/CONTRIBUTING.md) |
102 |
## Issues |
102 |
prior to submitting a pull request. |
103 |
|
103 |
|
104 |
If you encounter any bugs or have any specific feature requests, please |
104 |
--- |
105 |
[file an issue](https://github.com/nhejazi/biotmle/issues). |
105 |
|
106 |
|
106 |
## Citation |
107 |
----- |
107 |
|
108 |
|
108 |
After using the `biotmle` R package, please cite both of the following: |
109 |
## Contributions |
109 |
|
110 |
|
110 |
@article{hejazi2017biotmle, |
111 |
Contributions are very welcome. Interested contributors should consult |
111 |
author = {Hejazi, Nima S and Cai, Weixin and Hubbard, Alan E}, |
112 |
our [contribution |
112 |
title = {biotmle: Targeted Learning for Biomarker Discovery}, |
113 |
guidelines](https://github.com/nhejazi/biotmle/blob/master/CONTRIBUTING.md) |
113 |
journal = {The Journal of Open Source Software}, |
114 |
prior to submitting a pull request. |
114 |
volume = {2}, |
115 |
|
115 |
number = {15}, |
116 |
----- |
116 |
month = {July}, |
117 |
|
117 |
year = {2017}, |
118 |
## Citation |
118 |
publisher = {The Open Journal}, |
119 |
|
119 |
doi = {10.21105/joss.00295}, |
120 |
After using the `biotmle` R package, please cite both of the following: |
120 |
url = {https://doi.org/10.21105/joss.00295} |
121 |
|
121 |
} |
122 |
``` |
122 |
|
123 |
@article{hejazi2017biotmle, |
123 |
@article{hejazi2021generalization, |
124 |
author = {Hejazi, Nima S and Cai, Weixin and Hubbard, Alan E}, |
124 |
author = {Hejazi, Nima S and Boileau, Philippe and {van der Laan}, |
125 |
title = {biotmle: Targeted Learning for Biomarker Discovery}, |
125 |
Mark J and Hubbard, Alan E}, |
126 |
journal = {The Journal of Open Source Software}, |
126 |
title = {A generalization of moderated statistics to data adaptive |
127 |
volume = {2}, |
127 |
semiparametric estimation in high-dimensional biology}, |
128 |
number = {15}, |
128 |
journal={under review}, |
129 |
month = {July}, |
129 |
volume={}, |
130 |
year = {2017}, |
130 |
number={}, |
131 |
publisher = {The Open Journal}, |
131 |
pages={}, |
132 |
doi = {10.21105/joss.00295}, |
132 |
year = {2021+}, |
133 |
url = {https://doi.org/10.21105/joss.00295} |
133 |
publisher={}, |
134 |
} |
134 |
doi = {}, |
135 |
|
135 |
url = {https://arxiv.org/abs/1710.05451} |
136 |
@article{hejazi2021generalization, |
136 |
} |
137 |
author = {Hejazi, Nima S and Boileau, Philippe and {van der Laan}, |
137 |
|
138 |
Mark J and Hubbard, Alan E}, |
138 |
@manual{hejazi2019biotmlebioc, |
139 |
title = {A generalization of moderated statistics to data adaptive |
139 |
author = {Hejazi, Nima S and {van der Laan}, Mark J and Hubbard, Alan |
140 |
semiparametric estimation in high-dimensional biology}, |
140 |
E}, |
141 |
journal={under review}, |
141 |
title = {{biotmle}: {Targeted Learning} with moderated statistics for |
142 |
volume={}, |
142 |
biomarker discovery}, |
143 |
number={}, |
143 |
doi = {10.18129/B9.bioc.biotmle}, |
144 |
pages={}, |
144 |
url = {https://bioconductor.org/packages/biotmle}, |
145 |
year = {2021+}, |
145 |
note = {R package version 1.10.0} |
146 |
publisher={}, |
146 |
} |
147 |
doi = {}, |
147 |
|
148 |
url = {https://arxiv.org/abs/1710.05451} |
148 |
--- |
149 |
} |
149 |
|
150 |
|
150 |
## Related |
151 |
@manual{hejazi2019biotmlebioc, |
151 |
|
152 |
author = {Hejazi, Nima S and {van der Laan}, Mark J and Hubbard, Alan |
152 |
* [R/`biotmleData`](https://github.com/nhejazi/biotmleData) - R package with |
153 |
E}, |
153 |
example experimental data for use with this analysis package. |
154 |
title = {{biotmle}: {Targeted Learning} with moderated statistics for |
154 |
|
155 |
biomarker discovery}, |
155 |
--- |
156 |
doi = {10.18129/B9.bioc.biotmle}, |
156 |
|
157 |
url = {https://bioconductor.org/packages/biotmle}, |
157 |
## Funding |
158 |
note = {R package version 1.10.0} |
158 |
|
159 |
} |
159 |
The development of this software was supported in part through grants from the |
160 |
``` |
160 |
National Institutes of Health: [P42 ES004705-29](https://projectreporter.nih.gov/project_info_details.cfm?aid=9260357&map=y) and [R01 ES021369-05](https://projectreporter.nih.gov/project_info_description.cfm?aid=9210551&icde=37849782&ddparam=&ddvalue=&ddsub=&cr=1&csb=default&cs=ASC&pball=). |
161 |
|
161 |
|
162 |
----- |
162 |
--- |
163 |
|
163 |
|
164 |
## Related |
164 |
## License |
165 |
|
165 |
|
166 |
- [R/`biotmleData`](https://github.com/nhejazi/biotmleData) - R |
166 |
© 2016-2021 [Nima S. Hejazi](https://nimahejazi.org) |
167 |
package with example experimental data for use with this analysis |
167 |
|
168 |
package. |
168 |
The contents of this repository are distributed under the MIT license. See file |
169 |
|
169 |
`LICENSE` for details. |
170 |
----- |
170 |
|
171 |
|
171 |
---
|
172 |
## Funding |
|
|
173 |
|
|
|
174 |
The development of this software was supported in part through grants |
|
|
175 |
from the National Institutes of Health: [P42 |
|
|
176 |
ES004705-29](https://projectreporter.nih.gov/project_info_details.cfm?aid=9260357&map=y) |
|
|
177 |
and [R01 |
|
|
178 |
ES021369-05](https://projectreporter.nih.gov/project_info_description.cfm?aid=9210551&icde=37849782&ddparam=&ddvalue=&ddsub=&cr=1&csb=default&cs=ASC&pball=). |
|
|
179 |
|
|
|
180 |
----- |
|
|
181 |
|
|
|
182 |
## License |
|
|
183 |
|
|
|
184 |
© 2016-2021 [Nima S. Hejazi](https://nimahejazi.org) |
|
|
185 |
|
|
|
186 |
The contents of this repository are distributed under the MIT license. |
|
|
187 |
See file `LICENSE` for details. |
|
|
188 |
|
|
|
189 |
----- |
|
|
190 |
|
|
|
191 |
## References |
|
|
192 |
|
|
|
193 |
<div id="refs" class="references"> |
|
|
194 |
|
|
|
195 |
<div id="ref-bembom2009biomarker"> |
|
|
196 |
|
|
|
197 |
Bembom, Oliver, Maya L Petersen, Soo-Yon Rhee, W Jeffrey Fessel, Sandra |
|
|
198 |
E Sinisi, Robert W Shafer, and Mark J van der Laan. 2009. “Biomarker |
|
|
199 |
Discovery Using Targeted Maximum-Likelihood Estimation: Application to |
|
|
200 |
the Treatment of Antiretroviral-Resistant Hiv Infection.” *Statistics in |
|
|
201 |
Medicine* 28 (1): 152–72. |
|
|
202 |
|
|
|
203 |
</div> |
|
|
204 |
|
|
|
205 |
<div id="ref-hejazi2021generalization"> |
|
|
206 |
|
|
|
207 |
Hejazi, Nima S, Mark J van der Laan, and Alan E Hubbard. 2021. “A |
|
|
208 |
Generalization of Moderated Statistics to Data Adaptive Semiparametric |
|
|
209 |
Estimation in High-Dimensional Biology.” *Under Review*. |
|
|
210 |
<https://arxiv.org/abs/1710.05451>. |
|
|
211 |
|
|
|
212 |
</div> |
|
|
213 |
|
|
|
214 |
<div id="ref-rosenblum2009confidence"> |
|
|
215 |
|
|
|
216 |
Rosenblum, Michael A, and Mark J van der Laan. 2009. “Confidence |
|
|
217 |
Intervals for the Population Mean Tailored to Small Sample Sizes, with |
|
|
218 |
Applications to Survey Sampling.” *The International Journal of |
|
|
219 |
Biostatistics* 5 (1). |
|
|
220 |
|
|
|
221 |
</div> |
|
|
222 |
|
|
|
223 |
<div id="ref-smyth2004linear"> |
|
|
224 |
|
|
|
225 |
Smyth, Gordon K. 2004. “Linear Models and Empirical Bayes Methods for |
|
|
226 |
Assessing Differential Expression in Microarray Experiments.” |
|
|
227 |
*Statistical Applications in Genetics and Molecular Biology* 3 (1): |
|
|
228 |
1–25. <https://doi.org/10.2202/1544-6115.1027>. |
|
|
229 |
|
|
|
230 |
</div> |
|
|
231 |
|
|
|
232 |
<div id="ref-tsiatis2007semiparametric"> |
|
|
233 |
|
|
|
234 |
Tsiatis, Anastasios. 2007. *Semiparametric Theory and Missing Data*. |
|
|
235 |
Springer Science & Business Media. |
|
|
236 |
|
|
|
237 |
</div> |
|
|
238 |
|
|
|
239 |
<div id="ref-vdl2011targeted"> |
|
|
240 |
|
|
|
241 |
van der Laan, Mark J., and Sherri Rose. 2011. *Targeted Learning: Causal |
|
|
242 |
Inference for Observational and Experimental Data*. Springer Science & |
|
|
243 |
Business Media. |
|
|
244 |
|
|
|
245 |
</div> |
|
|
246 |
|
|
|
247 |
<div id="ref-vdl2018targeted"> |
|
|
248 |
|
|
|
249 |
van der Laan, Mark J, and Sherri Rose. 2018. *Targeted Learning in Data |
|
|
250 |
Science: Causal Inference for Complex Longitudinal Studies*. Springer |
|
|
251 |
Science & Business Media. |
|
|
252 |
|
|
|
253 |
</div> |
|
|
254 |
|
|
|
255 |
</div> |
|
|