|
a |
|
b/man/predictionAccuracyByCv.Rd |
|
|
1 |
% Generated by roxygen2: do not edit by hand |
|
|
2 |
% Please edit documentation in R/pRRophetic.R |
|
|
3 |
\name{predictionAccuracyByCv} |
|
|
4 |
\alias{predictionAccuracyByCv} |
|
|
5 |
\title{Cross validation on training dataset} |
|
|
6 |
\usage{ |
|
|
7 |
predictionAccuracyByCv( |
|
|
8 |
trainingExprData, |
|
|
9 |
trainingPtype, |
|
|
10 |
testExprData = -1, |
|
|
11 |
cvFold = -1, |
|
|
12 |
powerTransformPhenotype = TRUE, |
|
|
13 |
batchCorrect = "eb", |
|
|
14 |
removeLowVaryingGenes = 0.2, |
|
|
15 |
minNumSamples = 10, |
|
|
16 |
selection = 1 |
|
|
17 |
) |
|
|
18 |
} |
|
|
19 |
\arguments{ |
|
|
20 |
\item{trainingExprData}{The training data. A matrix of expression levels, rows contain genes and columns contain samples, "rownames()" must be specified and must contain the same type of gene ids as "testExprData"} |
|
|
21 |
|
|
|
22 |
\item{trainingPtype}{The known phenotype for "trainingExprData". A numeric vector which MUST be the same length as the number of columns of "trainingExprData".} |
|
|
23 |
|
|
|
24 |
\item{testExprData}{The test data where the phenotype will be estimted. It is a matrix of expression levels, rows contain genes and columns contain samples, "rownames()" must be specified and must contain the same type of gene ids as "trainingExprData".} |
|
|
25 |
|
|
|
26 |
\item{cvFold}{Specify the "fold" requried for cross validation. "-1" will do leave one out cross validation (LOOCV)} |
|
|
27 |
|
|
|
28 |
\item{powerTransformPhenotype}{Should the phenotype be power transformed before we fit the regression model? Default to TRUE, set to FALSE if the phenotype is already known to be highly normal.} |
|
|
29 |
|
|
|
30 |
\item{batchCorrect}{How should training and test data matrices be homogenized. Choices are "eb" (default) for ComBat, "qn" for quantiles normalization or "none" for no homogenization.} |
|
|
31 |
|
|
|
32 |
\item{removeLowVaryingGenes}{What proportion of low varying genes should be removed? 20 precent be default} |
|
|
33 |
|
|
|
34 |
\item{minNumSamples}{How many training and test samples are requried. Print an error if below this threshold} |
|
|
35 |
|
|
|
36 |
\item{selection}{How should duplicate gene ids be handled. Default is -1 which asks the user. 1 to summarize by their or 2 to disguard all duplicates.} |
|
|
37 |
} |
|
|
38 |
\value{ |
|
|
39 |
An object of class "pRRopheticCv", which is a list with two members, "cvPtype" and "realPtype", which correspond to the cross valiation predicted phenotype and the user provided measured phenotype respectively. |
|
|
40 |
} |
|
|
41 |
\description{ |
|
|
42 |
This function does cross validation on a training set to estimate prediction accuracy on a training set. |
|
|
43 |
If the actual test set is provided, the two datasets can be subsetted and homogenized before the |
|
|
44 |
cross validation analysis is preformed. This may improve the estimate of prediction accuracy. |
|
|
45 |
} |
|
|
46 |
\author{ |
|
|
47 |
Paul Geeleher, Nancy Cox, R. Stephanie Huang |
|
|
48 |
} |
|
|
49 |
\keyword{internal} |