--- a
+++ b/partyMod/man/readingSkills.Rd
@@ -0,0 +1,49 @@
+\name{readingSkills}
+\alias{readingSkills}
+\docType{data}
+\title{ Reading Skills }
+\description{
+  A toy data set illustrating the spurious correlation
+  between reading skills and shoe size in school-children.
+}
+\usage{data("readingSkills")}
+\format{
+  A data frame with 200 observations on the following 4 variables.
+  \describe{
+    \item{\code{nativeSpeaker}}{a factor with levels \code{no} and \code{yes},
+                                where \code{yes} indicates that the child
+                                is a native speaker of the language of the reading test.}
+    \item{\code{age}}{age of the child in years.}
+    \item{\code{shoeSize}}{shoe size of the child in cm.}
+    \item{\code{score}}{raw score on the reading test.}
+  }
+}
+\details{
+
+  In this artificial data set, that was generated by means of a linear model, 
+  \code{age} and \code{nativeSpeaker} are actual predictors of the 
+  \code{score}, while the spurious correlation between \code{score} and 
+  \code{shoeSize} is merely caused by the fact that both depend on \code{age}.  
+
+  The true predictors can be identified, e.g., by means of partial correlations, 
+  standardized beta coefficients in linear models or the conditional random 
+  forest variable importance, but not by means of the standard random 
+  forest variable importance (see example).
+
+}
+\examples{
+
+   set.seed(290875)
+   readingSkills.cf <- cforest(score ~ ., data = readingSkills,
+       control = cforest_unbiased(mtry = 2, ntree = 50))
+
+   # standard importance
+   varimp(readingSkills.cf)
+   # the same modulo random variation
+   varimp(readingSkills.cf, pre1.0_0 = TRUE)
+
+   # conditional importance, may take a while...
+   varimp(readingSkills.cf, conditional = TRUE) 
+
+}
+\keyword{datasets}