476 lines (374 with data), 14.9 kB
R Under development (unstable) (2014-06-29 r66051) -- "Unsuffered Consequences"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library("party")
Loading required package: grid
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Loading required package: sandwich
Loading required package: strucchange
Loading required package: modeltools
Loading required package: stats4
>
> data("BostonHousing", package = "mlbench")
> BostonHousing$lstat <- log(BostonHousing$lstat)
> BostonHousing$rm <- BostonHousing$rm^2
> BostonHousing$chas <- factor(BostonHousing$chas, levels = 0:1, labels = c("no", "yes"))
> BostonHousing$rad <- factor(BostonHousing$rad, ordered = TRUE)
> fmBH <- mob(medv ~ lstat + rm | zn + indus + chas + nox + age + dis + rad + tax + crim + b + ptratio,
+ control = mob_control(minsplit = 40, verbose = TRUE),
+ data = BostonHousing, model = linearModel)
-------------------------------------------
Fluctuation tests of splitting variables:
zn indus chas nox age
statistic 3.363356e+01 6.532322e+01 2.275635e+01 8.136281e+01 3.675850e+01
p.value 1.023987e-04 1.363602e-11 4.993053e-04 3.489797e-15 2.263798e-05
dis rad tax crim b
statistic 6.848533e+01 1.153641e+02 9.068440e+01 8.655065e+01 3.627629e+01
p.value 2.693904e-12 7.087680e-13 2.735524e-17 2.356348e-16 2.860686e-05
ptratio
statistic 7.221524e+01
p.value 3.953623e-13
Best splitting variable: tax
Perform split? yes
-------------------------------------------
Node properties:
tax <= 432; criterion = 1, statistic = 115.364
-------------------------------------------
Fluctuation tests of splitting variables:
zn indus chas nox age
statistic 27.785009791 21.3329346 8.0272421 23.774323202 11.9204284
p.value 0.001494064 0.0285193 0.4005192 0.009518732 0.7666366
dis rad tax crim b
statistic 24.268011081 50.481593270 3.523250e+01 3.276813e+01 9.0363245
p.value 0.007601532 0.003437763 4.275527e-05 1.404487e-04 0.9871502
ptratio
statistic 4.510680e+01
p.value 3.309747e-07
Best splitting variable: ptratio
Perform split? yes
-------------------------------------------
Node properties:
ptratio <= 15.2; criterion = 1, statistic = 50.482
-------------------------------------------
Fluctuation tests of splitting variables:
zn indus chas nox age
statistic 3.233350e+01 22.26864036 12.93407112 22.10510234 20.41295354
p.value 1.229678e-04 0.01504788 0.05259509 0.01622098 0.03499731
dis rad tax crim b
statistic 17.7204735 5.526565e+01 2.879128e+01 20.28503194 6.5549665
p.value 0.1091769 7.112214e-04 6.916307e-04 0.03706934 0.9999522
ptratio
statistic 4.789850e+01
p.value 4.738855e-08
Best splitting variable: ptratio
Perform split? yes
-------------------------------------------
Node properties:
ptratio <= 19.6; criterion = 1, statistic = 55.266
-------------------------------------------
Fluctuation tests of splitting variables:
zn indus chas nox age dis
statistic 14.971474 14.6477733 7.1172962 14.3455158 8.2176363 16.1112185
p.value 0.280361 0.3134649 0.5405005 0.3467974 0.9906672 0.1847818
rad tax crim b ptratio
statistic 43.17824350 3.447271e+01 9.340075 8.7773142 10.8469969
p.value 0.03281124 4.281939e-05 0.952996 0.9772696 0.8202694
Best splitting variable: tax
Perform split? yes
-------------------------------------------
Node properties:
tax <= 265; criterion = 1, statistic = 43.178
-------------------------------------------
Fluctuation tests of splitting variables:
zn indus chas nox age dis
statistic 11.998039 7.3971233 7.227770 9.2936189 14.3023962 8.9239826
p.value 0.574642 0.9931875 0.522447 0.9119621 0.2886603 0.9389895
rad tax crim b ptratio
statistic 33.1746444 16.6666129 11.7143758 9.9050903 11.5927528
p.value 0.3926249 0.1206412 0.6153455 0.8539893 0.6328381
Best splitting variable: tax
Perform split? no
-------------------------------------------
-------------------------------------------
Fluctuation tests of splitting variables:
zn indus chas nox age dis
statistic 10.9187926 9.0917078 2.754081e+01 17.39203006 4.6282349 11.9581600
p.value 0.7091039 0.9172303 4.987667e-05 0.08922543 0.9999992 0.5607267
rad tax crim b ptratio
statistic 0.2557803 10.9076165 3.711175 3.158329 9.8865054
p.value 1.0000000 0.7106612 1.000000 1.000000 0.8410064
Best splitting variable: chas
Perform split? yes
-------------------------------------------
Splitting factor variable, objective function:
no
Inf
No admissable split found in 'chas'
> fmBH
1) tax <= 432; criterion = 1, statistic = 115.364
2) ptratio <= 15.2; criterion = 1, statistic = 50.482
3)* weights = 72
Terminal node model
Linear model with coefficients:
(Intercept) lstat rm
9.2349 -4.9391 0.6859
2) ptratio > 15.2
4) ptratio <= 19.6; criterion = 1, statistic = 55.266
5) tax <= 265; criterion = 1, statistic = 43.178
6)* weights = 63
Terminal node model
Linear model with coefficients:
(Intercept) lstat rm
3.9637 -2.7663 0.6881
5) tax > 265
7)* weights = 162
Terminal node model
Linear model with coefficients:
(Intercept) lstat rm
-1.7984 -0.2677 0.6539
4) ptratio > 19.6
8)* weights = 56
Terminal node model
Linear model with coefficients:
(Intercept) lstat rm
17.5865 -4.6190 0.3387
1) tax > 432
9)* weights = 153
Terminal node model
Linear model with coefficients:
(Intercept) lstat rm
68.2971 -16.3540 -0.1478
> summary(fmBH)
$`3`
Call:
NULL
Weighted Residuals:
Min 1Q Median 3Q Max
-7.910 0.000 0.000 0.000 6.632
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.23488 3.95128 2.337 0.0223 *
lstat -4.93910 0.88285 -5.595 4.14e-07 ***
rm 0.68591 0.05136 13.354 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.413 on 69 degrees of freedom
Multiple R-squared: 0.922, Adjusted R-squared: 0.9197
F-statistic: 407.8 on 2 and 69 DF, p-value: < 2.2e-16
$`6`
Call:
NULL
Weighted Residuals:
Min 1Q Median 3Q Max
-4.614 0.000 0.000 0.000 12.473
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.96372 5.00781 0.792 0.43177
lstat -2.76629 1.00406 -2.755 0.00776 **
rm 0.68813 0.07716 8.918 1.36e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.2 on 60 degrees of freedom
Multiple R-squared: 0.8176, Adjusted R-squared: 0.8115
F-statistic: 134.5 on 2 and 60 DF, p-value: < 2.2e-16
$`7`
Call:
NULL
Weighted Residuals:
Min 1Q Median 3Q Max
-9.092 0.000 0.000 0.000 10.236
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.79839 2.84702 -0.632 0.529
lstat -0.26771 0.69581 -0.385 0.701
rm 0.65389 0.03757 17.404 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.652 on 159 degrees of freedom
Multiple R-squared: 0.8173, Adjusted R-squared: 0.815
F-statistic: 355.6 on 2 and 159 DF, p-value: < 2.2e-16
$`8`
Call:
NULL
Weighted Residuals:
Min 1Q Median 3Q Max
-8.466 0.000 0.000 0.000 4.947
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 17.58649 4.21666 4.171 0.000113 ***
lstat -4.61897 0.84025 -5.497 1.13e-06 ***
rm 0.33867 0.07574 4.472 4.13e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.197 on 53 degrees of freedom
Multiple R-squared: 0.6446, Adjusted R-squared: 0.6312
F-statistic: 48.07 on 2 and 53 DF, p-value: 1.238e-12
$`9`
Call:
NULL
Weighted Residuals:
Min 1Q Median 3Q Max
-10.56 0.00 0.00 0.00 24.28
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 68.29709 3.83284 17.819 < 2e-16 ***
lstat -16.35401 0.96577 -16.934 < 2e-16 ***
rm -0.14779 0.05047 -2.928 0.00394 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.689 on 150 degrees of freedom
Multiple R-squared: 0.6649, Adjusted R-squared: 0.6604
F-statistic: 148.8 on 2 and 150 DF, p-value: < 2.2e-16
>
> ### check for one-node tree
> fmBH <- try(mob(medv ~ lstat + rm | zn, control = mob_control(minsplit = 4000, verbose = TRUE),
+ data = BostonHousing, model = linearModel))
> stopifnot(class(fmBH) != "try-error")
>
>
> data("PimaIndiansDiabetes", package = "mlbench")
> fmPID <- mob(diabetes ~ glucose | pregnant + pressure + triceps + insulin + mass + pedigree + age,
+ control = mob_control(verbose = TRUE),
+ data = PimaIndiansDiabetes, model = glinearModel, family = binomial())
-------------------------------------------
Fluctuation tests of splitting variables:
pregnant pressure triceps insulin mass pedigree
statistic 2.988542e+01 7.5024235 15.94095417 6.5969297 4.880982e+01 18.33476114
p.value 9.778517e-05 0.9104325 0.06660773 0.9701412 8.316815e-09 0.02275017
age
statistic 4.351412e+01
p.value 1.182811e-07
Best splitting variable: mass
Perform split? yes
-------------------------------------------
Node properties:
mass <= 26.3; criterion = 1, statistic = 48.81
-------------------------------------------
Fluctuation tests of splitting variables:
pregnant pressure triceps insulin mass pedigree age
statistic 10.3924070 4.353740 5.911229 3.7855726 10.4748907 3.6263026 6.0978662
p.value 0.4903221 0.999824 0.986895 0.9999888 0.4785454 0.9999958 0.9817742
Best splitting variable: mass
Perform split? no
-------------------------------------------
-------------------------------------------
Fluctuation tests of splitting variables:
pregnant pressure triceps insulin mass pedigree
statistic 2.673912e+01 6.1757583 7.346804 7.8963977 9.1545915 17.96438828
p.value 4.434356e-04 0.9845137 0.922646 0.8700398 0.7033477 0.02677105
age
statistic 3.498466e+01
p.value 8.098640e-06
Best splitting variable: age
Perform split? yes
-------------------------------------------
Node properties:
age <= 30; criterion = 1, statistic = 34.985
-------------------------------------------
Fluctuation tests of splitting variables:
pregnant pressure triceps insulin mass pedigree age
statistic 4.3749991 9.4006532 7.661457 9.0583568 5.4287861 5.640420 6.3088818
p.value 0.9998989 0.6656073 0.893893 0.7168659 0.9967316 0.994611 0.9804133
Best splitting variable: pressure
Perform split? no
-------------------------------------------
-------------------------------------------
Fluctuation tests of splitting variables:
pregnant pressure triceps insulin mass pedigree
statistic 7.7282903 1.935271 3.6078314 4.9703223 10.136944 11.9004129
p.value 0.8882324 1.000000 0.9999987 0.9991162 0.555382 0.3205095
age
statistic 10.1330698
p.value 0.5559631
Best splitting variable: pedigree
Perform split? no
-------------------------------------------
> fmPID
1) mass <= 26.3; criterion = 1, statistic = 48.81
2)* weights = 167
Terminal node model
Binomial GLM with coefficients:
(Intercept) glucose
-9.95151 0.05871
1) mass > 26.3
3) age <= 30; criterion = 1, statistic = 34.985
4)* weights = 304
Terminal node model
Binomial GLM with coefficients:
(Intercept) glucose
-6.70559 0.04684
3) age > 30
5)* weights = 297
Terminal node model
Binomial GLM with coefficients:
(Intercept) glucose
-2.77095 0.02354
> summary(fmPID)
$`2`
Call:
NULL
Deviance Residuals:
Min 1Q Median 3Q Max
-1.817 0.000 0.000 0.000 2.718
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -9.95151 1.74013 -5.719 1.07e-08 ***
glucose 0.05871 0.01211 4.846 1.26e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 96.202 on 166 degrees of freedom
Residual deviance: 60.502 on 165 degrees of freedom
AIC: 64.502
Number of Fisher Scoring iterations: 6
$`4`
Call:
NULL
Deviance Residuals:
Min 1Q Median 3Q Max
-1.9438 -0.3484 0.0000 0.0000 2.4893
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.705586 0.800193 -8.380 < 2e-16 ***
glucose 0.046837 0.006208 7.544 4.54e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 364.01 on 303 degrees of freedom
Residual deviance: 280.98 on 302 degrees of freedom
AIC: 284.98
Number of Fisher Scoring iterations: 5
$`5`
Call:
NULL
Deviance Residuals:
Min 1Q Median 3Q Max
-2.005 0.000 0.000 0.000 2.380
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.770954 0.548241 -5.054 4.32e-07 ***
glucose 0.023536 0.004202 5.601 2.13e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 407.11 on 296 degrees of freedom
Residual deviance: 369.43 on 295 degrees of freedom
AIC: 373.43
Number of Fisher Scoring iterations: 4
>
>
> proc.time()
user system elapsed
3.252 0.076 3.332