Switch to unified view

a b/partyMod/tests/mob.Rout.save
1
2
R Under development (unstable) (2014-06-29 r66051) -- "Unsuffered Consequences"
3
Copyright (C) 2014 The R Foundation for Statistical Computing
4
Platform: x86_64-unknown-linux-gnu (64-bit)
5
6
R is free software and comes with ABSOLUTELY NO WARRANTY.
7
You are welcome to redistribute it under certain conditions.
8
Type 'license()' or 'licence()' for distribution details.
9
10
R is a collaborative project with many contributors.
11
Type 'contributors()' for more information and
12
'citation()' on how to cite R or R packages in publications.
13
14
Type 'demo()' for some demos, 'help()' for on-line help, or
15
'help.start()' for an HTML browser interface to help.
16
Type 'q()' to quit R.
17
18
> library("party")
19
Loading required package: grid
20
Loading required package: zoo
21
22
Attaching package: 'zoo'
23
24
The following objects are masked from 'package:base':
25
26
    as.Date, as.Date.numeric
27
28
Loading required package: sandwich
29
Loading required package: strucchange
30
Loading required package: modeltools
31
Loading required package: stats4
32
> 
33
> data("BostonHousing", package = "mlbench")
34
> BostonHousing$lstat <- log(BostonHousing$lstat)
35
> BostonHousing$rm <- BostonHousing$rm^2
36
> BostonHousing$chas <- factor(BostonHousing$chas, levels = 0:1, labels = c("no", "yes"))
37
> BostonHousing$rad <- factor(BostonHousing$rad, ordered = TRUE)
38
> fmBH <- mob(medv ~ lstat + rm | zn + indus + chas + nox + age + dis + rad + tax + crim + b + ptratio,
39
+   control = mob_control(minsplit = 40, verbose = TRUE),
40
+   data = BostonHousing, model = linearModel)
41
42
-------------------------------------------
43
Fluctuation tests of splitting variables:
44
                    zn        indus         chas          nox          age
45
statistic 3.363356e+01 6.532322e+01 2.275635e+01 8.136281e+01 3.675850e+01
46
p.value   1.023987e-04 1.363602e-11 4.993053e-04 3.489797e-15 2.263798e-05
47
                   dis          rad          tax         crim            b
48
statistic 6.848533e+01 1.153641e+02 9.068440e+01 8.655065e+01 3.627629e+01
49
p.value   2.693904e-12 7.087680e-13 2.735524e-17 2.356348e-16 2.860686e-05
50
               ptratio
51
statistic 7.221524e+01
52
p.value   3.953623e-13
53
54
Best splitting variable: tax
55
Perform split? yes
56
-------------------------------------------
57
58
Node properties:
59
tax <= 432; criterion = 1, statistic = 115.364
60
61
-------------------------------------------
62
Fluctuation tests of splitting variables:
63
                    zn      indus      chas          nox        age
64
statistic 27.785009791 21.3329346 8.0272421 23.774323202 11.9204284
65
p.value    0.001494064  0.0285193 0.4005192  0.009518732  0.7666366
66
                   dis          rad          tax         crim         b
67
statistic 24.268011081 50.481593270 3.523250e+01 3.276813e+01 9.0363245
68
p.value    0.007601532  0.003437763 4.275527e-05 1.404487e-04 0.9871502
69
               ptratio
70
statistic 4.510680e+01
71
p.value   3.309747e-07
72
73
Best splitting variable: ptratio
74
Perform split? yes
75
-------------------------------------------
76
77
Node properties:
78
ptratio <= 15.2; criterion = 1, statistic = 50.482
79
80
-------------------------------------------
81
Fluctuation tests of splitting variables:
82
                    zn       indus        chas         nox         age
83
statistic 3.233350e+01 22.26864036 12.93407112 22.10510234 20.41295354
84
p.value   1.229678e-04  0.01504788  0.05259509  0.01622098  0.03499731
85
                 dis          rad          tax        crim         b
86
statistic 17.7204735 5.526565e+01 2.879128e+01 20.28503194 6.5549665
87
p.value    0.1091769 7.112214e-04 6.916307e-04  0.03706934 0.9999522
88
               ptratio
89
statistic 4.789850e+01
90
p.value   4.738855e-08
91
92
Best splitting variable: ptratio
93
Perform split? yes
94
-------------------------------------------
95
96
Node properties:
97
ptratio <= 19.6; criterion = 1, statistic = 55.266
98
99
-------------------------------------------
100
Fluctuation tests of splitting variables:
101
                 zn      indus      chas        nox       age        dis
102
statistic 14.971474 14.6477733 7.1172962 14.3455158 8.2176363 16.1112185
103
p.value    0.280361  0.3134649 0.5405005  0.3467974 0.9906672  0.1847818
104
                  rad          tax     crim         b    ptratio
105
statistic 43.17824350 3.447271e+01 9.340075 8.7773142 10.8469969
106
p.value    0.03281124 4.281939e-05 0.952996 0.9772696  0.8202694
107
108
Best splitting variable: tax
109
Perform split? yes
110
-------------------------------------------
111
112
Node properties:
113
tax <= 265; criterion = 1, statistic = 43.178
114
115
-------------------------------------------
116
Fluctuation tests of splitting variables:
117
                 zn     indus     chas       nox        age       dis
118
statistic 11.998039 7.3971233 7.227770 9.2936189 14.3023962 8.9239826
119
p.value    0.574642 0.9931875 0.522447 0.9119621  0.2886603 0.9389895
120
                 rad        tax       crim         b    ptratio
121
statistic 33.1746444 16.6666129 11.7143758 9.9050903 11.5927528
122
p.value    0.3926249  0.1206412  0.6153455 0.8539893  0.6328381
123
124
Best splitting variable: tax
125
Perform split? no
126
-------------------------------------------
127
128
-------------------------------------------
129
Fluctuation tests of splitting variables:
130
                  zn     indus         chas         nox       age        dis
131
statistic 10.9187926 9.0917078 2.754081e+01 17.39203006 4.6282349 11.9581600
132
p.value    0.7091039 0.9172303 4.987667e-05  0.08922543 0.9999992  0.5607267
133
                rad        tax     crim        b   ptratio
134
statistic 0.2557803 10.9076165 3.711175 3.158329 9.8865054
135
p.value   1.0000000  0.7106612 1.000000 1.000000 0.8410064
136
137
Best splitting variable: chas
138
Perform split? yes
139
-------------------------------------------
140
141
Splitting factor variable, objective function: 
142
 no 
143
Inf 
144
145
No admissable split found in 'chas'
146
> fmBH
147
1) tax <= 432; criterion = 1, statistic = 115.364
148
  2) ptratio <= 15.2; criterion = 1, statistic = 50.482
149
    3)*  weights = 72 
150
Terminal node model
151
Linear model with coefficients:
152
(Intercept)        lstat           rm  
153
     9.2349      -4.9391       0.6859  
154
155
  2) ptratio > 15.2
156
    4) ptratio <= 19.6; criterion = 1, statistic = 55.266
157
      5) tax <= 265; criterion = 1, statistic = 43.178
158
        6)*  weights = 63 
159
Terminal node model
160
Linear model with coefficients:
161
(Intercept)        lstat           rm  
162
     3.9637      -2.7663       0.6881  
163
164
      5) tax > 265
165
        7)*  weights = 162 
166
Terminal node model
167
Linear model with coefficients:
168
(Intercept)        lstat           rm  
169
    -1.7984      -0.2677       0.6539  
170
171
    4) ptratio > 19.6
172
      8)*  weights = 56 
173
Terminal node model
174
Linear model with coefficients:
175
(Intercept)        lstat           rm  
176
    17.5865      -4.6190       0.3387  
177
178
1) tax > 432
179
  9)*  weights = 153 
180
Terminal node model
181
Linear model with coefficients:
182
(Intercept)        lstat           rm  
183
    68.2971     -16.3540      -0.1478  
184
185
> summary(fmBH)
186
$`3`
187
188
Call:
189
NULL
190
191
Weighted Residuals:
192
   Min     1Q Median     3Q    Max 
193
-7.910  0.000  0.000  0.000  6.632 
194
195
Coefficients:
196
            Estimate Std. Error t value Pr(>|t|)    
197
(Intercept)  9.23488    3.95128   2.337   0.0223 *  
198
lstat       -4.93910    0.88285  -5.595 4.14e-07 ***
199
rm           0.68591    0.05136  13.354  < 2e-16 ***
200
---
201
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
202
203
Residual standard error: 3.413 on 69 degrees of freedom
204
Multiple R-squared:  0.922, Adjusted R-squared:  0.9197 
205
F-statistic: 407.8 on 2 and 69 DF,  p-value: < 2.2e-16
206
207
208
$`6`
209
210
Call:
211
NULL
212
213
Weighted Residuals:
214
   Min     1Q Median     3Q    Max 
215
-4.614  0.000  0.000  0.000 12.473 
216
217
Coefficients:
218
            Estimate Std. Error t value Pr(>|t|)    
219
(Intercept)  3.96372    5.00781   0.792  0.43177    
220
lstat       -2.76629    1.00406  -2.755  0.00776 ** 
221
rm           0.68813    0.07716   8.918 1.36e-12 ***
222
---
223
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
224
225
Residual standard error: 3.2 on 60 degrees of freedom
226
Multiple R-squared:  0.8176,    Adjusted R-squared:  0.8115 
227
F-statistic: 134.5 on 2 and 60 DF,  p-value: < 2.2e-16
228
229
230
$`7`
231
232
Call:
233
NULL
234
235
Weighted Residuals:
236
   Min     1Q Median     3Q    Max 
237
-9.092  0.000  0.000  0.000 10.236 
238
239
Coefficients:
240
            Estimate Std. Error t value Pr(>|t|)    
241
(Intercept) -1.79839    2.84702  -0.632    0.529    
242
lstat       -0.26771    0.69581  -0.385    0.701    
243
rm           0.65389    0.03757  17.404   <2e-16 ***
244
---
245
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
246
247
Residual standard error: 2.652 on 159 degrees of freedom
248
Multiple R-squared:  0.8173,    Adjusted R-squared:  0.815 
249
F-statistic: 355.6 on 2 and 159 DF,  p-value: < 2.2e-16
250
251
252
$`8`
253
254
Call:
255
NULL
256
257
Weighted Residuals:
258
   Min     1Q Median     3Q    Max 
259
-8.466  0.000  0.000  0.000  4.947 
260
261
Coefficients:
262
            Estimate Std. Error t value Pr(>|t|)    
263
(Intercept) 17.58649    4.21666   4.171 0.000113 ***
264
lstat       -4.61897    0.84025  -5.497 1.13e-06 ***
265
rm           0.33867    0.07574   4.472 4.13e-05 ***
266
---
267
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
268
269
Residual standard error: 2.197 on 53 degrees of freedom
270
Multiple R-squared:  0.6446,    Adjusted R-squared:  0.6312 
271
F-statistic: 48.07 on 2 and 53 DF,  p-value: 1.238e-12
272
273
274
$`9`
275
276
Call:
277
NULL
278
279
Weighted Residuals:
280
   Min     1Q Median     3Q    Max 
281
-10.56   0.00   0.00   0.00  24.28 
282
283
Coefficients:
284
             Estimate Std. Error t value Pr(>|t|)    
285
(Intercept)  68.29709    3.83284  17.819  < 2e-16 ***
286
lstat       -16.35401    0.96577 -16.934  < 2e-16 ***
287
rm           -0.14779    0.05047  -2.928  0.00394 ** 
288
---
289
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
290
291
Residual standard error: 4.689 on 150 degrees of freedom
292
Multiple R-squared:  0.6649,    Adjusted R-squared:  0.6604 
293
F-statistic: 148.8 on 2 and 150 DF,  p-value: < 2.2e-16
294
295
296
> 
297
> ### check for one-node tree
298
> fmBH <- try(mob(medv ~ lstat + rm | zn, control = mob_control(minsplit = 4000, verbose = TRUE),
299
+   data = BostonHousing, model = linearModel))
300
> stopifnot(class(fmBH) != "try-error")
301
> 
302
> 
303
> data("PimaIndiansDiabetes", package = "mlbench")
304
> fmPID <- mob(diabetes ~ glucose | pregnant + pressure + triceps + insulin + mass + pedigree + age,
305
+   control = mob_control(verbose = TRUE),
306
+   data = PimaIndiansDiabetes, model = glinearModel, family = binomial())
307
308
-------------------------------------------
309
Fluctuation tests of splitting variables:
310
              pregnant  pressure     triceps   insulin         mass    pedigree
311
statistic 2.988542e+01 7.5024235 15.94095417 6.5969297 4.880982e+01 18.33476114
312
p.value   9.778517e-05 0.9104325  0.06660773 0.9701412 8.316815e-09  0.02275017
313
                   age
314
statistic 4.351412e+01
315
p.value   1.182811e-07
316
317
Best splitting variable: mass
318
Perform split? yes
319
-------------------------------------------
320
321
Node properties:
322
mass <= 26.3; criterion = 1, statistic = 48.81
323
324
-------------------------------------------
325
Fluctuation tests of splitting variables:
326
            pregnant pressure  triceps   insulin       mass  pedigree       age
327
statistic 10.3924070 4.353740 5.911229 3.7855726 10.4748907 3.6263026 6.0978662
328
p.value    0.4903221 0.999824 0.986895 0.9999888  0.4785454 0.9999958 0.9817742
329
330
Best splitting variable: mass
331
Perform split? no
332
-------------------------------------------
333
334
-------------------------------------------
335
Fluctuation tests of splitting variables:
336
              pregnant  pressure  triceps   insulin      mass    pedigree
337
statistic 2.673912e+01 6.1757583 7.346804 7.8963977 9.1545915 17.96438828
338
p.value   4.434356e-04 0.9845137 0.922646 0.8700398 0.7033477  0.02677105
339
                   age
340
statistic 3.498466e+01
341
p.value   8.098640e-06
342
343
Best splitting variable: age
344
Perform split? yes
345
-------------------------------------------
346
347
Node properties:
348
age <= 30; criterion = 1, statistic = 34.985
349
350
-------------------------------------------
351
Fluctuation tests of splitting variables:
352
           pregnant  pressure  triceps   insulin      mass pedigree       age
353
statistic 4.3749991 9.4006532 7.661457 9.0583568 5.4287861 5.640420 6.3088818
354
p.value   0.9998989 0.6656073 0.893893 0.7168659 0.9967316 0.994611 0.9804133
355
356
Best splitting variable: pressure
357
Perform split? no
358
-------------------------------------------
359
360
-------------------------------------------
361
Fluctuation tests of splitting variables:
362
           pregnant pressure   triceps   insulin      mass   pedigree
363
statistic 7.7282903 1.935271 3.6078314 4.9703223 10.136944 11.9004129
364
p.value   0.8882324 1.000000 0.9999987 0.9991162  0.555382  0.3205095
365
                 age
366
statistic 10.1330698
367
p.value    0.5559631
368
369
Best splitting variable: pedigree
370
Perform split? no
371
-------------------------------------------
372
> fmPID
373
1) mass <= 26.3; criterion = 1, statistic = 48.81
374
  2)*  weights = 167 
375
Terminal node model
376
Binomial GLM with coefficients:
377
(Intercept)      glucose  
378
   -9.95151      0.05871  
379
380
1) mass > 26.3
381
  3) age <= 30; criterion = 1, statistic = 34.985
382
    4)*  weights = 304 
383
Terminal node model
384
Binomial GLM with coefficients:
385
(Intercept)      glucose  
386
   -6.70559      0.04684  
387
388
  3) age > 30
389
    5)*  weights = 297 
390
Terminal node model
391
Binomial GLM with coefficients:
392
(Intercept)      glucose  
393
   -2.77095      0.02354  
394
395
> summary(fmPID)
396
$`2`
397
398
Call:
399
NULL
400
401
Deviance Residuals: 
402
   Min      1Q  Median      3Q     Max  
403
-1.817   0.000   0.000   0.000   2.718  
404
405
Coefficients:
406
            Estimate Std. Error z value Pr(>|z|)    
407
(Intercept) -9.95151    1.74013  -5.719 1.07e-08 ***
408
glucose      0.05871    0.01211   4.846 1.26e-06 ***
409
---
410
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
411
412
(Dispersion parameter for binomial family taken to be 1)
413
414
    Null deviance: 96.202  on 166  degrees of freedom
415
Residual deviance: 60.502  on 165  degrees of freedom
416
AIC: 64.502
417
418
Number of Fisher Scoring iterations: 6
419
420
421
$`4`
422
423
Call:
424
NULL
425
426
Deviance Residuals: 
427
    Min       1Q   Median       3Q      Max  
428
-1.9438  -0.3484   0.0000   0.0000   2.4893  
429
430
Coefficients:
431
             Estimate Std. Error z value Pr(>|z|)    
432
(Intercept) -6.705586   0.800193  -8.380  < 2e-16 ***
433
glucose      0.046837   0.006208   7.544 4.54e-14 ***
434
---
435
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
436
437
(Dispersion parameter for binomial family taken to be 1)
438
439
    Null deviance: 364.01  on 303  degrees of freedom
440
Residual deviance: 280.98  on 302  degrees of freedom
441
AIC: 284.98
442
443
Number of Fisher Scoring iterations: 5
444
445
446
$`5`
447
448
Call:
449
NULL
450
451
Deviance Residuals: 
452
   Min      1Q  Median      3Q     Max  
453
-2.005   0.000   0.000   0.000   2.380  
454
455
Coefficients:
456
             Estimate Std. Error z value Pr(>|z|)    
457
(Intercept) -2.770954   0.548241  -5.054 4.32e-07 ***
458
glucose      0.023536   0.004202   5.601 2.13e-08 ***
459
---
460
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
461
462
(Dispersion parameter for binomial family taken to be 1)
463
464
    Null deviance: 407.11  on 296  degrees of freedom
465
Residual deviance: 369.43  on 295  degrees of freedom
466
AIC: 373.43
467
468
Number of Fisher Scoring iterations: 4
469
470
471
> 
472
> 
473
> proc.time()
474
   user  system elapsed 
475
  3.252   0.076   3.332