Switch to unified view

a/README.md b/README.md
1
<div class="sc-cmRAlD dkqmWS"><div class="sc-UEtKG dGqiYy sc-flttKd cguEtd"><div class="sc-fqwslf gsqkEc"><div class="sc-cBQMlg kAHhUk"><h2 class="sc-dcKlJK sc-cVttbi gqEuPW ksnHgj">About Dataset</h2></div></div></div><div class="sc-jgvlka jFuPjz"><div class="sc-gzqKSP tNtjD"><div style="min-height: 80px;"><div class="sc-etVRix jqYJaa sc-bMmLMY ZURWJ"><pre class="uc-code-block">
1
## About Dataset
2
2
Data on recurrences of bladder cancer, used by many people to demonstrate methodology for recurrent event modelling.
3
<span class="hljs-built_in">Data</span> <span class="hljs-keyword">on</span> recurrences of bladder cancer, used <span class="hljs-keyword">by</span> many people <span class="hljs-keyword">to</span> demonstrate methodology for recurrent event modelling.
3
4
4
<table>
5
5
<thead>
6
<table>
6
<tr>
7
<thead>
7
<th>Column</th>
8
<tr>
8
<th>Description</th>
9
<th>Column</th>
9
<th>Format</th>
10
<th>Description</th>
10
</tr>
11
<th>Format</th>
11
</thead>
12
</tr>
12
<tbody>
13
</thead>
13
<tr>
14
<tbody>
14
<td><strong>Bladder Dataset 1</strong></td>
15
<tr>
15
<td></td>
16
<td><strong>Bladder Dataset 1</strong></td>
16
<td></td>
17
<td></td>
17
</tr>
18
<td></td>
18
<tr>
19
</tr>
19
<td>id</td>
20
<tr>
20
<td>Patient ID</td>
21
<td>id</td>
21
<td></td>
22
<td>Patient ID</td>
22
</tr>
23
<td></td>
23
<tr>
24
</tr>
24
<td>treatment</td>
25
<tr>
25
<td>Treatment received</td>
26
<td>treatment</td>
26
<td>Placebo, pyridoxine (vitamin B6), or thiotepa</td>
27
<td>Treatment received</td>
27
</tr>
28
<td>Placebo, pyridoxine (vitamin B6), or thiotepa</td>
28
<tr>
29
</tr>
29
<td>number</td>
30
<tr>
30
<td>Initial number of tumors</td>
31
<td>number</td>
31
<td>8=8 or more</td>
32
<td>Initial number of tumors</td>
32
</tr>
33
<td>8=8 or more</td>
33
<tr>
34
</tr>
34
<td>size</td>
35
<tr>
35
<td>Size (cm) of largest initial tumor</td>
36
<td>size</td>
36
<td></td>
37
<td>Size (cm) of largest initial tumor</td>
37
</tr>
38
<td></td>
38
<tr>
39
</tr>
39
<td>recur</td>
40
<tr>
40
<td>Number of recurrences</td>
41
<td>recur</td>
41
<td></td>
42
<td>Number of recurrences</td>
42
</tr>
43
<td></td>
43
<tr>
44
</tr>
44
<td>start</td>
45
<tr>
45
<td>Start time of each interval</td>
46
<td>start</td>
46
<td></td>
47
<td>Start time of each interval</td>
47
</tr>
48
<td></td>
48
<tr>
49
</tr>
49
<td>stop</td>
50
<tr>
50
<td>End time of each interval</td>
51
<td>stop</td>
51
<td></td>
52
<td>End time of each interval</td>
52
</tr>
53
<td></td>
53
<tr>
54
</tr>
54
<td>status</td>
55
<tr>
55
<td>End of interval code</td>
56
<td>status</td>
56
<td>0=censored, 1=recurrence, 2=death from bladder disease, 3=death other/unknown cause</td>
57
<td>End of interval code</td>
57
</tr>
58
<td>0=censored, 1=recurrence, 2=death from bladder disease, 3=death other/unknown cause</td>
58
<tr>
59
</tr>
59
<td>rtumor</td>
60
<tr>
60
<td>Number of tumors found at recurrence</td>
61
<td>rtumor</td>
61
<td></td>
62
<td>Number of tumors found at recurrence</td>
62
</tr>
63
<td></td>
63
<tr>
64
</tr>
64
<td>rsize</td>
65
<tr>
65
<td>Size of largest tumor at recurrence</td>
66
<td>rsize</td>
66
<td></td>
67
<td>Size of largest tumor at recurrence</td>
67
</tr>
68
<td></td>
68
<tr>
69
</tr>
69
<td>enum</td>
70
<tr>
70
<td>Event number (observation number within patient)</td>
71
<td>enum</td>
71
<td></td>
72
<td>Event number (observation number within patient)</td>
72
</tr>
73
<td></td>
73
<tr>
74
</tr>
74
<td><strong>Bladder Dataset 0</strong></td>
75
<tr>
75
<td></td>
76
<td><strong>Bladder Dataset 0</strong></td>
76
<td></td>
77
<td></td>
77
</tr>
78
<td></td>
78
<tr>
79
</tr>
79
<td>id</td>
80
<tr>
80
<td>Patient ID</td>
81
<td>id</td>
81
<td></td>
82
<td>Patient ID</td>
82
</tr>
83
<td></td>
83
<tr>
84
</tr>
84
<td>rx</td>
85
<tr>
85
<td>Treatment received</td>
86
<td>rx</td>
86
<td>1=placebo, 2=thiotepa</td>
87
<td>Treatment received</td>
87
</tr>
88
<td>1=placebo, 2=thiotepa</td>
88
<tr>
89
</tr>
89
<td>number</td>
90
<tr>
90
<td>Initial number of tumors</td>
91
<td>number</td>
91
<td>8=8 or more</td>
92
<td>Initial number of tumors</td>
92
</tr>
93
<td>8=8 or more</td>
93
<tr>
94
</tr>
94
<td>size</td>
95
<tr>
95
<td>Size (cm) of largest initial tumor</td>
96
<td>size</td>
96
<td></td>
97
<td>Size (cm) of largest initial tumor</td>
97
</tr>
98
<td></td>
98
<tr>
99
</tr>
99
<td>stop</td>
100
<tr>
100
<td>Recurrence or censoring time</td>
101
<td>stop</td>
101
<td></td>
102
<td>Recurrence or censoring time</td>
102
</tr>
103
<td></td>
103
<tr>
104
</tr>
104
<td>enum</td>
105
<tr>
105
<td>Which recurrence (up to 4)</td>
106
<td>enum</td>
106
<td></td>
107
<td>Which recurrence (up to 4)</td>
107
</tr>
108
<td></td>
108
<tr>
109
</tr>
109
<td><strong>Bladder Dataset 2</strong></td>
110
<tr>
110
<td></td>
111
<td><strong>Bladder Dataset 2</strong></td>
111
<td></td>
112
<td></td>
112
</tr>
113
<td></td>
113
<tr>
114
</tr>
114
<td>id</td>
115
<tr>
115
<td>Patient ID</td>
116
<td>id</td>
116
<td></td>
117
<td>Patient ID</td>
117
</tr>
118
<td></td>
118
<tr>
119
</tr>
119
<td>rx</td>
120
<tr>
120
<td>Treatment received</td>
121
<td>rx</td>
121
<td>1=placebo, 2=thiotepa</td>
122
<td>Treatment received</td>
122
</tr>
123
<td>1=placebo, 2=thiotepa</td>
123
<tr>
124
</tr>
124
<td>number</td>
125
<tr>
125
<td>Initial number of tumors</td>
126
<td>number</td>
126
<td>8=8 or more</td>
127
<td>Initial number of tumors</td>
127
</tr>
128
<td>8=8 or more</td>
128
<tr>
129
</tr>
129
<td>size</td>
130
<tr>
130
<td>Size (cm) of largest initial tumor</td>
131
<td>size</td>
131
<td></td>
132
<td>Size (cm) of largest initial tumor</td>
132
</tr>
133
<td></td>
133
<tr>
134
</tr>
134
<td>start</td>
135
<tr>
135
<td>Start of interval (0 or previous recurrence time)</td>
136
<td>start</td>
136
<td></td>
137
<td>Start of interval (0 or previous recurrence time)</td>
137
</tr>
138
<td></td>
138
<tr>
139
</tr>
139
<td>stop</td>
140
<tr>
140
<td>Recurrence or censoring time</td>
141
<td>stop</td>
141
<td></td>
142
<td>Recurrence or censoring time</td>
142
</tr>
143
<td></td>
143
<tr>
144
</tr>
144
<td>enum</td>
145
<tr>
145
<td>Which recurrence (up to 4)</td>
146
<td>enum</td>
146
<td></td>
147
<td>Which recurrence (up to 4)</td>
147
</tr>
148
<td></td>
148
</tbody>
149
</tr>
149
</table>
150
</tbody>
150
151
</table>
151
Bladder is the data set that appears most commonly in the literature. It uses only the 85 subjects with nonzero follow-up who were assigned to either thiotepa or placebo, and only the first four recurrences for any patient. The status variable is 1 for recurrence and 0 for everything else (including death for any reason). The data set is laid out in the competing risks format of the paper by Wei, Lin, and Weissfeld.
152
<pre class="uc-code-block"><code>Bladder is <span class="hljs-keyword">the</span> data <span class="hljs-built_in">set</span> that appears most commonly <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> literature. It uses only <span class="hljs-keyword">the</span> <span class="hljs-number">85</span> subjects <span class="hljs-keyword">with</span> nonzero follow-up who were assigned <span class="hljs-built_in">to</span> either thiotepa <span class="hljs-keyword">or</span> placebo, <span class="hljs-keyword">and</span> only <span class="hljs-keyword">the</span> <span class="hljs-keyword">first</span> <span class="hljs-literal">four</span> recurrences <span class="hljs-keyword">for</span> <span class="hljs-keyword">any</span> patient. The status <span class="hljs-built_in">variable</span> is <span class="hljs-number">1</span> <span class="hljs-keyword">for</span> recurrence <span class="hljs-keyword">and</span> <span class="hljs-number">0</span> <span class="hljs-keyword">for</span> everything <span class="hljs-keyword">else</span> (including death <span class="hljs-keyword">for</span> <span class="hljs-keyword">any</span> reason). The data <span class="hljs-built_in">set</span> is laid out <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> competing risks <span class="hljs-built_in">format</span> <span class="hljs-keyword">of</span> <span class="hljs-keyword">the</span> paper <span class="hljs-keyword">by</span> Wei, Lin, <span class="hljs-keyword">and</span> Weissfeld.
152
content_copy
153
</code>
153
Bladder1 is the full data set from the study. It contains all three treatment arms and all recurrences for 118 subjects; the maximum observed number of recurrences is 9.
154
154
Bladder2 uses the same subset of subjects as bladder, but formatted in the (start, stop] or Anderson-Gill style. Note that in transforming from the WLW to the AG style data set there is a quite common programming mistake that leads to extra follow-up time for 12 subjects: all those with follow-up beyond their 4th recurrence. This "follow-up" is a side effect of throwing away all events after the fourth while retaining the last follow-up time variable from the original data. The bladder2 data set found here does not make this mistake, but some analyses in the literature have done so; it results in the addition of a small amount of immortal time bias and shrinks the fitted coefficients towards zero.
155
<pre class="uc-code-block"><code>Bladder1 is <span class="hljs-keyword">the</span> full data <span class="hljs-built_in">set</span> <span class="hljs-built_in">from</span> <span class="hljs-keyword">the</span> study. It <span class="hljs-keyword">contains</span> all <span class="hljs-literal">three</span> treatment arms <span class="hljs-keyword">and</span> all recurrences <span class="hljs-keyword">for</span> <span class="hljs-number">118</span> subjects; <span class="hljs-keyword">the</span> maximum observed <span class="hljs-built_in">number</span> <span class="hljs-keyword">of</span> recurrences is <span class="hljs-number">9.</span>
156
</code>]
157
<pre class="uc-code-block">
158
159
<code>Bladder2 uses <span class="hljs-keyword">the</span> same subset <span class="hljs-keyword">of</span> subjects <span class="hljs-keyword">as</span> bladder, but formatted <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> (<span class="hljs-built_in">start</span>, <span class="hljs-built_in">stop</span>] <span class="hljs-keyword">or</span> Anderson-Gill style. Note that <span class="hljs-keyword">in</span> transforming <span class="hljs-built_in">from</span> <span class="hljs-keyword">the</span> WLW <span class="hljs-built_in">to</span> <span class="hljs-keyword">the</span> AG style data <span class="hljs-built_in">set</span> there is <span class="hljs-keyword">a</span> quite common programming mistake that leads <span class="hljs-built_in">to</span> extra follow-up <span class="hljs-built_in">time</span> <span class="hljs-keyword">for</span> <span class="hljs-number">12</span> subjects: all those <span class="hljs-keyword">with</span> follow-up beyond their <span class="hljs-number">4</span>th recurrence. This <span class="hljs-string">"follow-up"</span> is <span class="hljs-keyword">a</span> side effect <span class="hljs-keyword">of</span> throwing away all events <span class="hljs-keyword">after</span> <span class="hljs-keyword">the</span> <span class="hljs-keyword">fourth</span> <span class="hljs-keyword">while</span> retaining <span class="hljs-keyword">the</span> <span class="hljs-keyword">last</span> follow-up <span class="hljs-built_in">time</span> <span class="hljs-built_in">variable</span> <span class="hljs-built_in">from</span> <span class="hljs-keyword">the</span> original data. The bladder2 data <span class="hljs-built_in">set</span> found here does <span class="hljs-keyword">not</span> make this mistake, but some analyses <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> literature have done so; <span class="hljs-keyword">it</span> results <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> addition <span class="hljs-keyword">of</span> <span class="hljs-keyword">a</span> small amount <span class="hljs-keyword">of</span> immortal <span class="hljs-built_in">time</span> bias <span class="hljs-keyword">and</span> shrinks <span class="hljs-keyword">the</span> fitted coefficients towards <span class="hljs-literal">zero</span>.
160
</code><div class="uc-code-block-copy-button-wrapper">