|
a |
|
b/README.md |
|
|
1 |
<div class="sc-cmRAlD dkqmWS"><div class="sc-UEtKG dGqiYy sc-flttKd cguEtd"><div class="sc-fqwslf gsqkEc"><div class="sc-cBQMlg kAHhUk"><h2 class="sc-dcKlJK sc-cVttbi gqEuPW ksnHgj">About Dataset</h2></div></div></div><div class="sc-jgvlka jFuPjz"><div class="sc-gzqKSP tNtjD"><div style="min-height: 80px;"><div class="sc-etVRix jqYJaa sc-bMmLMY ZURWJ"><pre class="uc-code-block"><code><span class="hljs-built_in">Data</span> <span class="hljs-keyword">on</span> recurrences of bladder cancer, used <span class="hljs-keyword">by</span> many people <span class="hljs-keyword">to</span> demonstrate methodology for recurrent event modelling. |
|
|
2 |
</code><div class="uc-code-block-copy-button-wrapper"><button class="uc-code-block-copy-button google-symbols" aria-label="Copy code">content_copy</button></div></pre> |
|
|
3 |
<table> |
|
|
4 |
<thead> |
|
|
5 |
<tr> |
|
|
6 |
<th>Column</th> |
|
|
7 |
<th>Description</th> |
|
|
8 |
<th>Format</th> |
|
|
9 |
</tr> |
|
|
10 |
</thead> |
|
|
11 |
<tbody> |
|
|
12 |
<tr> |
|
|
13 |
<td><strong>Bladder Dataset 1</strong></td> |
|
|
14 |
<td></td> |
|
|
15 |
<td></td> |
|
|
16 |
</tr> |
|
|
17 |
<tr> |
|
|
18 |
<td>id</td> |
|
|
19 |
<td>Patient ID</td> |
|
|
20 |
<td></td> |
|
|
21 |
</tr> |
|
|
22 |
<tr> |
|
|
23 |
<td>treatment</td> |
|
|
24 |
<td>Treatment received</td> |
|
|
25 |
<td>Placebo, pyridoxine (vitamin B6), or thiotepa</td> |
|
|
26 |
</tr> |
|
|
27 |
<tr> |
|
|
28 |
<td>number</td> |
|
|
29 |
<td>Initial number of tumors</td> |
|
|
30 |
<td>8=8 or more</td> |
|
|
31 |
</tr> |
|
|
32 |
<tr> |
|
|
33 |
<td>size</td> |
|
|
34 |
<td>Size (cm) of largest initial tumor</td> |
|
|
35 |
<td></td> |
|
|
36 |
</tr> |
|
|
37 |
<tr> |
|
|
38 |
<td>recur</td> |
|
|
39 |
<td>Number of recurrences</td> |
|
|
40 |
<td></td> |
|
|
41 |
</tr> |
|
|
42 |
<tr> |
|
|
43 |
<td>start</td> |
|
|
44 |
<td>Start time of each interval</td> |
|
|
45 |
<td></td> |
|
|
46 |
</tr> |
|
|
47 |
<tr> |
|
|
48 |
<td>stop</td> |
|
|
49 |
<td>End time of each interval</td> |
|
|
50 |
<td></td> |
|
|
51 |
</tr> |
|
|
52 |
<tr> |
|
|
53 |
<td>status</td> |
|
|
54 |
<td>End of interval code</td> |
|
|
55 |
<td>0=censored, 1=recurrence, 2=death from bladder disease, 3=death other/unknown cause</td> |
|
|
56 |
</tr> |
|
|
57 |
<tr> |
|
|
58 |
<td>rtumor</td> |
|
|
59 |
<td>Number of tumors found at recurrence</td> |
|
|
60 |
<td></td> |
|
|
61 |
</tr> |
|
|
62 |
<tr> |
|
|
63 |
<td>rsize</td> |
|
|
64 |
<td>Size of largest tumor at recurrence</td> |
|
|
65 |
<td></td> |
|
|
66 |
</tr> |
|
|
67 |
<tr> |
|
|
68 |
<td>enum</td> |
|
|
69 |
<td>Event number (observation number within patient)</td> |
|
|
70 |
<td></td> |
|
|
71 |
</tr> |
|
|
72 |
<tr> |
|
|
73 |
<td><strong>Bladder Dataset 0</strong></td> |
|
|
74 |
<td></td> |
|
|
75 |
<td></td> |
|
|
76 |
</tr> |
|
|
77 |
<tr> |
|
|
78 |
<td>id</td> |
|
|
79 |
<td>Patient ID</td> |
|
|
80 |
<td></td> |
|
|
81 |
</tr> |
|
|
82 |
<tr> |
|
|
83 |
<td>rx</td> |
|
|
84 |
<td>Treatment received</td> |
|
|
85 |
<td>1=placebo, 2=thiotepa</td> |
|
|
86 |
</tr> |
|
|
87 |
<tr> |
|
|
88 |
<td>number</td> |
|
|
89 |
<td>Initial number of tumors</td> |
|
|
90 |
<td>8=8 or more</td> |
|
|
91 |
</tr> |
|
|
92 |
<tr> |
|
|
93 |
<td>size</td> |
|
|
94 |
<td>Size (cm) of largest initial tumor</td> |
|
|
95 |
<td></td> |
|
|
96 |
</tr> |
|
|
97 |
<tr> |
|
|
98 |
<td>stop</td> |
|
|
99 |
<td>Recurrence or censoring time</td> |
|
|
100 |
<td></td> |
|
|
101 |
</tr> |
|
|
102 |
<tr> |
|
|
103 |
<td>enum</td> |
|
|
104 |
<td>Which recurrence (up to 4)</td> |
|
|
105 |
<td></td> |
|
|
106 |
</tr> |
|
|
107 |
<tr> |
|
|
108 |
<td><strong>Bladder Dataset 2</strong></td> |
|
|
109 |
<td></td> |
|
|
110 |
<td></td> |
|
|
111 |
</tr> |
|
|
112 |
<tr> |
|
|
113 |
<td>id</td> |
|
|
114 |
<td>Patient ID</td> |
|
|
115 |
<td></td> |
|
|
116 |
</tr> |
|
|
117 |
<tr> |
|
|
118 |
<td>rx</td> |
|
|
119 |
<td>Treatment received</td> |
|
|
120 |
<td>1=placebo, 2=thiotepa</td> |
|
|
121 |
</tr> |
|
|
122 |
<tr> |
|
|
123 |
<td>number</td> |
|
|
124 |
<td>Initial number of tumors</td> |
|
|
125 |
<td>8=8 or more</td> |
|
|
126 |
</tr> |
|
|
127 |
<tr> |
|
|
128 |
<td>size</td> |
|
|
129 |
<td>Size (cm) of largest initial tumor</td> |
|
|
130 |
<td></td> |
|
|
131 |
</tr> |
|
|
132 |
<tr> |
|
|
133 |
<td>start</td> |
|
|
134 |
<td>Start of interval (0 or previous recurrence time)</td> |
|
|
135 |
<td></td> |
|
|
136 |
</tr> |
|
|
137 |
<tr> |
|
|
138 |
<td>stop</td> |
|
|
139 |
<td>Recurrence or censoring time</td> |
|
|
140 |
<td></td> |
|
|
141 |
</tr> |
|
|
142 |
<tr> |
|
|
143 |
<td>enum</td> |
|
|
144 |
<td>Which recurrence (up to 4)</td> |
|
|
145 |
<td></td> |
|
|
146 |
</tr> |
|
|
147 |
</tbody> |
|
|
148 |
</table> |
|
|
149 |
<pre class="uc-code-block"><code>Bladder is <span class="hljs-keyword">the</span> data <span class="hljs-built_in">set</span> that appears most commonly <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> literature. It uses only <span class="hljs-keyword">the</span> <span class="hljs-number">85</span> subjects <span class="hljs-keyword">with</span> nonzero follow-up who were assigned <span class="hljs-built_in">to</span> either thiotepa <span class="hljs-keyword">or</span> placebo, <span class="hljs-keyword">and</span> only <span class="hljs-keyword">the</span> <span class="hljs-keyword">first</span> <span class="hljs-literal">four</span> recurrences <span class="hljs-keyword">for</span> <span class="hljs-keyword">any</span> patient. The status <span class="hljs-built_in">variable</span> is <span class="hljs-number">1</span> <span class="hljs-keyword">for</span> recurrence <span class="hljs-keyword">and</span> <span class="hljs-number">0</span> <span class="hljs-keyword">for</span> everything <span class="hljs-keyword">else</span> (including death <span class="hljs-keyword">for</span> <span class="hljs-keyword">any</span> reason). The data <span class="hljs-built_in">set</span> is laid out <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> competing risks <span class="hljs-built_in">format</span> <span class="hljs-keyword">of</span> <span class="hljs-keyword">the</span> paper <span class="hljs-keyword">by</span> Wei, Lin, <span class="hljs-keyword">and</span> Weissfeld. |
|
|
150 |
</code><div class="uc-code-block-copy-button-wrapper"><button class="uc-code-block-copy-button google-symbols" aria-label="Copy code">content_copy</button></div></pre> |
|
|
151 |
<pre class="uc-code-block"><code>Bladder1 is <span class="hljs-keyword">the</span> full data <span class="hljs-built_in">set</span> <span class="hljs-built_in">from</span> <span class="hljs-keyword">the</span> study. It <span class="hljs-keyword">contains</span> all <span class="hljs-literal">three</span> treatment arms <span class="hljs-keyword">and</span> all recurrences <span class="hljs-keyword">for</span> <span class="hljs-number">118</span> subjects; <span class="hljs-keyword">the</span> maximum observed <span class="hljs-built_in">number</span> <span class="hljs-keyword">of</span> recurrences is <span class="hljs-number">9.</span> |
|
|
152 |
</code><div class="uc-code-block-copy-button-wrapper"><button class="uc-code-block-copy-button google-symbols" aria-label="Copy code">content_copy</button></div></pre> |
|
|
153 |
<pre class="uc-code-block"><code>Bladder2 uses <span class="hljs-keyword">the</span> same subset <span class="hljs-keyword">of</span> subjects <span class="hljs-keyword">as</span> bladder, but formatted <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> (<span class="hljs-built_in">start</span>, <span class="hljs-built_in">stop</span>] <span class="hljs-keyword">or</span> Anderson-Gill style. Note that <span class="hljs-keyword">in</span> transforming <span class="hljs-built_in">from</span> <span class="hljs-keyword">the</span> WLW <span class="hljs-built_in">to</span> <span class="hljs-keyword">the</span> AG style data <span class="hljs-built_in">set</span> there is <span class="hljs-keyword">a</span> quite common programming mistake that leads <span class="hljs-built_in">to</span> extra follow-up <span class="hljs-built_in">time</span> <span class="hljs-keyword">for</span> <span class="hljs-number">12</span> subjects: all those <span class="hljs-keyword">with</span> follow-up beyond their <span class="hljs-number">4</span>th recurrence. This <span class="hljs-string">"follow-up"</span> is <span class="hljs-keyword">a</span> side effect <span class="hljs-keyword">of</span> throwing away all events <span class="hljs-keyword">after</span> <span class="hljs-keyword">the</span> <span class="hljs-keyword">fourth</span> <span class="hljs-keyword">while</span> retaining <span class="hljs-keyword">the</span> <span class="hljs-keyword">last</span> follow-up <span class="hljs-built_in">time</span> <span class="hljs-built_in">variable</span> <span class="hljs-built_in">from</span> <span class="hljs-keyword">the</span> original data. The bladder2 data <span class="hljs-built_in">set</span> found here does <span class="hljs-keyword">not</span> make this mistake, but some analyses <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> literature have done so; <span class="hljs-keyword">it</span> results <span class="hljs-keyword">in</span> <span class="hljs-keyword">the</span> addition <span class="hljs-keyword">of</span> <span class="hljs-keyword">a</span> small amount <span class="hljs-keyword">of</span> immortal <span class="hljs-built_in">time</span> bias <span class="hljs-keyword">and</span> shrinks <span class="hljs-keyword">the</span> fitted coefficients towards <span class="hljs-literal">zero</span>. |
|
|
154 |
</code><div class="uc-code-block-copy-button-wrapper"> |