Card

About Dataset

The data set contains patient records from a 1984-1989 trial conducted by the German Breast Cancer Study Group (GBSG) of 720 patients with node positive breast cancer; it retains the 686 patients with complete data for the prognostic variables.
These data sets are used in the paper by Royston and Altman(2013). The Rotterdam data is used to create a fitted model, and the GBSG data for validation of the model. The paper gives references for the data source.

Dataset Format

A data set with 686 observations and 11 variables.

Columns Description
pid patient identifier
age age, years
meno menopausal status (0= premenopausal, 1= postmenopausal)
size tumor size, mm
grade tumor grade
nodes number of positive lymph nodes
pgr progesterone receptors (fmol/l)
er estrogen receptors (fmol/l)
hormon hormonal therapy, 0= no, 1= yes
rfstime recurrence free survival time; days to first of recurrence, death or last follow-up
status 0= alive without recurrence, 1= recurrence or death

References

Patrick Royston and Douglas Altman, External validation of a Cox prognostic model: principles and methods. BMC Medical Research Methodology 2013, 13:33