Diff of /data/README.md [000000] .. [a29fce]

Switch to unified view

a b/data/README.md
1
# Pension Dataset Description
2
3
The `pension` dataset is used to analyze various factors influencing retirement savings and financial assets. The dataset contains several variables, each representing demographic and financial attributes of individuals.
4
5
## Variables
6
7
### 1. **D**: Contribution to 401(k) Plan (`pension$p401`)
8
   - Binary indicator variable.
9
   - `1`: Individual contributes to a 401(k) retirement plan.
10
   - `0`: Individual does not contribute to a 401(k) retirement plan.
11
12
### 2. **Z**: Eligibility for 401(k) Plan (`pension$e401`)
13
   - Binary indicator variable.
14
   - `1`: Individual is eligible for a 401(k) retirement plan.
15
   - `0`: Individual is not eligible for a 401(k) retirement plan.
16
17
### 3. **Y**: Net Total Financial Assets (`pension$net_tfa`)
18
   - Continuous variable.
19
   - Represents the individual's total financial assets, adjusted for liabilities.
20
21
### 4. **X**: Covariates
22
   - A matrix of individual-level demographic and financial features. The variables included are:
23
24
| Variable             | Description                          |
25
|----------------------|--------------------------------------|
26
| **Age**              | Age of the individual.              |
27
| **Benefit pension**  | Binary indicator for benefit pension.|
28
| **Education**        | Years of education completed.       |
29
| **Family size**      | Number of family members.           |
30
| **Home owner**       | Binary indicator for home ownership.|
31
| **Income**           | Annual income (continuous variable).|
32
| **Male**             | Binary indicator for gender.        |
33
| **Married**          | Binary indicator for marital status.|
34
| **IRA**              | Binary indicator for having an Individual Retirement Account (IRA).|
35
| **Two earners**      | Binary indicator for dual-income households.|
36
37
## Data Structure
38
The dataset contains:
39
- Binary variables for 401(k) contributions and eligibility (`D` and `Z`).
40
- A continuous variable for financial assets (`Y`).
41
- A set of covariates (`X`) covering demographic and financial information.
42
43
## Usage
44
The dataset can be used for:
45
- Analyzing the relationship between 401(k) eligibility/contribution and financial assets.
46
- Studying the effects of demographic factors on retirement savings behavior.
47
- Building models to predict financial asset accumulation based on demographic features.
48
49
## Example R Code
50
Here is an example of how to load and prepare the data:
51
52
```R
53
data(pension)
54
55
D = pension$p401
56
Z = pension$e401
57
Y = pension$net_tfa
58
X = model.matrix(~ 0 + age + db + educ + fsize + hown + inc + male + marr + pira + twoearn, data = pension)
59
var_nm = c("Age","Benefit pension","Education","Family size","Home owner","Income","Male","Married","IRA","Two earners")
60
colnames(X) = var_nm
61