Diff of /data/README.md [000000] .. [a29fce]

Switch to side-by-side view

--- a
+++ b/data/README.md
@@ -0,0 +1,61 @@
+# Pension Dataset Description
+
+The `pension` dataset is used to analyze various factors influencing retirement savings and financial assets. The dataset contains several variables, each representing demographic and financial attributes of individuals.
+
+## Variables
+
+### 1. **D**: Contribution to 401(k) Plan (`pension$p401`)
+   - Binary indicator variable.
+   - `1`: Individual contributes to a 401(k) retirement plan.
+   - `0`: Individual does not contribute to a 401(k) retirement plan.
+
+### 2. **Z**: Eligibility for 401(k) Plan (`pension$e401`)
+   - Binary indicator variable.
+   - `1`: Individual is eligible for a 401(k) retirement plan.
+   - `0`: Individual is not eligible for a 401(k) retirement plan.
+
+### 3. **Y**: Net Total Financial Assets (`pension$net_tfa`)
+   - Continuous variable.
+   - Represents the individual's total financial assets, adjusted for liabilities.
+
+### 4. **X**: Covariates
+   - A matrix of individual-level demographic and financial features. The variables included are:
+
+| Variable             | Description                          |
+|----------------------|--------------------------------------|
+| **Age**              | Age of the individual.              |
+| **Benefit pension**  | Binary indicator for benefit pension.|
+| **Education**        | Years of education completed.       |
+| **Family size**      | Number of family members.           |
+| **Home owner**       | Binary indicator for home ownership.|
+| **Income**           | Annual income (continuous variable).|
+| **Male**             | Binary indicator for gender.        |
+| **Married**          | Binary indicator for marital status.|
+| **IRA**              | Binary indicator for having an Individual Retirement Account (IRA).|
+| **Two earners**      | Binary indicator for dual-income households.|
+
+## Data Structure
+The dataset contains:
+- Binary variables for 401(k) contributions and eligibility (`D` and `Z`).
+- A continuous variable for financial assets (`Y`).
+- A set of covariates (`X`) covering demographic and financial information.
+
+## Usage
+The dataset can be used for:
+- Analyzing the relationship between 401(k) eligibility/contribution and financial assets.
+- Studying the effects of demographic factors on retirement savings behavior.
+- Building models to predict financial asset accumulation based on demographic features.
+
+## Example R Code
+Here is an example of how to load and prepare the data:
+
+```R
+data(pension)
+
+D = pension$p401
+Z = pension$e401
+Y = pension$net_tfa
+X = model.matrix(~ 0 + age + db + educ + fsize + hown + inc + male + marr + pira + twoearn, data = pension)
+var_nm = c("Age","Benefit pension","Education","Family size","Home owner","Income","Male","Married","IRA","Two earners")
+colnames(X) = var_nm
+