辅导data留学生程序、 写作Program课程程序

辅导data留学生作业、 写作Program课程作业、 辅导Java,c/c++语言作业、Python编程作业 写作
Overview of the data
The data is from the 1991 Survey of Income and Program Participation
(SIPP). You are provided with 7933 observations.
The sample contains households data in which the reference persons
aged 25-64 years old. At least one person is employed, and no one is
self-employed. The observation units correspond to the household
reference persons.
The data set contains a number of feature variables that you can
choose to predict total wealth. The outcome variable (total wealth) and
feature variables are described in the next slide.
Dataframe with the following variables
Variable to predict (outcome variable):
tw: total wealth (in US $).
Total wealth equals net financial assets, including Individual Retirement Account (IRA) and 401(k) assets,
plus housing equity plus the value of business,
property, and motor vehicles.
Variables related to retirement (features):
ira: individual retirement account (IRA) (in US $).
e401: 1 if eligible for 401(k), 0 otherwise
Financial variables (features):
nifa: non-401k financial assets (in US $).
inc: income (in US $).
Variables related to home ownership (features):
hmort: home mortgage (in US $).
hval: home value (in US $).
hequity: home value minus home mortgage.
Other covariates (features):
educ: education (in years).
male: 1 if male, 0 otherwise.
twoearn: 1 if two earners in the household, 0 otherwise.
nohs, hs, smcol, col: dummies for education: no high- school, high-school, some college, college.
age: age.
fsize: family size.
marr: 1 if married, 0 otherwise.
What is 401k and IRA?
Both 401k and IRA are tax deferred savings options which aims to increase
individual saving for retirement
The 401(k) plan:
a company-sponsored retirement account where employees can contribute
employers can match a certain % of an employees contribution
401(k) plans are offered by employers — only employees in companies
offering such plans can participate
The feature variable e401 contains information on the eligibility
IRA accounts:
Everyone can participate — you can go to a bank to open an IRA account
The feature variable ira contains IRA account (in US $)
Collection of methods
We have already seen:
OLS
Ridge regressions
Stepwise selection methods
Lasso
Note:
1. In the project, you should select different methods from the list above and
compare their prediction performance and interpretability
2. For Ridge, Stepwise selection, and Lasso, dont forget the use of Cross- Validation
3. In addition to prediction performance, you might want to think about
whether the set of predictors used to predict total wealth make intuitive
sense
Compare the prediction performances of different
methods — an example (this is just ONE EXAMPLE)
Say, you have applied the Ridge regression and the Lasso

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导