ETF5952课程程序 辅导、Risk Analysis程序 写作

” ETF5952课程程序 辅导、Risk Analysis程序 写作ETF5952 Quantitative Methods for Risk AnalysisSemester 1, 2020ASSIGNMENT 2Deadline: 3PM, June 10, 2020Important Instruction This assignment comprises 25% of the assessment for ETF5952. This is an individual, NOT a syndicate,assignment. On the Assignment Cover Sheet, read the references to plagiarism and collusion from UniversityStatute 4.1. Part III-Academic Misconduct. Answer all questions, and start from a new page for each question. Your assignment must be typed andyou must submit a pdf file (A4 pages) with an Assignment Cover Sheet (from the ASSIGNMENTS sectionof Moodle).Name your assignment: Surname Initials AS.pdf and Upload this file to Moodle as follows:1. Go to the ASSIGNMENTS section.2. Click on the ASSIGNMENT 2 link to upload.3. The following message will appear momentarily, File uploaded successfully.(To later confirm your upload was successful, go to the ASSIGNMENTS section and click. On theAssignment 2 uploading link. The uploaded files name will be shown.) If you have a valid reason not to meet the deadline, you will be requested to submit what you have done atthe due date and receive your grade relative to opportunity. Without any valid reasons, 10% of Assignmentsallocated marks will be deducted for each day that it is late. Submit one pdf file only. Do NOT submit/attach R scripts or output files. Do not submit your assignmentin a folder. You should summarize what you obtain to answer questions, instead of providing all codes and outputs.If you provide too many outputs relative to questions, then we will consider that you may not understandthe questions and your answers would be subject point deduction. If you have questions regarding materials, you are encouraged to use our consultation. The course emailshould be used only for pointing out typos and personal matters.ETF5952课程作业 辅导、Risk Analysis作业 写作、Python,c/c++,Java程序语言作业 辅导Question 1 (25 points: 5+5+5+5+5+5)To answer this questions, use a mobility data set for Australia, move au.csv. This data is extracted from Googlemobility data and see more information from the google site ( httpss://www.google.com/covid19/mobility/). Thedata set contains 6 variables regarding mobility information in 8 sub-regions, Australia from Feb 15 to May 7.We consider a factor mode for the jth variable xi,t,j for region i and time t, given byE[xi,t,j ] = j,1it,1 + j,2it,2 + + j,6it,6.Here, since each variable can vary over time and regions, latent factors depend on time and regions (but, theanalysis is similar).1. To estimate the factor model, apply Principle Component Analysis (PCA). Use the scale option to standardize6 variables. Report the plot of variances of PCs and explain which component is dominant (nomore than 30 words).2. Report the estimated loadings in a table. From loading, explain the effect of the first factor on 6 variables(no more than 30 words).3. Using the estimated factors, report a boxplot of 6 factors and explain whether the result is consistent withthe one in Question 1.1.4. Add the estimate first factor to the original data set as a new variable. Also, set date as a date variableby using as.Date function. Report a scatter plot with x-axis of date and y-axis of the first factor. Drawa horizontal line at y = 0. Interpret variations in the first factor over time (no more than 50 words).5. Notice that the first factor, vit,1, can vary across regions. To see regional variations, create a box plotof vit,1 for each region (boxplot function may not work well without adjustment. If so, I suggest touse ggplot2 package). According to the first factor variations in Victoria relative to the ones in the otherregions, explain whether human mobility in Victoria decreased (no more than 50 words).Question 2 (25 points: 5+5+5+10)We will use a type of difference in difference estimation to estimate the effect of Napster on music sales. In thisquestion, use cex basefile97 02.csv, Which are extracted from several data sets and downloaded from Journalof Applied Econometrics. Before the analysis, you have to clean the data set. The data set is provided withreadme.sh.txt file. Check carefully what kind of variables are in the data set. We do not use newid,intno and firmth. We use cdall and weight as a dependent variables and a weight, respectively. Thevariables, year and nint are key variables and consider the other variables as control variables. Whenyou load the data set, notice that the data set has no variable names in the data set and you have to use anoption for no header (check ?read.csv). We do NOT use weight as weight for all regressions in this question.Consequential marks will not be provided and you are strongly encouraged to read the readme file and set yourdata carefully (it is easy to select variables by the column number. DATA[,3] means the 3rd variables andDATA[,3:7] means the 3rd-7th variables). When you use gamlr, you do not need to report any hypothesis testingresult.1. Let yit be music sales and dit take 1 If household HAS internet or 0 otherwise, for household i and yeart. Napster started in 1999 and let t.napt takes 1 if t 1999 or 0 otherwise. Set d.napit = dit t.nap.Without internet access, people cannot use Napster. Thus, we consider the following modelyit = + t + dit + d.napit + it,where , and are parameters t is a year effect for t, and the error it. The parameter measuresthe effect of Napster on music sale. Estimate this model by using the data set. Report only the estimatedeffect of Napster and provide interpretation of Napsters effect (20 words)2. Estimate the model Question 2.1 with all available control variables. Report only the estimated effect ofNapster and provide interpretation of Napsters effect (20 words)23. Use lasso (gamlr) to estimate the model Question 2.2 (single machine learning). Report only the estimatedeffect of Napster and provide interpretation of Napsters effect (20 words)4. Use the double machine learning to the effect of Napster. First, apply lasso (gamlr) to estimatedit = + t + x0it + it,where xit are a vector of control variables. Note you also have to include time effects t (time dummies).Let dit be the fitted values from this estimation.Second, apply lasso (gamlr) to estimateyit = + t + dit + d.napit + (dit t.napt) + x0it + it.Here, keep the term ( dit t.napt) always. Report only the estimated effect of Napster and provide interpretationof Napsters effect (20 words)Question 3 (25 points: 10+15)In the lecture, the average treatment effect was introduced under a binary treatment status, but we oftenencounter randomized Control trials with multiple treatments. Consider the case of three treatment status,where we have no treatment, treatment 1 and treatment 2. Let d1 be a dummy variable taking 1 for treatment1 and 0 otherwise, and d2 be a dummy variable taking 1 for treatment 2 and 0 otherwise.1. We consider the following regressiony = + d1 + d2 + ,where , and are parameters and is the error with E[] = 0. Using and in this regression, explainwhat you can estimate First, express only the final outcomes mathematically (no derivations) and second,explain each (no more than 10 words for each).2. Suppose that we want to estimate the difference of treatment effects between treatment 1 and 2 on average.To this end, consider and expression a Specification of a regression when only y, d1 and d2 are available,and denote the key parameter by . Express mathematically what measures.Question 4 (25 points: 5+5+5+5+5)Use Hitters from the ISLR package.1. The original data set contains some missing values, denoted by NA. Drop observations with the missingvalues and then report the summary Statistics of Salary, only.2. Report a histogram of Salary and explain salary inequality at Major Leagues Baseball (no more than 15words).3. To understand sources of the salary inequality, we use regression tree for Salary with the remaining variablesin the data set. Provide the estimation result and list the all characteristics of high-salary players (justmake a list of the conditions: no explanation is required).4. Friend A argues that the regression tree analysis based on Salary may be influenced by outliers. Explain ifthe argument is correct or not (no more than 30 words).5. Given Friend As argument, we consider an alternative formation by taking a log of Salary. Estimate aregression tree for log Salary with the rest of variables as regressors. Provide the estimation result andexplain characteristics of high-salary players (just make a list of the conditions: no explanation is required).3如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导