写作STATS762程序、 辅导R编程设计程序

” 写作STATS762程序、 辅导R编程设计程序STATS762 Regression for Data ScienceAssignment 3Due date: 10am, 1 June 2020Instruction Please submit both your R Markdown document and a pdf file containingthe document it generates. To create a pdf you should start your R Markdowndocument with the following lines (having made the appropriatechanges):—title: STATS 762 Assignment 3author: Your Name, ID 1234567date: Due: 10am, 1 June 2020output: pdf_document— Add the set.seed-function before your R-script to obtain the same outputwhen it is resimulated. All answers should be written with corresponding question numbers. Working must be shown. Each answer should be written explicitly and a R-code itself does notmake an answer.For example, the question is finding an average height of 6 trees: (1, 2, 1,3, 1.5).Good answer Bad answer If any of above is unsatisfied, a penalty may be applied.1. The spreadsheet avocado2.csv contains historical 338 avocado sales invarious markets in California, US. The attributes follow;Total.Volume Total number of sold avocadosAveragePrice Average price of a single avocadotype Production type; organic and conventionally produced avocados 写作STATS762作业、 辅导R编程设计作业A researcher wants to investigate how the amount of sales relates to an averageprice and a production type (organic/conventional). Total.Volumeis transformed in a log-scale to fit a linear regression model with AveragePriceand type.(a) Write how a log-transformed total number of sold avocados is usefulfor modelling a quantile using a linear regression. [2 marks](b) Find a suitable linear regression model for the 0.2 quantile of log(Total.Volume)and express a typical 0.2 quantile of total number of sold avocadosfor a given price and production type. [5 marks](c) Find a suitable linear regression model for the 0.8 quantile of log(Total.Volume)and express a typical 0.8 quantile of total number of sold avocadosfor a given price and production type. [5 marks](d) Using your model, predict the 0.2 quantile of the total sales for $1.2conventional avocados and $1.8 organic avocados. [1 marks](e) What conventional avocado price does result that 80% of marketssold at most 5.4 millions avocados? [3 marks]2. The spreadsheets (banktrain.csv and banktest.csv) are related withdirect marketing campaigns of a bank. The marketing campaigns werebased on phone calls. Often, more than one contact to the same client wasrequired, in order to access if the product (bank term deposit) would be(or not) subscribed. The interest is to predict if the client will subscribe aterm deposit (variable y).The attributions follow;gender – gender (categorical: male,female)age – age (numeric)marital – marital status (categorical: married,divorced,single)education – education information of client (categorical: unknown,secondary,primary,tertiary)default – credit account status (categorical: yes,no)balance – average yearly balance, in euros (numeric)housing – housing loan status (categorical: yes,no)loan – personal loan status (categorical: yes,no)contact – contact communication type (categorical: unknown,telephone,cellular)duration – last contact duration, in seconds (numeric)campaign – number Of contacts performed during this campaign and for this client (numeric)previous – number of contacts performed before this campaign and for this client (numeric)poutcome – outcome of the previous marketing campaign (categorical: unknown,other,failure,success)y – Has the client subscribed a term deposit? (categorical: yes,no)2We use the train data (banktrain.csv) to find a model and the test data(banktest.csv) to examine the predictability of a model. Note that thenumber of cross validation folders is 10.The function in make.r reforms a data that each categorical variable createsindicator variables corresponding to categorical levels. It producesa list with two objects; a reformed data (data) and a vector of groupmemberships (gpname).(a) Using the train data, complete the following questions.i. Using an appropriate Penalty on the model complexity, find amodel minimizing the cross validation error. Show how youfound the model and describe the model with the client charactersincluded. [4 marks]ii. Using an appropriate penalty on the model complexity, finda parsimonious model. Show how you found the model anddescribe the model With the client characters included. [4 marks](b) Estimate the predictability of each model using an appropriate measureand, compare the predictability. [3 marks](c) Using your parsimonious model, describe a type of client who isvery likely to subscribe a term deposit. [3 marks](d) If a marketing focuses on a single client character what would be thefeature to succeed the marketing campaign? [3 marks]如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导