辅导STATS 782编程、 写作R编程设计

” 辅导STATS 782编程、 写作R编程设计Department of StatisticsSTATS 782 Statistical ComputingAssignment 3(2020.FC)Total: 50 marks Due: 2:00 pm NZST, Friday 29 May 20201. Please read these instructions carefully. Further instructions might be posted on the classwebpage.2. Upload your soft copy (assignment source) to Canvas: the file should end in .Rmd, or possibly.R or .Rnw. The marker may run or knit your R code, so include your name and ID in allfiles. The file names should contain your UPI. RMarkdown is strongly recommended.3. Also upload your .pdf to Canvas too. Note the time difference between countries.4. Coversheet: please make sure you do one of the following else your assignment will not bemarked:(a) Sign the Cover Sheet and combine with your assignment document (pdf or Word) intoa single file before submission, OR(b) Type or write for the following at the beginning of your assignment: Your name (as itappears in Canvas), your UPI, and the following statement: I have read the declarationon the cover sheet and confirm my agreement with it.5. Include everything in your report: R code (tidied up), outputs (including error/warning messages),and your explanations (if any). Please comment on almost all of your output, especiallyparts that need human interpretation, else marks will be deducted. That is, you needto convince the marker that you understand what the data or solution is saying.6. Print some intermediate results to show how your code works step by step, if not obvious.Comment your code if appropriate, e.g., for functions, blocks of code, and key variables.7. Type help.start() when you open R. You need to use the online help to find details andfunctions that may not be covered directly in the coursebook. This requires maturity; wecannot cover everything in class or the coursebook.8. Your mark for this assignment will depend on getting the right answer, the elegance/efficiencyof your approach, and the tidiness and documentation of your code/report. The R TidyverseStyle Guide or R Google Style Guide is recommended. Marks (up to 7) will be deducted 辅导STATS 782作业、 写作R编程设计作业、 辅导data留学生作业、R课程设计作业 写作for messy code, etc.9. This PDF file may contain colour that is important to see.1. [16 marks] The Ministry of Health of New Zealand provides daily updates of the status ofCOVID-19 cases in the country. The basic data consists of the date of report and the numberof probable and confirmed COVID-19 cases reported that day. The data reported on April 19is provided in the file covid19-apr19.csv.(a) The ministry published the following plot on April 26 showing the total of reported casesper day (confirmed + probable):Re-create the graphic using R as closely as sensible. Start with the same basic type ofplot in R then adjust colors, line widths and labels. Finally address the axes. If there areany visual differences, describe them and explain which version you think is better, yoursor the original and why. Note that there are small differences between the available dataand the plot1. [4 marks](b) In addition to the plot above, the ministry also publishes a plot of all cases known up toa given date:Re-create the graphic using R. Discuss any drawbacks of the rendition of the graphic.[4 marks](c) Change the graphic from (a) in a way that it allows to distinguish probable from confirmedcases. Explain your decisions and which comparisons can be directly performed visuallyin the plot. Give at least one example of a comparison which cannot be done using thisplot. [4 marks]1The dataset file is more detailed in that it counts actual cases filed on the reported day whereas the daily reportplots count new cases known at a given time of that day which may include cases filed earlier.2(d) We can modify the plot from (c) such that we can directly compare the relative proportionof confirmed cases to total each day while keeping the modifications to a minimum asfollows:Mar 01 Mar 15 Apr 01 Apr 15Proportion of confirmed casesDate of reportProportion (in %)Re-create that plot type. Did you have to sacrifice information that was available in (c)but is no longer visible? If so, what was it? Interpret the resulting plot. [4 marks]32. [11 marks] Consider the following plot illustrating an optical illusion:The plot is composed of squares that are all aligned at the same y coordinate, although oureyes makes us believe that the lines are not straight. Each row is shifted by 1/4 square relativeto the adjacent rows, but the direction changes every two steps.(a) Re-create the plot Using R. [5 marks](b) Create a function taking n as a parameter which determines how many rows of squaresthere will be. Run it for values of 9, 11 and 15. [3 marks](c) Enhance the function from (b) by adding an argument cols which is a vector of the twocolours to be used to fill the boxes. Call it with f(n=11, cols=c(red,yellow)) andshow the resulting plot. Does the effect still work? [3 marks]43. [23 marks] The dataset temp-cities.csv contains the daily low and high temperaturesfor seven cities in the world over last 20 years.(a) Read the dataset and restrict it to the subset as follows: city Auckland and records fromthe year 2019. Create one plot which shows both the lows and highs for every day ofthe year 2019 in Auckland. Use blue colour for the lows and red colour for the highs.[4 marks](b) Based on the 2019 Auckland subset, compute the weekly average for both lows and highsrespecitvely. For this Purpose the first week are the first 7 days in 2019, second week arethe next 7 days etc. Superimpose the averages over the plot obtained in (a). [4 marks](c) Create a matrix of plots such that each plot shows all the data for one city. Make surethat it is possible to compare values between the plots. Justify the layout you used. Thepurpose of this plot is Exploratory data analysis, not presentation, so you do not needto worry about removing axes that are superfluous or labels (other than the city) at thispoint. Do you see any obvious issues in the data? [4 marks](d) Plot a matrix of scatterplots of highs vs lows for each city. Describe what can you learnfrom the plots. Do you see any technical issues with the data? [3 marks](e) Compute the average low and high temperature for each city and week of the year. This issimilar to (b), but you want to averge over the years as well, i.e., the average for the firstweek2 will be computed from temperatures on 1-7 January of all the years 2000, . . . , 2019.Do not worry about special handing of leap years.Plot the results. How can you interpret the resulting shapes? [4 marks](f) Take the plot from (e) and improve it by removing superfluous axes and margins. Useaxes only along the outer edge left, bottom and right of the entire matrix as illustrated infigure 1. [4 marks]2If you dont want to split years by hand (which you can), you may find as.POSIXlt(date)$yday useful.Figure 1: Weekly average temperature lows and highs for 2000-2019 in 7 world cities.如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导