辅导STU33009程序设计、 写作matlab

” 辅导STU33009程序设计、 写作matlabTRINITY COLLEGE DUBLINSchool of Computer Science and StatisticsMid-Term Assignment 2020-21 STU33009: Statistical Methods for Computer ScienceSubmitting Your Report Reports must be typed (no handwritten answers please) and submittedon Blackboard. As a guideline, reports should be about 5 pages in length including allplots (please dont go a lot over this). You will need to Use matlab to calculate values, or alternatively write ashort program in python to do this. In either case give the code used asan appendix to the report (it doesnt count towards the page limit), butplease keep the code short. In order to obtain full credit it is essential that you explain/justify how youobtained your results and, where appropriate, that you critically reflectupon them. Simply giving raw numbers as answers will receive few marksas will saying see code for details and the like, even if the code containsexplanatory comments. It is mandatory to complete the declaration that the work is entirely yourown and you have not collaborated with anyone – the declaration form isavailable on Blackboard.Downloading DataIn this assignment you will analyse the data on shopping behaviour. Start by downloadingthe following dataset: httpss://www.scss.tcd.ie/doug.leith/ST3009/midterm2021.php. Important: Youmust fetch your own copy of the dataset, do not use the dataset downloaded by someoneelse. Keep the dataset that you download as I might request it to validate yourresults. The data file consists of rows of data. Each row i corresponds to one supermarketshopping basket and each column j corresponds to one item for sale. The value Zi,jin row i, column j gives how many of the jth item are in the ith shopping basket.Assignment1. (a) Plot a histogram showing the PMF of the number of items in a basket. Hint:Summing the values in a row gives the number of items in that shopping basket.[5 marks](b) Estimate the probability P(Zi,1 = 1) that the first column in the dataset takesvalue 1 i.e. that a shopping basket contains an item 1. Briefly explain/discussyour calculation. Hint: Observe that the first column in the dataset only takesvalues 0 or 1 and recall that for an indicator RV X we have P rob(X = 1) = E[X].[5 marks](c) Derive a Confidence interval for your estimate P(Zi,1 = 1) using the CLT andChebyshev Inequality. Explain/discuss your calculation. [5 marks](d) Suppose we require to estimate the value of P(Zi,1 = 1) to an accuracy of 1%with 95% confidence. How many shopping baskets would we need to collect datafrom? [5 marks]2. Your task is to explore whether the presence of item 1 in a shopping basket can bepredicted from the presence of other items in the basket. We start with whether item2 in the basket is predictive of item 1 being in the basket. Since the first columnin the dataset only takes values 0 or 1, its conditional expectation E[Zi,1|Zi,2 =z] = P(Zi,1 = 1|Zi,2 = z). the sum is taken over the baskets with second column equal to z, and N = |{i :Zi,2 = z}| is the size of this set. This sample mean concentrates on E[Zi,1|Zi,2 = z]as the number of shopping baskets observed grows.(a) Calculate the sample mean of Zi,1 conditioned on the second column Zi,2 = zfor z = 0, 1, . . . being each of the different values that the second column takes.Report the values in a Table. Briefly explain/discuss your calculation. [5 marks](b) Derive confidence intervals for your estimate E[Zi,1|Zi,2 = z] using the CLT andChebyshev Inequality. Explain your working and extend your table from (a) toinclude these intervals. [5 marks](c) Using the matlab errorbar() function, or python equivalent, plot your estimatesof E[Zi,1|Zi,2 = z] vs z together with their confidence intervals i.e. a plot withz on the x-axis and the estimate of E[Zi,1|Zi,2 = z] on the y-axis, together witherror bars indicating the confidence interval around this estimate. Discuss. [5marks](d) Compare your estimate of E[Zi,1|Zi,2 = z] with your estimate of E(Zi,1) frompart 1(b)-(c), bearing in mind their confidence intervals. Critically discusswhether the presence of item 2 in the basket is predictive of item 1 being inthe basket. [5 marks]3. (a) Repeat your analysis in 2(d) but now using only the first 100 rows from thedataset (its enough to plot the data, no need to include a table of values). Whatis the impact on the confidence intervals of using less data, and why? How doesthat impact what conclusions you can draw from the data? [5 marks](b) Now repeat 2(d) but for E[Zi,1|Zi,3 = z] i.e. conditioned on the third columnZi,3 = z. Compare and contrast the behaviour with that observed when conditioningon the second column, again Bearing in mind the confidence intervals.[5 marks]请加QQ:99515681 或邮箱:99515681@qq.com WX:codehelp

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导