” 写作EECS6127编程、 辅导program程序设计EECS6127Winter 20202021Assignment 1Due: 11:59:59 PM, February 6, 2021.A full score on this assignment is 25 marks. However there are 28 marks achievable onthis assignment (up to 3 bonus marks). To gain full marks, explain your answers carefullyand prove your claims.1. Bayes classifier and Bayes riskRecall that we model the Data generation as a probability distribution D over X {0, 1}. D can be defined in term of its marginal distribution over X , which we willdenote by DX and the conditional labeling distribution, which is defined by theregression function(x) = P(x,y)D[y = 1 | x]Lets consider a 2-dimensional Euclidean domain, that is X = R2, and thefollowing process of data generation: The marginal distribution over X is uniformover two square areas [1, 2] [1, 2] [3, 4] [1.5, 2.5]. Points in the firstsquare Q1 = [1, 2] [1, 2] are labeled 0 (blue) and points in the second squareQ2 = [3, 4] [1.5, 2.5] are labeled 1 (red), as in the illustration below.(a) Describe the density Function of DX, and the regression function, Bayes predictorand Bayes risk of D.(b) Consider the two distributions D1 and D2that we obtain by Projecting ontoeach of the axes. Formally, we are marginalizing out one of the features toobtain D1 and D2. Both are distributions over R {0, 1}. Describe the densityfunctions of D1Xand D2X, and the regression functions, Bayes predictors andBayes risks of D1 and D2.1(c) Consider the hypothesis classes Hinit and Hdecst from Question 2 on this assignment.Determine the approximation errors of Hinit for D1 and D2 and theapproximation error of Hdecst for D.(d) For the classes Hinit and Hdecst consider their closures under function complementsand again determine the approximation errors on D, D1 and D2.3 + 3 + 3 + 3 marks2. VC-dimensionWe define hypothesis classes Hinit of initial segments over domain X = R and Hdecstof decisions stumps over domain X = R2 as follows:Hinit = {ha | a R}, where ha(x) = 1[x a] ,andHdecst = {hia| a R, i {1, 2}}, where hia((x1, x2)) = 1[xi a] .Further, for a hypothesis class H {0, 1}X , we define the class of complements ofH, denoted by Hc, as the class Where we flip all the labels of the functions in H,that isHc = {hc| h H} where hc(x) = |h(x) 1|.Finally , we let Hcc = H Hc denote the closure of H under complements.(a) Determine the VC-dimensions of Hinit and Hdecst.(b) Show that for every hypothesis class H, we have VC(H) = VC(Hc).(c) Determine the VC-dimensions of Hccinit and Hccdecst.(d) Show that, for Every k, there exists a hypothesis class H with VC-dimensionk and VC(Hcc) = 2k + 1.Hint: consider domain X = N and classesHk = {h {0, 1}X| |h1(1)| k}That is Hk is the Class of functions that map at most k natural numbers to 1and the remaining ones to 0.3 + 3 + 3 + 3 marks3. Empirical and true riskRecall Claim 2, from Lecture 2. We showed that, over an uncountable domain,there exists a learner A (the stubborn learner) and a distribution D, such thatfor every sample size m and all samples S from Dm:|LS(A(S)) LD(A(S))| = 1This is not true for countable domains (as we will show later in the course). Forthis exercise, we will show a similar, but weaker statement for countable domains.Without loss of Generality, we can assume X = N. Prove that, for every sample sizem, and every 0, there exists a learner A and a distribution D over X {0, 1},such that for all samples S from Dm we have|LS(A(S)) LD(A(S))| 1 如有需要,请加QQ:99515681 或WX:codehelp
“
添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导。