” B31XN留学生程序 写作、Matlab语言编程HOMEWORK ASSIGNMENT 3(Heriot-Watt MSc Course B31XN)HOMEWORK ASSIGNMENT 3:STOCHASTIC OPTIMISATION FOR BINARY SVMGuidelinesAssessment The assignment will contribute 20% to the final mark for the course. For each task, markswill be given for correctly using Matlab to carry out the task and for your interpretation and commentsabout the results. Each task will have the same weight in the assignment. Additional 2 marks will begiven for respecting the guidelines.General style The report should be a pdf file. There should be a header page that includes your nameand matriculation number.Length Your assignment Should be NOT more than four sides of A4 including title, graphs, tables,references, but not including the Appendix. Use a standard font that can be read when printing the report(be also careful with the font of figure labels and captions).Content The main part of your assignment should give the results for each task that you carry out andyour comments on and conclusions from these results. Figures and tables must be included in the mainreport.Appendix The report should not contain Matlab codes, but they must be included in an Appendix. Thecodes can be provided as pdf and/or snapshots, and must be properly commentedSubmission The assignment should be submitted through Vision. It should not be submitted by email.Collaboration Students are encouraged to discuss the methods used with other students, but yoursubmitted assignment must be all your own work. In particular, copying a section of your assignment,either plots or commentary, from another student is a serious disciplinary offence. It is also an offence toallow another student to copy your work, so your assignment files should not be made available to otherstudents.Deadline The deadline for submission is 26 March 2021; assignments may be submitted early.Late submissions Any assignment that is submitted after the submission deadline but within 5 workingdays of the deadline will be penalised by a 30% reduction in the mark. Assignments that are submittedmore than 5 working days past the deadline will not be marked.HOMEWORK ASSIGNMENT 3: STOCHASTIC OPTIMISATION FOR BINARY SVM 1/9Dr Audrey RepettiHOMEWORK ASSIGNMENT 3(Heriot-Watt MSc Course B31XN)Figure 1: Sample of the MNIST dataset ( https://yann.lecun.com/exdb/mnist/).Figure 2: Sample of the MNIST dataset showing handwritten digits 0 and 1.Outline Stochastic proximal Algorithms Machine Learning Binary classifier1 Project descriptionThis project aims to illustrate stochastic proximal methods in the context of machine learning. You willneed to implement a stochastic gradient forward-backward (FB) algorithm with Matlab, to train a binarysupport vector machine (SVM) classifier. To help with this task, you will also implement the standardFB algorithm.HOMEWORK ASSIGNMENT 3: STOCHASTIC OPTIMISATION FOR BINARY SVM 2/9Dr Audrey RepettiHOMEWORK ASSIGNMENT 3(Heriot-Watt MSc Course B31XN)Binary SVMThe objective in binary regression is to be able to correctly predict a binary label (e.g. yes vs. no, or 0vs. 1, etc.). Binary classification aims to pair an input x RN to a binary output y {1, +1}, via amodel of the formy = dw(x) = signxw, (1)where denotes the transpose operation, and w RN are weights that need to be learned. Precisely,the supervised learning task consists in finding w such that for a given input, the output will be correctlypredicted by the classifier. The Classifier is trained on a training set of pairsS =(x`, y`)16`6L |(` {1, . . . , L}) x` RN , y` {1, +1}. (2)A sample of the first 21 images from the MNIST test set are shown in Figure 1. The full MNISTdataset can be found at https://yann.lecun.com/exdb/mnist/, it contains 60000 training images of size28 28, and 10000 testing images of size 28 28, all representing handwritten digits between 0 and 9.A binary classifier will aim to provide a binary answer to a question (e.g. yes vs. no, or 0 vs. 1).For this data set, the classifier could be trained, e.g., to recognise the digit 0 among the other digits. Asimplified version would be to only look at 0 and 1 digits, and classify the digits depending if they are 0or 1. A sub-sample of the MNIST test set is given in Figure 2, only showing digits 0 and 1.Minimisation problemA classical approach for learning the vector of weights w is to define it as a solution tominimisewRNg(w) + h(w), (3)where g 0(RN ) is a regularisation function, and h 0(RN ) is the loss function. In the context ofsparse learning, g can be chosen to be an `1 norm:(w RN ) g(w) = kwk1, (4)where 0. Regarding h, a classical choice is the Huber loss. Let(w RN ) h(w) = 1LXL`=1(y` x` w), (5)with 0(R) is the Huber Function defined as(u R) (u) = (12u2if |u| 6 ,|u| 12otherwise.(6)Matlab toolboxThe file Main.m contains the main commands to implement the binary classifier.HOMEWORK ASSIGNMENT 3: STOCHASTIC OPTIMISATION FOR BINARY SVM 3/9Dr Audrey RepettiHOMEWORK ASSIGNMENT 3(Heriot-Watt MSc Course B31XN)Datasets. In this project, you will handle two datasets, both obtained from the MNIST dataset. dataset subMNIST.mat: Dataset that only contains digits 0 and 1. dataset MNIST.mat: Full MNIST dataset.The variables X test and X train are Tensors of dimension Nx Ny I and Nx Ny L, respectively,where Nx = Ny = 28 and I (resp. L) is the total number of images contained in the testing set (resp.training set). For instance, the i-th image in X test is given by X test(:,:,i). The variables Y test andY train are vectors of dimension L containing labels in {1, +1}. The datasets have been preprocessedsuch that the labels correspond to +1 if the associated the digit is 0, and the labels are 1 if the associateddigit is not 0. For instance, for the dataset dataset subMNIST.mat, the first digit in X test (i.e.X test(:,:,1)) is 1, so the the first coefficient in Y test (i.e. Y test(1)) is 1. The second digit X test(:,:,2)is 0, so the the second coefficient Y test(2) is +1.Binary classifier model. You will aim to train a classifier to determine whether a hand-written digit isa 0 or not. You will use a linear model of the form (1). This model in Matlab is implemented as follows:where x is the image to be tested to check whether it is the digit 0 or not, and w is the variable thatneeds to be trained.In order to test the full training set with the learned variable w, the Matlab function d test will beused, defined as follows:To determine the percentage of errors done by the classifier, one can computeE(w) = 100 12PIi=1 |dw(xi) yi|I, (7)where (xi)16i6I are the testing images, and (yi)16i6I are the associated labels (= +1 if the associateddigit is a 0, and = 1 if it is not a 0).Additional information. Further Details are provided below. op norm.m: val = op norm(A, At, im size) Compute the squared spectral norm of operator A A and At: operator and its adjoint. Must be defined as functions. im size: size of the input, e.g. if A applies to a variable of size [Nx,Ny], then im size=[Nx,Ny]. huber.m: h = huber(x, delta) Compute the Huber function defined in (6). x can be either a scalar or a vector.HOMEWORK ASSIGNMENT 3: STOCHASTIC OPTIMISATION FOR BINARY SVM 4/9Dr Audrey RepettiHOMEWORK ASSIGNMENT 3(Heriot-Watt MSc Course B31XN) delta: scalar value corresponding to the parameter in (6). If x is a vector of size L, then the output of the function is a vector of size L as well. huber grad: g=huber grad(x,delta) Compute the gradient of the Huber function defined in (6). x can be either a scalar or a vector. delta: scalar value corresponding to the parameter in (6). If x is a vector of size L, then the output of the function is a vector of size L as well. This function must be updated. TO BE COMPLETED: The parts of the code to be completed (or changed) are highlighted withcomments, e.g.:2 Tasks1. Before designing a stochastic gradient forward-backward to train a classifier, you will first builda classical forward-backward algorithm to solve (3) (without stochastic gradient). The questionsbelow will drive you for this task.(a) [1 Mark]Let X = (x`)16`6L RNL be the matrix Containing in the `-th column the input x`. LetY = (y`)16`6L RL.Show that(w RN ) h(w) = 1L(Y Xw) (8)where : RL R is defined as(u = (u`)16`6L RL) (u) = XL`=1(u`), (9)with defined in (6).(b) [2 Mark]What is the gradient of function h? What is the Lipschitz constant of its gradient?(c) [1 Mark]What is the proximity operator of function g?(d) [1 Mark]Give the FB iterations to solve (3).HOMEWORK ASSIGNMENT 3: STOCHASTIC OPTIMISATION FOR BINARY SVM 5/9Dr Audrey RepettiHOMEWORK ASSIGNMENT 3(Heriot-Watt MSc Course B31XN)2. [2 Marks]You will now implement the FB algorithm with MATLAB. In file main.m, compute compute the Lipschitz constant of h: In file FB.m Update the first lines corresponding to the different parameter, functions andoperators involved to run the FB algorithm to solve (3): gamma: step-size for FB g and h: compute Functions g and h given in equations (4) and (5), respectively. grad =@(w): gradient of the smooth function, computed at w. To compute this gradient,you also need to update the file huber grad.m, with function g=huber grad(x,delta). prox =@(w, T): proximity operator of the proximable function, computed at w, withparameter T. In file FB.m update the FB steps:3. [2 Marks]Run the FB algorithm to solve problem (3), with both the two MNIST datasets. For each dataset,comment on the results, and give a figure showing the percentage of error of the classifier obtainedduring the training process, as a function of time.4. For this task, you will aim to build a stochastic gradient forward-backward (SGFB) algorithm tosolve (3).HOMEWORK ASSIGNMENT 3: STOCHASTIC OPTIMISATION FOR BINARY SVM 6/9Dr Audrey RepettiHOMEWORK ASSIGNMENT 3(Heriot-Watt MSc Course B31XN)(a) [1 Mark]For every k N, let Lk {1, . . . , D}, and Lk = ] Lk. Let XLk = (x`)`Lk RLkN andYLk = (y`)`Lk RLk .In the context of SGFB, give the approximated gradient of h, only involving the functionsand operators associated with indices Lk.(b) [1 Mark]Give the SGFB iterations to solve problem (3), assuming that, at each iteration k N, asubset Dk {1, . . . , D} of the D functions is randomly selected.(c) [1 Mark]State clearly the convergence conditions.5. [2 Marks]You will now implement the SGFB algorithm with MATLAB. In file FB sto.m, update the first lines corresponding to the different parameters, functions,and operators involved to run the SGFB algorithm to solve (3):They are all the same as for FB, apart from the computation of the approximated gradient,that should only involve the functions/operators associated with the selected indices Ind. In file FB sto.m update the FB steps:Additional information: In file Main.m, you will need to choose the parameters p and theta:HOMEWORK ASSIGNMENT 3: STOCHASTIC OPTIMISATION FOR BINARY SVM 7/9Dr Audrey RepettiHOMEWORK ASSIGNMENT 3(Heriot-Watt MSc Course B31XN) p: parameter with values in ]0, 1], corresponding to the proportion of indices d {1, . . . , D} that are randomly Selected at each iteration. theta: parameter to control convergence speed that must appear in the choice of thestep-size.Remark: In this task, the main difficulty is to build the approximated gradient, hence finding asuitable Matlab syntax is part of the question.6. [2 Marks]Run the SGFB algorithm for the MNIST dataset only containing 0 and 1 digits, considering thefollowing parameters: p {1, 1%, 10%, 50%} and {0.6, 0.8, 1}.Give tables (see below) with (i) the percentage error when applying the learned classifier to the testsets, (ii) the total time necessary to train the classifier, and (iii) the final objective value of h + g.You may want to run the Algorithm multiple times to take an average of both percentage error andconvergence time.Remark: If you prefer, you Can provide figures.Percentage of error \ p 1 1% 10% 50%0.60.81Convergence time \ p 1 1% 10% 50%0.60.81Final objective value \ p 1 1% 10% 50%0.60.817. [2 Marks]Comment on the Matlab results obtained for the previous questions.Based on your comments, choose the best values for p and and train the classifier with the SGFBalgorithm on the full MNIST dataset. Comment your results.Remark: The problem being convex, the minimum objective value should be the same for allmethods. It may happen that you observe that it is not the case. This might be due to the consideredstopping criterion.Keeping this in mind, you can use the results obtained from the classical FB algorithm as groundtruth results.Do not change the stopping criterion, only acknowledge this fact if you observe it in your simulations.[Additional 2 Marks] If guidelines are respected.HOMEWORK ASSIGNMENT 3: STOCHASTIC OPTIMISATION FOR BINARY SVM 8/9Dr Audrey RepettiHOMEWORK ASSIGNMENT 3(Heriot-Watt MSc Course B31XN)Matlab helpTo obtain average values in Task 6, one Can use loops in Matlab to automatically change parameters andrun multiple times an algorithm:HOMEWORK ASSIGNMENT 3: STOCHASTIC OPTIMISATION FOR BINARY SVM 9/9请加QQ:99515681 或邮箱:99515681@qq.com WX:codehelp
“
添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导。