
” 辅导COMP0169编程、Python编程设计 写作Coursework I:Learning Learning, From-ScratchCOMP0169 TeamThe total points for this exercise is 100.We will be using two datasets that we will provide to you: Iris and MNIST. You will be provided with twosubsets for each of the dataset, a training subset (XX train samples.npy and XX train labels.npy withXX the name of the dataset) and a validation subset (XX val samples.npy and XX val labels.npy). Wewithhold the test sub-sets of these two data sets and a hidden training set Hidden.We will test Correctness of your code on Hidden. An anonymised leader board with test subsets and onHidden will be published at Moodle.Iris dataset contains the following features in order: sepal length, sepal width, petal length, petal width.Classes names are: Iris Setosa for label 0, Iris Versicolour for label 1, and Iris Virginica for label 2.Each exercise must be implemented from scratch. Libraries are allowed unless differently specified. Weencourage to test the correctness of results using libraries.1 Line Fitting (10 points)a Implement the normal equation solver function nsolve, which takes as input the matrix X and thetarget vector y and returns the optimized weights w. (5 points)1b Implement lineFit(X,y) which should fit a linear function to the input data. Test your implementationon the following task: predict with linear fitting the petal length (cm) of the Iris dataset using the threeremaining variables as inputs (sepal length (cm), sepal width (cm) and petal width (cm)). Report theL2 loss on the validation set and plot a graph showing the correlation between y and your predictionon the validation set (2 points)c Implement polyFit(X,y) which should fit a 2nd degree polynomial to the input data. Test yourimplementation on the following task: predict with the polynomial the petal width (cm) of the Irisdataset using the three remaining variables as inputs (sepal length (cm), sepal width (cm), petal length(cm), petal width (cm)). The 2nd degree polynomial should consider all possible pairwise terms, i.e.w1x2 + w2xy + w3y2 + w4x + w5y + w6 in the case of two input variables x and y. Report the L2 losson the validation set and plot a graph showing the correlation between y and your prediction on thevalidation set (3 points)2 Clustering (14 points)a) Implement a function pca(X, ndims) that performs PCA over the input data X and returns both themean vector X and the ndims top components. The top components are the eigen vectors linked tothe top eigen Values computed from the Covariance matrix. Try your function on the MNIST dataset,which is composed of 10 digit classes. Display the top 10 components fitted on the train dataset asimages and check that you can reconstruct perfectly an input digit from the validation set using allcomponents (7 points)b) Perform independent Research on the clustering algorithm k-means. Implement a function kmeansperforming k-means on input data X. Propose the interface to that function (i.e., what is its input andoutput?) and write in three sentences why this is. Apply you Kmeans implementation on the MNISTtraining set with k = 10 clusters and display the centroids as images (5 points).c) Describe the k-means algorithm, highlighting similarities and differences from KNN. Compare thereconstruction loss on the validation set for both k-means and PCA. Write no more than a third of apage. (2 points)3 Linear Classification (26 points)a) Implement the normal equation-based binary linear classifier lclass(examplesA, examplesB, testExample)where the first two arguments are the set of samples from class A and class B respectively and the thirdis the test. The function should return 0 if test is in A and 1 otherwise. It should, for simplicity, bothtrain and test in one function call. (5 points)b) Test this on all the samples in Iris, Setosa vs non-Setosa, etc and propose a simple analysis (text,figure, table) of the result you find, but not longer than the third of a page. (6 points)c) Perform independent research How to do multi-class classification. Implement lmclass(examples,class, testExample) that Performs multi-class classification of the examples examples accordingto the vector of labels class of the same size and tests it with testExample by returning the vectorprobability of being class i. (10 points)d) Present findings applying multi-class classification on Iris dataset with 3 classes. You can includefigures and tables if needed. Write no longer than third of a page. (5 points)24 Non-linear Classification (25 points)a) Implement classification based on logistic regression using GD by implementing the gradient functiondeLogistic(preds, X, Y) and optimizing using GD. preds are the prediction from the model, X arethe data and Y are the labels. (5 points)b) Implement classification based on hinge loss (5 points) using GD by implementing the gradient functiondeHinge(preds, W, x, y) and optimizing using GD. preds are the prediction from the model, Wdescribes the model parameters, x is the data and y represent the labels. (5 points)c) Implement kernel SVM function ksvm(kernel, x, y, xtest). The function takes as input a kernel,training data and a set of test points. The function returns the set of support vectors along with thepredicted labels. You are allowed to use scipy optimization library to solve the quadratic problem ofSVM. (10 points)5 Neural network (25 points)a) Devise a three-layer neural Network with n hidden states and sigmoid activations for classification.Explain how many parameters this has in one sentence. (2 points)b) Provide the equation for the Gradient using chain rule for the network in point a). (8 points)c) Implement the binary classifier nnclass(examplesA, examplesB, testExample) that is trained withyour implementation of (stochastic) GD and your gradient function using the network. (10 points)d) Do an analysis how changes of n affect the accuracy, not longer than a third of a page. A table and /or plot is welcome. Pay attention to table formatting and labels. (5 points)如有需要,请加QQ:99515681 或邮箱:99515681@qq.com
“
添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导。







