” CSCE 474程序设计 写作、 辅导Data编程CSCE 474/874: Introduction to Data Mining Spring 2021Homework 3 March 02, 2021AssignmentImplement the k-means algorithm to perform Clustering and compare your results withthe results from Weka. Assume that all the attributes are continuous variables. Your program must allow the number of clusters (k) to be specified as input. Your program must allow the epsilon (change in the sum of the distances from thecluster centers) to be specified as input. Your program must allow the number of iterations to be specified as input.Your program should stop if either the number of iterations is reached or if the change inthe total sum of the squares of the Distances (SSD) falls below epsilon.Plot the runtime of the algorithm as a function of number of clusters, number ofdimensions and size of the dataset (number of transactions).Plot the goodness of clustering as a function of the number of clusters and determine theoptimal number of clusters.Compare the performance of your algorithm with that of Weka and summarize yourresults.For this assignment you will work in teams. Use the dataset from the domain you will beworking on for the project. If the data is not suitable, you may use one from the Wekadataset.All code must be written by the members of your team. You may NOT use any codefrom ANY OTHER source, including other students and the Internet.Due DateThe assignment is due on March 16 is worth 100 points.HandinHand in a report along with the listing of your program, the output generated from the runof the test file on Canvas. Make sure that you have uploaded a signed copy of theContributions form. Prepare and Submit two files as follows: Your report named as Lastname1_Lastname2.pdf in pdf format. The signedcontributions form should be used as the cover page of your report. A zip file named Lastname1_Lastname2.zip that includes everything else (yourprogram, the output generated from the run of the test file, etc.). You must includea README file that describes the usage of your program. Make sure yourimplementation can successfully Execute on the CSE server.Grading GuidelinesImplement the k-means algorithm to perform clustering in a dataset. (50 points) Your implementation will be tested on cse.unl.edu server using the command youprovided in the README file. (30 points) In the report, you should write a paragraph about your program design (10 points)Plot the runtime of the algorithm as a function of number of clusters, number ofdimensions and size of the dataset (number of transactions). (20 points) In the report, you should write a paragraph to summarize the observation andelaborate on it.Plot the goodness of clustering as a function of the number of clusters and determine theoptimal number of clusters. (20 points) In the report, you should write a paragraph to summarize the observation andelaborate on it.Compare the performance of your Algorithm with that of Weka and summarize yourresults. (10 points) Summarize the differences (if There is any) and elaborate on it (why/how).请加QQ:99515681 或邮箱:99515681@qq.com WX:codehelp
“
添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导。