” 写作EE6435课程语言、 辅导Python留学生编程EE6435 Homework 3Points: 80.Out: Oct. 15, 2020 (Thursday)Due: 11:59PM, Oct. 28, 2020 (Wed.). No late homework will be accepted.Handin method and requirement: name your notebook file (.ipynb) as yourlastnamefirstname-studentID-hw3.ipynb.For example, if your name is Amy Zhang, the file should benamed as zhang-amy-5678910-hw3.ipynb. Also, attach an html file (generated by thenotebook file) with your notebook using the naming rule: yourlastname-firstnamestudentID-hw2.html.(-10 points if missing these files)You are allowed to form a group of size =2 for this homework. In that case, you two will getthe same grade for this homework. If you choose to do this by yourself, +10 points (thatmeans, you could get 90)======================================================================Homework overview: Implement the Nave Bayes Classifier (NBC) for the given trainingdata, apply it to the given testing data, and report the accuracy on both the training and testingdata.You need to use two methods to learn the conditional probabilities for continuous attributes.One is based on a parametric distribution such as normal distribution. The other isdiscretization (you need to figure out how to do this and describe it in the report).You need to implement NBC yourself. Calling any API or existing functions of NBC willlead to 0 for this homework.Data:Canvasfiles/homework/hw3 training sample.csv, testing sample.csvThe data format and meanings are self-explanatory.Requirement and grading:1. (20 pts) Submit two programs with the two different methods of computingconditional probabilities for continuous attributes. Each python program must taketwo files as inputs. One is the training data and the other is the testing data. Inpractice, you can separate the training and testing. But in this homework, we will lookat them altogether. Dont hardcode the input files because we can change the contentsof the test data (while keeping the same format).a. If you did not follow the naming and submission requirement (ipynb+htmlfiles), -10 ptsb. If you hardcode your input files, -10 pts.c. Please use the given file format directly. If you feel you must convert theformat, you should wrap the conversion part in a function and generalize it toother files. If any manual intervention is needed, it is called hard code andwe will deduct 10 points (refer to b).2. (15 pts) Required outputs:a. Part 1: output the accuracy on both the training data and the testing data usingthe following format:i. The accuracy on training data is ____. The accuracy on testing Data is_____.b. Part 2: for the first five samples in the test files, output the results of P(class |attributes) for all classes using the following format.i. P(class 1| sample 1)=_____. P(class 2 |sample 1)=____ ..ii. P(class 1| sample 2)=_____. P(class 2 |sample 2)=_____…iii. c. Part 3: output the learned conditional probabilities P(attribute | class) foreach class and attribute. For continuous attributes, you should output thenormal distribution parameters and the discretization results.d. The output should only include the required information. Dont include anydebug information in the final programe. For any new test files, your code should be able to generate the outputs for thenew input. Unless you hardcode your input files, you should not have thisissue.3. (10 pts) Test your program using the given data. 10 pts for correct outputs.4. (15 pts) Test your program Using a different data file (same format, but differentnumber of samples and values). 20 pts for correct outputs.5. (20 pts) Submit one report in pdf format containing the following information:a. Describe your method of discretization of the continuous attributesb. Compare the two methods of computing conditional probabilities ofcontinuous attributes. Which one is better and why?i. You should not use Chinese in your report. English writing is partof the training of this course.ii. Be concise and accurate. Just include the required information(e.g. dont print the data you read, debugging information in thereport).iii. 5 pts are reserved for clarity. If we Need to read multiple times tounderstand your report, -5.6. Note that items 4 and 5 Depend on whether your programs can run correctly for ourtest files. If the codes have bugs and we cannot generate the output, you cannot getcredit for items 4 and 5.如有需要,请加QQ:99515681 或邮箱:99515681@qq.com
“
添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导。