写作COMP 2019程序、辅导Python程序设计程序

” 写作COMP 2019程序、辅导Python程序设计程序COMP 2019 Assignment 2 Machine LearningPlease submit your solution via LEARNONLINE. Submission instructions are given at the end of this assignment.This assessment is due on Sunday, 14 June 2019, 11:59 PM.This assessment is worth 20% of the total marks.In this assignment you will aim to identify which hand gesture is being performed based on recordedElectromyography (EMG) data. You will perform machine learning tasks, including training a classifier,assessing its output, and optimising its performance. You will document your findings in a written report.Write concise explanations; Approximately one paragraph per task will be sufficient.Download the data file for this assignment from the course website (file EMG.zip). The archive contains thedata file in CSV format, and some python code that you may use to visualise a decision tree model.Before starting this assignment, ensure that you have worked through the three Machine Learning modulesand Practicals 23. The tasks set in this assignment require understanding of the Python programminglanguage, the Jupyter Python notebook environment, and an overall understanding of machine learningtraining and evaluation methods using the scikit-learn python library. You will need a working Python 3.xsystem with the Jupyter Notebook environment and the sklearn package installed.The Anaconda 3 Python distribution ( httpss://www.anaconda.com/distribution/) is recommended, as itincludes the packages and tools required for this assignment.Documentation That you may find useful: Python: httpss://www.python.org/doc/ Jupyter: httpss://jupyter-notebook.readthedocs.io/en/stable/ Scikit-learn: https://scikit-learn.org/stable/ Numpy: httpss://docs.scipy.org/doc/ Pandas: httpss://pandas.pydata.org/ (optional, for reading the data file)PreparationCreate a Jupyter notebook and set the random state based on your student ID.import numpy as npnp.random.seed(1234) # use your StudentID in place of 1234.Include this this code as the preamble to your code in the Jupyter notebook.Then, load the data. Use 写作COMP 2019作业、辅导Python程序设计作业import numpy as npdata = np.loadtxt(EMG.csv,skiprows=1,delimiter=,)to load the data. Type this code into the notebook. You will get a syntax error if you copy and paste from thisdocument. Students familiar with the Pandas library may use that to load and explore the data instead.Familiarise yourself with the data. There are 65 columns and 11678 rows. The first 64 columns represent thepredictors, and the 65th column represents the target label. The 64 predictors are organised in 8 blocks,where each block corresponds to Electromyography (EMG) data obtained at the same time instant. Thereare 8 time instants, 0,,7. In each block there are readings from 8 sensors (S1,,S8). Hence, the columntitled S2_3 contains sensor readings taken from the second sensor, S2, at the fourth time instant.The last column, titled Target, represents the gesture that was performed while taking the sensor readings.There are four gestures, each encoded as an integer in the range {0,,3}.Explore the distribution of data in each column.Task 1: ReportWrite a concise report showing your analysis for Questions 1-6 described below.Demonstrate that you have followed appropriate training and evaluation procedures and justify yourconclusions with Relevant evidence from the evaluation output.As part of the assignment you will need to decide and justify which training and evaluation procedures areappropriate for this data set and the given questions.Where there are alternatives (e.g. measures, procedures, models, conclusions), demonstrate that you haveconsidered all relevant alternatives and justify why the selected alternative is appropriate.Ensure that the report is professionally presented and self-contained.Do not include the python code in your report; instead, select relevant output from your program for use injustifications and discussion. Do not copy and paste the entire output into the report. The Jupyter notebookcontaining your code and complete output will be submitted as a separate deliverable.Question 1: Evaluation MetricChoose an appropriate measure to evaluate the classifier.Select among Accuracy, F1-measure, Precision, Recall, or ROC curve.Justify your selection.Note that you will need to use the same measure for all tasks in this Assignment.Question 2: BaselineConstruct a classifier that always predicts the majority class (as seen in the training data) for each sample.What performance can we expect from this simple model when applied to new data?Use a confusion matrix and/or classification report to support your analysis.Question 3: Nearest NeighbourTrain a k Nearest Neighbour classifier (KNeighborsClassifier) to predict Target.Use the Euclidean Distance, 5 neighbours, and uniform weighting for the classifier. This should be the defaultoffered by sklearn for this classifier.Ensure that you follow correct training and evaluation procedures.1. Assess how well the classifier performs on the prediction task.2. What performance can we expect from the trained model if we applied it to new data?Question 4: Decision TreeTrain a DecisionTreeClassifier to predict Target. Use the default parameter values for the classifier (that is,dont specify your own values).Ensure that you follow correct training and evaluation procedures.1. Assess how well the classifier performs on the prediction task.2. What performance can we expect from the trained model if we applied it to new data?If you wish to visualise the decision tree you can use function print_dt provided in dtutils.py in theAssignment 2 zip archive:import dtutilsdtutils.print_dt(tree, feature_names=flabels)where tree refers to the trained decision tree model, and flabels is a list of features names (columns) in thedata. This function prints a hierarchical representation of the tree where nodes deeper in the tree areindented further. For internal nodes, the children are shown. For leaf nodes, the class label associated withthe node is shown, as well as the frequency of each class among the samples associated with the node (insquare brackets).Question 5: DiagnosisDoes the Decision Tree model suffer from overfitting or underfitting? Justify what problem exists, if any, anddescribe how you have arrived at your assessment.If the model exhibits overfitting or underfitting, revise your training procedure to remedy the problem, andre-evaluate the improved model. The DecisionTreeClassifier has a number of parameters that you canconsider for tuning the model: max_depth: maximum depth of the tree min_samples_split: minimum number of samples required to split an internal node in the tree max_leaf_nodes: maximum number of leaf nodes in the tree min_samples_leaf: minimum number of samples per leaf nodesQuestion 6: RecommendationWhich of the models you trained should be selected for the prediction task?Ensure that you use the appropriate results for making a decision.Justify your recommendation.Submission InstructionsSubmit a single zip archive containing the following: emg.ipynb: the Jupyter Notebook file (in ipynb format). emg.html: the HTML version of emg.ipynb showing the notebook including all output. Create this byselecting FileDownload asHTML after having run all cells in the Jupyter notebook. emg.pdf: the report as specified in Task 1 (i.e. your answers to questions 1-6) in PDF formatRestart your Python kernel and run all cells from the top to ensure your code runs without errors prior tosaving the notebook and its HTML version.Please check that all files are in the appropriate format before submitting.Marking SchemeQuestion MarksQ1: MetricsAppropriate measure selected and justified10Q2: BaselineAppropriate measure selected and justified10Q3: k Nearest NeighbourCorrect training procedure appliedCorrect evaluation procedure appliedCorrect conclusion analysis15Q4: Decision TreeCorrect training procedure appliedCorrect evaluation procedure appliedCorrect conclusion analysis15Q5: DiagnosisCorrect diagnosisCorrect revised training and evaluation procedure applied25Q6: RecommendationCorrect recommendationsRecommendations justified by evaluation results15Report formatWell-structured reportProfessional presentationFree of grammar and spelling errorsDescribes the training process and assessment procedures used alongwith the findingsIncludes only relevant data with related discussionDoes Not include code10Jupyter notebookRandom state set based on Student ID at the start of each questionExecutes correctly when using Run All from the topContains only relevant code, no errorsUses only Packages/code mentioned in this assignmentCopy saved as HTML format submittedMatches the contents of the reportDeductions apply if criteria arenot metNo marks will be awarded for a question if the code in the notebook and section in the report are missing ordont align with each other. It is not sufficient to submit only a report or only code.如有需要，请加QQ：99515681 或邮箱：99515681@qq.com

“