写作CITS 2401编程、 辅导R编程

” 写作CITS 2401编程、 辅导R编程Computer Analysisand VisualisationAssignment 1Tweet AnalysisWorth: 5% of the unitSubmission: Answer the questions on the quiz server.Deadline: 11 March 2021 5pmLate submissions: late submissions attract 5% raw penalty per day up to 7 days (i.e., 18 March 2021 5pm). After, the markwill be 0 (zero). Also, any plagiarised work will be marked zero.1. OutlineNatural language processing (NLP) is useful yet a difficult task. Our UWA Cybersecurity Research Grouphas been focusing on rumour detection and generation in order to prevent rumours causing harm to thesociety. As a first step, We built an Automated Rumour Generation Hub (ARGH) that uses various machinelearning (ML) and NLP techniques to generate rumours that are difficult to be identified by both humansand machines. In particular, Twitter has been used as the source dataset as we often observe differentrumours circulating this social media platform. However, they dont provide analytical functions for us tosummarise the data, so we have to do that ourselves.In this assignment, you will be carrying out simple data analysis tasks using tweets as outlined is theTasks section below, mostly just to test your basic Excel competency. More complex tasks will be carriedout in other assignments (stay tuned!).Note1: This is an individual assignment, please dont share your solution/code/files with others (onlyhigh-level discussion is allowed, e.g., the syntax of the formula, use of array formula with other examplesetc.). If it is found to be not your Original work, then you may be penalised.Note2: You may use intelligent formatting and colour combinations to display your worksheet in anunderstandable manner. However, dont pimp up the worksheet.Note3: You can find ARGH here: httpss://github.com/argh-rumor-detection/ARGH-Rumor-Generation,where you can run ARGH yourself using Google Colab.CITS 2401Computer Analysisand Visualisation2. TasksTask 1Import the original.txt into excel word by word. Here the term word refers to any sequence of lettersseparated by a space. Note, the text qualifier should be set to {none} when you import the text. Thissheet should be named words_data. Finally, the whole data range should be named words. Figure 1shows the example output of what it would look like if this task is done correctly.Figure 1. words_data sheet snippet.Task 2Create a new sheet named uniques_data. Import the list of unique words from the uniques.txt fileprovided. The words should be located from Cell A1. The whole range should be named uniques.Task 31. In Column B: Calculate the frequency of the unique words from the words_data sheet. You mustuse an array formula to do this. Name the cell range as freq.2. In Column C: Calculate the number of letters used for each word from the words_data. This canbe calculated by simply multiplying the number of letters by its frequency count. Name the cellrange as letters.3. In Column D: Calculate the rank based on the frequency values. You must use an array formulato do this. Name the cell range as rank.In addition, apply conditional formatting on rank where the bottom 10 ranked values (i.e., the 10smallest values) are formatted with light red filled with dark red text.Task 4Create a new sheet named stats. Add the following columns From A2 to A7:CITS 2401Computer Analysisand Visualisation1. Average2. Max3. Min4. Median5. Mode6. SDNote, SD stands for standard Deviation. Also, Average and SD should be rounded to 2 decimal places.Next, add labels as follows (Cell: Value):1. B1: Frequency2. C1: Letters3. D1: Average4. E1: Average The Frequency category (Column B) is using freq to calculate the statistics of the data. The Letters category (Column C) is using letters. The Average category (Column D) uses letters where values are greater than the average (i.e.,value in cell C2). The Average category (Column E) uses letters where values are smaller than the average.Populate all the statistics fields (i.e., B2:C7). You must not use any other supporting cells (i.e., youshould calculate all those stats directly using Excel formulas using previously populated cells only).Note: the values in the uniques_data tab (i.e., freq and letters) should be treated as the entirepopulation.Task 5Create a new sheet named charts. In this sheet, create a histogram of letters. The bin size shouldstart from 0 and the gap between the bins are 20. For this, you must use an array formula.Then, format the chart as follows:1. The gap width is set to 0%.2. The series outline is solid black line.3. The series data labels are set to Outside End.4. The title is removed.Task 6In charts sheet, create a scatter chart using freq in x-axis and letters in y-axis. Then, format thechart as follows:1. Remove the title.2. Add a linear trend line and display the R2 value.3. Add x-axis label Frequency.4. Add y-axis label Letters.The sample image of the charts Sheet is provided in Figure 2, and your solution may look similar to this.However, the data shown in the image is sample data and is not the correct result (I.e., your figuresmay look different). The yellow blocks are added to the image to hide the sample data to avoid confusion.Task 7Insert a Treemap of uniques and letters into the charts sheet. Remove the title and the legend.Change the size to 15cm height and 25cm width.CITS 2401Computer Analysisand VisualisationFigure 2. Sample image of the charts sheet (treemap not shown)3. SubmissionYou should answer the questions related to the tasks above on the quiz server by the due date – 11 March2021 5pm (drop dead due date 18 March 2021 with 5% raw penalty per day).Submit your Excel workbook on the quiz server, you should name the file as A1_[student id].xlsx.For example, if your student ID is 12345678, then your file name is A1_12345678.xlsx.Fail to follow this will result in a penalty of 50%.CITS 2401Computer Analysisand Visualisation4. RubricsCriteria Highly Satisfactory (D, HD) Satisfactory (P, CR) Unsatisfactory (N)Excel functions(10 marks) Understand variousExcel functions. Demonstrate theability to carry outvarious Excelfunctionalities andtools.Demonstrated the ability to useExcel functions fluently: Correct use of Excel functionsas appropriate.Demonstrated the Abilityto use Excel functions: Some correct uses ofExcel functions.Failed to demonstratethe ability to use Excelfunctions: Incorrect use ofExcel functions.Excel formulas(20 marks) Understand the use ofExcel formulas.Demonstrated the ability toutilise Excel formulas fluently: Correct use of Excel formulasmost adequate for theproblem. Comprehensive understandingof Excel formulas and theirusage.Demonstrated the abilityto utilise Excel formulas: Correct use of Excelformulas for theproblem. Understanding ofExcel formulas andtheir usage.Failed to demonstratethe ability to UtiliseExcel formulas: Incorrect use ofExcel formulas. Misunderstandingof Excel formulasand their usage..Excel visualisation(20 marks) Understand the use ofExcel visualisationtools.Demonstrated the ability tovisualise using Excel: The visualisation generated isaccurate and comprehensive.Demonstrated the abilityto visualise using Excel: The visualisationgenerated is accurate.Demonstrated theability to visualise usingExcel: The visualisationgenerated is notaccurate.This assignment is worth a total of 50 marks.如有需要,请加QQ:99515681 或WX:codehelp

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导