FIT1043编程 写作、data留学生编程 辅导、Python编程

” FIT1043编程 写作、data留学生编程 辅导、Python编程FIT1043 Assignment 1: DescriptionDue date: Friday 4 Sept 2020 – 11:55 pmAimThis assignment aims to explore and visualise data using Python as a data science tool. It will testyour ability to:1. read data files in Python and extract related data from them2. use various graphical and non-graphical tools for performing exploratory data analysis andvisualisation3. use basic tools for managing and processing data4. communicate your findings in your report.This is an individual assignment.DataCOVID-19 is a Respiratory illness caused by a new virus which has changed our livessignificantly. We aim to explore two datasets which contain relevant information about the virusand see whether different decisions and features such as applying lockdown, or the GDP of acountry had any effect on the spread of the Coronavirus or not.To achieve the goal of this analysis, we need some information about the new/total confirmedcases and deaths due to coronavirus as well as GDP and the lockdown date of each of thecountries to do our analysis.We use the following two datasets in this assignment:1. The Corona Virus dataset (Covid-data.csv) generally contains information about newcases, deaths and GDP of several countries. Although, there are many countries in theworld, we filtered the information to look at only the following countries in order to keepthe level of assignment as simple as possible for this assignment: Australia, China,France, Iran, Italy, Spain, United Kingdom, United States.Moreover, the data set has the following columns:FIT1043作业 写作、data留学生作业 辅导、Python编程语言 date location total_cases new_cases total_deaths new_deaths gdp_per_capita population2. The lockdown dataset (CountryLockdowndates.csv) which contains information about thelockdown date and the name of the country which applied the lockdown.Hand-in RequirementsPlease hand in three files including a PDF file containing your answer, a CSV file containing thecleansed data set and a Jupyter notebook file (.ipynb) containing your Python code to all thequestions respectively. Please consider the following cases for your submission:1. PDF file should contain: Answers to the questions. Make sure to include screenshots/images of the graphs yougenerate in your Report (You will need to use screen-capture functionality to createappropriate images). Moreover, please include your Python code, not the screenshotof your codes, to justify your answers to all the questions. The Turnitin would not begenerated if you include a screenshot of your codes and you will lose 20% of theassignment mark if you include a screenshot of the codes instead of writing/copyingyour codes.To generate a pdf report, you can use Word to write your report, but you need to convertit to PDF before your submission. Alternatively, an easier way is to generate a pdfversion of your Juputer notebook by hitting Ctrl+P in the Jupyter notebook. This pdffile is a mandatory requirement to check the Turnitin by Monash University.2. Ipynb file should contain:Your Python codes for this assignment. Please use the provided template underAssignment 1 resources on Moodle (StudentID_FIT1043_Assignment1_Template.ipynb).3. CSV file should contain: The cleaned data that is exported at the end of Task 1, based on the specifiedrequirements in Task1.You will need to submit three separate files. Zip, rar or any other similar filecompression format is not acceptable and will have a penalty of 10%.You will be penalized by 5% of the assignment mark (5% out of 10 marks) if you submit afterthe due date for every day that you are late. If you could not submit your assignment beforethe due date, please make sure to submit your files at most 7 days after the assignment duedate, we do not mark assignments which will be submitted after 11th of September 11:55 pm.Assignment Tasks:There are two tasks That you need to complete for this assignment. You need to use Python tocomplete the tasks.Task 1-Data wranglingFirst, you need to extract the required information from two data sources, namely Covid-data.csv, andCountryLockdowndates.csv based on our analysis requirements mentioned in the previous section,Data. Then, you need to clean the data and integrate the data sets. We call this process as datawrangling! Please pay attention that you should not delete any row from dataset Covid-data.csv during thedata wrangling process.Regarding the cleansing of data set Covid-data.csv, you need to check all the columns one by one andmake sure their values are correct. For example, we do not expect to see any value higher than 100% in acolumn which shows the percentage. Moreover, if there are some missing values, you would be able tofind the correct values based on the value of other columns. This is an important part of data science andyou need to make sure you check all the columns one by one, detect their errors and fix them.Please pay attention that in lockdown information, you would see different dates for differentstates/provinces of a country. Consider the earliest(minimum) date as the lockdown date for a country.You need to export the cleansed dataframe which is the result of this task, as a CSV file at the end of thetask and submit it in Moodle With the other two files as required. Please name the dataframe as follows:student_ID_Task1DataSet.csv.Following is a screenshot showing the columns of a dataframe which is the required output of this task(Order of columns is important).Required column names and order for the CSV file which should be printed is as follows (location, date,total_cases, new_cases, total_deaths, new_deaths, gdp_per_capita, population, lockdown_date)Task 2 ExplorationIn this part, you need to explore the dataset which you generated in Task1. Please pay attentionthat exploration is not just a visualisation with a brief explanation. You can watch the assignmentexplanation recording provided in Moodle for further clarification about how a good explorationcan be performed on a dataset.1. Create a line chart to show the trend of the daily number of new cases for each country andexplore the result of visualisation (Create one line chart for each country).2. Add a vertical line for the lockdown date to the line chart of each country which you created inthe previous question and explore if the lockdown affected the trend which is shown in the plot?Is the effect similar for all countries? Why do you think so?Following is an example of the expected plot for this question.Figure 1. Example For the output of question 2 of Task 23. Explore whether there is a relation between daily new case/death rate and the GDP of a country.To this aim, you need to calculate: The average of GDP of the countries, and then divide the countries into two groups, agroup which its GDP is above the average GDP, and another group which its GDP isbelow the average GDP. We call the former group as AboveGDP and the later asBelowGDP from now onwards. The daily new cases rate (new cases divided by population) for each country The daily new death rate (new deaths divided by population) for each countryThen, you need to create two line charts, one which shows the new case rate of groups AboveGDP andBelowGDP; and, another line chart to show the death rate of the two groups (AboveGDP andBelowGDP).a) Which group (AboveGDP or BelowGDP) usually had higher values of case rate?b) Which group (AboveGDP or BelowGDP) usually had higher values of the death rate?c) We would have Expected that the case rate and death rate of group AboveGDP will be lowerthan group BelowGDP. Is the result of your visualisation the same as the mentionedexpectation? If no, why do you think the expectation is different from the reality?GOOD LUCK!如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导