CSCI - 4146编程 写作、Data Science程序 辅导

” CSCI – 4146编程 写作、Data Science程序 辅导CSCI – 4146 – The Process of Data Science – Fall 2020Assignment 2The submission must be done through Brightspace.Due date and time as shown on Brightspace under Assignments.● To prepare your assignment solution use the assignment template notebook availableon Brightspace.● The detailed requirements for your writing and code can be found in the evaluation rubricdocument on Brightspace.● Questions will be marked individually with a letter grade. Their weights are shown inparentheses after the question.● Assignments can be done by a pair of students, or individually. If the submission is by apair of students, only one of the students should submit the assignment on Brightspace.● We will use plagiarism tools to detect any type of cheating and copying (your code andPDF).● Your submission is a single Jupyter notebook and a PDF (With the compiled resultsgenerated by your Jupyter notebook). File names should be:○ A2-your_name1-your_name2.ipynb○ A2-your_name1-your_name2.pdf● Forgetting to submit both files results in 0 markings for both students.Predictive maintenance (PdM) is gaining traction in the industry. In PdM, components arereplaced as they approach Failure, not at prescribed intervals (Preventative Maintenance). ForPdM, equipment is monitored by sensors, and machine learning models are used to predict theremaining useful life (RUL) (Fig 1.) of the equipment based on data streams generated by thesensors. The data is typically a time series of sensor measurements collected until failure.Figure 1: Illustration of an RUL.[1]As shown in (Fig 2), a machinery health prognostic program is generally composed of fourtechnical processes, i.e., data acquisition, health indicator (HI) construction, health stage (HS)division and RUL Prediction. At first, measured data, such as vibration signals, are acquiredfrom sensors to monitor the health condition of machinery. Then, from the measured data, HIsare constructed using signal processing techniques, artificial intelligent (AI) techniques, etc., torepresent the health condition of machinery. After that, according to the varying degradationtrends of HIs, the whole lifetime of machinery is divided into two or more different HSs. Finally,in the HS which presents an obvious degradation trend, the RUL is predicted with the analysisof the degradation trends and a pre-specified failure threshold (FT).[2]Figure 2: Four Technical processes in a machinery health prognostic program.[2]In this assignment, you will need to predict an RLU of bearings. For the specific data set frombearings (#4 of the datasets in the NASA Prognostics Center repository, httpss://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/), the data consists ofvibration measurements collected by accelerometers in the experimental set up (bearing testrig) described in Fig. 16 of this publication and reproduced below. Each accelerometer providesa single scalar measurement per sample. The sampling rate is 20KHz (20,000 samples persecond). The three-time series data sets are described in detail in this document. Each data setconsists of individual files containing 1-second worth of vibration signal measurements recordedat specific intervals. The file name indicates when the data was collected. Each row in the datafile is a data point. Each row contains several measurements (channels), one from eachaccelerometer in the experimental setup.Example of a bearing (there are several other types).1. Data understanding and feature engineering (0.1)a. We will extract features from each channel of each of the data files of Test set 2.The features will be statistical time-domain features typically used in bearingmonitoring. The six features to extract are RMS, Variance, Skewness, Kurtosis,Shape factor and Crest factor (Table 1) [3]. Use =1 in the formulas for variance,skewness and kurtosis. Your dataset should consist of 7 features: vibrationalsignal plus the six time-domain features.b. Build the data quality reportc. Identify data Quality issues and build the data quality pland. Analyze your data. Plot the six features as functions of time for each of thechannels. Compute and plot the histograms of the vibration signals for each datafile. Describe your observations. How similar are the plots of the differentchannels? Is there any evidence in the plots for which features are the mostuseful for the RUL prediction task? Is the normalization of the data useful?e. Preprocess your data according to the data quality plan2. Build a baseline model to predict RLU (0.35). In Test set 2, there are four channels, withchannel 1 corresponding to the bearing that failed (bearing 1)a. Explain what is the task youre solving (e.g., supervised x unsupervised,classification x regression x clustering or similarity matching x etc)b. Use a feature selection method to select the features to build a model. Considerdifferent feature choices: features from channel 1 only, features from all fourchannels, and different subsets of the six features.c. Select the evaluation metric. Justify your choice.d. Perform Hyperparameter tuning if applicable.e. Train and evaluate your model on test data from Test set 1f. How do you make sure not to overfit?g. Plot learning curveh. Analyze the results3. Build a NN model to predict RLU (0.35). Repeat question #2 above but now use a neuralnetwork model to predict RLU. You can use a simple feedforward neural network or 1D CNNfrom tutorial 6. Compare the model to your baseline model with a statistical significance test.Use a box-plot to visualize your comparison.4. Concept drift detection (0.2). Use concept drift methods and find out if there is any drift inthe data that can be detected. If so, what type of drift is that? Suggest specific actions to adaptyour model to the new concept.References:[1] D. A. Tobon-Mejia, K. Medjaher, N. Zerhouni and G. Tripot, A Data-Driven FailurePrognostics Method Based on Mixture of Gaussians Hidden Markov Models, in IEEETransactions on Reliability, vol. 61, no. 2, pp. 491-503, June 2012[2] Machinery Health prognostics: A systematic review from data acquisition to RUL prediction.2018. Yaguo Lei , Naipeng Li, Liang Guo, Ningbo Li, Tao Yan, Jing Lin[3] Caesarendra, Wahyu, and Tegoeh Tjahjowidodo. A review of feature extraction methods invibration-based condition monitoring and its application for degradation trend estimation oflow-speed slew Bearing. Machines 5.4 (2017): 21.如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导