SDGB-7847程序写作、c++程序程序

”

SDGB-7847作业写作、c++程序语言作业辅导、写作Java、Python课程作业
SDGB-7847
Final Exam

The data we are working with is in longitudinal format. Each column represents a patient, and each row represents a gene expression reading for genes 1-5913. The patients disease status is marked in the column header. The first 20 patients are marked with meta, meaning these patients have a form of metastatic cancer (disease=1). The last 20 patients do not have the disease (disease=0).

You will need to transform this data into a model-ready format in order to predict metastatic disease by patients expression of each gene.

Set your Rs seed to 1234.

Once your data is ready to model, separate it into training and test sets.

Apply the following algorithms- training on your training data and testing on your test data- to predict disease based on gene expression. From your test data, pull out your accuracy, sensitivity and specificity.

RF (RF on the full dataset may take a long time to run due to the number of genes being used as predictor variables)

RF+PCA

KNN + PCA (Use iteration to find optimal value of K)
In an external document, write a discussion on which algorithm you would choose and why. Discuss what the variable importance plot showed for RF and RF + PCA, the number of principal components you chose and what you chose as your optimal value of K.

Upload your code and your external explanation document by Thursday, April 30th at 8pm.

Thank you for a wonderful class and have a great summer! Stay in touch!

“

添加老师微信回复‘’官网辅导‘’获取专业老师帮助，或点击联系老师1对1在线指导。

声明：本站包含转载考而思在线或考而思。对于转载内容，本站尊重原创者劳动，保留原文作者或出处。但由于人为因素的限制，难免有疏忽、失误和遗漏，或者内容来源无法查明。如果出现类似这些情况，不管是被转载内容的原作者，还是本站读者，请及时联系本站，以确保第一时间予以修正。

本站辅导：留学课程辅导丨留学生论文辅导丨留学生作业辅导丨留学挂科申诉丨留学生课程预习

推荐：essay代写

相关文章