MAT061课程程序 写作、 辅导Python,Java实验程序

MAT061课程作业 写作、 辅导Python,Java实验作业、 写作Java,Python编程设计作业
MA4601/MAT061 Stochastic Search and Optimisation
Assignment 2: Clustering with Cross-Entropy
Due 9:00am Tuesday 3 March, 2020
The task for assignment 2 is to use the cross-entropy method to do clustering. You will need
to submit two files: a programme file titled YOUR NAME programme.r (or .py, .jl, etc.) and a report
as a pdf file titled YOUR NAME report.pdf. Submission by email to joneso18@cardiff.ac.uk.
Reference: De Boer, Kroese, Mannor, Rubinstein (2005) A tutorial on the Cross-Entropy
method. Annals of Operations Research 134, pp. 1967.
Clustering
We are given a set of points X R
d and are told to find centroids c1, . . . , ck R
d and a
partition X = X1 Xk in order to minimise.
It can be shown that the centroids have to have the form,
so we can simplify the problem to finding the optimal partition. (If we use a distance other
than Euclidean then the centroids may not have this form.)
Solution Space
W.l.o.g. suppose that X = {x1, . . . , xn} and represent a partition into k subsets by a vector
i = (i1, . . . , in) {1, . . . , k}n, where ih = j when xh Xj . The problem is thus to minimise
S(i) = X.
Our family of probabilities on = {1, . . . , k}
n
is indexed by matrices P = (pij )i=i,…,n;j=1,…,k,
where
P(I = i) = Ynh=1
ph,ih.
P is constrained to be non-negative, with row-sums equal to 1.
1
Test Problem
Once you have your code working, apply it to the data set jain.txt, available on Learning
Central (taken from Jain Law (2005) Data Clustering: A Users Dilemma). Do it for k =
2, 3, 4, . . .
The report should describe the choices made in implementing the CE method and why they
were made. It should be presented as a stand-alone document that can be understood without
having to read your code. It should be no more than four pages long. When implementing the
CE method you should consider the following aspects:
The sample size.
The selection quantile.
The smoothing parameter.
Marks will be allocated on the following basis:
50% Code correctness (does it actually work).
20% Quality of solution (how good is the best solution found).
20% Reasoning behind choices for meta-parameters.
10% Clarity of report.”

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导