” Chi-squared编程 辅导、 写作Python,c/c++,Java编程Chi-squared Goodness of Fit Test ProjectOverview and RationaleThis assignment is designed to provide you with hands-on experience in generatingrandom values and performing statistical analysis on those values.Course OutcomesThis assignment is directly linked to the following key learning outcomes from the coursesyllabus: Use descriptive, Heuristic and prescriptive analysis to drive business strategies andactionsAssignment SummaryFollow the instructions in this project document to generate a number of different randomvalues using random number generation algorithm in Excel, the Inverse Transform. Thenapply the Chi-squared Goodness of Fit test to verify whether their generated values belongto a particular probability distribution. Finally, complete a report summarizing the resultsin your Excel workbook. Submit both the report and the Excel workbook.The Excel workbook contains all statistical work. The report should explain theexperiments and their respective conclusions, and additional information as indicated ineach problem. Be sure to include all your findings along with important statistical issues.Format GuidelinesThe report should follow the following format:(i) Introduction(ii) Analysis(iii) ConclusionAnd be 1000 – 1200 words in length and presented in the APA formatProject Instructions:algorithm作业 辅导、 写作Python,c/c++,Java编程The project consists of 4 problems and a summary set of questions. For each problem, tomhints and theoretical background is provided.Complete each section in a separate worksheet of the same workbook (Excel file). Nameyour Excel workbook as follows:ALY6050-Module 1 Project Your Last Name First Initial.xlsxIn the following set of problems, r is the standard uniform random value (a continuousrandom value between 0 and 1).Problem 1Generate 1000 random values r. For each r generated, calculate the random value by: = (),where Ln is the natural logarithm function.Investigate the probability distribution of X by doing the following:1. Create a relative frequency histogram of X.2. Select a probability distribution that, in your judgement, is the best fit for X.3. Support your assertion above by creating a probability plot for X.4. Support your assertion above by performing a Chi-squared test of best fit with a 0.05level of significance.5. In the word document, describe your methodologies and conclusions.6. In the word document, explain what you have learned from this experiment.Hints and Theoretical BackgroundA popular method for generating random values according to a certain probabilitydistribution is to use the inverse transform method. In this method, the cumulativefunction of the distribution (F(x)) is used for such a random number generation. Morespecifically, a standard uniform random value r is generated first. Most softwareenvironments are capable of generating such a value. In Excel and R, functions=RAND() and runif() generate such a value respectively. After r has been created, itthen replaces F(x) in the expression of the cumulative function and the resultingequation is solved for the variable x.For example, suppose we wish to generate a random value according to the exponentialdistribution with a certain mean (say ). The cumulative function for the exponentialdistribution is:() = (The quantity 1/ in the above description is called the rate of the exponential randomvariable and is denoted by .)Therefore, to generate a random value x that belongs to the exponential distributionwith a mean of . We first generate a standard uniform value r, then replace F(x) by r inthe above expression, and solve the resulting equation for the variable x: = = = ( ) = ( )The formula above means that if R is a standard uniform random variable, then therandom variable X obtained by the expression = ( ) will belong to theexponential distribution with an average which is equal to the value of . This formulacan be simplified as: = ()(Note that If R is a standard uniform random variable, then (1 ) is also standarduniform.)A special case of the above formula is when = . This means that a random variable xgenerated by the formula = ()is an exponential random variable with anaverage of 1 (or, rate=1).Problem 2Generate three sets of standard uniform random values, , and , each consisting of10,000 values. Next, calculate the random value x according to the following formula: = ().Investigate the probability distribution of X by doing the following:1. Create a relative frequency histogram of X.2. Select a probability distribution that, in your judgement, is the best fit for X.3. Support your assertion above by creating a probability plot for X.4. Support your assertion above by performing a Chi-squared test of best fit with a 0.05level of significance.5. In the word document, describe your methodologies and conclusions.6. In the word document, explain what you have learned from this experiment.Hints and Theoretical Background:This problem is related to a theorem in the probability theory. The theorem states that:If , , , are n identical and independent exponential random variables each witha mean of , then the random variable obtained by their sum, that is + + + ,will have a (, ) probability distribution, where n is the shape parameter ofthe Gamma distribution and = .From the Hints and Theoretical Background of Problem 1, we know that if R is astandard uniform random variable, then = () is an exponential random variablewith an average of 1. Therefore, if , , and are three independent standarduniform random variables, then = ()) , = (), and = ()arethree independent and identical (each with a mean of 1) exponential random variables.Thus, according to the theorem above, the random variable formed by their sum, that is() + () + (()), will belong to the (, )probabilitydistribution.However algebraically,() + () + () = () + () + () = ( ).Therefore, if , , and are three independent standard uniform random variablesbetween zero and 1, then the random variable X formed by the formula =( ) will belong to the (, ) probability distribution.Problem 3Generate a set of 1000 pairs of standard uniform random values and . Then performthe following algorithm for each of these 1000 pairs: Let the output of this algorithm bedenoted by Y.Step 1: Generate random values = () and = ()Step 2: Calculate = () . If , then generate a random number . If . accept as (that is, let = ); otherwise if . , else accept as (that is, let =).If , no result is obtained, and the algorithm returns to step 1. This means that thealgorithm skips the pair and for which without generating any result andmoves to the next pair and.After repeating the above algorithm 1000 times, a number N of the Y values will begenerated. Obviously , since there will be instances when a pair and wouldnot generate any result, and consequently that pair would be wasted.Investigate the probability distribution of by doing the following:1. Create a relative frequency histogram of .2. Select a probability distribution that, in your judgement, is the best fit for .3. Support your assertion Above by creating a probability plot for .4. Support your assertion above by performing a Chi-squared test of best fit with a0.05 level of significance.5. In the word document, describe your methodologies and conclusions.6. In the word document, explain what you have learned from this experiment.Hints and Theoretical BackgroundOther than the inverse transform method used for generating random values that areaccording to a certain particular probability distribution, a second applied method forgenerating random values is the Rejection algorithm. The details of this algorithm areexplained below:Suppose we wish to generate random values x that is according to a certain probabilitydistribution with ()as its probability density function (pdf). Also suppose that thefollowing two conditions are satisfied(i) we are able to generate random values y that belong to a probability distributionwhose probability density function is (),(ii) there exists a positive constant C such that ()() for all y values (this meansthat the ratio (()()) is always bounded and does not grow indefinitely. Thiscondition is almost always satisfied for any two probability density functions() and ()).The rejection algorithm can now be implemented as follows:Step 1: Generate a Random value y that belongs to the probability distribution with() as its pdf and generate a standard uniform random value r.Step 2: Evaluate = () (). If , then accept y as the random variable x (that is, let =); otherwise return to Step1 and try another pair of ( , ) values.A few remarks about the Rejection algorithm is worth noting:1. The probability that the generated y value will be accepted as x, is: () (). This is thereason why the algorithm uses a standard uniform value r and accepts y as x if () () .2. Each iteration of the algorithm will independently result in an accepted value with aprobability equal to: () () = . Therefore, the number of iterations neededto generate one accepted y value follows a geometric probability distribution withmean C.Relevancy of Problem 3 to the Rejection Algorithms:In problem 3, the Random variable y , selected from an exponential probabilitydistribution with rate =1 and a pdf of () = , is used to first generate the absolutevalue of a standard normal random variable x (||has the pdf: () = ), andthen assign positive or negative signs to this value (through a standard uniform variabler) in order to obtain a standard normal random value. It can be shown algebraically that()() = () for all y values (note that () for all y values). Therefore,the constant C in The assumptions of the algorithm can be chosen to be: = . . Therefore, () () = () . Hence the following algorithm can be used to generatethe absolute value of a standard normal random variable:Step 1: Generate random variables Y and R; with Y being exponential with ate=1, and Rbeing uniform on (, )Step 2: If () , then accept Y as the random variable X (that is, set = );otherwise return to Step1 and try another pair of ( ,) values.Note that in step 2 of the above algorithm, the condition () is mathematicallyequivalent to: () () . However, we have already seen in the Hints andTheoretical Backgrounds of the earlier problems that if R is standard uniform, then() is exponential with rate=1. Therefore, the algorithm for generating the absolutevalue of the standard normal random variable can be modified as follows:Step 1: Generate independent exponential random variables and ; each withate=1.Step 2: Evaluate = ()2. If , then accept as the random variable X (that is,set = ); otherwise return to Step1 and try another pair of ( , ) values.In fact, it is the above version of the Rejection algorithm that is being implemented inProblem 3. However, in order to obtain a standard normal random value (instead of itsabsolute value), the step 2 of the above algorithm has been modified as follows:Step 2: Evaluate = () . If , then generate a standard uniform variable R. If . , set = , otherwise set = . If , return to step 1 and try anotherpair of ( , ) values.Note: The standard normal random value generated by the Rejection algorithm can beused to generate Any normal random value with a mean and a standard deviation .Once a standard normal variable Z has been generated, it suffices to evaluate + togenerate the desired normal variable.Problem 4In the algorithm of problem #3 above, there are instances when the generated randomvalues do not satisfy the condition In order to obtain an acceptable value for . Insuch cases, the algorithm Returns to step 1 and generates another two values to check foracceptance. Let be the number of iterations needed to generate of the accepted values ( ). Let = .(For example, suppose that the algorithm has produced 700 values ( = ) after 1000iterations ( = ). Then = = . . This means that it takes the algorithm1.43 iterations to produce one output. In fact, itself is a random variable. Theoretically,() – the expected value (i.e., average) of of an algorithm is a measure of efficiencyof that algorithm.)Investigate by the following sequence of exploratory data analytic methods:1. Estimate the expected value and the standard deviation of .2. Select a probability distribution that, in your judgement, is the best fit for .3. Support your assertion above by performing a Chi-squared test of best fit with a 0.05level of significance.4. As the number of iterations becomes larger, the values will approach a certainlimiting value. Investigate this limiting value of by completing the following table andplotting versus . What value do you propose for the limiting value that approaches to?如有需要,请加QQ:99515681 或邮箱:99515681@qq.com
“
添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导。