” 辅导CS4022编程、 写作JavaUNIVERSITY OF WARWICKLEVEL 7 Open Book Assessment [2 hours]Department of Computer ScienceCS4022: High Performance ComputingInstructions1. Read all instructions carefully and read through the entire paper at leastonce before you start writing.2. There are four questions. You should attempt two questions from SectionA and the one question in Section B.You should not submit answers to more than the required number ofquestions.3. All questions will carry the same Number of marks unless otherwise stated.4. You should handwrite your answers either with paper and pen or using anelectronic device with a stylus (unless you have special arrangements forexams which allow the use of a computer). Start each question on a newpage and clearly mark each page with the page number, your student id andthe question number.Handwritten notes must be scanned or photographed and all individualsolutions should (if you possibly can) be collated into a single PDF with pagesin the correct order.You must upload two files to the AEP: your PDF of solutions and a completedcover sheet.You must click FINISH ASSESSMENT to complete the submission process.After you have done so you will not be able to upload anything further.5. Please ensure that all your handwritten answers are written legibly, preferablyin dark blue or black ink. If you use a pencil ensure that it is not too faint to becaptured by a scan or photograph.6. Please check the legibility of your final submission before uploading. It is yourresponsibility to ensure That your work can be read.7. You are allowed to access module materials, notes, resources, referencesand the internet during the assessment.28. You should not try to communicate with any other candidate during theassessment period or seek assistance from anyone else in completing youranswers. The Computer Science Department expects the conduct of allstudents taking this assessment to conform to the stated requirements.Measures will be in operation to check for possible misconduct. These willinclude the use of similarity detection tools and the right to require liveinterviews with selected students following the assessment.9. By starting this assessment you are declaring yourself fit to undertake it. Youare expected to make a reasonable attempt at the assessment by answeringthe questions in the paper.Please note that:- You must have completed and uploaded your assessment before the 24hour assessment window closes.- You have an additional 45 minutes beyond the stated length of the paper toallow for downloading and uploading the assessment, your files andtechnical delays.- For further details you should refer to the AEP documentation.Use the AEP to seek advice immediately if during the assessment period: you cannot access the online assessment; you believe you have been given access to the wrong online assessment.Please note that technical support is only available between 9AM and 5PM (BST).Invigilator support will be also be available (via the AEP) between 9AM and 5PM(BST).Notify Dcs.exams@warwick.ac.uk as soon as possible if you cannot completeyour assessment because: you lose your internet connection; your device fails; you become unwell and are unable to continue; you are affected by circumstances beyond your control (e.g. fire alarm).Please note that this is for notification purposes, it is not a help line.Your assessment starts below.3Section A1. This question is about Fundamental knowledge.(a) What do we mean by the Granularity of Parallelism? Give four types of parallelism inorder of granularity and provide an application example for each. [7](b) Discuss how superthreading and hyperthreading reduce the waste of pipeline slots in thepipeline mechanism. [8](c) Discuss the differences between scientific applications such as matrix multiplication andgraph-based applications such as online-shopping recommendation. Focus yourdiscussions on data structure, performance metric and key factors that affect theperformance. [12](d) Analyse the following two for loops in Listing 1. Describe whether the iterations ofthese two loops can be parallelised automatically by compilers and explain how youreached your conclusions. [8]Loop 1:for(i=1; i=n; i++){a[i]= b[i] + c[i];d[i]= a[i];}Loop 2:for(i=2; i=n; i++)a[i]= b[i] + a[i-1];Listing 1: Two loops for Question 1(d)42. This question is about parallel programming models.(a) The Synchronous mode is a communication mode in MPI. Explain why the Synchronousmode may incur higher communication overhead than the Standard mode. [7](b) Assume there are two MPI processes running on different machines: P0 and P1. In p0,MPI_Send is first called to send message A to p1 and then MPI_Recv is called to receivemessage B from p1. In p1, MPI_Send is first called to send message B to p0 and thenMPI_Recv is called to receive message A from p0. What will happen if the sizes of bothmessage A and B exceed the system buffers managed by MPI? Explain why. [8](c) A collective communication operation is performed by all relevant processes at the sametime with the same set of parameters. However the parameters may have differentmeanings to different Processes. Describe, using illustrative examples if necessary, theoperations of the following two MPI collective communication calls. Further, discusswhat the parameters in these functions mean to different processes.i) MPI_Bcast(void *buf, int count, MPI_Datatype type, int root, MPI_CommComm) [6]ii) MPI_Gather(void *sendbuf, int sendcnt, MPI_Datatype sendtype,void *recvbuf, int recvcount, MPI_Datatype recvtype, int root,MPI_Comm comm) [6](d) MPI_Type_create_indexed_block can be used to construct the users own data types. Theformat of the function is as follows:MPI_Type_create_indexed_block ( int count,int blocklengths,int *array_of_displacements,MPI_Datatype oldtype,MPI_Datatype *newtype)Let oldtype ={(MPI_INT, 0), (MPI_CHAR, 2)} with the extent of 3 bytes.Let D=(2, 5, 10).Give the memory layout of newtype after callingMPI_Type_create_indexed_block (3, 2, D, oldtype, newtype) [8]53. This question is about high performance computing systems.(a) Discuss the differences between multicore CPU and GPU in terms of architecture designand performance objective. [7](b) The topology of node interconnection plays an important role in the performance of aCluster system. Draw the topology of a 4-D hypercube. What are the values of nodedegree and bisection width of the topology? Discuss which aspect of networkperformance node degree and bisection width represent. [8](c) Discuss the difference between Cluster systems and Grid systems. [8](d) There are three potential methods to implement parallel I/O: 1) One process performs I/Ooperations for all other processes; 2) Each process reads or writes the data from or to aseparate file; 3) Different processes access different parts of a common file. Discuss theadvantages and disadvantages of each method. Which method of parallel I/O is mostwidely used nowadays? [12]Section B4. This question is about performance modelling.(a) Consider a 3-D grid of equal-sized cells. Assume that the volume of the grid is V and thegrid is a cube (i.e., The length of the grid in each dimension is V1/3). Assume V=cn,where c is the number of cells allocated to each processor and n is the number ofprocessors. Derive the surface-to-volume ratios under 1-D, 2-D and 3-D decomposition.Further, analyse under what circumstances 2-D decomposition is better than 1-Ddecomposition. [12](b) Discuss the drawbacks of using asymptotic analysis to evaluate the performance of analgorithm. Give an example for each drawback you list. [8](c) Modelling the execution time of an application is a good way of evaluating theperformance of the application. Discuss how to model the execution time of anapplication. The discussion Should cover the modelling of both computation time andcommunication time, and the discussion should revolve around the various parametersused to model the execution time. [10]请加QQ:99515681 或邮箱:99515681@qq.com WX:codehelp
“
添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导。