辅导KIT308/408程序、Programming程序 写作

” 辅导KIT308/408程序、Programming程序 写作KIT308/408 Multicore Architecture and ProgrammingAssignment 1 MultithreadingAims of the assignmentThe purpose of this assignment is to give you experience at writing a multithreaded program for the CPU. This assignment willgive you an opportunity to demonstrate your understanding of:creating and handling multiple threads;partitioning work;dynamic scheduling; anddata-sharing between threads.Due Date11:55pm Friday 21st of August (Week 6 of semester)Late assignments will only be accepted in exceptional circumstances and provided that the proper procedures have beenfollowed (see the School Office or this link for details) assignments which are submitted late without good reason will besubject to mark penalties if they are accepted at all (see the School office or this link for details on this as well).Forms to request extensions Of time to submit assignments are available from the School of Engineering and ICT office.Requests must be accompanied by suitable documentation and should be submitted before the assignment due date.Due to the tight schedule of this unit (and the reliance of future assignments on a solution to this one), extensions will be unableto be granted and late assignments will not be accepted. If exception circumstances occur for an individual student they mustcontact the lecture-in-charge before the assignment due date to arrange alternative methods of assessment in this unit.Assignment SubmissionYour assignment is to be submitted electronically via MyLO and should contain:An assignment cover sheet;A .zip (or .rar) containing a Visual Studio Solution containing a project for each attempted stage of the assignment (in theformat provided in the downloadable materials provided below).A document containing:A table of timing information comparing the original single-threaded times against each of three stages on eachscene file. 辅导KIT308/408作业、Programming作业 写作An analysis of the above timing data.You do not need to (and shouldnt) submit executables, temporary object files, or images. In particular, youmust delete the .vs diretory before submission as it just Visual Studio temporary files and 100s of MBs. Do not howeverdelete the Scenes folder or the Outputs folder (but do delete the images within this one).Task/Topic Marks1. Basic Multithreaded CPU Implementation 30%Correct (5%) and elegant (5%) thread management(i.e. the correct image for any number of threads (even 64 threads)10%Correct use of -thread command-line argument 5%Successful use of -testMode and -colourise command-line arguments 5%Correct partitioning of render for images with height equally divisible by thread count(i.e. the correct image, with exactly equal work area for each thread, and with the correct number of samplesrendered for: Tests 1, 2, 4, 5, and 7 With 8 threads)5%Correct partitioning of render for images with height NOT equally divisible by thread count(i.e. the correct image, with work area allocated evenly across threads, and with the correct number ofsamples rendered for: Tests 3 and 6 with 8 threads, and also Test 1 with 3, 5, 6, 7, and 513 threads)5%2. Dynamically Balanced Line-Based Multithreaded CPU Implementation 20%Correct (5%) and elegant (5%) thread management 10%Correct (5%) and elegant (5%) data-sharing between threads(i.e. no possibility of threads fighting over shared memory locations)10%3. Dynamically Balanced Square-Based Multithreaded CPU Implementation 30%Correct and elegant thread management 5%Correct and elegant data-sharing between threads 5%Correct use of -blockSize command-line argument 5%Correct partitioning of render into squares for images with size that is equally divisible by block size(i.e. the correct image, with exactly equal work area for each block, rendered for: Tests 1, 2, 5, and 7 withblock size 8 and 64)5%Correct partitioning of render into squares for images with size that is NOT equally divisible by block size(i.e. the correct image, with work area for each block either at block size or smaller at edges, rendered for:Tests 3, 4, and 6 with block size 8 and 64)5%Correct partitioning of render with no extra samples generated(i.e. the correct image, with work area for each block either at block size or smaller at edges, and with thecorrect number of samples Rendered for: Tests 3, 4, and 6 with block size 8 and 64)5%Documentation 20%Outputs showing timing Information for each stage on all applicable scene files with all thread counts(to get full marks for this part your code needs to pass all tests for all stages above)10%Analysis of data with respect to CPU used(to get full marks for this part your code needs to pass all tests for all stages above)10%PenaltiesFailure to comply with submission instructions (eg. no cover sheet, incorrect submission of files, submission ofunnecessary temporary or image files, abnormal solution/project structure, etc.)-10%Poor programming style (eg. insufficient / poor comments, poor variable names, poor indenting, obfuscatedcode without documentation, compiler warnings, etc.)up to -20%Lateness (-20% for up to 24 hours, -50% for up to 7 days, -100% after 7 days)Late assignments will not be acceptedup to -100%MarkingThis assignment will be marked out of 100. The following is the breakdown of marks:Programming StyleThis assignment is not focussed on programming style, but you should endeavour to follow good programming practices. Youshould, for example:comment your code;use sensible variables names;use correct and consistent indenting; andinternally document (with comments) any notable design decisions.[NOTE: any examples in the provided assignment materials that dont live up to the above criteria, should be considered to bedeliberate examples of what not to do and are provided to aid your learning ;P]The Assignment TaskYou are to implement a simple raytracer in a multithreaded fashion (for a quick definition of raytracing see the wikipedia page).From the provided single-threaded raytracer implementation, you will create multiple subsequent versions as follows:1. A basic multithreaded CPU implementation.2. A dynamically balanced line-based multithreaded CPU implementation.3. A dynamically balanced square-based multithreaded CPU implementation.Implementation1. Basic Multithreaded CPU ImplementationThis stage involves splitting up the render across multiple threads by splitting up the render into equal sized chunks (or almostequal sized chunks when the number of threads doesnt divide evenly into the image height) with each chunk being renderedentirely by an individual thread.In order to complete this step you will need to:Write code to partition the rendering job into chunks (so that each thread generates a different part of the final image).Write code for the to create and manage multiple threads.At the end of this stage the program should be able to handle any image size (any dimensions that are multiples of four, at least,up to the maximum of 2048×2048).To be eligible for full marks, your assignment should also use the command-line option -threads to set the number of threads toany value that is less than or equal to the height of the image being rendered and also respect the -colourise option to tint eachsection of the image with a set of colours (colour re-use is expected for a large number of threads).2. Dynamically Balanced Line-Based Multithreaded CPU ImplementationThis stage involves splitting up the render into lines which are dynamically allocated to threads from the thread pool as theyrequest more work.In order to complete this step you will need to:Start as many threads As required (specified from the command-line arguments) for the thread pool.Manage some shared data between the threads so that each do the correct work.3. Dynamically Balanced Square-Based Multithreaded CPU ImplementationThis stage involves splitting up the render into squares which are dynamically allocated to thread from the thread pool as theyrequest more work.In order to complete this step you will need to:Partition the render into squares (requires more carefulness when writing results to the image).NOTE: this implementation is (slighty) harder than Stage 2, but might not result in an increase in performance.Hints / TipsThe techniques required to complete each stage rely heavily on work done in the tutorials 2 and 3 refer to them often.DocumentationFor each stage of the assignment you attempt you should provide:average (of 3 runs) timing information for each scene file for the total time taken for a render for particular thread counts.See the next section for an example format for this timing table that specifies which tests are required for which scenes;andan explanation of the results (e.g. why theres no difference between the performance of stages 1, 2, and 3 (NOTE: this isa made up example and isnt necessarily what to expect), or why a particular type of implementation works well (orpoorly) on a particular scene, etc.). This explanation should be with respect to the CPU on the system on which you ran thetests, and you should discuss how the architectural features of the CPU explain the results.Tests / TimingThe following table lists all the tests that your code needs to generate correctly at each stage. It also shows the timing tests thatneed to be performed in order to fully complete the documentation section of the assignment.In order to confirm your images match the images created by the base version of the assignment code, its stronglyrecommended you use a image comparison tool. For part of the marking for this, Image Magick will be used:Image Magick: download For Windows/Mac/Linux.e.g. running this from the command-line to test:magick.exe compare -metric mae Outputs\cornell.txt_1024x1024x1_RayTracerAss1.exe.bmpOutputs_REFERENCE\cornell.txt_1024x1024x4_RayTracerAss1.exe.bmp diff.bmpTestAverage Time (Milliseconds)SingleThreaded Multithreaded Technique / (Stage)Threads1 2 3 4 5 6 7 81. -inputScenes/cornell.txt -size1024 1024 -samples 1Static Chunks (Stage 1) X X X X X X XDynamic Lines (Stage 2) X X X X X X XDynamic Squares (Stage 3) — blocksize 8 X X X X X X XDynamic Squares (Stage 3) — blocksize 64 X X X X X X X2. -inputScenes/cornell.txt -size1024 1024 -samples 4Static Chunks (Stage 1) X X X X X X XDynamic Lines (Stage 2) X X X X X X XDynamic Squares (Stage 3) — blocksize 8 X X X X X X XDynamic Squares (Stage 3) — blocksize 64 X X X X X X X3. -inputScenes/cornell.txt -size500 300 -samples 1Static Chunks (Stage 1) X X X X X X XDynamic Lines (Stage 2) X X X X X X XDynamic Squares (Stage 3) — blocksize 8 X X X X X X XDynamic Squares (Stage 3) — blocksize 64 X X X X X X X4. -inputScenes/allmaterials.txt-size 1000 1000 -samples 1Static Chunks (Stage 1) X X X X X X XDynamic Lines (Stage 2) X X X X X X XDynamic Squares (Stage 3) — blocksize 8 X X X X X X XDynamic Squares (Stage 3) — blocksize 64 X X X X X X X5. -input Scenes/cornell-199lights.txt -size 256256 -samples 1Static Chunks (Stage 1) X X X X X X XDynamic Lines (Stage 2) X X X X X X XDynamic Squares (Stage 3) — blocksize 8 X X X X X X XDynamic Squares (Stage 3) — blocksize 64 X X X X X X X6. -inputScenes/5000spheres.txt-size 480 270 -samples1Static Chunks (Stage 1) X X X X X X XDynamic Lines (Stage 2) X X X X X X XDynamic Squares (Stage 3) — blocksize 8 X X X X X X XDynamic Squares (Stage 3) — blocksize 64 X X X X X X X7. -input Scenes/dudes.txt-size 256 256 -samples1Static Chunks (Stage 1)Dynamic Lines (Stage 2)Dynamic Squares (Stage 3) — blocksize 8Dynamic Squares (Stage 3) — blocksize 64This produces an image (diff.bmp) showing the differences between the two source images, and also a numeric summaryof the difference, here these images are 0.13266% different):0.33827 (0.0013266)(this example deliberately compares the 1×1 sampled to the 4×4 sampled image to show there is a difference).Note: to fully complete the empty cells in the table below, around 180 total images will need be calculated (each test needs 3runs, and the first six tests need to be run with the base code, and with 8 threads for each stage of the assignment, and test 7needs to be run for 18 threads for stage 1, stage 2, and stage 3 with two different block sizes), so plan your time accordingly.Provided MaterialsThe materials provided with this assignment contain:the source code of the base single-threaded version of the raytracer;a set of scene files to be supplied to the program;a set of reference output files created from the single-threaded version of the program;four batch files that will run all 7 timing tests for the base code, and each of the three stages; anda further set of five batch files that will run comparison tests (assuming Image Magick is installed in its default location) foreach of the three stages.Download the materials as an ZIP file.Source CodeThe provided code consists of 19 source files.Raytracing logic:Raytrace.cpp: this file contains the main function which reads the supplied scene file, begins the raytracing, andwrites the output BMP file. The main render loop, Ray trace function, and handling of reflection and refraction is alsoin this file.Intersection.h and Intersection.cpp: these files define a datastructure for capturing relevant information at the pointof intersection between a ray and a scene object and functions for testing for individual ray-object collisions and rayscenecollisions.Lighting.h and Lighting.cpp: these files provide functions to apply a lighting calculation at a single intersection point.Texturing.h and Texturing.cpp: these files provide functions for the reading points from 3D procedural textures.Constants.h: this header provide constant definitions used in the raytracing.Basic types:Primitives.h: this header contains definitions for points, vector, and rays. It also provides functions and overloadedoperators for performing calculations with vectors and points.SceneObjects.h: this header file provides definitions for scene objects (ie. materials, lights, spheres, and boxes).Colour.h: this header defines a datastructure for representing colours (with each colour component represented as afloat) and simple operations on colours, including conversions to/from the standard BGR pixel format.Scene definition and I/O:Scene.h and Scene.cpp: the header file contains the datastructure to represent a scene and a single function thatinitialises this datastructure from a file. The scene datastructure itself consists of properties of the scene and lists ofthe various scene objects as described above. The implementation file contains many functions to aide in the sceneloading process. Scene loading relies upon the functionality provided by the Config class.Config.h and Config.cpp: this class provide facilities for parsing the scene file.SimpleString.h: this is helper string class used by the Config class.Image I/O:ImageIO.h and ImageIO.cpp: these files contain the definitions of functions to read and write BMP files.Miscellaneous:Timer.h: this class provides a simple timer that makes use of different system functions depending on whetherTARGET_WINDOWS, TARGET_PPU, or TARGET_SPU is defined (we dont use the latter two, but I left this file unchanged incase anyone wanted to see how such cross-platform stuff can be handled).ExecutingThe program has the following functionality:By default it will attempt to load the scene Scenes/cornell.txt and render it at 1024×1024 with 1×1 samples.By default it will output a file named Outputs/[scenefile-name]_[width]x[height]x[sample-level]_[executablefilename].bmp(e.g. with all the default options, Outputs/cornell.txt_1024x1024x1_RayTracerAss1.exe.bmp)It takes command line arguments that allow the user to specify the width and height, the anti-aliasing level (must be apower of two), the name of the source scene file, the name of the destination BMP file, and the number of times to performthe render (to improve the timing information).Additionally it accepts some arguments (that are currently unused) for setting the number of threads, whether each threadwill tint the area that it renders, and the size of the block to render (only applicable for stage 3).Further it accepts an arguments -testMode that will fill the output with white. When executing with multiple threads thisshould be a greyscale colour that represents the thread number.It loads the specified scene.It renders the scene (as many times as requested).It produces timing information for the average time taken to produce a render ignoring all file IO, and the number ofsamples generated per run.It outputs the rendered scene as a BMP file.For example, running the Program at the command line with no arguments would perform the first test (as described in the scenefile section):On execution this would produce output similar to the following (as well as writing the resultant BMP file toOutputs/cornell.txt_1024x1024x1_RayTracerAss1.exe.bmp):average time taken (1 run(s)): 3672msTesting Batch FilesA number of batch files are provided that are intended to be executed from the command line, e.g.For timing:Stage1Timing.bat 3 8 will perform all the timing tests required for Stage 1 (i.e. 3 runs with 8 threads one each Testscene).Stage2Timing.bat 3 8 will perform all the timing tests required for Stage 2 (i.e. 3 runs with 8 threads one each Testscene).Stage3Timing.bat 3 8 8 will perform all the timing tests required for Stage 3 at a particular block size (i.e. 3 runswith 8 threads and block size 8 on each Test scene).For testing (requires Image Magick installation), e.g.:Stage1Tests1.bat will perform all the comparison required for Stage 1 Tests where the thread count evenly dividesinto the image height.Stage1Tests2.bat will perform all the comparison required for Stage 1 Tests where the thread count doesnt evenlydivide into the image height.Ray TracingThe materials provided with this assignment include an implementation of a simple raytracer based on the raytracing tutorial atcodermind.com (links currently Broken, But still available via the wayback machine here). This tutorial also provides a pretty goodintroduction to the concepts of ray tracing.Apart from a general structure rewrite and simplification of the code, the supplied implementation has the following similaritiesand differences to the codermind tutorial:it retains support for:spheres;diffuse (Lambert) and specular (Blinn-Phong) lighting;shadows;super-sampled anti-aliasing;simple exposure;a conic perspective camera model; andreflection and refraction.it is extended, in that is provides additional support for:(axis-aligned) boxes;(simple) procedural 3D textures;setting the cameras position and (xz) rotation;reading and writing BMP files.it is simplified, in that it doesnt provide support for:blobs;gamma calculations;perlin noise based procedural textures;bump mapping;auto exposure estimation;bitmap texturing;cubemap texturing;an environment cubemap;depth of field; andsupport for reflection and refraction in the same material.For those of you who want More information (as its not entirely necessary to understand these concepts to be able to completethe assignment) on some of the general and/or mathematical aspects of raytracing, see:General explanations:Codermind Tutorial (archived mirror): A step-by-step guide and the original source of the codebase of theassignment materials.Project Amiga Juggler: A step-by-step guide to the maths and implementation of a raytracer (handling only spheresand a ground plane) in Java:Step 1: vectors, rays, dot-product, and cross-product.Step 5: camera model.Step 8: ray-sphere intersections and Phong shading.Step 9: ray-plane intersections.Step 13: reflection.Wikipedias ray tracing page: a basic outline of the concepts.Princeton Ray Casting Lectures: concepts, maths, and some pseudo code.Princeton Illumination lectures: concepts, maths, and some pseudo code.Softsurfer Algorithms: geometry algorithms.Intersections:Ray-sphere intersections: wikipedia, codermind (part 1), Project Amiga Juggler (step 8), Princeton Ray CastingLectures (Slide 1114).Ray-AABB (axis-aligned bounding box) intersections: medium.Ray-plane intersections: wikipedia, Princeton Ray Casting Lectures (Slide 17), Softsurfer.com Algorithms.Ray-triangle intersections: wikipedia, source paper, lighthouse3d, one more explanation.Ray-something intersections: Real-time Rendering.Lighting:Diffuse lighting: codermind (part 1), Princeton Illumination Lectures (Slide 1921).Specular lighting: codermind (part 2), Princeton Illumination Lectures (Slide 2325).Shadowing: Project Amiga Juggler (step 8).Perlin Noise:2D: wikipedia.3D: Understanding Perlin Noise.One of the scenes uses voxel characters created by miklovesrobots originally in the .VOX format:Voxel character / objects: Mini Mikes Metro Minis.Magica Voxel editor and resources: MagicaVoxel.Voxel file format: voxel-model.如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导