辅导CS3103语言、 写作Systems编程设计

” 辅导CS3103语言、 写作Systems编程设计Project CS3103 Operating Systems Version 2.1 (20201021)1 CS3103 – Operating SystemsProject: Parallel ZipDue Date: Tuesday, November 24, 2020 at 8 PM HKT. Late submissions will be penalized as per syllabus.I. Project InstructionsOverviewIn the earlier programming assignment, you implemented a simple compression tool based on run-lengthencoding, known simply as czip. For this project, youll implement something similar, except youll usethreads to make a parallel version of czip. Well call this version pzip.There are three specific objectives to this project: To familiarize yourself with the Linux pthreads library for writing multi-threaded programs. To learn how To parallelize a program. To learn how to program for performance.Project OrganizationEach group should do the following pieces to complete the project. Each piece is explained below: Design Implementation Evaluation30 points30 points40 pointsYoure required to submit a project report which concisely describes your design, implementation, andexperimental results.DesignThe design space of building an Effective compression tool is large. A practical zip that achieve a betterperformance will require you to address (at least) the following issues (see part II. Project Description formore details): How to parallelize the compression. How to determine the number of threads to create. How to efficiently compress multiple files in parallel. How to Access the input file efficiently.In your project report, please describe the main techniques or mechanisms proposed to parallelize thecompression. List and describe all the functions used in this project.2 CS3103 – Operating SystemsImplementationYour code should be nicely formatted with plenty of comments. Each function should be preceded by aheader comment that describes what the function does. The code should be easy to read, properly indented,employ good naming standards, good structure, and should correctly implement the design. Your codeshould match your design.EvaluationWe provide 10 test cases for you to test your code. Before submitting your work, please run your pzip on thetest cases and check if your pzip zips input files correctly. Time limitation is set for each test case, and if thislimit is exceeded, your test will fail. For each test case, your grade will be calculated as follows: Correctness. Your code will first be measured for correctness, ensuring that it zips input files correctly.You will receive full points if your solution passes the correctness tests performed by the test script. Youwill receive zero points for this test case if your code is buggy. Performance. If you pass the correctness tests, your code will be tested for performance; Test script willrecord the running time of your program for performance evaluation. Shorter time/higher performancewill lead to better scores.In your project report, summary and analyze the results. You can also compare your solution with the providedbaseline implementation.Tips: Keep a log of work you have done. You may wish to list optimizations you tried, what failed, etc.Keeping a good log will make it easy to put together your final writeup.BonusYoure encouraged to be Creative and innovative, and this project award bonus points (up to 10 points) foradditional and/or exemplary work. New ideas/designs are welcome to fully explore the parallelism of the process of zipping files.Additional test cases will be used to test your solution for performance evaluation. To encourage healthy competition and desire to improve, well provide a scoreboard that shows scoresfor each group. The latest scores are displayed, rank ordered, on the scoreboard. We will award bonuspoints for top 10 groups. Further details will be posted on Canvas soon.Language/PlatformThe project should be written in ANSI standard C.This project can be done on Linux (recommended), MacOS, or Windows using Cygwin. Since grading ofthis project will be done using gateway Linux server, students who choose to develop their code on anyother machine are strongly encouraged to run their programs on the gateway Linux server before turning itin. There will be no Points for programs that do not compile and run on gateway Linux Server, even if theyrun somewhere else.3 CS3103 – Operating SystemsHanding InThe project can be done individually, in pairs, or in groups, where each group can have a maximum of threemembers. All students are required to join one project group in Canvas: People section Groups tab Project Group Group 01Group 40 or Individual 41Individual 50. Contact TA to add additional groupsif necessary. Self sign-up is enabled for these groups. Instead of all students having to submit a solution to theproject, Canvas allows one person from each group to submit on behalf of their team. If you work withpartner(s), both you and your partner(s) will receive the same grade for the entire project unless you explicitlyspecify each team members contribution in your report. Please be sure to indicate who are in your group whensubmitting the project report.Before you hand in, make sure to add the requested identifying information about your project group, whichcontains project group number, full name and e-mail address for each group member.When youre ready to hand in your solution, go to the course site in Canvas, choose the Assignments section Project group Project item Submit Assignment button and upload your files, including the following:1) A PDF document which concisely Describes your design, implementation, and experimental results;If you are working in a team, please also describe each team members contribution.2) The source file, i.e., pzip.c;Academic HonestyAll work must be developed by each group separately. Please write your own code. All submitted sourcecode will be scanned by anti-plagiarism software. If the code does not work, please indicate in the reportclearly.Questions?If you have questions, please first post them on Canvas so others can get the benefit of the TAs answer.Avoid posting code that will give away your solution or allow others to cheat. If this does not resolve yourissue, contact the TA (Mr. LI Jinhengjinheng.li@my.cityu.edu.hk).AcknowledgementsThis project is taken and modified from the OSTEP book written by Remzi and Andrea Arpaci-Dusseau atthe University of Wisconsin. This Ffree OS book is available at https://www.ostep.org. Automated testing scriptsare from Kyle C. Hale at the Illinois Institute of Technology.DisclaimerThe information in this document is subject to change with notice via Canvas. Be sure to download the latestversion from Canvas.4 CS3103 – Operating SystemsII. Project DescriptionFor this project, you will implement a parallel version of zip using threads. First, recall how zip works byreading the description in Assignment 1 Part II. Youll use the same basic specification, with run-lengthencoding (RLE) as the basic technique.RLE is quite simple: when you encounter n characters of the same type in a row, the compression tool (pzip)will turn that into the number n and a single instance of the character.Thus, if we had a file with the following contents:aaaaaaaaaabbbbThe tool would turn it (logically) into:10a4bHowever, the exact format of the compressed file is quite important; here, you will write out a 4-byte integerin binary format followed by the single character in ASCII. Thus, a compressed file will consist of somenumber of 5-byte entries, each of which is comprised of a 4-byte integer (the run length) and the singlecharacter. To write out an integer in binary format (not ASCII), you should use fwrite(). Read the manpage for more details. For pzip, All output should be written to standard output (the stdout file stream).Your pzip will externally look the same as czip. However, internally, the program will use POSIX threadsto parallelize the compression process. The general usage from the command line will be as follows:$ ./pzip file.txt file.zDoing so effectively and with high performance will require you to address (at least) the following issues: How to parallelize the compression. The central challenge of this project is to parallelize thecompression process. You are required to think about whether the compression process can be separatedinto several sub-processes, what sub-processes can be done in parallel, and what sub-processes must bedone serially by a single thread. Then, you are required to design your parallel zip as appropriate. Forexample, does it possible to zip several small sub-files using multiple threads instead of zipping a largefile using only one thread? If its possible, how to divide the large file? How to zip those small sub-filesusing multiple threads? How to merge zipped results of several small sub-files? One interesting issue thatthe best implementations will handle is this: what happens if one thread runs much slower than another?Does the compression give more work to faster threads? This issue is often referred to as the stragglerproblem. How to determine the number of threads to create. On Linux, the determination of the number ofthreads may refer to some interfaces like get_nprocs() and get_nprocs_conf(); You are suggestedto read the man pages for more details. Then, you are required to create an appropriate number of threadsto match the number of CPUs available on whichever system your program is running.5 CS3103 – Operating Systems How to efficiently compress multiple files in parallel. In previous issues, you may have completed theparallel compression for one large file. Now you are required to think about how to parallelize thecompression processes of multiple files. A nave way is to sequentially process the parallel compressionprocess of each file. However, this method cannot fully explore the parallelism of the compressionprocesses of multiple files. You are required to explore the parallelism between the compression processesof multiple files and design an efficient and fast parallel method to compress multiple files. Note thatwhen the input contains Directories, you can first obtain the paths of all files in the directories recursivelyusing readdir(), then compress them as multiple files. How to access the input file efficiently. On Linux, there are many ways to read from a file, including Cstandard library calls like fread() and raw system calls like read(). One particularly efficient way isto use memory-mapped files, available via mmap(). By mapping the input file into the address space, youcan then access bytes of the input file via pointers and do so quite efficiently.To understand how to make tackle these problems, you should first understand the basics of thread creation,and perhaps locking and signaling via mutex locks and condition variables. Review the tutorials and read thefollowing chapters from OSTEP book carefully in order to prepare yourself for this project. Intro to Threads Threads API Locks Using Locks Condition Variables6 CS3103 – Operating SystemsIII. Project Guidelines1. Getting StartedThe project is to be done on the CSLab SSH gateway server, to which you should already be able to log in.As before, follow the same copy procedure as you did in the previous tutorials to get the project files (codeand test files). They are Available in /public/cs3103/project/ on the gateway server. project.zip containsthe following files/directories:/project├── Makefile├── pzip – A sample solution (executable file).├── pzip.c – Modify and hand in pzip.c file.├── README.txt└── tests├── bin│ ├── generator.py│ ├── generic-tester.py│ ├── serialized_runner.py│ └── test-pzip.csh├── config│ ├── 1.json│ ├── …│ └── 10.json├── stdout│ ├── 1.out│ ├── 1.rc│ ├── …│ └── 10.rc└── tests-pzip├── 1│ └── 1-0.in├── 2│ ├── 2-0.in│ ├── 2-1.in│ ├── …│ └── 2-11.in├── …└── 10├── 10-0│ ├── 10-0-0.in│ ├── 10-0-1.in7 CS3103 – Operating Systems│ └── 10-0-2.in├── 10-1│ ├── 10-1-0.in│ ├── 10-1-1.in│ └── 10-1-2.in├── 10-2.in├── …└── 10-7.inStart by copying the provided files to a directory in which you plan to do your work. For example, copy/public/cs3103/project/project.zip to your home directory, extract the files from the ZIP file with the unzipcommand. Note that the file size of project.zip is only 10MB, but the uncompressed project directory has asize of 5GB. It takes about 90 seconds to unzip the project.zip file on our gateway server. After the unzipcommand extracts all Files to the current directory, change to the project directory and take a look at thedirectory contents:$ cd ~$ cp /public/cs3103/project/project.zip .$ unzip project.zip$ cd project$ lsA sample pzip is also provided (we only provide a single executable file without source code). This pzipuses pthread to support the parallel compression of multiple files or directories. When the input files of thecompression process contain directories, the pzip first obtains the paths of all files in the directoriesrecursively using readdir(), then treats the compression process as the compression of multiple files. Forthe multiple files compression, the pzip uses mmap to map files into multiple pages and compress all pagesin parallel. The parallel compression process can be treated as producer-consumer problem. The pzip usesone thread of producer to map files and multiple threads of consumers to compress pages.You can run and test pzip by using the make run command.$ make runTEST 1 – single file test, a small file of 10 MB (2 sec timeout)Test finished in 0.045 secondsRESULT passed(content removed for brevity)TEST 10 – mixed test 3, two directories that contain three small files of10 MB, 20 MB, 10 MB and three large files of 200 MB, 100 MB, 200 MB, andsix small files outside directory of 10 MB, 20 MB, 10 MB, 20 MB, 10 MB,20 MB (2 sec timeout)Test finished in 0.252 secondsRESULT passed8 CS3103 – Operating SystemsYou can also use this one as a baseline implementation for performance evaluation, which means you cancompare the execution time of your pzip with that of the provided one in final report. Note that after buildingyour own pzip (using Make or make test), the provided pzip file will be overwritten. But dont worry,you can always copy it from /public/cs3103/project/pzip.2. Writing your pzip programThe pzip.c is the file that you will be handing in and is the only file you should modify. Write your code fromscratch or simply borrow the code from your czip to implement this parallel version of zip. Again, its a goodchance to learn (as a side effect) how to use a proper code editor such as vim1,2.3. Building your programA simple makefile that describes how to compile pzip is provided for you.To compile your pzip.c and to generate the executable file, use the make command within the directory thatcontains your project. It will display the command used to compile the pzip.$ makegcc -Wall -Werror -pthread -O pzip.c -o pzipNote that the -Werror compiler flag is specified. It causes all warnings to be treated as build errors. It wouldbe better to fix the compiling issue instead of disabling -Werror flag.If everything goes well, there would an executable file pzip in it:$ lsMakefile pzip pzip.c README.txt testsIf you make some Changes in pzip.c later, you should re-compile the project by running make command again.To remove any files generated by the last make, use the make clean command.$ make cleanrm -f pzip$ lsMakefile pzip.c README.txt tests4. Testing your C programWe also provide 10 test cases for you to test your code. You can find them in the directory tests/tests-pzip/.The makefile could also trigger automated testing scripts, type make run (run testing only) or make test(build your program and run testing).1 Interactive Vim tutorial, httpss://www.openvim.com/2 Vim Cheat Sheet, httpss://vim.rtorr.com9 CS3103 – Operating Systems$ make testTEST 0 – clean build (program should compile without errors or warnings)Test finished in 0.189 secondsRESULT passedTEST 1 – single file test, a small file of 10 MB (2 sec timeout)Test finished in 0.045 secondsRESULT passedTEST 2 – multiple Files test, twelve small files of 10 MB, 20 MB, 30 MB,10 MB, 20 MB, 30 MB, 10 MB, 20 MB, 30 MB, 10 MB, 20 MB, 30 MB (2 sectimeout)Test finished in 0.101 secondsRESULT passedTEST 3 – empty file test (2 sec timeout)Test finished in 0.013 secondsRESULT passedTEST 4 – no file test (2 sec timeout)Test finished in 0.014 secondsRESULT passedTEST 5 – single large file test, a large file of 100 MB (2 sec timeout)Test finished in 0.074 secondsRESULT passedTEST 6 – multiple large files test, six large files of 100 MB, 200 MB, 300MB, 100 MB, 200 MB, 300 MB (2 sec timeout)Test finished in 0.435 secondsRESULT passedTEST 7 – directory test, a directory that contains twelve small files of10 MB, 20 MB, 30 MB, 40 MB, 10 MB, 20 MB, 30 MB, 40 MB, 10 MB, 20 MB, 30MB, 40 MB (2 sec timeout)Test finished in 0.135 secondsRESULT passedTEST 8 – mixed Test 1, a directory that contains six small files of 10 MB,20 MB, 10 MB, 20 MB, 10 MB, 20 MB and six large files outside directoryof 100 MB, 200 MB, 300 MB, 100 MB, 200 MB, 300 MB (2 sec timeout)10 CS3103 – Operating SystemsTest finished in 0.410 secondsRESULT passedTEST 9 – mixed test 2, a directory that contains six large files of 100MB, 200 MB, 100 MB, 200 MB, 100 MB, 200 MB, and six small files outsidedirectory of 30 MB, 40 MB, 30 MB, 40 MB, 30 MB, 40 MB (2 sec timeout)Test finished in 0.374 secondsRESULT passedTEST 10 – mixed test 3, two directories that contain three small files of10 MB, 20 MB, 10 MB and three large files of 200 MB, 100 MB, 200 MB, andsix small files outside directory of 10 MB, 20 MB, 10 MB, 20 MB, 10 MB,20 MB (2 sec timeout)Test finished in 0.252 secondsRESULT passedThe job of those automated scripts is to orderly run your pzip on the test cases and check if your pzip zipsinput files correctly. TEST 0 (available for make test) will fail if your program is compiled with errors orwarnings. Time limitation is set for each test case, and if this limit is exceeded, your test will fail. Besides, thescript will record the running time of your program for performance evaluation. Shorter time/higherperformance will lead to better scores.Below is a brief description of each test case: Test case 1-6: Test cases 1, 2, are small files. For test case 3, the file is empty. For test case 4, there is nofile. if no files are Specified, the program should exit with return code 1 and print pzip: file1 [file2 …](followed by a newline).Test case 1) single file test a small file.Test case 2) multiple files test twelve small files of different file size.Test case 3) empty file test.Test case 4) no files test.Test case 5) single large file test a large file.Test case 6) multiple large files test six large files of different file size. Test case 7-10: Some files are stored in a directory, and you are required to compress the directory andother files. Do note that if multiple files are passed to pzip, they are compressed into a single compressedoutput. The information that multiple files were originally input into pzip is lost.Test case 7) directory test a directory that contains twelve small files of different file size.Test case 8) mixed test 1 a directory that contains six small files, and six large files outsidedirectory.Test case 9) mixed test 2 a directory that contains six large files, and six small files outsidedirectory.11 CS3103 – Operating SystemsTest case 10) mixed test 3 two directories that contain three large files and three small files, andsix small files outside directory.Each test consists of 5 files with different filename extension: n.json (in tests/config/ directory): The configuration of test case n. binary: Indicates the data type of input and output is binary. filename: The test case number. filesize: The file Size of each file. All numbers are included in a list. If an item is a list ofnumbers, it indicates it is a directory. timeout: Time limitation (seconds). seed: The seed used to generate the content of input files. description: A short text description of the test. preparation: Code to run before the test, to set something up. n.rc (in tests/stdout/ directory): The return code the program should return (usually 0 or 1). n.out (in tests/stdout/ directory): The standard output expected from the test. n.in (in tests/tests-pzip/ directory): The input file of the test case.Just like in Assignment 1, you can run and test your pzip manually. For example, to run your pzip tocompress the input file 1-0.in in tests/tests-pzip and save the compressed file as 1.out, enter:$ ./pzip ./tests/tests-pzip/1/1-0.in 1.outTo run Your pzip to compress multiple input files (the input files 2-0.in, 2-1.in, 2-2.in) in tests/tests-pzip, andsave the compressed file as 2.out, enter:$ ./pzip tests/tests-pzip/2/2-0.in tests/tests-pzip/2/2-1.in tests/testspzip/2/2-2.in 2.outTo run your pzip to compress the input directory 7-0 in tests/tests-pzip and save the compressed file as 7.out,enter:$ ./pzip tests/tests-pzip/7/7-0 7.outTo run your pzip to Compress the input directory 8-0 and some input files (the input files 8-0.in, 8-1.in) intests/tests-pzip and save the compressed file as 8.out, enter:$ ./pzip tests/tests-pzip/8/8-0 tests/tests-pzip/8/8-0.in tests/tests-pzip/8/8-1.in 8.outTo run other test cases, please run as follows:$ ./pzip tests/tests-pzip/3/3-0.in 3.out$ ./pzip tests/tests-pzip/5/5-0.in 5.out$ ./pzip tests/tests-pzip/9/9-0 tests/tests-pzip/9/9-0.in tests/tests-pzip/9/9-1.in 9.out如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

添加老师微信回复‘’官网 辅导‘’获取专业老师帮助,或点击联系老师1对1在线指导