Chris Pollett> CS255
( Print View )

Student Corner:
[Submit Sec1]
[Grades Sec1]

[Lecture Notes]
[Discussion Board]

Course Info:
[Texts & Links]
[Description]
[Course Outcomes]
[Outcomes Matrix]
[Course Schedule]
[Grading]
[Requirements/HW/Quizzes]
[Class Protocols]
[Exam Info]
[Regrades]
[University Policies]
[Announcements]

HW Assignments:
[Hw1] [Hw2] [Hw3]
[Hw4] [Hw5] [Quizzes]

Practice Exams:
[Midterm] [Final]

HW#2 --- last modified February 16 2022 23:32:26.

Solution set.

Due date: Mar 7

Files to be submitted:
  Hw2.zip

Purpose: To gain experience with analyzing and coding parallel algorithms

Related Course Outcomes:

CLO2 -- Analyze or code a parallel algorithm using a thread library

CLO3 -- Analyze or code a parallel algorithm using a library such as OpenCL

Specification:

This homework will consist of two parts, a written part and a coding part. Both parts will be submitted in the Hw2.zip file. The written part should be in a file Hw2.pdf. For the written part of the homework you should write solutions for the following questions. In what you turn in, make sure to write the names and student ids for each group member. For each problem, first copy and paste the question, then write your solution to it beneath it.

  1. Do problem 27.1-1 out of CLRS.
  2. Do problem 27.1-7 out of CLRS.
  3. Do problem 27.2-4 out of CLRS
  4. Suppose we have an array of n many `n times n` matrices `A_1, A_2,... A_i ...,A_n`. Design a multithreaded algorithm which computes `\prod_{i=1}^n A_i` with work at most `O(n^4)` and span `O(log^3n)`.
  5. Suppose we have an array A[1] to A[n] where each entry of which has a positive integer. Devise a CREW PRAM algorithm that sets each of these A[i]'s to the average value rounded down of the A[i]'s in `O(log n)` steps using at most `n` processors.

For the coding part of this homework I would like you to write two parallel programs in Java: ThreadMaxProd.java and JoclMaxProd.java. These program should take as input two equal length column vectors `vec v` and `vec w` and return the pair (`argmax_i v_i cdot w_i`, `max_i v_i cdot w_i`) where v_i, w_i is the value of row `i` in `vec v` and `vec w` respectively. The first program makes use of Java Threads and the second makes use of JOCL jar file that provides Java bindings to OpenCL. The column vectors will come from a file as described below. I would like you to code both of your programs so that their spans are `O(log n)`. ThreadMaxProd.java will be compiled from the command line via:

javac ThreadMaxProd.java

It can use either classic Java Threads or the Fork Join/Parallel Array frameworks in java.util.concurrent. We didn't talk about the latter so you'll be own your own to learn about if you want to use those. To run your program I will then type:

java ThreadMaxProd filename_with_vector_data

On this input, your program should read in the contents filename_with_vector_data, which should consist of lines with two comma separated integers followed by a new line character/line. It make two vectors `vec x` and `vec y` from these, and the index at which `x_i cdot y_i` achieves the largest value followed by a comma, followed by that largest value. For example, I might have the file my_vectors.txt with contents:

1,-1
2,1
10,5
3,4

should output:

2,50

Your JoclMaxProd.java program should do exactly the same thing, but use JOCL rather than Java Threads. I.e., I'll compile it with

javac JoclMaxProd .java

You can assume I have set up the classpath to find the JoCL jar file. Then I'll run your program with a line like:

java JoclMaxProd  filename_with_vector_data

For each program you should add a mechanism of your choice to time just the portion of the code in which parallel processing is done (not reading in the file, you probably want to be using System.nanoTime()). I want you to do experiments with both programs varying the length of the vector you compute the outer of. Look up the number of cores the machine you are experimenting on has, and the number of GPU shader processors it has. If you plot time versus the length of vectors you compute the norms of, does it match what you'd expect in each case if Brent's Theorem were an equality rather than an inequality? Write up your experiments also in Hw2.pdf. This timing should be turned off by default.

The above concludes the description of the required homework.

Point Breakdown

Written problems (1pt each - graded 0, 1/2 (partial), 1 ( correct)). 5pts
Programs compile and run from the command line as described (1pt). 1pt
ThreadMaxProd.java uses a `O(log n)` span algorithm to compute the max product pair using Java Threads(1pt). Program correctly computes the max product pair of a set of test vectors (1/2pt) 1.5pts
JoclMaxProd.java uses a `O(log n)` span algorithm to compute the max product pair using JOCL(1pt). Program correctly computes the max product pair of a set of test vectors (1/2pt) 1.5pts
Experiments write up 1pt
Total 10pts