Comp 4510

Comp 4510 – Assignment 3 (Part A)
Due: Due anytime before October 21, 2021 (11:59pm)
Instructions:
This assignment should be completed on the Mercury machine using MPI. The MPI Programming Environment
document on the course website describes how to log into these machines and use them – please read it before
you begin.
You will be expected to follow the Assignment Guidelines, which are available in the assignments folder on the
course website. This document describes how to organize and hand in your code. Please review it before you
begin. Note that a failure to follow these standards will result in a loss of marks. Always make sure your program
works for any input and optimize your code (do not add unnecessary statements to make your program workmarks
will be deducted.). Please place the answers to the written part of the programming questions (I) and
written questions (II) in a single word document. Submit all programs that you have written or made
modifications to. Also submit all output files. Give appropriate names to the output files: for example,
question1_datasize_processsize.out.) You need not submit the original files provided by the instructor if not
requested.
This assignment is to learn MPI programming through an interesting maths application.
Total Marks: [50] This assignment is worth 10% of the total assignment marks.
I. PROGRAMMING QUESTIONS [35]

  1. [10] Consider the example we discussed in class from LAM-MPI manual (page 12). The example
    was to illustrate dynamic load balancing using an on-demand approach. We will now look at
    converting this program into a static load balancing single program multiple data approach with some
    modifications. [Note: Think about how you will send and receive: batches, non-batches, point-topoint,
    collective, etc. Choose the MPI routines that best fits the architecture to reduce
    communication/synchronization latencies.]
    a. Assume there are N (NUM_WORK_REQs) exam papers, q markers. Assume N is divisible
    by q.
    b. Each marker is assigned N/q exam papers to mark. Assume one of the markers (Process 0
    as in the manual) is also the manager who collects the results of the exam papers.
    c. Assume an array (A) of size N, where the ? row in A is the ID of the exam paper [Or, you
    may choose another data structure such as a linked list and add arguments to the list as
    needed.]
    d. Each exam paper (j) stores two values: ID of the marker and mark [ “struct” (record) data
    type would be appropriate].
    e. Each marker (similar to the slave() procedure that computes result) computes the
    mark of paper j as follows: mark = ( + 1)* j +N/q.
    f. Each marker sends the ID and mark of the exam papers it was assigned, to the manager.
    g. The manager should print out for each exam paper in sequence, the marker ID, paper ID and
    mark on each line. That is “Marker ID = …, Paper ID = …, Mark = …”)
  2. [25] In this question, you will use MPI to parallelize a program that generates Julia Set Fractals (a
    type of mathematically defined image). Several sample pictures of Julia Set Fractals are shown on
    the following page. The sequential program that generated these images is available in the
    assignment folder on the course website.
    The Sequential Program
    Download the sequential code (and upload it to Mercury) and try running it. Look at the output image
    you get (it’s written to the file out.bmp when the program finishes). You will need to transfer the
    image back to your own local machine to view it.
    Take a look at the main() and compute_row() functions. The main() function contains a loop
    that calls compute_row() for each row in the output image. The compute_row() function, in turn,
    contains a loop that goes through each column (pixel) in the row, does some calculations to figure out
    what colour it should be, then stores the result in the data array. It also updates the hist array, which
    contains the counts (frequencies) of each of the distinct values in the data array.
    Parallelizing the Code
    Your task is to make modifications to the sequential code to parallelize it using MPI function calls. Each
    of the modifications you will need to make are described below. To run your parallel program, you will
    need a parallel myjob script (the one in the Julia example code only launches one process). You can
    use a copy of the parallel myjob script from the hello MPI example provided on the course website.
    Your parallel MPI version of the code will launch Q processes. The root process will act as a
    “supervisor” (master), and the others will be the “workers” (slaves).
    We will parallelize the code by partitioning the work of computing the rows (currently done by the
    loop in main()) among the Q-1 worker processes. Each of the Q–1 workers should compute exactly
    HEIGHT / (Q-1) rows of the output image. You may assume that
    HEIGHT % (Q-1) == 0 so that each worker gets exactly the same number of rows with no
    leftovers (or shortages).
    In order to do this, each worker process will need its own copy of a data array and a hist array (each
    with enough space to process its own chunk of rows). When a worker is finished its computation (which
    it can do by simply calling compute_row() - though you may have to make some small modifications
    to the indexing of the arrays inside the function), it should send these arrays back to the supervisor
    process.
    The supervisor process’s job is to wait to receive the data and hist arrays from each of the worker
    processes. When a worker sends its results, it should store the chunk of the data array that is received
    in the appropriate location in its own (full size) data array, and add the values in the received hist
    array into its own hist array. Finally, once all workers have sent back their results, the supervisor
    should call the functions hist_eq() and write_bmp() to output the image, as main() does at the
    end of the existing sequential code.
    When you are finished, generate a test image using your program. To do this, change the SCALE and C
    constants in the julia_iters.h file to these values:
    define SCALE 4.0define C -0.839 + 0.4I【Comp 4510】This will generate a computationally intensive fractal. Run your program with 41 processes (1
    supervisor and 40 workers.1 Note that this may take some time (usually between 5 and 10 minutes).
    Include the resulting image (out.bmp) with your code when you submit your assignment.
    Notes:
    ? The only functions you’ll need to modify here are main() and compute_row() - though
    you are free (and encouraged) to add your own new functions if you want.
    ? While you do not need to understand the math that goes into computing the fractal to complete
    this question (i.e. the stuff that the other functions in the code do), if you are curious, a brief
    description is provided below.
    ? The provided code uses a simple BMP library (“Quick and Dirty BMP”) to write the image
    files. This is what the “libqdbmp.a” file (and corresponding header file) is for. The existing
    makefile links it with your program.
    ? You can use PSCP for secure file transfer between the mercury server (MPI programming
    environment) and your local machine. Details on how to use it is given in the link
    https://it.cornell.edu/manage...
    If you’re interested, here are the settings for generating each of the images below (feel free to try out
    your own as well!):
    SCALE = 3.0 SCALE = 3.0 SCALE = 3.0 SCALE = 3.0
    C = 1.0 + 0.5I C = 1.0 + 0.05I C = 0.55 + 0.85I C = 0.05 + 1.0
    Although it is not necessary to understand the Julia set algorithm to do this question, if you are
    interested, a brief description is given below:
    The math
    The Julia set is a mathematically defined set of complex numbers. Suppose we have a complex number
    0. To test whether or not 0is in the Julia set, we can use the following process:
  3. Generate a new complex number 1 by setting i = 1 in the following equation:
    = (?1)
  4. Repeat the above process (setting i = 2, 3, ... n), to generate a sequence of complex numbers
    0, 1, 2, . . . . If the imaginary part of the values in the sequence tend toward infinity, then 0is not in
    the set. If they do not, then 0is in the set.
    Generating the image
  5. Note: do NOT hardcode the number of processes in your program – use the MPI_Comm_size()
    function as described in class. You can find an example of this in the hello_world example code
    available on the course website.
    To generate an image using the Julia set, we go through every pixel (x, y) and combine them in some
    way (how doesn’t really matter, so long as it’s consistent) to create our initial complex number 0. We
    then apply steps 1 and 2 of the above process in a loop, stopping when either:
    a. the imaginary part of the number exceeds 50 (indicating that it is tending toward infinity), or
    b. we exceed 216 iterations (indicating that it is not).
    When the loop stops, we record the number of iterations it took in an array called data.
    After we have completed the steps from above paragraph for every pixel in the output image, the data
    array is full of the iteration counts for each pixel in the output image. We can now generate the image
    by interpreting each of these counts as a 24-bit RGB colour value (with the lower 8 bits representing
    blue, the next 8 bits green, and the upper 8 representing red), and writing those colour values into an
    BMP file.
    However, there is a problem with this process. If all of the iteration counts are very similar (say
    between 100 and 110), they will all generate very similar colours, making it very hard to see the
    patterns in the image. To prevent this, before we interpret the data array as RGB values, we use a
    technique called histogram equalization to normalize the distribution of the iteration counts it contains.
    This involves the use of another array called hist (used to store the frequencies of each of the values
    in data), which you’ll see in the code. After we have tweaked the data values to fix the colours, we
    convert them to RGB colour values and write them to the image file.
    II. WRITTEN QUESTIONS [15]:
  6. [11] Suppose MPI_COMM_WORLD consists of the three processes 0,1, and 2, and suppose the
    following code is executed:
    int x, y, z;
    switch(my_rank) {
    case 0: x=0; y=1; z=2;
    MPI_Bcast(&x, 1, MPI_INT, 0, MPI_COMM_WORLD);
    MPI_Send(&y, 1, MPI_INT, 2, 43, MPI_COMM_WORLD);
    MPI_Bcast(&z, 1, MPI_INT, 1, MPI_COMM_WORLD);
    break;
    case 1: x=3; y=8; z=5;
    MPI_Bcast(&x, 1, MPI_INT, 0, MPI_COMM_WORLD);
    MPI_Bcast(&y, 1, MPI_INT, 1, MPI_COMM_WORLD);
    break;
    case 2: x=6; y=7; z=8;
    MPI_Bcast(&z, 1, MPI_INT, 0, MPI_COMM_WORLD);
    MPI_Recv(&x, 1, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD,
    &status);
    MPI_Bcast(&y, 1, MPI_INT, 1, MPI_COMM_WORLD);
    break;
    }
    Indicate in the table below, the output of the values (x, y, z) after execution of each of the
    instructions by each of the processes. Show all your work.
    Process 0 Process 1 Process 2
    Bcast, x = ? Bcast, x = ? Bcast, z = ?
    Send, y = ? Bcast, y = ? Recv, x = ?
    Bcast, z = ? z = ? Bcast, y = ?
    (x,y,z) = ? (x,y,z) = ? (x,y,z) = ?
  7. [2] Consider the following code for two processes:
    if (myrank == 0) then {
    compute (); %this is a function
    MPI_Barrier();
    else
    donothing(); %this is a function;
    Is this a safe program? Explain your answer.
  8. [2] Consider the following code for two processes:
    if (rank == 0) {
    X = 5;
    MPI_ISend(&X, 1, MPI_INT, 1, tag, MPI_COMM_WORLD, &request1);
    MPI_Test(&request1, &flag, &stat1);
    X= X +2;
    Dosomething(); //This function is NOT dependent on X
    }
    else
    MPI_RECV(&X, 1, MPI_INT, 0, tag, MPI_COMM_WORLD,&status);
    The intended behaviour is that Process 1 should receive X = 5. Explain why this is not guaranteed.
    Rewrite the code so this is guaranteed.

    推荐阅读