MIPS simulator 项目

【MIPS simulator 项目】Project 1

  1. Overview
    The first project is going to be a MIPS simulator. In general, you will be building a program that
    simulates the execution of an assembly language file. The input of your program will be a MIPS
    file that contains MIPS assembly language code. Examples will be given in the latter part of this
    instruction.
    1.1 Readings:
    Please read Appendix A from the text book before you start writing your first project. Make sure
    you understand what each MIPS instruction is doing. All of the supplementary materials for this
    project can be found in Appendix A, such as the register numbers, instructions, and their machine
    code format.
  2. MIPS
    MIPS is an assembly language, which is a low level programming language that each line of
    code corresponds to one machine instruction.
    2.1 Instruction
    Machines cannot understand high level programming languages directly, such as C/C++/JAVA.
    High level programming languages are "translated" to machine instructions that machine can
    understand. Assembly languages, including MIPS we deal with, are the readable (although
    difficult) version of machine code. In assembly languages, one instruction tells the computer to do
    one thing exactly. For example, an instruction may look like this:
    This instruction tells the computer to add up the things stored in register $t1 and register $t2
    and store the result in $t3. Here, registers are small chunks of memory in CPU used for program
    execution. The MIPS assembly language has three types of instruction in general: I-type, R-type,
    and J-type, each corresponds to a pattern of the 32 bits of machine code. Details can be found in
    Appendix A of the text book. The above instruction has the machine code:
    It does not make sense at a glance, however, it follows a certain pattern. The add instruction is
    a R-instruction, so it follows the pattern of:
    add $t0, $t1, $t2
    00000001001010100100000000100000
    R-instruction:
    op rs rt rd shamt funct
    | 6bits | 5bits | 5bits | 5bits | 5bits | 6bits |
  3. op: operation code, all zeros for R-instructions.
  4. rs: the first register operand
  5. rt: the second register operand
  6. rd: the destination register
  7. shamt: shift amount. 0 when N/A
    Here, we go through how this instruction and its machine code corresponds to each other.
  8. The first 6 bits for R-instruction are for operation code, which are all zeros for R-type.
  9. The following three 5-bit slots are the register numbers specified in the instruction. "rs" and
    "rt" represents the first and second register operand in the instruction, and "rd" represents
    the destination register. Here, the register number of $t0, $t1, and $t2 are 8, 9, 10,
    respectively. These translate to 01000, 01001, 01010 in binary, respectively. Putting them in
    the right place, we have the middle 15-bits.
  10. "shamt" represents the shift amount, which are only in certain instructions (such as sll, srl),
    and 0's are filled when N/A. In add instruction, these 5 bits are zeros.
  11. The last 6 bits are for function code. The function code for add is 32, which is 100000 in
    binary.
    Machine codes of other R-instructions are constructed through the same process. As for Iinstructions
    and J-instructions, I will not go through an example. The formats of these two types of
    instructions are:
    2.2 MIPS programs
    Now you know what do MIPS instructions look like, but what does a MIPS program look like?
    Here is a general format a MIPS program follows:
  12. funct: function code, which is used to identify which R-instruction this is.
    The add instruction has the format:
    add $rd, $rs, $rt
    Therefore, for add $t0, $t1, $t2, we have:
  13. 01001 01010 01000 00000 100000
    I-instruction:
    op rs rt immediate
    | 6bits | 5bits | 5bits | 16bits |
  14. op: the operation code that specifies which I-instruction this is.
  15. rs: register that contains the base address
  16. rt: the destination/source register (depends on the operation)
  17. immediate: a numerical value or offset (depends on the operation)
    J-instruction:
    op address
    | 6bits | 26bits |
  18. op: the operation code that specifies which J-instruction this is.
  19. address: the address to jump to, usually associate with a label. Since the
    address of an instruction in the memory is always divisible by 4 (think about
    why), the last two bits are always zero, so the last two bits are dropped.
    In general, a MIPS program looks like this. All of the MIPS code goes under the .text section, and
    all of the static data in the program are under .data section. As you can see, for each piece of
    static data, we have a name to it, just like what we do in high level programming languages. "str1"
    is the name to that piece of data, .asciiz is the data type of it, and "hello world!\n" is the value.
    There are many other types of data, you can find them in Appendix A.
    For the code part, as you can see, we also have a "name" called "main". This is the label
    representing the line of code. Usually this is used for indicating the start of a loop, a function, or a
    procedure. Recall that all of these codes are stored somewhere in the memory, which means that
    each line of code has a specific address. To better understand this, see section 2.3.
    2.3 How computer runs the program?
    With the idea that all of the codes are stored in memory, and each has an address, we can now
    talk about how computers run these codes. Long story short, the computer runs the programs
    following the machine cycle.
    2.3.1 Machine cycle
    A shorter version of a machine cycle looks like this:
  20. The computer loads the line of instruction PC is "pointing at".
  21. The computer increment PC by 4 (think about why).
  22. The computer runs the instruction loaded.
    This goes on until the program terminates. PC in this context represents the "program counter".
    In other words, PC is the "pointer" the computer maintains that stores the address of the
    next instruction to be executed.
  23. Project 1 details
    3.1 Requirements
  24. Your project 1 should be written in C/C++ only.
  25. You will need to write your own makefile/cmake.
  26. The testing environment is provided by Mr. Yifan Zhu and Mr. Guochao Xie. You need to
    make sure your program can execute without a problem on the VM/Docker they provided.
    You can access the testing environment through VM instruction on BB.
  27. The detailed list of the MIPS instructions you need to support will be announced later.
    3.2 Assembler
    .data #static data go here
    str1: .asciiz "hello world!\n"
    .text #MIPS code goes here
    main: add $t0, $t1, $t2
    ...
    3.2.1 Overview
    The first task you are doing is assembling the given MIPS code to their corresponding machine
    code. Here's a quick example of what you need to do:
    Remember that you are not assembling the .data section, nor the labels. The .data section will
    be loaded to your memory in your simulation part. The labels in your .text section will be
    translated to its corresponding address when needed. For example, when you see j R , you will
    put the address of the line of instruction label R is indicating.
    3.2.2 Details
    This part of your program is easy but needs your patience. You need to be able to:
  28. Read file line by line.
  29. Find the segments of the MIPS file. (.data, .text)
  30. Tokenize the line read in.
  31. Discard useless information, such as comments.
    Here are some ideas of how to implement this part:
  32. You need to scan through the file for the first time. This time, discard all of the comments
    (following a "#"). Remember you only need to deal with .text segment for assembling. Find all
    of the labels in the code, and store them with their corresponding address for later
    reference.
  33. You need to scan through the file for the second time, line by line. This time, you need to
    identify which instruction the line is (R, I, J). According to the instruction type, you can
    assemble the line. For lines with the label, you can refer to the stored information for the
    label's address.
    3.3 Simulator
    3.3.1 Overview
    This is the major part of your project 1. You need to have a full understanding of how computer
    executes programs, and how are things stored in memory. Your code will need to be capable of
    executing the MIPS code line by line.
    MIPS code:
    .text
    R: add $s0, $s1, $s2 #r instructions
    addu $s0, $s1, $s2
    sub $s0, $s1, $s2
    subu $s0, $s1, $s2
    Machine code:
    00000010001100101000000000100000
    00000010001100101000000000100001
    00000010001100101000000000100010
    00000010001100101000000000100011
    3.3.2 Memory & register simulation
    The first thing you will need to do is memory simulation. Think about your simulator as a minicomputer,
    that has its own main memory, CPU, etc. To simulate main memory, you need to
    dynamically allocate a block of memory with C/C++, with a size of 6MB. Here is a figure of what
    does a real computer memory look like.
    Your simulated memory should also have these components. Also, since most of you are using
    a 64-bit computer, you need to "translate" the real address of your allocated memory to a 32-bit
    simulated address. Specifically:
  34. Let's say you have the pointer named "real_mem" storing the real address of the block of
    memory allocated. The first thing you need to do is to map the value of "real_mem" to
    400000_hex. Then the real address will have a 1-to-1 mapping relationship to the simulated
    address. For instance, if the address mentioned in the MIPS testing file is 500000_hex (such
    as lw, where we want to load the data storing on 500000_hex), then, you should access it at
    real address of: (real_mem + 500000_hex - 400000_hex).
  35. The dynamically allocated 6MB memory block is pointing at the start of your text segment,
    and your text segment will be 1MB in size. The end of text segment will be at simulated
    address 400000_hex+1MB, or at address real_mem+1MB.
  36. The static data segment will start at simulated address 500000_hex, or at real address
    (real_mem+1MB).
  37. The dynamic data segment will start at wherever your static data section ends.
  38. The stack segment will start at the highest address 1000000_hex (real_mem+6MB), and it
    grows downwards (whenever you put things in there, the address decreases).
    You should also simulate the registers. The registers should not be a part of your simulated
    memory. Recall that registers are in CPU. In this project, you are not accessing the real registers,
    however, you will allocate memory for the 32 general purpose registers, which are:
    Your code should initiate the registers as described by its functionality. For example, the stack
    pointer register, $sp, should always store the current stack top. You should initialize it with a value
    of 1000000_hex.
    3.3.3 Putting things in the right place
    Your simulator should take a MIPS file as input, and you should put everything in the right place
    before your simulation. After the simulated memory is ready, you will read a MIPS file, and:
  39. Put the data in .data segment of MIPS file piece by piece in the static data segment. The
    whole block (4 bytes) is assigned to a piece of data even if it is not full. For example:
    Here, each character of .asciiz type occupies 1 byte, so "hell" occupies the first block. The first
    two bytes of the second block is used by "o" and a terminating sign "\0", but the last two
    bytes of the second block is not used. However, when we put the next piece of data in the
    memory, we start a new block. The data type .word occupies 4 bytes, so the third block is
    assigned to this piece of data.
  40. Assemble the .text segment of the MIPS file (section 3.2), and put the assembled machine
    code in the text segment of your simulated memory (the first line of code has the lowest
    address). The assembled machine code is 32 bits, which is 4 bytes. Thus, you can translate it
    to a decimal number and store it as an integer.
    3.3.4 Start simulating
    Your code should maintain a PC, which points to the first line of code in the simulated memory.
    Your code should have a major loop, simulating the machine cycle. Following the machine cycle,
    your code should be able to:
  41. Go to your simulated memory to fetch a line of machine code stored at the address PC
    indicates.
  42. PC=PC+4
  43. From the machine code, be able to know what the instruction is and do the corresponding
    things.
    The third step of the machine cycle requires you to write a C/C++ function for each instruction to
    do what it's supposed to. For example, for the add instruction, we can write in C:
    In the third step of the machine cycle, when we read a line of machine code and the
    corresponding instruction is add , we can simply call the function.
  44. Miscellaneous
    .data
    str1: .asciiz "hello"
    int1: .word 1
    in memory:
    | hell | o\0-- | 1 |
    void add (int rs, int rt, int* rd){
    rd = rs+ *rt;
    }
    4.1 Deadline
    The deadline of this project is 3/21/2021 midnight.
    4.2 Submission
    You should put all of your source files in a folder and compress it in a zip. Name it with your
    student ID. Submit it through BB.
    4.3 Grading
    This project worth 30% of your total grade. The grading details of this project will be:
  45. Assembling - 20%
  46. Memory & register simulation - 30%
  47. Proper MIPS execution - 40%
    Each unsuccessful MIPS instruction will result in 3% deduction.
  48. Report - 10%
    NOTICE: If your code does not support makefile/cmake, you will lose 50% automatic.
    4.4 Report
  49. The report of this project should be no longer than 5 pages.
  50. In your report, you should write:
  51. Your big picture thoughts and ideas, showing us you really understand how are MIPS
    programs executed in computers. (This part should be the major part of your report.)
  52. The high level implementation ideas. i.e. how you break down the problem into small
    problems, and the modules you implemented, etc.
  53. The implementation details. i.e. what structures did you define and how are they used.
  54. In your report, you should not:
  55. Include too many screenshots your code.
  56. Copy and paste others' report.
    4.5 Honesty
    We take your honesty seriously. If you are caught copying others' code, you will get an
    automatic 0 in this project. Please write your own code.

    推荐阅读