COMP6714 源码

【COMP6714 源码】COMP6714 Project
It will take you quite some time to complete this project, therefore, we earnestly
recommend that you start working as early as possible. You should read the specs carefully at
least 2-3 times before you start coding.
Project Specification
Instructions

  1. This note book contains instructions for .
    You are required to complete your implementation for part-1 in a file project.py
    provided along with this notebook. Please the name of the file.
    You are not allowed to print out unnecessary stuff. We will not consider any output printed
    out on the screen. All results should be returned in appropriate data structures via
    corresponding functions.
    You can submit your implementation for Project via give .
    For each question, we have provided you with detailed instructions along with question
    headings. In case of problems, you can post your query @ Piazza.
    You are allowed to add other functions and/or import modules (you may have to for this
    project), but you are not allowed to define global variables. Only functions are allowed in
    project.py
    You should not import unnecessary and non-standard modules/libraries. Loading such
    libraries at test time will lead to errors and hence 0 mark for your project. If you are not
    sure, please ask @ Piazza.
    Allowed Libraries:
    You are required to write your implementation for the project using Python 3.6.5 . You are
    allowed to use any python standard libraries (https://docs.python.org/3.6/l...).
    Part One - Group Varint Encoding
    Input Format:
    Note:
    Submission deadline for the Project is 20:59:59 on 19th Nov, 2021
    LATE PENALTY: 10% on day-1 and 30% on each subsequent day.
    COMP6714-Project
    DO NOT ALTER
    The function encode() should receive One argument:
    posting_list which is a list of integers, where each integer represents a document ID
    (all the document IDs are sorted).
    Output Format:
    Your output should be a bytearray, which is the group varint encoding for posting_list .
    Toy Example for Illustration
    Here, we provide a small toy example for this part:
    Let posting_list be:
    ['00000110',
    '00000001',
    '00001111',
    '11111111',
    '00000001',
    '11111111',
    '11111111',
    '00000001']
    Part Two - Group Varint Decoding
    Input Format:
    The function decode() should receive One argument:
    encoded_list is a Bytearray which corresponds to the encoded binary sequence.
    Output Format:
    Your output should be a list of integers, where each integer represents a document ID that is
    decoded from the encoded list.
    Toy Example for Illustration
    In [1]:
    def encode(posting_list):
    pass
    In [2]:
    posting_list = [1, 16, 527, 131598]
    In [3]:
    encoded_list = encode(posting_list)
    In [6]:
    [bin(code)[2:].zfill(8)for code in encoded_list]
    Out[6]: In [55]:
    def decode(encoded_list):
    pass
    Here, we provide a small toy example for this part:
    Let encoded_list be:
    [1, 16, 527, 131598]
    Part Three - Evaluation
    In this part, you need to implement a function that computes the F1 score and MAP with the
    given informtion.
    Input Format:
    The function evaluation() should receive two argument:
    rel_list is a list of 0s and 1s, where 0 indicates that the corresponding document is
    irrelevant, and 1 indicates that the corresponding document is relevant. total_rel_doc is an
    integer that indicates the total relevant documents to the query.
    Output Format:
    Your output should be two float numbers, where the first one is the F1 score, and the second
    one is the MAP.
    Toy Example for Illustration
    Here, we provide a small toy example for this part:
    Let rel_list and total_rel_doc be:
    0.43
    In [66]:
    encoded_list = bytearray(b'\x06\x01\x0f\xff\x01\xff\xff\x01')
    In [67]:
    decoded_list = decode(encoded_list)
    In [9]:
    decoded_list
    Out[9]: In [71]:
    def evaluation(rel_list, total_rel_doc):
    pass
    In [94]:

    推荐阅读