【COMP6714 源码】COMP6714 Project
It will take you quite some time to complete this project, therefore, we earnestly
recommend that you start working as early as possible. You should read the specs carefully at
least 2-3 times before you start coding.
Project Specification
Instructions
- This note book contains instructions for .
You are required to complete your implementation for part-1 in a file project.py
provided along with this notebook. Please the name of the file.
You are not allowed to print out unnecessary stuff. We will not consider any output printed
out on the screen. All results should be returned in appropriate data structures via
corresponding functions.
You can submit your implementation for Project via give .
For each question, we have provided you with detailed instructions along with question
headings. In case of problems, you can post your query @ Piazza.
You are allowed to add other functions and/or import modules (you may have to for this
project), but you are not allowed to define global variables. Only functions are allowed in
project.py
You should not import unnecessary and non-standard modules/libraries. Loading such
libraries at test time will lead to errors and hence 0 mark for your project. If you are not
sure, please ask @ Piazza.
Allowed Libraries:
You are required to write your implementation for the project using Python 3.6.5 . You are
allowed to use any python standard libraries (https://docs.python.org/3.6/l...).
Part One - Group Varint Encoding
Input Format:
Note:
Submission deadline for the Project is 20:59:59 on 19th Nov, 2021
LATE PENALTY: 10% on day-1 and 30% on each subsequent day.
COMP6714-Project
DO NOT ALTER
The function encode() should receive One argument:
posting_list which is a list of integers, where each integer represents a document ID
(all the document IDs are sorted).
Output Format:
Your output should be a bytearray, which is the group varint encoding for posting_list .
Toy Example for Illustration
Here, we provide a small toy example for this part:
Let posting_list be:
['00000110',
'00000001',
'00001111',
'11111111',
'00000001',
'11111111',
'11111111',
'00000001']
Part Two - Group Varint Decoding
Input Format:
The function decode() should receive One argument:
encoded_list is a Bytearray which corresponds to the encoded binary sequence.
Output Format:
Your output should be a list of integers, where each integer represents a document ID that is
decoded from the encoded list.
Toy Example for Illustration
In [1]:
def encode(posting_list):
pass
In [2]:
posting_list = [1, 16, 527, 131598]
In [3]:
encoded_list = encode(posting_list)
In [6]:
[bin(code)[2:].zfill(8)for code in encoded_list]
Out[6]: In [55]:
def decode(encoded_list):
pass
Here, we provide a small toy example for this part:
Let encoded_list be:
[1, 16, 527, 131598]
Part Three - Evaluation
In this part, you need to implement a function that computes the F1 score and MAP with the
given informtion.
Input Format:
The function evaluation() should receive two argument:
rel_list is a list of 0s and 1s, where 0 indicates that the corresponding document is
irrelevant, and 1 indicates that the corresponding document is relevant. total_rel_doc is an
integer that indicates the total relevant documents to the query.
Output Format:
Your output should be two float numbers, where the first one is the F1 score, and the second
one is the MAP.
Toy Example for Illustration
Here, we provide a small toy example for this part:
Let rel_list and total_rel_doc be:
0.43
In [66]:
encoded_list = bytearray(b'\x06\x01\x0f\xff\x01\xff\xff\x01')
In [67]:
decoded_list = decode(encoded_list)
In [9]:
decoded_list
Out[9]: In [71]:
def evaluation(rel_list, total_rel_doc):
pass
In [94]:
推荐阅读
- 7个最好的开源PHP模板引擎推荐
- 开发中要考虑的6种Web后端安全风险
- scala|scala基础知识
- 如何保护平台即服务(PaaS)环境()
- 21个OpenSSL示例可在现实世界中为你提供帮助
- 如何在Linux上下载和安装WebSphere Application Server 7
- 跟踪WebSphere DMGR登录控制台的访问和更改
- IBM WebSphere 8.5中的修订包升级指南
- WebSphere和配置指南中的虚拟主机-解释