CS602数据驱动分析
CS602 - Data-Driven Development with Python Spring 2019 Programming Assignment
Programming Assignment 3
Getting started
Complete the reading and practice assignments posted on the course schedule. Review class handouts and
examples. This assignment requires knowledge of loops, lists and function definitions.
Programming Project: PurchaseSummary worth: 20 points
Create an item purchase summary matrix.
In this assignment you will be processing order data from a fictional coffee shop. The café’s owner would
like to optimize the business operations and has hired you to create a few summary analyses of the data.
The first task concerns looking at a summary of how many coffee drinks and pastries get sold within
different periods of a work day during any period of seven consecutive days.
The data, collected from online and in-person orders, includes a set of uniform records from January 2019,
each storing the following fields
Day of the month: 1 through 31
Time of the order: the time range of orders corresponds to the working hours of the business, which
is 6 am until midnight.
First and last name of the person making the order, if known. 'anon' represents orders by
anonymous customers.
Number of items ordered for each of the following products: espresso, cappuccino, americano,
muffin, and scone.
The data is supplied in a file, orderlog.py, which is posted with the assignment. For simplicity, orderlog.py
contains a definition of a python list, stored in variable called orderlst, as follows:
orderlst=[
['dayofmonth', 'time', 'first name', 'last name', 'espresso',
'capuccino', 'americano', 'muffin', 'scone'],
['22', '22:46:52', 'Arlyne', 'Highthorne', 3,0,4,1,0],
['2', '20:31:28', 'Mapenda', 'Rugeley', 0,1,1,3,1],
['3', '20:56:28', 'anon', 'anon', 0,0,2,2,1],
['24', '13:27:00', 'Miles', 'Cappy', 2,0,0,3,0],
. . . the rest of the content is omitted . . .
As you can see, orderlst is a two-dimensional list, in which the first row represents column titles, and
each of the rest of the rows has similar structure, representing fields described above as shown.
To use the orderlst list in your program, download orderlog.py into your project folder and include the
following code in the beginning of your program:
import orderlog
ORDERS = orderlog.orderlst # rename for brevity
This will make the orderlst content available to the program as global variable ORDERS.
The program that you write must work as follows. We begin with a brief summary of the requirements:
CS602 - Data-Driven Development with Python Spring 2019 Programming Assignment 3
2
- Ask the user for the following input:
a. Which item purchases to summarize. The input must be entered as an integer code (0
through 4) – see the sample interaction.
b. The length of a unit time interval for which summaries are displayed, specified in minutes.
c. Stating date of the weekly summary (between 1 and 25) - Create and display a matrix summarizing the quantity of the purchased item during each unit time
interval, starting from 6 am and ending at midnight, for each day starting from the specified start
date and ending 6 days later.
The following interaction, in which user input is indicated in boldface, illustrates one execution of the
program:
This program presents a weekly purchase summary.
Enter code to select item type:
0(espresso) 1(capuccino) 2(americano) 3(muffin) 4(scone): 4
Enter the time interval in minutes: 120
Please enter a stating date (between 1 and 25): 2
Displaying number of purchases of scones for 7 days starting on Wednesday, Jan - 2019
TIME \ DAY Wed Thu Fri Sat Sun Mon Tue【CS602数据驱动分析】6:00 - 7:59 | 0 0 0 0 0 1 0
8:00 - 9:59 | 0 0 0 0 2 3 0
10:00 - 11:59 | 2 4 7 7 1 1 0
12:00 - 13:59 | 0 7 0 0 1 0 1
14:00 - 15:59 | 3 1 0 0 1 1 4
16:00 - 17:59 | 5 0 0 2 4 3 0
18:00 - 19:59 | 0 0 3 2 0 2 1
20:00 - 21:59 | 1 1 0 4 1 0 0
22:00 - 23:59 | 5 6 0 4 3 0 0
Bye!
Notice that the output above displays a matrix, annotated with row and column labels. The column labels
on top of the matrix show days of the week, starting with the day corresponding to the date entered by the
user (Wed), and ending exactly 6 days later. The row labels define the beginning and end time of each
successive unit time interval (as requested by user), starting from the 6 am opening time until the midnight
closing.
The next interaction is based on the following smaller orderslst list created specifically for testing purposes:
orderlst=[
['dayofmonth', 'time', 'first name', 'last name', 'espresso', 'cappuccino',
'americano', 'muffin', 'scone'],
['3', '09:05:54', 'anon', 'anon', 1,1,3,0,4],
['3', '09:15:54', 'anon', 'anon', 1,1,3,0,4],
['6', '10:17:17', 'Jacquelin', 'Trevale', 2,1,3,1,0],
['7','12:42:56', 'anon', 'anon', 0,2,0,2,3],
['7','12:59:56', 'anon', 'anon', 0,0,1,2,3],
['8','06:00:00', 'anon', 'anon', 2,0,0,1,0],
CS602 - Data-Driven Development with Python Spring 2019 Programming Assignment 3
3
['8','12:30:00', 'anon', 'anon', 0,0,5,2,3],
['21', '12:33:23', 'Zuleyma', 'Pemdevon', 2,2,3,2,1],
['23', '10:01:52', 'Hongliang', 'Sapone', 1,0,3,3,1],
['30','18:31:38', 'anon', 'anon', 2,2,4,1,2]
]
Note how invalid input is treated in the interaction below:
This program presents weekly purchase summary.
Enter code to select item type:
0(espresso) 1(cappuccino) 2(americano) 3(muffin) 4(scone) 6
Enter code to select item type:
0(espresso) 1(cappuccino) 2(americano) 3(muffin) 4(scone) 0
Enter the time interval in minutes: 90
Please enter a stating date (between 1 and 25)28
Please enter a stating date (between 1 and 25)0
Please enter a stating date (between 1 and 25)3
Displaying number of purchases of espressos for 7 days starting on Thursday,
Jan 3 2019
TIME \ DAY Thu Fri Sat Sun Mon Tue Wed6:00 - 7:29 | 0 0 0 0 0 2 0
7:30 - 8:59 | 0 0 0 0 0 0 0
9:00 - 10:29 | 2 0 0 2 0 0 0
10:30 - 11:59 | 0 0 0 0 0 0 0
12:00 - 13:29 | 0 0 0 0 0 0 0
13:30 - 14:59 | 0 0 0 0 0 0 0
15:00 - 16:29 | 0 0 0 0 0 0 0
16:30 - 17:59 | 0 0 0 0 0 0 0
18:00 - 19:29 | 0 0 0 0 0 0 0
19:30 - 20:59 | 0 0 0 0 0 0 0
21:00 - 22:29 | 0 0 0 0 0 0 0
22:30 - 23:59 | 0 0 0 0 0 0 0
Bye!
Important Notes and Requirements:
- Your program should not use any global variables except for ORDERS (as defined above).
- The item type list should be generated from the first entry of the ORDERS list, assuming the item
titles start at index 4 and end at the end of the list. If product titles in the header row of the log
file change, or more products are introduced and their count is added past the ‘scone’, your
program should still work correctly and require no modifications. - Your program must verify the validity of the item code parameter and the date – they must be
within the correct bounds. If the entered value does not fit the bounds, the input must be repeated
until a correct value is entered. As a simplifying assumption, assume that the unit time interval
entered divides the length of the day with remainder 0, i.e. you do not need to check validity of this
parameter.
CS602 - Data-Driven Development with Python Spring 2019 Programming Assignment 3
4 - To determine which day of the week a specified start date in January 2019 falls on, use functions of
the python datetime package, which you must import, as illustrated by a code segment below:
import datetime
find the day of week for Feb 22, 2019startDate = datetime.date(2019, 2, 22)
print (startDate.month, startDate.day, startDate.year, "is a",
startDate.strftime('%A'), "or", startDate.strftime('%a') )
produces: 2 22 2019 is a Friday or FriYou can consult https://docs.python.org/3/lib... if you are looking for complete
documentation of the python datetime package and its functions.
Requirements on function definitions:
- You must define the following functions, plus define others as you see fit:
a. Function composeOrderMatrix(), with five integer parameters, denoting the start date,
item type, opening time, closing time, and the length of the unit interval. The method should
create and return a two-dimensional list, representing the order summary matrix shown in the
interaction as the shaded part of the order summary. In the matrix, each of the seven columns,
starting from the start date, represents one day’s data. Each entry in row r represents the
number of orders in the time interval number r+1 from the beginning of the day.
First, create the two-dimensional list populated with as many rows of 0s, as the number of time
intervals that would fit in the work day (must be calculated). The length of each row must
equal to the number of days.
b. Function labelString () that will produce a string label shown in the leftmost column of the
output. The function will be passed the interval number, opening time, and the length of the
unit interval as input parameters and must return a string defining the start and end time of
the interval, as shown in the sample interaction.
c. Function printWeek() with five parameters: the summary matrix, opening time, closing time,
the length of the unit interval, and the start date. The function should display the content of
the matrix as shown, with the exact formatting, which includes
i. the header row showing the day names
ii. the leftmost column, showing the time intervals,
and displays all items aligned as shown.
d. Define method main(), to start and organize the program flow, read user input and call other
methods as needed. - There should be no code outside of the methods, except for the definitions of the ORDERS global
variable and a call to method main.
CS602 - Data-Driven Development with Python Spring 2019 Programming Assignment 3
5
Hints: - Remove the first row from the ORDERS list to get rid of the column headers.
- Conduct all time arithmetic in minute-based representation, converting from minutes to hours and
minutes when you need to display the time intervals. - Create and use your own small orderlog for testing and debugging. Use the supplied file when done.
Grading
The grading schema for this project is roughly as follows:
Your program should compile without syntax errors to receive any credit. If a part of your program is
working, you will receive partial credit, but only if the program compiles without syntax errors.
4 points will be awarded for correctly handling the input, including repeated entry for invalid input
values.
8 points for correctly constructing the matrix, satisfying all code requirements.
6 points for correctly displaying the matrix in the specified form, satisfying all code requirements.
2 points will be awarded for good programming style, as defined in handout 1. Make sure to include
introductory comments to functions, documenting what the function does, the function’s parameters
and the return values.
Created by Tamara Babaian on Feb. 18, 2019
推荐阅读
- Docker应用:容器间通信与Mariadb数据库主从复制
- 使用协程爬取网页,计算网页数据大小
- 两感一练
- Java|Java基础——数组
- Python数据分析(一)(Matplotlib使用)
- Jsr303做前端数据校验
- Spark|Spark 数据倾斜及其解决方案
- 数据库设计与优化
- 爬虫数据处理HTML转义字符
- 数据库总结语句