【STAC51数据分析】Department of Computer and Mathematical Sciences
STAC51: Categorical Data Analysis
Winter 2021
Instructor: Sohee Kang
E-mail: sohee.kang@utoronto.ca
Office: IC 483
Online Office Hours: Monday 5-6 pm and Wednesday 5-6 pm
(416) 208-4749
TA: Bo Chen TA: Lehang Zhong
E-mail: bojacob.chen@mail.utoronto.ca E-mail: lehang.zhong@mail.utoronto.ca
Course Description: In this course we discuss statistical models for categorical data. Contingency
tables, generalized linear models, logistic regression, multinomial responses, logit models for
nominal responses, log-linear models for two-way tables, three-way tables and higher dimensions,
models for matched pairs, repeated categorical response data, correlated and clustered responses
and statistical analyses using R. The students will be expected to interpret R codes and outputs
on tests and the exam.
Prerequisite(s): STAB27H3 or STAB57H3 or MGEB12H3 or PSYC08H3
Credit Hours: 3
Required Text: An Introduction to Categorical Data Analysis, 3rd Edition
Author(s): Alan Agresti
WebLink for 2nd edition: https://search.library.utoron...
Sub-text1: Categorical Data with R, 3rd edition
Author: Alan Agresti
Sub-text2: Analysis of Categorical Data with R (2014)
Author:Bilder C. and Loughin T.
Course Objectives:
At the completion of this course, students will be able to:
- use R software to conduct categorical data analysis.
- identify designs of contingency tables and recommend appropriate measures of association
and statistical tests. - develop models for binary response and polytomous categorical responses, interpret results
and diagnose model fits. - interpret and communicate categorical data methods to a technical audience.
1
Grade Components:
Case Study and Presentation 15%
Assignments 15%
Quizzes 15%
Midterm Exam 20%
Final Exam 30%
Attendance 5 %
Course Policy:
? Communication
– Important announcements, lecture notes, additional material, and other course info will
be posted on Quercus. Check it regularly. You are responsible for keeping up with
announcements from instructors on Quercus and via e-mail.
– Check “Piazza” before you send an e-mail, make sure that you are not asking for
information that is already on “Piazza”. In general, I will not answer questions about
the course material by e-mail. Such questions are more appropriately discussed during
office hours of me or TAs.
– E-mail is appropriate for private communication. Use your utoronto.ca account and
include STAC51 in the subject line.
? Oral Assessment
If the instructor has a suspicion on your assessment result (the deviance is great) then she
will conduct an oral assessment after. If the oral assessment result confirms the suspicion
then the previous assessment score will be replaced to 0.
? No makeup quizzes or exams will be given.
Learning Components:
? Tutorial
Students are expected to attend the weekly tutorial to gain practical R programming experience.
Quizzes will be conducted in tutorial. You need to turn on videos so that TAs can
invigilate.
? Assignments
Three assignments (each 5%) will be distributed. All assignments are group works (two team
members) unless you prefer individual work.
? Quiz
Three quizzes (each 5%) will take place after the assignments handed in.
? Case Study and Presentation
Students will be required to work on a case study as a group and to submit a report. The
size of the group is maximum of FOUR. You can choose your group members. For a report,
students will write R codes and interpret R outputs and will use R Markdown (R package).
More details, such as the content and deadline, will be communicated later. No late report
will be accepted. Each group will present the case study (5 minutes) at the last day of
lecture.
2
? Attendance Attendance is expected and will be taken each class and tutorial.
? Computing Statistical computing is a key part of the class. In-class analysis will be conducted
in R and all course material (code and data) is in R format. R is free and available for
download at http://www.r-project.org, and you can find manuals and installation guidelines
on this site.
For basics in R, here are suggested documents: R for beginners by Emanuel Paradis, An
Introduction to R by W. N. Venables, D. M. Smith, and the R Core Team, A (very) short
introduction to R by Paul Torfs and Claudia Brauer. More information and documentation
are available on The R Project website. Students are expected to write R codes and interpret
R outputs on assignments, tests, and the exam.
Outline of Topics:
Chapter Content
Ch. 1
? Introduction
? Distributions for categorical data
? Statistical inference for categorical data
Ch. 2 ? Describing contingency tables, independence of categorical variables
? Comparing proportions, Relative risk, Odds ratio
Ch. 2 ? Inference for contingency tables, Chi-squared tests of independence
? Exact tests for small samples
Ch. 3 ? Introduction to Generalized Linear Models: Generalized linear models for binary
data, Poisson log linear models, Negative binomial GLMs
Ch. 4 ? Logistic Regression
Ch. 5 ? Building, Checking, and applying logistic regression models.
Ch. 6 ? Models for multinomial responses.
Ch. 7 ? Loglinear models for two-way tables, Loglinear models for three-way tables,
Inference for loglinear models.
Ch 8 ? Models for matched pairs.
3
University Policies
? Academic Integrity:
Academic integrity is essential to the pursuit of learning and scholarship in a university,
and to ensuring that a degree from the University of Toronto is a strong signal of each students
individual academic achievement. As a result, the University treats cases of cheating
and plagiarism very seriously. The University of Torontos Code of Behaviour on Academic
Matters (http://www.governingcouncil.u...) outlines the behaviours
that constitute academic dishonesty and the processes for addressing academic offences.
Potential offences include, but are not limited to:
In papers and assignments:
– Using someone elses ideas or words without appropriate acknowledgment.
– Submitting your own work in more than one course without the permission of the instructor.
– Making up sources or facts.
– Obtaining or providing unauthorized assistance on any assignment.
On tests and exams:
– Using or possessing unauthorized aids.
– Looking at someone elses answers during an exam or test.
– Misrepresenting your identity.
In academic work:
– Falsifying institutional documents or grades.
– Falsifying or altering any documentation required by the University, including (but not
limited to) doctors notes.
All suspected cases of academic dishonesty will be investigated following procedures outlined
in the Code of Behaviour on Academic Matters. If you have questions or concerns about what
constitutes appropriate academic behaviour or appropriate research and citation methods, you
are expected to seek out additional information on academic integrity from your instructor
or from other institutional resources (see http://www.utoronto.ca/academ...).
? Accessibility:
Students with diverse learning styles and needs are welcome in this course. In particular,
if you have a disability/health consideration that may require accommodations, please feel
free to approach me and/or the AccessAbility Services Office as soon as possible. I will
work with you and AccessAbility Services to ensure you can achieve your learning goals in
this course. Enquiries are confidential. The UTSC AccessAbility Services staff (located in
S302) are available by appointment to assess specific needs, provide referrals and arrange
appropriate accommodations (416) 287-7560 or
推荐阅读
- 算法测试探索与实践
- 蓝桥真题|【蓝桥真题五】带三百人训练了十天精选蓝桥真题,看看他们都练些什么(三门语言题解)
- leetcode|LeetCode 48. Rotate Image 时间复杂度(O(n))
- LeetCode|LeetCode 53. Maximum Subarray 时间复杂度(O(n))
- LeetCode|LeetCode 42. Trapping Rain Water 时间复杂度(O(n))
- leetcode|算法入门之字符串(Python)【初级算法——字符串】【蓝桥杯练习】【力扣练习】
- 备战蓝桥杯|蓝桥杯python组十一届省赛真题+解析+代码(通俗易懂版)
- 备战蓝桥杯|2020年第十一届蓝桥杯省赛Python组(真题+解析+代码)(作物杂交)
- 备战蓝桥杯|2021年第十二届蓝桥杯省赛Python组(真题+解析+代码)(直线)