【EECS6127 分类器】EECS6127–Winter 2020–2021
Assignment 1
Due: 11:59:59 PM, February 6, 2021.
A full score on this assignment is 25 marks. However there are 28 marks achievable on
this assignment (up to 3 bonus marks). To gain full marks, explain your answers carefully
and prove your claims.
- Bayes classifier and Bayes risk
Recall that we model the data generation as a probability distribution D over X ×
{0, 1}. D can be defined in term of its marginal distribution over X , which we will
denote by DX and the conditional labeling distribution, which is defined by the
regression function
μ(x) = P
(x,y)~D
[y = 1 | x]
Let’s consider a 2-dimensional Euclidean domain, that is X = R
2
, and the
following process of data generation: The marginal distribution over X is uniform
over two square areas [1, 2] × [1, 2] ∪ [3, 4] × [1.5, 2.5]. Points in the first
square Q1 = [1, 2] × [1, 2] are labeled 0 (blue) and points in the second square
Q2 = [3, 4] × [1.5, 2.5] are labeled 1 (red), as in the illustration below.
(a) Describe the density function of DX, and the regression function, Bayes predictor
and Bayes risk of D.
(b) Consider the two distributions D1 and D2
that we obtain by “projecting” onto
each of the axes. Formally, we are marginalizing out one of the features to
obtain D1 and D2
. Both are distributions over R× {0, 1}. Describe the density
functions of D1
X
and D2
X
, and the regression functions, Bayes predictors and
Bayes risks of D1 and D2
.
1
(c) Consider the hypothesis classes Hinit and Hdecst from Question 2 on this assignment.
Determine the approximation errors of Hinit for D1 and D2 and the
approximation error of Hdecst for D.
(d) For the classes Hinit and Hdecst consider their closures under function complements
and again determine the approximation errors on D, D1 and D2
. -
- 3 + 3 + 3 marks
- VC-dimension
We define hypothesis classes Hinit of initial segments over domain X = R and Hdecst
of decisions stumps over domain X = R - as follows:
Hinit = {ha | a ∈ R}, where ha(x) = 1[x ≤ a] ,
and
Hdecst = {h
i
a
| a ∈ R, i ∈ {1, 2}}, where h
i
a
((x1, x2)) = 1[xi ≤ a] .
Further, for a hypothesis class H ∈ {0, 1}
X , we define the class of complements of
H, denoted by Hc
, as the class where we flip all the labels of the functions in H,
that is
H
c = {h
c
| h ∈ H} where h
c
(x) = |h(x) ? 1|.
Finally , we let Hcc = H ∪ Hc denote the closure of H under complements.
(a) Determine the VC-dimensions of Hinit and Hdecst.
(b) Show that for every hypothesis class H, we have VC(H) = VC(Hc
).
(c) Determine the VC-dimensions of Hcc
init and Hcc
decst.
(d) Show that, for every k, there exists a hypothesis class H with VC-dimension
k and VC(Hcc) = 2k + 1.
Hint: consider domain X = N and classes
Hk = {h ∈ {0, 1}
X
| |h
?1
(1)| ≤ k}
That is Hk is the class of functions that map at most k natural numbers to 1
and the remaining ones to 0. -
- 3 + 3 + 3 marks
- Empirical and true risk
Recall Claim 2, from Lecture 2. We showed that, over an uncountable domain,
there exists a learner A (the “stubborn learner”) and a distribution D, such that
for every sample size m and all samples S from Dm:
|LS(A(S)) ? LD(A(S))| = 1
This is not true for countable domains (as we will show later in the course). For
this exercise, we will show a similar, but weaker statement for countable domains.
Without loss of generality, we can assume X = N. Prove that, for every sample size
m, and every > 0, there exists a learner A and a distribution D over X × {0, 1},
such that for all samples S from Dm we have
推荐阅读
- 数据分析|python机器学习之模型选择与优化
- 人工智能+大数据|逻辑回归(使用激活函数sigmoid)详细介绍
- 统计学|灰色关联分析,Python实现GRA(gray relation analysis)
- python|总结|图像分割5大经典方法
- 机器学习|河南郑州二手房房价预测
- java|为什么使用开源软件_为什么要使用开源软件()
- java|为什么要使用开源软件()
- leetcode|LeetCode 48. Rotate Image 时间复杂度(O(n))
- 大数据|关于Vision Transformer的一些思考