AI|AI笔记: 数学基础之正交矩阵与矩阵的QR分解

正交矩阵

  • 若n阶方阵A满足 A T A = E A^TA = E ATA=E, 则称A为正交矩阵, 简称正交阵 (复数域上称为酉矩阵)
    • A是正交阵的充要条件:A的列(行)向量都是单位向量,且两两正交。
  • 若A为正交矩阵,x为向量,则Ax称为正交变换
    • 正交变换不改变向量的长度y = A x , y T y = ( A x ) T A x = x T A T A x = x T E x = x T x y=Ax, y^Ty = (Ax)^TAx = x^TA^TAx = x^TEx = x^Tx y=Ax,yTy=(Ax)TAx=xTATAx=xTEx=xTx
  • 正交矩阵的性质
    • 若A为正交矩阵,则逆矩阵 A ? 1 A^{-1} A?1也为正交矩阵
    • 若P、Q为正交矩阵,那么 P ? Q P*Q P?Q也为正交矩阵
QR分解(正交三角分解)
  • 对于m*n的列满秩矩阵A, 必有,A m ? n = Q m ? m ? R m ? n A_{m*n} = Q_{m*m} · R_{m*n} Am?n?=Qm?m??Rm?n?
  • 其中Q为正交矩阵,R为非奇异上三角矩阵,当要求R的对角线元素为正的时候,该分解唯一。
  • 该分解叫做QR分解,常用语求解A的特征值、A的逆,最小二乘等问题
  • QR分解是将矩阵分解为一个正交矩阵与上三角矩阵的乘积
AI|AI笔记: 数学基础之正交矩阵与矩阵的QR分解
文章图片

备注:图片托管于github,请确保网络的可访问性
  • 这其中,Q为正交矩阵, Q T Q = l Q^TQ = l QTQ=l, R为上三角矩阵
  • 实际中,QR分解经常被用来解线性最小二乘问题。
施密特正交化过程
  • 把一组线性无关向量组化为规范正交向量组,继而得到正交阵
  • η 1 = β 1 ∣ ∣ β 1 ∣ ∣ , η 2 = β 2 ∣ ∣ β 2 , ? ? , η r = β r ∣ ∣ β r ∣ ∣ \eta_1 = \frac{\beta_1}{||\beta_1||}, \eta_2 = \frac{\beta_2}{||\beta_2}, \cdots, \eta_r = \frac{\beta_r}{||\beta_r||} η1?=∣∣β1?∣∣β1??,η2?=∣∣β2?β2??,?,ηr?=∣∣βr?∣∣βr?? 是与 α 1 , α 2 , . . . , α r \alpha_1, \alpha_2, ..., \alpha_r α1?,α2?,...,αr?等价的规范(标准)正交组。
  • 设 α 1 , α 2 , . . . , α r \alpha_1, \alpha_2, ..., \alpha_r α1?,α2?,...,αr? 线性无关, 令β 1 = α 1 , β 2 = α 2 ? [ β 1 , α 2 ] [ β 1 , β 1 ] β 1 , β 3 = α 3 ? [ β 1 , α 3 ] β 1 , β 2 β 1 ? [ β 2 , α 3 ] [ β 2 , β 2 ] β 2 ? ? \beta_1 = \alpha_1, \beta_2 = \alpha_2 - \frac{[\beta_1, \alpha_2]}{[\beta_1, \beta_1]} \beta_1, \beta_3 = \alpha_3 - \frac{[\beta_1, \alpha_3]}{\beta_1, \beta_2} \beta_1 - \frac{[\beta_2, \alpha_3]}{[\beta_2, \beta_2]} \beta_2 \cdots \cdots β1?=α1?,β2?=α2??[β1?,β1?][β1?,α2?]?β1?,β3?=α3??β1?,β2?[β1?,α3?]?β1??[β2?,β2?][β2?,α3?]?β2???
  • β r = α r ? [ β 1 , α r ] [ β 1 , β 1 ] β 1 ? [ β 2 , α r ] [ β 2 , β 2 ] β 2 ? ? ? [ β r ? 1 α r ] [ β r ? 1 , β r ? 1 ] β r ? 1 \beta_r = \alpha_r - \frac{[\beta_1, \alpha_r]}{[\beta_1, \beta_1]}\beta_1 - \frac{[\beta_2, \alpha_r]}{[\beta_2, \beta_2]} \beta_2 - \cdots - \frac{[\beta_{r-1} \alpha_r]}{[\beta_{r-1}, \beta_{r-1}]} \beta_{r-1} βr?=αr??[β1?,β1?][β1?,αr?]?β1??[β2?,β2?][β2?,αr?]?β2????[βr?1?,βr?1?][βr?1?αr?]?βr?1?
  • 则 β 1 , β 2 , ? ? , β r \beta_1, \beta_2, \cdots, \beta_r β1?,β2?,?,βr? 两两正交,且与α 1 , α 2 , ? ? , α r \alpha_1, \alpha_2, \cdots, \alpha_r α1?,α2?,?,αr?等价
【AI|AI笔记: 数学基础之正交矩阵与矩阵的QR分解】例1
  • 求矩阵 A = ( 1 1 ? 1 1 0 0 0 1 0 0 0 1 ) A=\left (\begin{array}{cccc}1 & 1 & -1 \\1 & 0 & 0 \\0 & 1 & 0 \\0 & 0 & 1\end{array} \right ) A=?????1100?1010??1001??????的QR(正交三角)分解
  • 分析
    • 容易判断出 A ∈ C 3 4 × 3 A \in C_3^{4×3} A∈C34×3? 即A是一个列满秩矩阵
    • 将 A = [ α 1 , α 2 , α 3 ] A = [\alpha_1, \alpha_2, \alpha_3] A=[α1?,α2?,α3?]的三个列向量施密特正交化先得到一个规范正交向量组
    • β 1 = α 1 = [ 1100 ] T \beta_1 = \alpha_1 = [1 \ \ 1 \ \ 0 \ \ 0]^T β1?=α1?=[1100]T
    • β 2 = α 2 ? ( α 2 , β 1 ) β 1 , β 1 β 1 = α 2 ? 1 2 β 1 = [ 1 2? 1 210 ] T \beta_2 = \alpha_2 - \frac{(\alpha_2, \beta_1)}{\beta_1, \beta_1} \beta_1 = \alpha_2 - \frac{1}{2} \beta_1 = [\frac{1}{2} \ \ \frac{-1}{2} \ \ 1 \ \ 0]^T β2?=α2??β1?,β1?(α2?,β1?)?β1?=α2??21?β1?=[21?2?1?10]T
    • β 3 = α 3 ? ( α 3 , β 1 ) β 1 , β 1 β 1 ? ( α 3 , β 2 ) β 2 , β 2 β 2 = α 3 + 1 2 β 1 + 1 3 β 2 = [ ? 1 31 31 31 ] T \beta_3 = \alpha_3 - \frac{(\alpha_3, \beta_1)}{\beta_1, \beta_1} \beta_1 - \frac{(\alpha_3, \beta_2)}{\beta_2, \beta_2} \beta_2 = \alpha_3 + \frac{1}{2} \beta_1 + \frac{1}{3} \beta_2 = [\frac{-1}{3} \ \ \frac{1}{3} \ \ \frac{1}{3} \ \ 1]^T β3?=α3??β1?,β1?(α3?,β1?)?β1??β2?,β2?(α3?,β2?)?β2?=α3?+21?β1?+31?β2?=[3?1?31?31?1]T
    • 再将其单位化,得到一组标准正交向量组
      • η 1 = 1 ∣ ∣ β 1 ∣ ∣ β 1 = [ 2 22 200 ] T \eta_1 = \frac{1}{||\beta_1||} \beta_1 = [\frac{\sqrt{2}}{2} \ \ \frac{\sqrt{2}}{2} \ \ 0 \ \ 0]^T η1?=∣∣β1?∣∣1?β1?=[22 ??22 ??00]T
      • η 2 = 1 ∣ ∣ β 2 ∣ ∣ β 2 = [ 6 6? 6 36 30 ] T \eta_2 = \frac{1}{||\beta_2||} \beta_2 = [\frac{\sqrt{6}}{6} \ \ -\frac{\sqrt{6}}{3} \ \ \frac{\sqrt{6}}{3} \ \ 0]^T η2?=∣∣β2?∣∣1?β2?=[66 ???36 ??36 ??0]T
      • η 3 = 1 ∣ ∣ β 3 ∣ ∣ β 3 = [ ? 3 63 63 63 2 ] T \eta_3 = \frac{1}{||\beta_3||} \beta_3 = [-\frac{\sqrt{3}}{6} \ \ \frac{\sqrt{3}}{6} \ \ \frac{\sqrt{3}}{6} \ \ \frac{\sqrt{3}}{2}]^T η3?=∣∣β3?∣∣1?β3?=[?63 ??63 ??63 ??23 ??]T
    • ? Q ( η 1 , η 2 , η 3 ) = [ 2 2 6 6 ? 3 6 2 2 ? 6 6 3 6 0 6 3 3 6 0 0 3 2 ] \Rightarrow Q(\eta_1, \eta_2, \eta_3) = \left [\begin{array}{cccc}\frac{\sqrt{2}}{2} & \frac{\sqrt{6}}{6} & -\frac{\sqrt{3}}{6} \\\frac{\sqrt{2}}{2} & -\frac{\sqrt{6}}{6} & \frac{\sqrt{3}}{6} \\0 & \frac{\sqrt{6}}{3} & \frac{\sqrt{3}}{6} \\0 & 0 & \frac{\sqrt{3}}{2}\end{array} \right ] ?Q(η1?,η2?,η3?)=??????22 ??22 ??00?66 ???66 ??36 ??0??63 ??63 ??63 ??23 ?????????
    • β 1 = α 1 = [ 1100 ] T \beta_1 = \alpha_1 = [1 \ \ 1 \ \ 0 \ \ 0]^T β1?=α1?=[1100]T
    • β 2 = α 2 ? ( α 2 , β 1 ) ( β 1 , β 1 ) β 1 = α 2 ? 1 2 β 1 = [ 1 2? 1 210 ] T \beta_2 = \alpha_2 - \frac{(\alpha_2, \beta_1)}{(\beta_1, \beta_1)} \beta_1 = \alpha_2 - \frac{1}{2} \beta_1 = [\frac{1}{2} \ \ \frac{-1}{2} \ \ 1 \ \ 0]^T β2?=α2??(β1?,β1?)(α2?,β1?)?β1?=α2??21?β1?=[21?2?1?10]T
    • β 3 = α 3 ? ( α 3 , β 1 ) β 1 , β 1 β 1 ? ( α 3 , β 2 ) β 2 , β 2 β 2 = α 3 + 1 2 β 1 + 1 3 β 2 = [ ? 1 31 31 31 ] T \beta_3 = \alpha_3 - \frac{(\alpha_3, \beta_1)}{\beta_1, \beta_1}\beta_1 - \frac{(\alpha_3, \beta_2)}{\beta_2, \beta_2} \beta_2 = \alpha_3 + \frac{1}{2}\beta_1 + \frac{1}{3}\beta_2 = [\frac{-1}{3} \ \ \frac{1}{3} \ \ \frac{1}{3} \ \ 1]^T β3?=α3??β1?,β1?(α3?,β1?)?β1??β2?,β2?(α3?,β2?)?β2?=α3?+21?β1?+31?β2?=[3?1?31?31?1]T
    • ? \Rightarrow ?
      • α 1 = β 1 \alpha_1 = \beta_1 α1?=β1?
      • α 2 = 1 2 β 1 + β 2 \alpha_2 = \frac{1}{2}\beta_1 + \beta_2 α2?=21?β1?+β2?
      • α 3 = ? 1 2 β 1 ? 1 3 β 2 + β 3 \alpha_3 = -\frac{1}{2}\beta_1 - \frac{1}{3}\beta_2 + \beta_3 α3?=?21?β1??31?β2?+β3?
    • 再将其单位化,得到一组标准正交向量组
      • 由β 1 = ∣ ∣ β 1 ∣ ∣ η 1 β 2 = ∣ ∣ β 2 ∣ ∣ η 2 β 3 = ∣ ∣ β 3 ∣ ∣ η 3 \left.\begin{array}{cccc}\beta_1 = ||\beta_1|| \eta_1 \\ \beta_2 = ||\beta_2|| \eta_2 \\ \beta_3 = ||\beta_3|| \eta_3\end{array} \right. β1?=∣∣β1?∣∣η1?β2?=∣∣β2?∣∣η2?β3?=∣∣β3?∣∣η3?? 和α 1 = β 1 α 2 = 1 2 β 1 + β 2 α 3 = ? 1 2 β 1 ? 1 3 β 2 + β 3 \left. \begin{array}{cccc} \alpha_1 = \beta_1 \\ \alpha_2 = \frac{1}{2}\beta_1 + \beta_2 \\ \alpha_3 = -\frac{1}{2}\beta_1 - \frac{1}{3}\beta_2 + \beta_3 \end{array} \right. α1?=β1?α2?=21?β1?+β2?α3?=?21?β1??31?β2?+β3??
      • ? α 1 = 2 η 1 α 2 = 6 2 η 2 + 2 2 η 1 α 3 = 2 3 3 η 3 ? 6 6 η 2 ? 2 2 η 1 ? R = [ 2 2 2 ? 2 2 0 6 2 6 6 0 0 2 3 3 ] \Rightarrow \left.\begin{array}{cccc}\alpha_1 = \sqrt{2} \eta_1 \\\alpha_2 = \frac{\sqrt{6}}{2} \eta_2 + \frac{\sqrt{2}}{2} \eta_1 \\\alpha_3 = \frac{2\sqrt{3}}{3} \eta_3 - \frac{\sqrt{6}}{6} \eta_2 - \frac{\sqrt{2}}{2} \eta_1 \\ \end{array} \right. \Rightarrow R = \left [ \begin{array}{cccc} \sqrt{2} & \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2} \\ 0 & \frac{\sqrt{6}}{2} & \frac{\sqrt{6}}{6} \\ 0 & 0 & \frac{2\sqrt{3}}{3} \end{array} \right ] ?α1?=2 ?η1?α2?=26 ??η2?+22 ??η1?α3?=323 ??η3??66 ??η2??22 ??η1???R=????2 ?00?22 ??26 ??0??22 ??66 ??323 ???????
      • 故得到A矩阵的QR分解如下:
      • A = ( α 1α 2α 3 ) = Q R = [ 2 2 6 6 ? 3 6 2 2 ? 6 6 3 6 0 6 3 3 6 0 0 3 2 ] [ 2 2 2 ? 2 2 0 6 2 6 6 0 6 3 3 6 0 0 2 3 3 ] A = (\alpha_1 \ \ \alpha_2 \ \ \alpha_3) = QR =\left [\begin{array}{cccc}\frac{\sqrt{2}}{2} & \frac{\sqrt{6}}{6} & -\frac{\sqrt{3}}{6} \\\frac{\sqrt{2}}{2} & -\frac{\sqrt{6}}{6} &\frac{\sqrt{3}}{6} \\ 0 & \frac{\sqrt{6}}{3} & \frac{\sqrt{3}}{6} \\ 0 & 0 & \frac{\sqrt{3}}{2} \end{array} \right ] \left [ \begin{array}{cccc} \sqrt{2} & \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2} \\ 0 & \frac{\sqrt{6}}{2} & \frac{\sqrt{6}}{6} \\ 0 & \frac{\sqrt{6}}{3} & \frac{\sqrt{3}}{6} \\ 0 & 0 & \frac{2\sqrt{3}}{3} \end{array} \right ] A=(α1?α2?α3?)=QR=??????22 ??22 ??00?66 ???66 ??36 ??0??63 ??63 ??63 ??23 ???????????????2 ?000?22 ??26 ??36 ??0??22 ??66 ??63 ??323 ?????????
      • 简写为: A 4 × 3 = Q R = Q 4 × 3 R 3 × 3 A_{4×3} = QR = Q_{4×3} R_{3×3} A4×3?=QR=Q4×3?R3×3?

    推荐阅读