矩阵的迹及迹的求导

矩阵的迹的定义:一个 $n \times n$ 的矩阵 A 的迹是指 A 的主对角线上各元素的总和,记作 $\operatorname{tr}(A)$ 。即
$\operatorname{tr}(A)=\sum\limits\limits _{i=1}^{n} a_{i i}$
定理1:
$\operatorname{tr}(A B)=\operatorname{tr}(B A) $
证明:
$\operatorname{tr}(A B)=\sum\limits_{i=1}^{n}(A B)_{i i}=\sum\limits_{i=1}^{n} \sum\limits_{j=1}^{m} A_{i j} B_{j i}=\sum\limits_{j=1}^{m} \sum\limits_{i=1}^{n} B_{j i} A_{i j}=\sum\limits_{j=1}^{m}(B A)_{j j}=\operatorname{tr}(B A) $
定理2:
$\operatorname{tr}(A B C)=\operatorname{tr}(C A B)=\operatorname{tr}(B C A) $
证明:
把 $\mathrm{AB}$ 或者 $ \mathrm{BC}$ 当作整体, 由定理 1 可以知道成立
定理3:
$\frac{\partial \operatorname{tr}(A B)}{\partial A}=\frac{\partial \operatorname{tr}(B A)}{\partial A}=B^{T} $
其中 $A$ 是 $m \times n$ 的矩阵, $B$ 是 $n \times m$ 的矩阵
证明:
$\operatorname{tr}(A B)=\operatorname{tr}\left(\begin{array}{cccc}a_{11} & a_{12} & \cdots & a_{1 n} \\a_{21} & a_{22} & \cdots & a_{2 n} \\\vdots & \vdots & \ddots & \vdots \\a_{m 1} & a_{m 2} & \cdots & a_{m n}\end{array}\right)\left(\begin{array}{cccc}b_{11} & b_{12} & \cdots & b_{1 m} \\b_{21} & b_{22} & \cdots & b_{2 m} \\\vdots & \vdots & \ddots & \vdots \\b_{n 1} & b_{n 2} & \cdots & b_{n m}\end{array}\right)$
只考虑对角线上的元素, 那么有
$\operatorname{tr}(A B)=\sum\limits_{i=1}^{n} a_{1 i} b_{i 1}+\sum\limits_{i=1}^{n} a_{2 i} b_{i 2}+\ldots+\sum\limits_{i=1}^{n} a_{m i} b_{i m}=\sum\limits_{i=1}^{m} \sum\limits_{j=1}^{n} a_{i j} b_{j i}$
$\frac{\partial \operatorname{tr}(A B)}{\partial a_{i j}}=b_{j i} \Rightarrow \frac{\partial \operatorname{tr}(A B)}{\partial A}=B^{T}$
定理4:
$\frac{\partial \operatorname{tr}\left(A^{T} B\right)}{\partial A}=\frac{\partial \operatorname{tr}\left(B A^{T}\right)}{\partial A}=B$
证明:
证明步骤跟定理 3 一样, 很容易, 不再赘述。
定理5:
$\operatorname{tr}(A)=\operatorname{tr}\left(A^{T}\right) $
证明:
略。
定理6:
如果 $a \in R$ , 那么有 $\operatorname{tr}(a)=a $
证明:
【矩阵的迹及迹的求导】当作 $1 \times 1$的矩阵处理即可。
定理7:
$\frac{\partial \operatorname{tr}\left(A B A^{T} C\right)}{\partial A}=C A B+C^{T} A B^{T} $
证明: 分步求导, 得到如下表达式
$\begin{aligned}\frac{\partial \operatorname{tr}\left(A B A^{T} C\right)}{\partial A} &=\frac{\partial \operatorname{tr}\left(A B A^{T} C\right)}{\partial A}+\frac{\partial \operatorname{tr}\left(A^{T} C A B\right)}{\partial A}\quad\quad(分步求导,定理1) \\&=\left(B A^{T} C\right)^{T}+C A B\quad\quad(定理1、定理4) \\&=C A B+C^{T} A B^{T}\end{aligned}$
例子:
$\begin{array}{l}\operatorname{tr}(A)=\sum_{i=1}^{n} a_{i i} \\\operatorname{tr}(A B C)=\operatorname{tr}(B C A)=\operatorname{tr}(C A B) \\ \operatorname{tr}(A B)=\operatorname{tr}(B A) \\\frac{\partial \operatorname{tr}(A B)}{\partial A}=\frac{\partial \operatorname{tr}(B A)}{\partial A}=B^{T}\\\operatorname{tr}(A)=\operatorname{tr}\left(A^{T}\right) \\ \frac{\partial \operatorname{tr}\left(A^{T} B A\right)}{\partial A}=B A+B^{T} A\frac{\partial \operatorname{tr}\left(A X B X C^{T}\right)}{\partial X}=A^{T} C X^{T} B^{T}+B^{T} X^{T} A^{T} C \\ \frac{\partial\operatorname{tr}\left(A B A^{T}\right)}{\partial A}=A B+A B^{T} \\\frac{\partial \operatorname{tr}(A X B X)}{\partial X}=A^{T} X^{T} B^{T}+B^{T} X^{T} A^{T} \\ \frac{\partial \operatorname{tr}\left(A X B X^{T}\right)}{\partial X}=A X B+A^{T} X B^{T}\\\frac{\partial \operatorname{tr}\left(A^{T} B\right)}{\partial A}=\frac{\partial \operatorname{tr}\left(B A^{T}\right)}{\partial A}=B \\\frac{\partial\operatorname{tr}\left(A^{T} X B^{T}\right)}{\partial X}=\frac{\partial \operatorname{tr}\left(A X^{T} B\right)}{\partial X}=A B\end{array}$

    推荐阅读