来自https://math.stackexchange.com/questions/265917/intuitive-explanation-of-a-definition-of-the-fisher-information
fisher information 似然函数(likelihood) , L ( X ; θ ) = ∏ i = 1 n f ( X i ; θ ) L(\bm{X};\theta)=\prod^{n}_{i=1}f(X_i;\theta) L(X;θ)=∏i=1nf(Xi;θ)。其中 { X } \{\bm{X}\} {X}是独立同分布的一组随机量, X 1 , X 2 , ⋯ , X n X_1,X_2,\cdots,X_n X1,X2,⋯,Xn, θ \theta θ是需要估计的参数。
根据最大似然估计(MLE,Maximum Likelihood Estimation),求得score function S ( X ; θ ) = ∑ i = 1 n ∂ log f ( X i , ; θ ) ∂ θ (1) S(\bm{X};\theta)=\sum^{n}_{i=1}\frac{\partial\log f(X_i,;\theta)}{\partial \theta} \tag{1} S(X;θ)=i=1∑n∂θ∂logf(Xi,;θ)(1)
score的期望为零,因为 ∫ ⋯ ∫ f ( X ; θ ) d X = 1 (2) \int\cdots \int f(\bm{X};\theta)d\bm{X} =1 \tag{2} ∫⋯∫f(X;θ)dX=1(2) 所以 ∂ ∂ θ ∫ ⋯ ∫ f ( X ; θ ) d X = 0 (3) \frac{\partial}{\partial \theta} \int \cdots \int f (\bm{X};\theta)d\bm{X} =0 \tag{3} ∂θ∂∫⋯∫f(X;θ)dX=0(3) 该式左侧, ∂ ∂ θ ∫ ⋯ ∫ f ( X ; θ ) d X = ∫ ⋯ ∫ ∂ f ( X ; θ ) ∂ θ d X = ∫ ⋯ ∫ ∂ f ( X ; θ ) ∂ θ f ( X ; θ ) f ( X ; θ ) d X = ∫ ⋯ ∫ ∂ log f ( X ; θ ) ∂ θ f ( X ; θ ) d X = E [ S ( X ; θ ) ] (4) \begin{aligned} \frac{\partial}{\partial \theta}\int \cdots \int f(\bm{X};\theta ) d \bm{X} = & \int \cdots \int \frac{\partial f (\bm{X};\theta)}{\partial \theta}d \bm{X}\\ = & \int \cdots \int \frac{\frac{\partial f (\bm{X};\theta)}{\partial \theta}}{f(\bm{X};\theta)} f(\bm{X};\theta) d \bm{X} \\ = & \int \cdots \int \frac{\partial \log f (\bm{X};\theta)}{\partial \theta}f (\bm{X};\theta) d \bm{X}\\ = & \mathbb{E} \left[ S(\bm{X};\theta)\right] \end{aligned} \tag{4} ∂θ∂∫⋯∫f(X;θ)dX====∫⋯∫∂θ∂f(X;θ)dX∫⋯∫f(X;θ)∂θ∂f(X;θ)f(X;θ)dX∫⋯∫∂θ∂logf(X;θ)f(X;θ)dXE[S(X;θ)](4) 得证。
Fisher Information: V [ S ( X ; θ ) ] = V [ ∂ L ( X ; θ ) ∂ θ ] (5) \mathbb{V}[S(\bm{X};\theta)]= \mathbb{V}\left[ \frac{\partial L (\bm{X};\theta)}{\partial \theta}\right] \tag{5} V[S(X;θ)]=V[∂θ∂L(X;θ)](5)
由于Score function的期望为0,假设S关于 θ \theta θ二阶可导。对(5)左右两侧继续求导,有 ∂ ∂ θ ∫ ⋯ ∫ ∂ L ( X ; θ ) ∂ θ f ( X ; θ ) d X = 0. (6) \frac{\partial}{\partial \theta}\int \cdots \int \frac{\partial L(\bm{X};\theta)}{\partial \theta} f(\bm{X};\theta) d\bm{X} =0.\,\,\,\,\, \tag{6} ∂θ∂∫⋯∫∂θ∂L(X;θ)f(X;θ)dX=0.(6) (6)的左侧展开: ∫ ⋯ ∫ ∂ 2 L ( X ; θ ) ∂ θ 2 f ( X ; θ ) d X + ∫ ⋯ ∫ ∂ L ( X ; θ ) ∂ θ ∂ f ( X ; θ ) ∂ θ d X ⏟ ( 8 ) = 0 (7) \int \cdots \int \frac{\partial ^2 L(\bm{X};\theta)}{\partial \theta ^2}f(\bm{X};\theta)d\bm{X} + \underbrace{\int \cdots \int \frac{\partial L(\bm{X};\theta)}{\partial \theta}\frac{\partial f (\bm{X};\theta)}{\partial \theta}d\bm{X}}_{(8)}=0\\ \tag{7} ∫⋯∫∂θ2∂2L(X;θ)f(X;θ)dX+(8) ∫⋯∫∂θ∂L(X;θ)∂θ∂f(X;θ)dX=0(7) ( 8 ) = ∫ ⋯ ∫ ∂ L ( X ; θ ) ∂ θ ∂ f ( X ; θ ) ∂ θ f ( X ; θ ) f ( X ; θ ) d X = ∫ ⋯ ∫ ( ∂ L ( X ; θ ) ∂ θ ) 2 f ( X ; θ ) d X = V [ ∂ L ( X ; θ ) ∂ θ ] (9) \begin{aligned} (8)=&\int \cdots \int \frac{\partial L (\bm{X};\theta)}{\partial \theta} \frac{\frac{\partial f (\bm{X};\theta)}{\partial \theta}}{f(\bm{X};\theta)} f(\bm{X};\theta)d\bm{X}\\ =&\int \cdots \int \left(\frac{\partial L(\bm{X};\theta)}{\partial \theta}\right)^2 f (\bm{X};\theta)d\bm{X}\\ =&\mathbb{V}\left[\frac{\partial L (\bm{X};\theta)}{\partial \theta}\right]\\ & \tag{9} \end{aligned} (8)===∫⋯∫∂θ∂L(X;θ)f(X;θ)∂θ∂f(X;θ)f(X;θ)dX∫⋯∫(∂θ∂L(X;θ))2f(X;θ)dXV[∂θ∂L(X;θ)](9) 结合(7)和(9)可以得到 V [ S ( X ; θ ) ] = V [ ∂ L ( X ; θ ) ∂ θ ] = − ∫ ⋯ ∫ ∂ 2 L ( X ; θ ) ∂ θ 2 f ( X ; θ ) d X = − E [ ∂ 2 L ( X ; θ ) ∂ θ 2 ] \begin{aligned} \mathbb {V}[S(\bm{X};\theta)] =&\mathbb{V}\left[\frac{\partial L (\bm{X};\theta)}{\partial \theta}\right]\\ =& - \int \cdots \int \frac{\partial ^2 L(\bm{X};\theta)}{\partial \theta ^2}f(\bm{X};\theta)d\bm{X} \\ =& - \mathbb{E}\left[\frac{\partial ^2 L(\bm{X};\theta)}{\partial \theta ^2} \right] \end{aligned} V[S(X;θ)]===V[∂θ∂L(X;θ)]−∫⋯∫∂θ2∂2L(X;θ)f(X;θ)dX−E[∂θ2∂2L(X;θ)]