diff --git a/probability-theory-and-mathematical-statistics/exercise/2-random-variables-and-distribution/random-variables-and-distribution.pdf b/probability-theory-and-mathematical-statistics/exercise/2-random-variables-and-distribution/random-variables-and-distribution.pdf index 9218ed4..92561ef 100644 Binary files a/probability-theory-and-mathematical-statistics/exercise/2-random-variables-and-distribution/random-variables-and-distribution.pdf and b/probability-theory-and-mathematical-statistics/exercise/2-random-variables-and-distribution/random-variables-and-distribution.pdf differ diff --git a/probability-theory-and-mathematical-statistics/exercise/2-random-variables-and-distribution/random-variables-and-distribution.tex b/probability-theory-and-mathematical-statistics/exercise/2-random-variables-and-distribution/random-variables-and-distribution.tex index d58bd9d..e6b276d 100644 --- a/probability-theory-and-mathematical-statistics/exercise/2-random-variables-and-distribution/random-variables-and-distribution.tex +++ b/probability-theory-and-mathematical-statistics/exercise/2-random-variables-and-distribution/random-variables-and-distribution.tex @@ -141,7 +141,7 @@ $(X,Y)$联合概率=条件概率×边缘概率。 \draw[black, densely dashed](0.6,0) -- (0.6,1) node[above]{$y=y_0$}; \end{tikzpicture} - 所以$P\{X+Y>1\}=\iint\limits_D\dfrac{1}{x}\textrm{d}\delta$,$D=x+y>1\cap01\}=\iint\limits_D\dfrac{1}{x}\textrm{d}\sigma$,$D=x+y>1\cap0\dfrac{3}{2}\right\}$出现时,否定假设$H_0$,接受$H_1$,求犯第一类错误概率和第二类错误概率$\alpha\beta$。 + +解:$\alpha=P\left\{U>\dfrac{3}{2}\bigg|H_0\right\}=\displaystyle{\int_\frac{3}{2}^2\dfrac{1}{2}\,\textrm{d}x=\dfrac{1}{4}}$。 + +$\beta=P\left\{U\leqslant\dfrac{3}{2}\bigg|H_1\right\}=\displaystyle{\int_0^{\frac{3}{2}}\dfrac{x}{2}\,\textrm{d}x=\dfrac{9}{16}}$。 + \end{document} diff --git a/probability-theory-and-mathematical-statistics/knowledge/2-random-variables-and-distribution/random-variables-and-distribution.pdf b/probability-theory-and-mathematical-statistics/knowledge/2-random-variables-and-distribution/random-variables-and-distribution.pdf index 77ea3e2..46f21b1 100644 Binary files a/probability-theory-and-mathematical-statistics/knowledge/2-random-variables-and-distribution/random-variables-and-distribution.pdf and b/probability-theory-and-mathematical-statistics/knowledge/2-random-variables-and-distribution/random-variables-and-distribution.pdf differ diff --git a/probability-theory-and-mathematical-statistics/knowledge/2-random-variables-and-distribution/random-variables-and-distribution.tex b/probability-theory-and-mathematical-statistics/knowledge/2-random-variables-and-distribution/random-variables-and-distribution.tex index 38c0b41..e0626c4 100644 --- a/probability-theory-and-mathematical-statistics/knowledge/2-random-variables-and-distribution/random-variables-and-distribution.tex +++ b/probability-theory-and-mathematical-statistics/knowledge/2-random-variables-and-distribution/random-variables-and-distribution.tex @@ -281,32 +281,32 @@ $=1-F(t)=1-P\{X\leqslant t\}=P\{X>t\}$。 \subsubsection{正态分布} -\textcolor{violet}{\textbf{定义:}}如果$X$的概率密度为$f(x)=\dfrac{1}{\sqrt{2\pi\delta}}e^{-\frac{1}{2}(\frac{x-\mu}{\delta})^2}$($-\infty0$),则称$X$服从参数为$(\mu,\delta^2)$的\textbf{正态分布},称$X$为\textbf{正态变量},记为$X\sim N(\mu,\delta^2)$。 +\textcolor{violet}{\textbf{定义:}}如果$X$的概率密度为$f(x)=\dfrac{1}{\sqrt{2\pi\sigma}}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}$($-\infty0$),则称$X$服从参数为$(\mu,\sigma^2)$的\textbf{正态分布},称$X$为\textbf{正态变量},记为$X\sim N(\mu,\sigma^2)$。 -$f(x)$的图形关于$x=\mu$对称,即$f(\mu-x)=f(\mu+x)$,并在$x=\mu$处有唯一最大值$f(\mu)=\dfrac{1}{\sqrt{2\pi}\delta}$。$\mu-\delta$和$\mu+\delta$为拐点。 +$f(x)$的图形关于$x=\mu$对称,即$f(\mu-x)=f(\mu+x)$,并在$x=\mu$处有唯一最大值$f(\mu)=\dfrac{1}{\sqrt{2\pi}\sigma}$。$\mu-\sigma$和$\mu+\sigma$为拐点。 \begin{tikzpicture}[scale=2] \draw[-latex](-2,0) -- (2,0) node[below]{$x$}; \draw[-latex](0,-0.25) -- (0,1.25) node[above]{$y$}; \filldraw[black] (0,0) node[below]{$O$}; \draw[black, thick, domain=-2:2] plot (\x,{pow(e,-\x*\x/2)}); - \draw[black, densely dashed](-1,0.65) -- (-1,0) node[below]{$\mu-\delta$}; - \draw[black, densely dashed](1,0.65) -- (1,0) node[below]{$\mu+\delta$}; - \draw[black, densely dashed](0,1) -- (-1,1) node[left]{$\dfrac{1}{\sqrt{2\pi}\delta}$}; + \draw[black, densely dashed](-1,0.65) -- (-1,0) node[below]{$\mu-\sigma$}; + \draw[black, densely dashed](1,0.65) -- (1,0) node[below]{$\mu+\sigma$}; + \draw[black, densely dashed](0,1) -- (-1,1) node[left]{$\dfrac{1}{\sqrt{2\pi}\sigma}$}; \filldraw[black] (0.35,0.25) node{$x=\mu$}; \filldraw[black] (1,1) node{$f(x)$}; \end{tikzpicture} -当$\mu=0$,$\delta=1$时的正态分布$N(0,1)=\dfrac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}$为\textbf{标准正态分布},记为$\phi(x)$,分布函数为$\varPhi(x)=\displaystyle{\int_{-\infty}^x\dfrac{1}{\sqrt{2\pi}}e^{-\frac{t^2}{2}}\,\textrm{d}t}$。$\phi(x)$为偶函数,$\varPhi(0)=\dfrac{1}{2}$,$\varPhi(-x)=1-\varPhi(x)$。 +当$\mu=0$,$\sigma=1$时的正态分布$N(0,1)=\dfrac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}$为\textbf{标准正态分布},记为$\phi(x)$,分布函数为$\varPhi(x)=\displaystyle{\int_{-\infty}^x\dfrac{1}{\sqrt{2\pi}}e^{-\frac{t^2}{2}}\,\textrm{d}t}$。$\phi(x)$为偶函数,$\varPhi(0)=\dfrac{1}{2}$,$\varPhi(-x)=1-\varPhi(x)$。 若$X\sim N(0,1)$,$P\{X>\mu_\alpha\}=\alpha$,则称$\mu_\alpha$为标准正态分布的\textbf{上侧$\alpha$分位数/上$\alpha$分位点}。 -若$X\sim N(\mu,\delta^2)$,则 +若$X\sim N(\mu,\sigma^2)$,则 \begin{itemize} - \item $F(x)=P\{X\leqslant x\}=P\left\{\dfrac{X-\mu}{\delta}\leqslant\dfrac{x-\mu}{\delta}\right\}=\varPhi\left(\dfrac{x-\mu}{\delta}\right)$。(标准化) + \item $F(x)=P\{X\leqslant x\}=P\left\{\dfrac{X-\mu}{\sigma}\leqslant\dfrac{x-\mu}{\sigma}\right\}=\varPhi\left(\dfrac{x-\mu}{\sigma}\right)$。(标准化) \item $F(\mu-x)+F(\mu+x)=1$。 - \item $P\{a0$,$-1<\rho<1$,则称$(X,Y)$服从参数为$\mu_1,\mu_2,\delta_1^2,\delta_2^2,\rho$的\textbf{二维正态分布},记为$(X,Y)\sim N(\mu_1,\mu_2;\delta_1^2,\delta_2^2;\rho)$。此时: +其中$\mu_1,\mu_2\in R$,$\sigma_1,\sigma_2>0$,$-1<\rho<1$,则称$(X,Y)$服从参数为$\mu_1,\mu_2,\sigma_1^2,\sigma_2^2,\rho$的\textbf{二维正态分布},记为$(X,Y)\sim N(\mu_1,\mu_2;\sigma_1^2,\sigma_2^2;\rho)$。此时: \begin{itemize} - \item $X\sim N(\mu_1,\delta_1^2)$,$Y\sim N(\mu_2,\delta_2^2)$,$\rho$为$X$与$Y$的相关系数,即$\rho=\dfrac{Cov(X,Y)}{\sqrt{DX}\sqrt{DY}}=\dfrac{Cov(X,Y)}{\delta_1\delta_2}$。 + \item $X\sim N(\mu_1,\sigma_1^2)$,$Y\sim N(\mu_2,\sigma_2^2)$,$\rho$为$X$与$Y$的相关系数,即$\rho=\dfrac{Cov(X,Y)}{\sqrt{DX}\sqrt{DY}}=\dfrac{Cov(X,Y)}{\sigma_1\sigma_2}$。 \item $X,Y$的条件分布都是正态分布。 \item $aX+bY$($a\neq0$或$b\neq0$)服从正态分布。 \item $XY$相互独立的充要条件是$XY$不相关,即$\rho=0$。 diff --git a/probability-theory-and-mathematical-statistics/knowledge/3-digital-features/digital-features.pdf b/probability-theory-and-mathematical-statistics/knowledge/3-digital-features/digital-features.pdf index 8d78cb3..b3c7e5b 100644 Binary files a/probability-theory-and-mathematical-statistics/knowledge/3-digital-features/digital-features.pdf and b/probability-theory-and-mathematical-statistics/knowledge/3-digital-features/digital-features.pdf differ diff --git a/probability-theory-and-mathematical-statistics/knowledge/3-digital-features/digital-features.tex b/probability-theory-and-mathematical-statistics/knowledge/3-digital-features/digital-features.tex index ffc0a2d..df49085 100644 --- a/probability-theory-and-mathematical-statistics/knowledge/3-digital-features/digital-features.tex +++ b/probability-theory-and-mathematical-statistics/knowledge/3-digital-features/digital-features.tex @@ -71,7 +71,7 @@ \subsubsection{概念} -\textcolor{violet}{\textbf{定义:}}设$X$是随机变量,若$E[(X-EX)^2]$存在,则称$E[(X-EX)^2]$为$X$的\textbf{方差},记为$DX$,即$DX=E[(X-EX)^2]=E(X^2)-(EX)^2$。称$\sqrt{DX}$为$X$的\textbf{标准差}或\textbf{均方差},记为$\delta(X)$,称随机变量$X^*=\dfrac{X-EX}{\sqrt{DX}}$为$X$的\textbf{标准化随机变量},此时$EX^*=0$,$DX^*=1$。 +\textcolor{violet}{\textbf{定义:}}设$X$是随机变量,若$E[(X-EX)^2]$存在,则称$E[(X-EX)^2]$为$X$的\textbf{方差},记为$DX$,即$DX=E[(X-EX)^2]=E(X^2)-(EX)^2$。称$\sqrt{DX}$为$X$的\textbf{标准差}或\textbf{均方差},记为$\sigma(X)$,称随机变量$X^*=\dfrac{X-EX}{\sqrt{DX}}$为$X$的\textbf{标准化随机变量},此时$EX^*=0$,$DX^*=1$。 \subsubsection{性质} @@ -105,7 +105,7 @@ 二项分布$B(n,p)$ & $P\{X=k\}=C_n^kp^k(1-p)^{n-k}$,$k=0,\cdots,n$ & $np$ & $np(1-p)$ \\ \hline 泊松分布$P(\lambda)$ & $P\{X=k\}=\dfrac{\lambda^k}{k!}e^{-\lambda}$,$k=0,\cdots$ & $\lambda$ & $\lambda$ \\ \hline 几何分布$G(p)$ & $P\{X=k\}=(1-p)^{k-1}$,$p,k=1,\cdots$ & $\dfrac{1}{p}$ & $\dfrac{1-p}{p^2}$ \\ \hline - 正态分布$N(\mu,\delta^2)$ & $f(x)=\dfrac{1}{\sqrt{2\pi}\delta}\exp\left\{-\dfrac{(x-\mu)^2}{2\delta^2}\right\}$,$x\in R$ & $\mu$ & $\delta^2$ \\ \hline + 正态分布$N(\mu,\sigma^2)$ & $f(x)=\dfrac{1}{\sqrt{2\pi}\sigma}\exp\left\{-\dfrac{(x-\mu)^2}{2\sigma^2}\right\}$,$x\in R$ & $\mu$ & $\sigma^2$ \\ \hline 均匀分布$U(a,b)$ & $f(x)=\dfrac{1}{b-a}$,$a0$ & $\dfrac{1}{\lambda}$ & $\dfrac{1}{\lambda^2}$ \\ \hline \end{tabular} diff --git a/probability-theory-and-mathematical-statistics/knowledge/4-law-of-large-numbers-and-central-limit-theorem/law-of-large-numbers-and-central-limit-theorem.pdf b/probability-theory-and-mathematical-statistics/knowledge/4-law-of-large-numbers-and-central-limit-theorem/law-of-large-numbers-and-central-limit-theorem.pdf index 734b6f4..47c0be5 100644 Binary files a/probability-theory-and-mathematical-statistics/knowledge/4-law-of-large-numbers-and-central-limit-theorem/law-of-large-numbers-and-central-limit-theorem.pdf and b/probability-theory-and-mathematical-statistics/knowledge/4-law-of-large-numbers-and-central-limit-theorem/law-of-large-numbers-and-central-limit-theorem.pdf differ diff --git a/probability-theory-and-mathematical-statistics/knowledge/4-law-of-large-numbers-and-central-limit-theorem/law-of-large-numbers-and-central-limit-theorem.tex b/probability-theory-and-mathematical-statistics/knowledge/4-law-of-large-numbers-and-central-limit-theorem/law-of-large-numbers-and-central-limit-theorem.tex index 4e7e09f..a72b38b 100644 --- a/probability-theory-and-mathematical-statistics/knowledge/4-law-of-large-numbers-and-central-limit-theorem/law-of-large-numbers-and-central-limit-theorem.tex +++ b/probability-theory-and-mathematical-statistics/knowledge/4-law-of-large-numbers-and-central-limit-theorem/law-of-large-numbers-and-central-limit-theorem.tex @@ -104,11 +104,11 @@ $C.$服从同一泊松分布\qquad$D.$服从同一连续型分布 \section{中心极限定理} -中心极限定理总结来看均为:若$X_i$独立同分布于某一分布$F$,则$\sum\limits_{i=1}^nX_i\overset{n\to\infty}{\sim}N(n\mu,n\delta^2)$。 +中心极限定理总结来看均为:若$X_i$独立同分布于某一分布$F$,则$\sum\limits_{i=1}^nX_i\overset{n\to\infty}{\sim}N(n\mu,n\sigma^2)$。 \subsection{列维-林德伯格定理} -\textcolor{violet}{\textbf{定义:}}假设$\{X_n\}$是独立分布的随机变量序列,若$EX_i=\mu$,$DX_i=\delta^2>0$($i=1,2,\cdots$)存在,则对任意的实数$x$,有$\lim\limits_{n\to\infty}P\left\{\dfrac{\sum\limits_{i=1}^nX_i-n\mu}{\sqrt{n}\delta}\leqslant x\right\}=\dfrac{1}{\sqrt{2}\pi}\int_{-\infty}^xe^{-\frac{t^2}{2}}\,\textrm{d}t=\varPhi(x)$。(正态分布标准化) +\textcolor{violet}{\textbf{定义:}}假设$\{X_n\}$是独立分布的随机变量序列,若$EX_i=\mu$,$DX_i=\sigma^2>0$($i=1,2,\cdots$)存在,则对任意的实数$x$,有$\lim\limits_{n\to\infty}P\left\{\dfrac{\sum\limits_{i=1}^nX_i-n\mu}{\sqrt{n}\sigma}\leqslant x\right\}=\dfrac{1}{\sqrt{2}\pi}\int_{-\infty}^xe^{-\frac{t^2}{2}}\,\textrm{d}t=\varPhi(x)$。(正态分布标准化) 定理要求:独立、同分布、期望方差存在。 diff --git a/probability-theory-and-mathematical-statistics/knowledge/5-mathematical-statistics/mathematical-statistics.pdf b/probability-theory-and-mathematical-statistics/knowledge/5-mathematical-statistics/mathematical-statistics.pdf index 796c056..2008647 100644 Binary files a/probability-theory-and-mathematical-statistics/knowledge/5-mathematical-statistics/mathematical-statistics.pdf and b/probability-theory-and-mathematical-statistics/knowledge/5-mathematical-statistics/mathematical-statistics.pdf differ diff --git a/probability-theory-and-mathematical-statistics/knowledge/5-mathematical-statistics/mathematical-statistics.tex b/probability-theory-and-mathematical-statistics/knowledge/5-mathematical-statistics/mathematical-statistics.tex index af9e268..07fd377 100644 --- a/probability-theory-and-mathematical-statistics/knowledge/5-mathematical-statistics/mathematical-statistics.tex +++ b/probability-theory-and-mathematical-statistics/knowledge/5-mathematical-statistics/mathematical-statistics.tex @@ -95,14 +95,14 @@ $X_{(1)}$的分布函数为$F_{(1)}(x)=1-[1-F(x)]^n$,概率密度为$f_{(1)}(x \subsubsection{性质} -设总体$X$的期望$EX=\mu$,方差$DX=\delta^2$,样本$X_1,X_2,\cdots,X_n$取自$X$,$\overline{X}$和$S^2$分别为样本的均值和方差,则: +设总体$X$的期望$EX=\mu$,方差$DX=\sigma^2$,样本$X_1,X_2,\cdots,X_n$取自$X$,$\overline{X}$和$S^2$分别为样本的均值和方差,则: \begin{itemize} \item $EX_i=\mu$。 - \item $DX_i=\delta^2$。 + \item $DX_i=\sigma^2$。 \item $E\overline{X}=EX=\mu$。 - \item $D\overline{X}=D\left(\dfrac{1}{n}\sum\limits_{i=1}^nx_i\right)=\dfrac{1}{n^2}n\delta^2=\dfrac{1}{n}DX=\dfrac{\delta^2}{n}$。 - \item $E(S^2)=E\left(\dfrac{1}{n-1}\sum\limits_{i=1}^n(x_i-\overline{x})^2\right)=E\left(\dfrac{1}{n-1}\sum\limits_{i=1}^n(x_i^2-2x_i\overline{x}+\overline{x}^2)\right)=$\\$E\left(\dfrac{1}{n-1}\left(\sum\limits_{i=1}^nx_i^2-2\overline{x}\cdot\sum\limits_{i=1}^nx_i+n\overline{x}^2\right)\right)=E\left(\dfrac{1}{n-1}\left(\sum\limits_{i=1}^nx_i^2-n\overline{x}^2\right)\right)=$\\$\dfrac{1}{n-1}E\left(\sum\limits_{i=1}^nx_i^2-n\overline{x}^2\right)=\dfrac{1}{n-1}\left(\sum\limits_{i=1}^nEx_i^2-nE\overline{x}^2\right)=\dfrac{n}{n-1}[(Ex_i)^2+Dx_i-(E\overline{x})^2-D\overline{x}]=\dfrac{n}{n-1}\left(\mu^2+\delta^2-\mu^2-\dfrac{\delta^2}{n}\right)=DX=\delta^2$。 + \item $D\overline{X}=D\left(\dfrac{1}{n}\sum\limits_{i=1}^nx_i\right)=\dfrac{1}{n^2}n\sigma^2=\dfrac{1}{n}DX=\dfrac{\sigma^2}{n}$。 + \item $E(S^2)=E\left(\dfrac{1}{n-1}\sum\limits_{i=1}^n(x_i-\overline{x})^2\right)=E\left(\dfrac{1}{n-1}\sum\limits_{i=1}^n(x_i^2-2x_i\overline{x}+\overline{x}^2)\right)=$\\$E\left(\dfrac{1}{n-1}\left(\sum\limits_{i=1}^nx_i^2-2\overline{x}\cdot\sum\limits_{i=1}^nx_i+n\overline{x}^2\right)\right)=E\left(\dfrac{1}{n-1}\left(\sum\limits_{i=1}^nx_i^2-n\overline{x}^2\right)\right)=$\\$\dfrac{1}{n-1}E\left(\sum\limits_{i=1}^nx_i^2-n\overline{x}^2\right)=\dfrac{1}{n-1}\left(\sum\limits_{i=1}^nEx_i^2-nE\overline{x}^2\right)=\dfrac{n}{n-1}[(Ex_i)^2+Dx_i-(E\overline{x})^2-D\overline{x}]=\dfrac{n}{n-1}\left(\mu^2+\sigma^2-\mu^2-\dfrac{\sigma^2}{n}\right)=DX=\sigma^2$。 \end{itemize} \subsection{三大分布} @@ -155,50 +155,249 @@ $X_{(1)}$的分布函数为$F_{(1)}(x)=1-[1-F(x)]^n$,概率密度为$f_{(1)}(x 若随机变量$X\sim N(0,1)$,$Y\sim\chi^2(n)$,$XY$相互独立,则随机变量$t=\dfrac{X}{\sqrt{Y/n}}$服从自由度为$n$的$t$分布,记为$t\sim t(n)$。 -当$t\to\infty$时,$t$分布就是正态分布。其是偶函数,所以$Et=0$。 +当$t\to\infty$时,$t$分布就是标准正态分布。其是偶函数,所以$Et=0$。 + +t分布用于根据小样本来估计呈正态分布且方差未知的总体的均值。 + +\begin{tikzpicture}[scale=1.5] + \draw[-latex](-2,0) -- (2,0) node[below]{$x$}; + \draw[-latex](0,-0.25) -- (0,3) node[above]{$y$}; + \filldraw[black] (0,0) node[below]{$O$}; + \draw[brown, thick, domain=-2:2] plot (\x,{pow(e,-\x*\x/2)/pow(6, -0.5)}); + \filldraw[black] (0.75,2.5) node{$n=+\infty$}; + \draw[orange, thick, domain=-2:2] plot (\x,{pow(e,-\x*\x/2)/pow(4, -0.5)}); + \filldraw[black] (0.75,2) node{$n=2$}; + \draw[red, thick, domain=-2:2] plot (\x,{pow(e,-\x*\x/2)/pow(3, -0.5)}); + \filldraw[black] (0.5,1) node{$n=1$}; + \filldraw[black] (0.25,3) node{$f(x)$}; + \filldraw[black] (2,0.2) node{$\alpha$}; +\end{tikzpicture} + +\paragraph{性质} \leavevmode \medskip + +由$t$分布的概率密度$f(x)$图形的对称性可知$P\{t>-t_\alpha(n)\}=P\{t>t_{1-\alpha}(n)\}$,所以$t_{1-\alpha}(n)=-t_\alpha(n)$。 \subsubsection{\texorpdfstring{$F$分布}{}} -若随机变量$X_1,X_2,\cdots,X_n$ +\paragraph{概念} \leavevmode \medskip + +若随机变量$X\sim\chi^2(n_1)$,$Y\sim\chi^2(n_2)$,且$X$与$Y$相互独立,则$F=\dfrac{X/n_1}{Y/n_2}$服从自由度为$(n_1,n_2)$的$F$分布,记为$F\sim F(n_1,n_2)$,其中$n_1$为第一自由度,$n_2$为第二自由度。 + +\begin{tikzpicture}[scale=2] + \draw[-latex](-0.25,0) -- (3,0) node[below]{$x$}; + \draw[-latex](0,-0.25) -- (0,2) node[above]{$y$}; + \filldraw[black] (0,0) node[below]{$O$}; + \draw[green, thick, domain=0.1:3] plot (\x,{pow(\x,4)*pow(e,-\x*\x/2)/pow(\x,2)*pow(e,-\x*\x/2)}); + \filldraw[black] (2,0.55) node{$n_1=10,n_2=6$}; + \draw[cyan, thick, domain=0.1:3] plot (\x,{pow(\x,4)*pow(e,-\x*\x/2)/pow(\x,3)*pow(e,-\x*\x/2)*4}); + \filldraw[black] (1.5,2) node{$n_1=10,n_2=\infty$}; + \filldraw[black] (0.25,2) node{$f(x)$}; +\end{tikzpicture} + +\paragraph{性质} \leavevmode \medskip + +\begin{itemize} + \item 若$F\sim F(n_1,n_2)$,则$\dfrac{1}{F}\sim F(n_2,n_1)$。 + \item $F_{1-\alpha}(n_1,n_2)=\dfrac{1}{F_\alpha}(n_2,n_1)$。 +\end{itemize} + +证明性质二:记$F\sim F(n_2,n_1)$。 + +$\therefore P\{F>F_\alpha(n_2,n_1)\}=\alpha$,$P\{F\leqslant F_\alpha(n_2,n_1)\}=1-\alpha$。 + +取倒数:$P\left\{\dfrac{1}{F}\geqslant\dfrac{1}{F_\alpha(n_2,n_1)}\right\}=1-\alpha$。 + +又根据性质1:$\dfrac{1}{F}\sim F(n_1,n_2)$,$P\{\dfrac{1}{F}\geqslant F_{1-\alpha}(n_1,n_2)\}=1-\alpha$。 + +即$F_{1-\alpha}(n_1,n_2)=\dfrac{1}{F_\alpha}(n_2,n_1)$。 \subsection{正态总体下结论} +设$X_1,X_2,\cdots,X_n$是来自正态总体$N(\mu,\sigma^2)$的一个样本,$\overline{X}$,$S^2$分别是样本的均值和方差,则: + +\begin{enumerate} + \item $\overline{X}\sim N\left(\mu,\dfrac{\sigma^2}{n}\right)$,即$\dfrac{\overline{X}-\mu}{\sigma/\sqrt{n}}=\dfrac{\sqrt{n}(\overline{X}-\mu)}{\sigma}\sim N(0,1)$。 + \item $\dfrac{1}{\sigma^2}\sum\limits_{i=1}^n(X_i-\mu)^2\sim\chi^2(n)$。 + \item $\dfrac{(n-1)S^2}{\sigma^2}=\sum\limits_{i=1}^n\left(\dfrac{X_i-\overline{X}}{\sigma}\right)^2\sim\chi^2(n-1)$($\mu$未知时,在2中用$\overline{X}$代替$\mu$)。 + \item $\overline{X}$与$S^2$相互独立,$\dfrac{\sqrt{n}(\overline{X}-\mu)}{S}\sim t(n-1)$($\sigma$未知时在1中用$S$代替$\sigma$)。进一步有$\dfrac{n(\overline{X}-\mu)^2}{S^2}\sim F(1,n-1)$。 +\end{enumerate} + \section{参数点估计} \subsection{概念} +\textcolor{violet}{\textbf{定义:}}设总体$X$的分布函数为$F(x;\theta)$,其中$\theta$为一个未知参数,$X_1,X_2,\cdots,$\\$X_n$是取自总体$X$的一个样本。由样本构造一个适当的统计量$\hat{\theta}(X_1,X_2,\cdots,X_n)$作为参数$\theta$的估计,称统计量$\hat{\theta}(X_1,X_2,\cdots,X_n)$为$\theta$的\textbf{估计量},一般记为$\hat{\theta}=\hat{\theta}(X_1,X_2,\cdots,X_n)$。 + +如果$x_1,x_2,\cdots,x_n$是样本的一个观察值,将其代入估计量$\hat{\theta}$中得到值$\hat{\theta}(x_1,$\\$x_2,\cdots,x_n)$,并且此值作为未知参数$\theta$的参数值,统计值称这个值为未知参数$\theta$的\textbf{估计值}。 + +建立一个适当的统计量作为未知参数$\theta$的估计量并以相应的观察值作为未知参数估计值的问题,就是参数$\theta$的\textbf{点估计问题}。 + \subsection{方法} \subsubsection{矩估计法} +\textbf{例题:}来自总体的$X$的简单随机样本$X_1,X_2,\cdots,X_n$,总体$X$的概率分布为$X\sim\left(\begin{array}{ccc} + -1 & 0 & 2 \\ + 2\theta & \theta & 1-3\theta +\end{array}\right)$,其中$0<\theta<\dfrac{1}{3}$,求参数$\theta$的矩估计量。 + +解:令$\overline{X}=EX$,即$\dfrac{1}{n}\sum\limits_{i=1}^nX_i=(-1)2\theta+0\theta+2(1-3\theta)=2-8\theta$。 + +所以$\hat{\theta}=\dfrac{2-\overline{X}}{8}$。 + +\textbf{例题:}来自总体的$X$的概率密度为$f(x)=\left\{\begin{array}{ll} + (1+\theta)x^\theta, & 0-1$为未知参数,设$X_1,X_2,\cdots,X_n$为来自总体$X$的样本容量为$n$的简单随机样本,求$\theta$的矩估计量。 + +解:令$\overline{X}=EX$,$EX=\int_{-\infty}^{+\infty}xf(x)\,\textrm{d}x=\int_0^1x(1+\theta)x^\theta\,\textrm{d}x=(1+\theta)\dfrac{x^{\theta+2}}{\theta+2}\bigg|_0^1=\dfrac{1+\theta}{2+\theta}$。 + +解得$\hat{\theta}=\dfrac{2\overline{X}-1}{1-\overline{X}}$。 + \subsubsection{最大似然估计} +\paragraph{定义} \leavevmode \medskip + +对未知参数$\theta$进行估计时,在该参数可能取值的范围$I$内选取,使得样本获得次观测值$x_1,x_2,\cdots,x_n$的概率最大的参数值$\hat{\theta}$作为$\theta$的估计,这样的$\hat{\theta}$最有利于$x_1,x_2,\cdots,x_n$的出现。 + +设总体$X$是离散型,其概率分布为$P\{X=x\}=p(x;\theta)$,$\theta\in I$,$\theta$为未知参数,$X_1,X_2\cdots,X_n$为$X$的一个样本,则$X_1,X_2,\cdots,X_n$取值为$x_1,x_2,\cdots,x_n$的概率为$P\{X_1=x_1,X_2=x_2,\cdots,X_n=x_n\}=\prod\limits_{i=1}^nP\{X_i=x_i\}=\prod\limits_{i=1}^np(x_i;\theta)$。显然这个概率值为$\theta$的函数,记为$L(\theta)=L(x_1,x_2,\cdots,x_n;\theta)=\prod\limits_{i=1}^np(x_i;\theta)$。称$L(\theta)$为样本的\textbf{似然函数}。 + +\textcolor{violet}{\textbf{定义:}}若存在$\hat{\theta}\in I$,使得$L(x_1,x_2,\cdots,x_n;\hat{\theta})=\max\limits_{\theta\in I}L(x_1,x_2,\cdots,x_n;\theta)$,则称$\hat{\theta}=\hat{\theta}(x_1,x_2,\cdots,x_n)$为参数$\theta$的\textbf{最大似然估计},对应的统计量$\hat{\theta}(X_1,X_2,\cdots,X_n)$称为参数$\theta$的\textbf{最大似然估计量}。 + +同理若总体$X$为连续型随机变量,其概率密度为$f(x;\theta)$,$\theta\in I$,则样本的\textbf{似然函数}为$L(\theta)=L(x_1,x_2,\cdots,x_n;\theta)=\prod\limits_{i=1}^nf(x_i;\theta)$. + +\textcolor{violet}{\textbf{定义:}}若存在$\hat{\theta}\in I$,使得$L(x_1,x_2,\cdots,x_n)=\max\limits_{\theta\in I}\prod\limits_{i=1}^nf(x_i;\theta)$,则称$\hat{\theta}=\hat{\theta}(x_1,x_2,\cdots,x_n)$为参数$\theta$的\textbf{最大似然估计},对应的统计量$\hat{\theta}(X_1,X_2,\cdots,X_n)$称为参数$\theta$的\textbf{最大似然估计量}。 + +\paragraph{步骤} \leavevmode \medskip + +\begin{enumerate} + \item 写出样本的似然函数。$L(\theta)=L(x_1,x_2,\cdots,x_n;\theta_1,\theta_2,\cdots,\theta_k)=\prod\limits_{i=1}^np(x_i;\theta_1,$\\$\theta_2,\cdots,\theta_k)$或$\prod\limits_{i=1}^nf(x_i;\theta_1,\theta_2,\cdots,\theta_k)$。 + \item 如果$p(x;\theta_1,\theta_2,\cdots,\theta_k)$或$f(x;\theta_1,\theta_2,\cdots,\theta_k)$关于$\theta_i$可微,则令$\dfrac{\partial L(\theta)}{\partial\theta_i}=0$或$\dfrac{\partial\ln L(\theta)}{\partial\theta_i}=0$。由于$L(\theta)$是乘积形式,且$\ln x$单调增,所以$L(\theta)$域$\ln L(\theta)$在同一$\theta$处取极值,所以更多采用后面一种对数似然方程组来解。求得$\theta_i$的最大似然估计量为$\hat{\theta}=\hat{\theta}(X_1,X_2,\cdots,X_n)$($i=1,2,\cdots,k$)。 + \item 如果$p(x;\theta_1,\theta_2,\cdots,\theta_k)$或$f(x;\theta_1,\theta_2,\cdots,\theta_k)$不可微,或似然方程组无解,则应由定义用其他方法求$\hat{\theta}$,如当$L(\theta)$为$\theta$的单调函数时,$\hat{\theta}$为$\theta$的取值上限或下限。 +\end{enumerate} + +即将概率密度或概率分布连乘,然后取对数,再求导令其为0解出$\overline{\theta}$。 + +\textbf{例题:}设总体$X$的概率分布为: + +\begin{tabular}{c|cccc} + \hline + $X$ & 0 & 1 & 2 & 3 \\ \hline + $P$ & $\theta^2$ & $2\theta(1-\theta)$ & $\theta^2$ & $1-2\theta$ \\ \hline +\end{tabular} \medskip + +其中$\theta\int\left(0,\dfrac{1}{2}\right)$为未知参数,从总体$X$中抽取容量为8的一组样本,其样本值为3,1,3,0,3,1,2,3。求$\theta$的矩估计值和最大似然估计值。 + +解:首先将所有的概率相乘:$L(\theta)l=(1-2\theta)^4[2\theta(1-\theta)]^2\cdot\theta^2\cdot\theta^2=4\theta^6(1-\theta)^2(1-2\theta)^4$。 + +对其求对数:$\ln L(\theta)=\ln4+6\ln\theta+2\ln(1-\theta)+4\ln(1-2\theta)$。 + +对其求导:$\dfrac{\textrm{d}\ln L(\theta)}{\textrm{d}\theta}=\dfrac{6}{\theta}-\dfrac{2}{1-\theta}-\dfrac{8}{1-2\theta}=0$。解得$\theta=\dfrac{7\pm\sqrt{13}}{12}$。 + +$0<\theta<\dfrac{1}{2}$,舍去正值,得到$\hat{\theta}=\dfrac{7-\sqrt{13}}{12}$。 + +\textbf{例题:}来自总体的$X$的概率密度为$f(x)=\left\{\begin{array}{ll} + (1+\theta)x^\theta, & 0-1$为未知参数,设$X_1,X_2,\cdots,X_n$为来自总体$X$的样本容量为$n$的简单随机样本,求$\theta$的最大似然估计量。 + +解:这是上面的矩估计的题目的延申。 + +首先$L(\theta)=(1+\theta)x_1^\theta\cdot(1+\theta)x_2^\theta\cdots=(1+\theta)\cdot\prod\limits_{i=1}^nx_i^\theta$。 + +取对数$\ln L(\theta)=n\ln(1+\theta)+\theta\sum\limits_{i=1}^n\ln x_i$ + +对其求导:$\dfrac{\textrm{d}\ln L(\theta)}{\textrm{d}\theta}=\dfrac{n}{1+\theta}+\sum\limits_{i=1}^n\ln x_i=0$,解得$\hat{\theta}=-\dfrac{n}{\sum\limits_{i=1}^n\ln x_i}-1$。 + +最大似然估计量为$-\dfrac{n}{\sum\limits_{i=1}^n\ln X_i}-1$。 + +\textcolor{orange}{注意:}估计值用小写$x$,估计量用大写$X$。 + \subsection{估计量平均标准} +不同的估计法所产生的估计量有所差异,需要有一套标准来评判估计量。 + \subsubsection{无偏性} +\textcolor{violet}{\textbf{定义:}}若参数$\theta$的估计量$\hat{\theta}=\hat{\theta}(X_1,X_2,\cdots,X_n)$对一切$n$及$\theta\in I$,有$E\hat{\theta}=\theta$,则称$\hat{\theta}$为$\theta$的\textbf{无偏估计量}。 + +\textbf{例题:}设$X_1,X_2,\cdots,X_n$是正态总体$X\sim N(\mu,\sigma^2)$的简单随机样本,为使$D=k\sum\lim\limits_{i=1}^n-1(X_{i+1}-X_i)^2$称为总体方差$\sigma^2$的无偏估计量,求$k$。 + +解:已知总体方差为$\sigma^2$,所以代入: + +$ED=\sigma^2=kE(\sum\lim\limits_{i=1}^{n-1}(X_{i+1}-X_i)^2)=kE(\sum\limits_{i=1}^{n-1}(X_{i+1}^2-2X_iX_{i+1}+X_i^2))$。 + \subsubsection{有效性} -最小方差性。 +也称为最小方差性。 + +\textcolor{violet}{\textbf{定义:}}设$\hat{\theta_1}=\hat{\theta_1}(X_1,X_2,\cdots,X_n)$与$\hat{\theta_2}=\hat{\theta_2}(X_1,X_2,\cdots,X_n)$都是$\theta$的无偏估计量,若$D(\hat{\theta_1})0$,有$\lim\limits_{n\to\infty}P\{\vert\hat{\theta}-\theta\vert<\epsilon\}=1$,即$\hat{\theta}\overset{P}{\longrightarrow}\theta(n\to\infty)$,则称$\hat{\theta}$为$\theta$的\textbf{一致估计量}(\textbf{相合估计量})。 \section{参数区间估计与假设检验} \subsection{区间估计} +区间估计是根据样本估计总体期望$\mu$所在的区间。 + \subsubsection{概念} +\textcolor{violet}{\textbf{定义:}}已知从总体$X$中取出一部分样本$X_n$,则这些样本的平均值$\overline{X}$不一定等于$X$的期望即应该的平均值$\mu$,但是其之间的差距应该不大,即差距较小的概率较大,从而表示为$P(\vert\overline{X}-\mu\vert<\Delta)=1-\alpha$,$\alpha$为\textbf{显著性水平},其一般是一个较小的正数。而$1-\alpha$称为\textbf{置信度}或\textbf{置信水平}。 + \subsubsection{正态总体均值的置信空间} -\subsection{检设检验} +假设$X\sim N(\mu,\sigma^2)$(若不服从正态分布就用中心极限定理来解决),则$\overline{X}\sim N\left(\mu,\dfrac{\sigma^2}{n}\right)$,$P\left(\left\vert\dfrac{\overline{X}-\mu}{\sigma/\sqrt{n}}\right\vert<\dfrac{\Delta}{\sigma/\sqrt{n}}\right)=1-\alpha$。记$\dfrac{\overline{X}-\mu}{\sigma/\sqrt{n}}=Z$,则$Z\sim N(0,1)$。 + +$\therefore P\left(\vert Z\vert<\dfrac{\Delta}{\sigma/\sqrt{n}}\right)=1-\alpha$,从而中间面积为$1-\alpha$,得到两端面积$\dfrac{\alpha}{2}$。 + +得到上$\alpha$分位数$Z_\frac{\alpha}{2}$,$\therefore\dfrac{\Delta}{\sigma/\sqrt{n}}=Z_\frac{\alpha}{2}$,解得$\Delta=Z_\frac{\alpha}{2}\dfrac{\sigma}{\sqrt{n}}$。 + +代入:解得$\mu\in(\overline{X}-\Delta,\overline{X}+\Delta)=(\overline{X}-Z_\frac{\alpha}{2}\dfrac{\sigma}{\sqrt{n}},\overline{X}+Z_\frac{\alpha}{2}\dfrac{\sigma}{\sqrt{n}})$。 + +这个$\mu$所处的区间就是\textbf{置信区间},区间上限就是\textbf{置信上限},区间下限就是\textbf{置信下限}。 + +当$\sigma$未知的时候就无法求出置信区间了,所以根据正态总体下的结论,用样本方差$S$代替方差$\sigma$,且$\dfrac{\sqrt{n}(\overline{X}-\mu)}{S}\sim t(n-1)$。 + +所以$P\left(\left\vert\dfrac{\overline{X}-\mu}{S/\sqrt{n}}\right\vert<\dfrac{\Delta}{S/\sqrt{n}}\right)=1-\alpha$,令$\dfrac{\overline{X}-\mu}{S/\sqrt{n}}=t$,所以$t\sim t(n-1)$。 + +可得上$\alpha$分位点$t_\frac{\alpha}{2}(n-1)$,所以$\dfrac{\Delta}{S/\sqrt{n}}=t_\frac{\alpha}{2}(n-1)$,解得$\Delta=t_\frac{\alpha}{2}(n-1)\dfrac{S}{\sqrt{n}}$。 + +代入:解得$\mu\in(\overline{X}-\Delta,\overline{X}+\Delta)=\mu\in(\overline{X}-t_\frac{\alpha}{2}(n-1)\dfrac{S}{\sqrt{n}},\overline{X}+t_\frac{\alpha}{2}(n-1)\dfrac{S}{\sqrt{n}})$。 + +综上:求置信空间的关键是求$\Delta$: + +\begin{itemize} + \item 当$\sigma$已知时,$\Delta=Z_\frac{\alpha}{2}\dfrac{\sigma}{\sqrt{n}}$。 + \item 当$\sigma$未知时,$\Delta=t_\frac{\alpha}{2}(n-1)\dfrac{S}{\sqrt{n}}$。 +\end{itemize} + +\subsection{假设检验} + +已经有了对期望$\mu$的假设,对这个假设进行检验。若所处的区间在拒绝域中,就拒绝原假设。 \subsubsection{思想} +已经有了假设样本期望为$\mu=\mu_0$。则$P(\vert\overline{X}-\mu_0\vert<\Delta)=1-\alpha$,所以取对立事件$P(\vert\overline{X}-\mu_0\vert\geqslant\Delta)=\alpha$,这是一个小概率事件。若对这个小概率事件发生了,则否定原假设。 + +若$\sigma$已知,则$\Delta=Z_\frac{\alpha}{2}\dfrac{\sigma}{\sqrt{n}}$,则区间$(-\infty,\mu_0-\Delta]\cup[\mu_0+\Delta,+\infty)$称为\textbf{拒绝域},即小概率发生的区间。 + +若$\sigma$未知,则$\Delta=t_\frac{\alpha}{2}(n-1)\dfrac{S}{\sqrt{n}}$,拒绝域一样。 + \subsubsection{正态总体下的六大检验与拒绝域} \subsection{两类错误} +第一类错误(弃真):若$H_0$为真,按检验法则否定$H_0$。发生概率为$\alpha=P\{\text{拒绝}H_0|H_0\text{为真}\}$。 + +第二类错误(存伪):若$H_0$为假,按检验法则接受$H_0$。发生概率为$\beta=P\{\text{接受}H_0|H_0\text{为假}\}=P\{\text{接受}H_0|H_1\text{为真}\}$。 + \end{document}