sampling distribution

Mean

Xˉ=X1+...+Xnn=1nXi\bar{X} =\frac{X_1+...+X_n}{n} = \frac{1}{n} \sum X_i

  • mina(x1a)2=(xixˉ)2\min _a \sum (x_1 -a)^2 = \sum (x_i -\bar x)^2
    • xˉ\bar {x}使得xix_i的距离和最短
  • E(Xˉ)=μE(\bar{X}) = \mu
    • 表示样本均值是一个无偏估计
  • Var(Xˉ)=σ2nVar(\bar{X} ) = \frac{\sigma^2}{n}
    • 可以看出,其方差随着nn的增大减小,也就是说,增大样本量可以使得估计更为准确。

Variance

S=1n1(XiXˉ)2S= \frac{1}{n-1}\sum (X_i-\bar{X})^2

  • (n1)S2=i=1nxi2nxˉ2(n-1)S^2 = \sum_{i=1}^n x_i^2 - n\bar{x}^2
    • 只需要扫描一遍数据
  • E(S2)=σ2E(S^2) = \sigma^2
    • 表示严格方差为无偏估计

Lemma

  1. X1,...,XnX_1,...,X_n 是来自同一分布的样本,令g(x)g(x)为它的一个函数,同时,若E(g(X1))E(g(X_1))Var(g(X1))Var(g(X_1))存在,则:

    E(g(Xi))=nE(g(X1))E(\sum g(X_i)) = n \cdot E(g(X_1))

    Var((Xi))=nVar(g(X1))Var(\sum (X_i)) = n\cdot Var(g(X_1))

  2. 考虑均值矩母函数与随机样本的关系:

    MXˉ(t)=[MX(tn)]nM_{\bar{X}}(t) = [M_{X}(\frac{t}{n})]^n

Theorem

  1. Xˉ\bar{X}S2S^2相互独立

  2. XˉN(μ,σ2n)\bar{X} \sim N(\mu,\frac{\sigma^2}{n})

  3. (n1)S2σ2χ2(n1)\frac{(n-1)S^2}{\sigma^2}\sim \chi^2(n-1)

    为什么自由度为n1n-1

    • 在用样本方差估计总体方差时会需要用到样本均值,而样本均值就决定了变量值的总数。

Convolution theorem

XXYY是两个相互独立的连续随机变量,那么Z=X+YZ = X+Y的pdf为:

fZz=+fX(w)fY(zw)dwf_Z{z} = \int_{-\infty}^{+\infty} f_X(w)f_Y(z-w)dw

Order statistics

  • The order statistics of a random sample X1,...,XnX_1,...,X_n are the sample values placed in ascending order. denoted by X(1),...X(n)X_{(1)},...X_{(n)}

Distribution

discrete case

Define Pi=p1+p2+...+piP_i = p_1 + p_2 + ... + p_i,then:

P(X(j)xi)=k=jnCnkPik(1Pi)nkP(X_{(j)}\le x_i )= \sum_{k=j}^nC_n^k P_i^k(1-P_i)^{n-k}

continuous case

fX(j)(x)=n!(j1)!(nj)!fX(x)[FX(x)]j1[1FX(x)]njf_{X_{(j)}}(x) = \frac{n!}{(j-1)!(n-j)!}f_X(x)[F_X(x)]^{j-1}[1-F_X(x)]^{n-j}

Joint distribution

fX(i),X(j)(u,v)=n!(i1)!(j1i)!(nj)!fX(u)fX(v)[FX(u)]i1[FX(v)FX(u)]ji1[1FX(v)]njf_{X_{(i)},X_{(j)}}(u,v) = \frac{n!}{(i-1)!(j-1-i)!(n-j)!} f_X(u)f_X(v)[F_X(u)]^{i-1}[F_X(v) - F_X(u)]^{j-i-1}[1-F_X(v)]^{n-j}

Limit theory

Convergence in probability

A sequence of X1,X2,...,X_1,X_2,..., converges in probability to a r.v XX, if for every ϵ>0\epsilon >0:

limnP(XnXϵ)=0\lim\limits_{n\to \infty} P(|X_n - X| \ge \epsilon ) = 0

  • 可以使用切比雪夫不等式证明,样本均值依概率收敛到0

Weak law of large numbers

Let X1,X2,..X_1,X_2,.. be i.i.d. with E(Xi)=μE(X_i) = \mu and Var(Xi)=θ2<Var(X_i) = \theta^2 < \infty:

limnP(Xˉμ<ϵ)=1\lim\limits_{n\to \infty} P(|\bar{X} -\mu | < \epsilon ) = 1

  • 可以使用切比雪夫不等式证明,样本方差依概率收敛到0,符合弱大数定理

Almost sure convergence

A sequence of X1,X2,...,X_1,X_2,..., converges almost to a r.v XX, if for every ϵ>0\epsilon >0:

P(limnXnˉX<ϵ)=1P(\lim\limits_{n\to \infty}|\bar{X_n}-X|< \epsilon) = 1

  • 几乎处处收敛一定是依概率收敛

Strong law of large numbers

Let X1,X2...X_1,X_2... be i.i.d. r.v.s with E(Xi)=μE(X_i) = \mu and Var(Xi)=θ2<Var(X_i) = \theta^2 < \infty:

P(limnXnˉμ<ϵ)=1P(\lim\limits_{n\to \infty}|\bar{X_n}-\mu|< \epsilon) = 1

Convergence in distribution

A sequence of X1,X2,...,X_1,X_2,..., converges in distribution to a r.v XX, if for every ϵ>0\epsilon >0:

limnFXn(x)=FX(x)\lim\limits_{n\to \infty} F_{X_{n}} (x) = F_{X}(x)

  • 依分布收敛最弱
  • 若满足依概率收敛,一定满足依分布收敛

Central limit theorem

Let X1,X2...X_1,X_2... be a sequence of i.i.d r.v.s whose mgfs exist in a neighborhood of 0. Let E(Xi)=μE(X_i)= \mu and Var(Xi)=σ2>0Var(X_i) = \sigma^2>0.

n(Xnˉμ)σN(0,1)\frac{\sqrt{n}(\bar{X_n}-\mu)}{\sigma} \sim N(0,1)

Slutsky's Theorem

Let XnXX_n \to X in distribution and YnaY_n \to a,a constant, in probability, then:

  1. YnXnaXY_nX_n \to aX in distribution
  2. Xn+YnX+aX_n+Y_n \to X + a in distribution

这告诉我们,乘积和极限可以交换位置。因此不难得到:

n(Xnˉμ)Sn=σSnn(Xnˉμ)σN(0,1)\frac{\sqrt{n}(\bar{X_n}-\mu)}{S_n} = \frac{\sigma}{S_n}\frac{\sqrt{n}(\bar{X_n}-\mu)}{\sigma} \to N(0,1)

results matching ""

    No results matching ""