Showing posts with label Variance. Show all posts
Showing posts with label Variance. Show all posts

[The Variance/ Bias Tradeoff] The Hill Estimator, Kernel Density Estimation, Non-Parametirc Regression

[1] The Hill Estimator
The Hill estimator is one of ways of estimating the tail index alpha.
Suppose $X_{1}, X_{2},...,X_{n}$ are independent non-negative random variables.
* The Heavy-tailed data : 1-F(X) = $P(X_{i}>x)=x^{-\alpha}L(x)$,
where $\alpha(=\frac{1}{\gamma})$ > 0 is an unknown parameter (called the tail index) and it describes the heaviness of the right tail. And L(x) is an slowly varying function satisfying $\lim_{x\rightarrow \infty}\frac{L(tx)}{L(x)}=1$, t>0.
* Bias increases with k, variance decrease with k. Choice of k can be chosen by Hill plot. 

[2] Kernel Density Estimation 
The density estimator: $\hat{f_{h}x}= \frac{1}{nh}\sum_{i=1}^n w(\frac{x-x_{i}}{h})$, where w is a symmetric probability density. 
* The bandwidth h: If h is too large, the estimator $\hat{f}$ is too smooth, whereas the h is too small, then the estimator $\hat{f}$ is too noisy. Therefore, the bias and variance depend on the bandwidth h! 
* Remark) As the bias depends on the unknown density, therefore the choice of the bandwidth h is complicated.    

 
[3] Non-Parametirc Regression using Kernel Smoothing 
In non-parametriec regression, our model would be $y_{i}=g(x_{i})+ \varepsilon_{i}$.
We need an inference about g(x), smooth function of x. We can estimate g(x) by $\hat{g}(x)=\sum_{i\in S(x)} w_{i}(x)y_{i}$, where $S(x)= \{ |x_{i} -x |\leq h\}$, bandwidth h. Our $\hat{g}(x)$ will be a loess smoother which uses weighted linear. So the estimated function is smoother, but severly biased. The estimated function is less smooth indicating smaller bias, but large variance.     

 

Expected Value & Variance & Covariance



| Expected Value

  • Discrete Case: $\mathsf {E(x)= \sum x_{i} \cdot P(X=x_{i})}$        
  • Continuous Case: $\mathsf {E(x)= \int_{-\infty}^{\infty} x \cdot f(x)dx }$ 
  • Properties   (a, b $ \in \mathbb{R}$)  
       If X$\geq$ 0, then E(X)$\geq$ 0         
       Proof 
       $\mathsf { E(X)=\sum_{x}x\cdot P(X=x)=\sum_{x>0} x\cdot P(X=x)\geq \sum_{x>0} 0 \cdot P(X=x)=0}$               
 
       E(aX) = a E(X) 
       Proof
       $\mathsf {E(aX)=\sum _{x}a\cdot x \cdot P(X=x)=a \sum_{x} x \cdot P(X=x)=a \cdot E(x) }$ 

        E(X+Y) = E(X) + E(Y)


| Variance
  • $\mathsf{ Var= E(X^2)- E(X)^2 }$ 
     Proof 
      $\mathsf{Var= E[x- E(x)^2] = E[(x- \mu)^2)]= \sum_{x} (x_{i}-\mu)^2 \cdot P(X=x)}$ 
            $\mathsf{= \sum_{x}(x^2 - 2\mu x + \mu^2) \cdot P(X=x)} $
            $\mathsf{ = \sum_{x}x^2\cdot P(X=x)-2\mu \cdot \sum_{x}x\cdot P(X=x)+ \mu^2 \sum_{x}P(X=x)}$ 
            $\mathsf{=E(x^2)-2\mu^2+\mu^2 = E(x^2)-\mu^2 = E(x^2)-E(x)^2}$ 

  • Properties   (a, b $ \in \mathbb{R}$)  
        Var(a)=0  (All values are same, then there is no variance) 

        Var (aX+b)= $a^2 \cdot$Var(x)
        Proof 
        From $\mathsf {E(aX+b)=aE(X)+b}$, and $\mathsf {E[(aX+b)^2]=a^2E(X^2)+2abE(X)+b^2}$ 
        $\mathsf {Var(aX+b)=E[(aX+b)^2]-[E(aX+b)]^2}$ 
                         $\mathsf {=a^2E(X^2)+2abE(X)+b^2-[aE(X)+b]^2}$ 
                         $\mathsf {=a^2E(X^2)+2abE(X)+b^2-[a^2(E(X))^2+2abE(X)+b^2]}$ 
                         $\mathsf {=a^2[E(X^2)-E(X)^2]=a^2\cdot Var(X)}$   
 
        Var(X+Y)= Var(X)+Var(Y)+ 2Cov(X,Y)
        Proof
        From $\mathsf {E(X+Y)=E(X)+E(Y)}$, and $\mathsf {E[(X+Y)^2]=E(X^2)+2E(XY)+E(Y^2)}$ 
        $\mathsf {Var(X+Y)=E[(X+Y)^2]-[E(X+Y)]^2}$
                          $\mathsf {=E(X^2)+E(Y^2)+2E(XY)-[ (E(X))^2 + (E(Y))^2 +2E(X)E(Y)] }$ 
                          $\mathsf {=E(X^2)-(E(X))^2+E(Y^2)-(E(Y))^2+2 [E(XY)-E(X)E(Y)] }$ 
                          $\mathsf {=Var(X)+Var(Y)+2Cov(X,Y)}$ 

       Var(X+Y)= Var(X)+Var(Y), iff X and Y are uncorrelated



| Covariance
  • A measure of how much two random variables change together!
  • Cov(X,Y)=E{ [X-E(X)][Y-E(Y)] } = E(XY)-E(X)E(Y)
       

|If X and Y are independent, then 
  • P(X=x, Y=y)=P(X=x)P(Y=y)
  • E(XY)=E(X)E(Y), 
     Proof
      $\mathsf {E(XY)=\sum_{x,y}xy \cdot P(X=x,Y=y)= \sum_{x}\sum_{y}xy \cdot P(X=x)P(Y=y)}$        
      $\mathsf {=\sum_{x} x\cdot P(X=x) \cdot \sum_{y}y \cdot P(Y=y)=E(X)E(Y)}$ 
      
  •  Cov (X,Y)=0
      Proof
      E(XY)-E(X)E(Y)=E(X)E(Y)-E(X)E(Y)=0