Showing posts with label Probability Examples. Show all posts
Showing posts with label Probability Examples. Show all posts

Independent Event Example 1 - Solution.

Example 1.) Mathematical Statistics by K.Knight. Chapter 1, 1.2
Suppose that A and B are independent events. Determine which of the following pairs of events are always independent and which are always disjoint.
(a) A and $B^c$
(c) $A^c$ and $B^c$  


Solution
(a) A and $B^c$
We need to check whether P(A \cap B^c)=P(A)\cdot P(B^c) or not.
$P(A)=P(A \cap B) \cup (A \cap B^c) \Rightarrow P(A \cap B^c)=P(A)-P(A \cap B) $
         $=P(A)\left \{ 1-P(B^c) \right \} = P(A) \cdot P(B^c)$
Therefore, it's true! They are independent.

(c) $A^c$ and $B^c$  
We need to check whether $P(A^c \cap B^c)=P(A^c) \cdot P(B^c)$ or not.
We know $(A^c \cap B^c)=(A \cup B)^c$, $\Rightarrow P( (A \cup B)^c) = 1 - P(A\cup B)=1 - P(A)-P(B)+P(A)\cdot P(B)$
     $= 1 - P(A)+P(B)(P(A)-1)=(1-P(A))(1-P(B))=P(A^c) \cdot P(B^c)$
Therefore, it's true! They are independent.  

Independent


Independent Event
Definition) Two events A, and B are independent if and only if
$P(A\cap B)=P(A)\cdot P(B)$
$P(A|B)= \frac{ P(A \cap B)}{P(B)}=P(a)$, where P(B) > 0
 
Example 1.) Mathematical Statistics by K.Knight. Chapter 1, 1.2
Suppose that A and B are independent events. Determine which of the following pairs of events are always independent and which are always disjoint.
(a) A and $B^c$
(c) $A^c$ and $B^c$  
Solution??!!
 

Example) about MLE, Pivot, N-P Lemma

Example) University of Toronto STA355 2013 Final Test Q1.

Suppose that $X_{1}, X_{2},...X_{n}$ are independent exponential random variables with density $f(x;\lambda)=\lambda\cdot\exp(-\lambda\cdot x)$ for $x \geq 0$, $\lambda > 0$ 

a) Find the MLE of $\lambda$, and find the limiting distribution of $\sqrt{n}(\hat{\lambda_{n}}-\lambda)$?

b) A pivot for $\lambda$ is $2\lambda \sum_{i=1}^{n}X_{i}\sim \chi^2_{2n}$.
Show how you can use this pivot to construct a CI for $\lambda$. 

c) $H_{0}: \lambda =1$ vs. $H_{1}: \lambda > 1$ Suing test statistic $T=2\sum_{i=1}^{n}X_{i}$.
For an alpha level test, for what values of T would reject $H_{0}$?
=================================================================================
Solution 
a) * First the MLE of $\lambda$,
  Likelihood, $L(\lambda)=\prod_{i=1}^{n} f(x_{i};\lambda)= \lambda^n \cdot \exp(-\lambda \cdot \sum_{i=1}^{n}x_{i})$
  Log likelihood, $l(\lambda)= n \log \lambda - \lambda \sum_{i=1}^{n}x_{i}$
  Taking a derivative w.r.t. $\lambda$, $l'(\lambda)= \frac{n}{\lambda} - \sum_{i=1}^{n}=0$
  Therefore, $\hat{\lambda_{n}}= \frac{1}{\bar{X}}$ 
  * Limiting distribution of $\sqrt{n}(\hat{\lambda_{n}}-\lambda)$
  $I(\lambda)=Var(\frac{d}{d\lambda}\log f(x_{i}; \lambda))=Var(\frac{1}{n}-X_{1}) =Var(X_{1})=\frac{1}{\lambda^2}$ 

b) In this question, follow three steps;
1) Find a pivot statistic ($T=h(\theta; X_{i})$), From the question, we know $2\lambda \sum_{i=1}^{n}X_{i}\sim \chi^2_{2n}$.
2) Pick a & b such that $P(a$P(a< \chi_{2n}^2
3) And rearrange the 2nd step w. r. t. a parameter given a distribution from the question. 
  $P(\frac{a}{2\sum X_{i}} < \lambda <\frac{b}{2\sum X_{i}})= 1-2\alpha$%. Therefore, CI for $\lambda$ : $(\frac{a}{2\sum X_{i}}, \frac{b}{2\sum X_{i}})$

c) We need to use the N-P lemma! 
 Under the null hypothesis, if $\lambda_{1}$>1, $\frac{f(x_{1}, x_{2},...,x_{n}; \lambda_{1})}{f(x_{1}, x_{2},...,x_{n}; 1)}$ = $\lambda_{1}^n \cdot \exp((1-\lambda_{1})\sum X_{i})$
 This is a decreasing function of $\sum_{i=1}^nX_{i}$ (as $1-\lambda_{1}$<0 font="">

 Therefore, the most powerful alpha level test reject the null hypothesis when $T < c \cdot \alpha$,
 where $c \cdot \alpha$ is the alpha quantile of $\chi_{2n}^2$.
 

Hypothesis Testing - Likelihood Ratio Test Example (1)


Example) Mathematical Statistics by Keith Knight, Chapter 7-19

Let $X_{1}, X_{2},...X_{n}$ be iid N$(\mu,\sigma^2)$. Both are unknown. 
We want to test $H_{0}: \mu=\mu_{0}$ vs. $H_{1}: \mu \neq \mu_{0}$.   

$\triangleright$ Solution.
As we know the MLE of $\mu$ and $\sigma ^2$ are follwing: $\widehat{\mu}=\bar{X}$, and $\widehat{\sigma^2}=\frac{1}{n}\sum(X_{i}-\bar{X})^2$.
Under the $H_{0}$, the MLE of $\widehat{\sigma^2_{0}}= \frac{1}{n}\sum(X_{i}-\mu_{0})^2$.

$f(x_{1},... x_{n};\mu, \sigma^2)= \prod_{i=1}^{n} \frac{1}{\sigma\sqrt{2\pi}}\exp[-\frac{(x_{i}-\mu)^2}{2\sigma^2}]= (2\pi\sigma^2)^{-\frac{n}{2}} \exp[-\frac{\sum(X_{i}-\mu)}{2\sigma^2}]$

* We know the Likelihood Ratio test will reject $H_{0}$ when $\Lambda$ $\geq$ k,
where $\Lambda = (\frac{\widehat{\sigma^2_{0}}}{\widehat{\sigma^2}})^\frac{n}{2}$, and k is chosen based on the level of the test is specifed $\alpha$.

$\frac{\widehat{\sigma^2_{0}}}{\widehat{\sigma^2}}=\frac{\sum(X_{i}-\mu_{0})^2}{\sum(X_{i}-\bar{X})^2} = 1+\frac{n(\bar{x}-\mu_{0})}{\sum(X_{i}-\bar{x})^2}$

* Why? $\sum ((X_{i}-\bar{x})+(\bar{x}-\mu_{0}))^2=\sum(X_{i}-\bar{x})^2+n(\bar{x}-\mu_{0})^2$ !!
* But Wait!! We can use the T distribution, where $T=\frac{\sqrt{n}(\bar{x}-\mu_{0})}{S}$, where $S^2= \frac{1}{n-1}\sum(X_{i}-\bar{x})^2$!
  
Therefore, we can rearrange the EQTN; $=1+\frac{1}{n-1}(\frac{n(\bar{x}-\mu_{0})}{s^2})=1+\frac{T^2}{n-1}$
Therefore, when $H_{0}$ is true, T has student T distribution with n-1 degree of freedom.

Now we know the distribution of Lambda, we can define the rejection region.

Uniform Distribution

Continuous Random Variable 
X~U(a,b): All intervals of the same length within a given range (a,b) is equally probable!


$\star F(x)= \frac{x-a}{b-a}$ where $a < x < b$  
$\star f(x)=\frac{1}{b-a}\cdot I_{(a,b)}(X)$ 

$\star E(X)= \frac{a+b}{2}$, $Var(X)=\frac{(b-a)^2}{12}$
Proof
$E(X)=\int_{a}^{b}x \cdot \frac{1}{b-a} dx= \frac{x^2}{2(b-a)}|_{a}^b = \frac{b^2-a^2}{2(b-a)}=\frac{a+b}{2}$
$E(X^2)=\int_{a}^{b}x^2 \cdot \frac{1}{b-a} dx= \frac{x^3}{3(b-a)}|_{a}^b = \frac{b^3-a^3}{3(b-a)}=\frac{b^2+ab+a^2}{3}$
$Var(X)=\frac{a^2+ab+b^2}{3}-(\frac{a+b}{2})^2=\frac{(b-a)^2}{12}$



Method of Moments in Uniform Distribution 
$\star$ Moment Generating Function m(t) is...
$m(t)=\int_{0}^{\theta}e^{xt}\cdot p(X=x)dt=\int_{0}^{\theta} e^{xt}\cdot \frac{1}{\theta}dt = \frac{1}{\theta}\cdot \frac{1}{x}e^{xt} |^{\theta}_{0}= \frac{1}{\theta}\cdot \frac{1}{x}\cdot e^{\theta x}$ 

Example) Mathematical Statistics and Data Analysis, 3ED, Chapter 8. Q53(a)
Find the method of moments estimate of $\theta$ and its mean and variance. 
Solution??!!

Binomial Distribution Example_using generating function

Example) 

Let Y~be binomial (15, $\frac {1}{3}$). Evaluate Var(Y).  
(We know variance of binomial distribution is npq. However what if we don't know this formula?) 

$\triangleright$ Think First 
$Var(Y)=E(Y^2)-[E(Y)]^2$. So we need the second moment, which means we need to use generating function. 

$\triangleright$ Solution
By using generating function, $G(z)=E(z^Y)=(q+pz)^n$
$G'(z)=E(Y \cdot z^{Y-1})=n \cdot (q+pz)^{n-1}\cdot p$ 
So when z=1, $G'(1)=E(Y)=np$ 

$G''(z)=E(Y (Y-1) \cdot z^{Y-2})=n (n-1)\cdot (q+pz)^{n-2}\cdot p^2$, 
so when z=1 $G''(1)=E(Y^2-Y)=n(n-1)^2p=E(Y^2)-E(Y)=n(n-1)p^2$  

$Var(Y)=n(n-1)p^2+np-(np)^2=n^2p^2-np^2+np-n^2p^2=npq \because(q=1-p)$ 

$\therefore Var(Y)=15 \cdot \frac{1}{3} \cdot \frac{2}{3}=\frac {30}{9}$ 

Poisson Distribution Example_Likelihood Ratio testing_1

Example) Mathematical Statistics and Data Analysis 3ED, Chpater 9, Q7. 

Let $X_{1}, \cdots ,X_{n}$ be a sample from a Poisson distribution. Find the likelihood ratio for testing $H_{0}: \lambda = \lambda_{0}$ versus $H_{A}: \lambda = \lambda_{1}$, where $\lambda_{1}> \lambda_{0}$. Use the fact that the sum of independent Poisson random variables follows a Poisson distribution to explain how to determine a rejection region for a test at level $\alpha$   


$\triangleright$ Think First
the LRT reject $H_{o}\Leftrightarrow \frac {L(data|H_{o})}{L(data|H_{1})}$ < C 

If $X_{1}$ ~Poisson($\lambda_{1}$ ) and $X_{2}$ ~ Poisson($\lambda_{2}$), and $X_{1}$ & $X_{2}$ independent. 
then ($X_{1}$+$X_{2}$) ~ Poisson ($\lambda_{1}$+$\lambda_{2}$)
Thus, $\sum x_{i}$ ~ (under $H_{0}$) Poisson ($n \cdot \lambda_{0}$)

$\triangleright$ Solution
First, the likelihood ratio is...
$\rightarrow \frac {L(data|H_{o})}{L(data|H_{1})}=\frac {\prod e^{-\lambda_{0}}\lambda_{0}^{x_{i}}}{x_{i}!} / \frac {\prod e^{-\lambda_{1}}\lambda_{1}^{x_{i}}}{x_{i}!} = e^{n(\lambda_{1}-\lambda_{0})} \cdot (\frac {\lambda_{0}}{\lambda_{1}})^{\sum x_{i}}$ 

Second, LRT is...
Reject $H_{o}\Leftrightarrow$ $e^{n(\lambda_{1}-\lambda_{0})} \cdot (\frac {\lambda_{0}}{\lambda_{1}})^{\sum x_{i}}$ < C
                  $(\sum x_{i}) \ln \frac {\lambda_{0}}{\lambda_{1}} < \ln C-n(\lambda_{1}-\lambda_{0})$ 
                   $\sum x_{i} > \frac {\ln C-n(\lambda_{1}-\lambda_{0})}{\ln \lambda_{0}-\ln \lambda_{1}} = C$ Some constant C. (why? $\ln \frac {\lambda_{0}}{\lambda_{1}}< 0$ )

Thus, reject $H_{o}\Leftrightarrow \sum x_{i}> C$ 

Finally significant level $\alpha$ is...
$\alpha$ = P(reject $H_{0}$ | $H_{0}$ ) = $P(\sum x_{i} > C | \lambda = \lambda_{0})$ 

So, we have the following eqtn on C 
$\alpha = P(Y> C)$ where Y~ Poisson ($n \cdot \lambda_{0}$)
$\alpha = 1-F_{(n \cdot \lambda_{0})}(c)$, where $F_{(n \cdot \lambda_{0})}$ is the CDF of Y. 

$\therefore C= F^{-1}_{n\cdot \lambda_{0}}(1-\alpha)$  

Poisson Distribution Example_Sufficient

Example)

Let $X_{1}, X_{2}$ be a random sample of size 2 from Poisson distribution $f(x_{1}=\lambda^{x_{1}} \cdot \frac {e^{-\lambda}}{x_{1}!})$ . Show $T=X_{1}+ X_{2}$ 

$\triangleright$  Solution
The joint distribution of the random sample is $\prod_{i=1}^{2}=f(x_{i}|\lambda)=\frac {\lambda^{x_{1}+x_{2}}}{x_{1}!x_{2}!} \cdot e^{-2\lambda}$ 

The joint probability of $X_{1}=x_{1}, X_{2}=x_{2}$ and $T=X_{1}+X_{2}=t$ is
$\rightarrow f(x_{1}, x_{2}, t|\lambda)=\frac {\lambda^t}{x_{1}!x_{2}!} \cdot e^{-2\lambda}$ 

We know $T=X_{1}+X_{2}$ has a Poisson distribution with parameter $2\lambda$
$\rightarrow g(t|\lambda)=\frac {(2\lambda)^t}{t!}\cdot e^{-2\lambda}$  

Consequently, the conditional distribution of the sample, given T=t is
$\rightarrow f(x_{1}, x_{2}|T=t; \lambda)=\frac {f(x_{1}, x_{2}, t|\lambda)}{g(t|\lambda)}=\frac {\lambda^t}{x_{1}!x_{2}!}\cdot e^{-2\lambda}/ \frac {(2\lambda)^t}{t!}\cdot e^{-2\lambda}$ 
$\therefore$  This does not depend on the parameter $\lambda$ as the $\lambda$ is cancelled out. 

Poisson Distribution Example_MLE

Example) Mathematical Statistics and Data Analysis, 3ED, Chapter 8. Q3. 

One of the earliest applications of the Poisson distribution was made by Student(1907) in studying errors made in counting yeast cells or blood corpuscles with a haemacytometer. In this study, yeast cells were killed and mixed with water and gelatin; the mixture was then spread on a glass and allowed to cool. Four different concentrations were used. Counts were made on 400 squares and the data are summarized in the following table; (we're deal with only one data set) 

(#cells, Concentration 2)
= (0, 103) (1, 143) (2, 98) (3, 42) (4, 8) (5, 4) (6 2) (7, 0) (8, 0) (9, 0) (10, 0) (11, 0) (12, 0)

a) log-likelihood for $\lambda$ 
b) maximum likelihood? 
c) Calculate maximum likelihood?
d) Find an approximate 95% confidence interval for $\lambda $

$P(Y=y)= \frac {e^{-\lambda} \lambda^y}{y!}$ 

$\triangleright$ Solution (a)
$l(\lambda) = \sum y_{i} \cdot \log \lambda = \sum \log(y_{i})-n\lambda$ 

$\triangleright$ Solution (b)
$l'(\lambda) = \frac {\sum y_{i}}{\lambda} -n=(set)0\rightarrow \hat {\lambda}=\frac {\sum y_i}{n} = \bar{y}$ 

$\triangleright$ Solution (c)
$\hat{\lambda}=\bar{y}= \frac {529}{400}=1.3225$ 

$\triangleright$ Solution (d)
We can find a variance by using Fisher information: 
$- \frac {d^2}{d\lambda^2}[\sum y_i \log \lambda-n\lambda]=\frac {\sum y_{i}}{\hat {\lambda}^2}= \frac {n^2}{\sum y_{i}} \rightarrow \frac {\sum y_{i}}{n^2}\approx \sigma ^2_{\hat{\lambda}}$ 
$\hat {\lambda} \pm Z_{0.975} \frac {\sqrt{\sum y_{i}}}{n} = 1.3225 \pm1.96 \cdot \frac {23}{400}=(1.210, 1.435)$ 

Poisson Distribution

Discrete Random Variable 
: X~Poisson ($\lambda$)= the probability of a number of events occurring in a fixed time interval at rate $\lambda$. We assume that time between events is independent of one another! 
$\bigstar P(X=k)= e^{-\lambda} \cdot \frac {\lambda^k}{k!}$, k=0,1,...,  0 $\leqslant$  $\lambda < \infty$ 

$\bigstar$ Moment Generating Function, m(t)=$e^{\lambda(e^t-1)}$ 
Proof 
m(t)=$E[e^{tx}]=\sum_{x=0}^{\infty}e^{tx}\cdot e^{-\lambda}\cdot \frac {\lambda^k}{x!}= e^{-\lambda}\sum_{x=0}^{\infty} \frac {e^{tx}\lambda^x}{x!}$ 
               (tip!! $\sum p(x) = 1 \rightarrow \sum_{x=0}^{\infty}e^{-\lambda}\cdot \frac{\lambda^x}{x!}=1 \rightarrow e^{-\lambda} \cdot \sum_{x=1}^{\infty} \frac{\lambda^x}{x!}=1$) 
                = $e^{-\lambda}\cdot \sum_{x=0}^{\infty}\frac{(e^t\lambda)^x}{x!}=e^{-\lambda}\cdot e^{e^t \cdot \lambda}=e^{\lambda(e^t-1)}$  

$\bigstar$ E(X)=$\lambda$, Var(X)=$\lambda$ 
Proof 
$\rightarrow$ by using mgf, we need E[X]=m'(0), $m'(t)=e^u$, where $u=\lambda \cdot(e^t-1)$  
$m'(t)= \frac{dm}{du}\cdot \frac{du}{dt}=e^u \cdot\lambda e^t=e^{\lambda(e^t-1)}\cdot \lambda e^t$ $\therefore m'(0)=e^0 \cdot \lambda e^0= \lambda$ 

$\rightarrow$ For finding variance, we need E[X]=m''(0)
$m'(t)=e^{\lambda(e^t-1)}\cdot \lambda e^t$, $m''(t)=\lambda [e^t [e^{\lambda(e^t-1)}\cdot \lambda e^t]+e^{\lambda(e^t-1)\cdot e^t}]$ $\rightarrow m''(0)=\lambda(\lambda+1)$ 
$\therefore Var(x)=E(X)^2-[E(X)]^2=\lambda^2+\lambda - \lambda^2=\lambda$ 

Maximum Likelihood Estimate (MLE) in Poisson Distribution 
Proof
$L(\lambda)=\prod_{i=1}^{n}(\frac {\lambda^{k_{i}}\cdot e^{-\lambda}}{k_{i}!})= L(X_1, X_2,...,X_n|\lambda)$ 
$l(\lambda)=-n\lambda+ \sum x_{i}\log \lambda-log(\prod x_{i})$
$l'(\lambda)=-n + \frac {\sum x_{i}}{\lambda}=(set)0$ $\rightarrow n\lambda=\sum x_{i}\rightarrow \lambda= \frac{\sum x_{i}}{n}$ 
$\hat{\lambda}=\frac{1}{n} \sum x_{i}=\bar{x}$  

So the expected value and variance of $\hat{\lambda}$ is...
$E[\bar{x}]=E[\frac{\sum x_{i}}{n}]=\frac {1}{n}\cdot n\cdot E[x]=\lambda$ 
$Var[\bar{x}]=Var[\frac{\sum x_{i}}{n}]=\frac {1}{n^2}\cdot \sum Var(x_{i})= \frac {\lambda}{n}$ 



Example) Mathematical Statistics and Data Analysis, 3ED, Chapter 8. Q3. 
One of the earliest applications of the Poisson distribution was made by Student(1907) in studying errors made in counting yeast cells or blood corpuscles with a haemacytometer. In this study, yeast cells were killed and mixed with water and gelatin; the mixture was then spread on a glass and allowed to cool. Four different concentrations were used. Counts were made on 400 squares and the data are summarized in the following table; (we're deal with only one data set) 
(#cells, Concentration 2)
= (0, 103) (1, 143) (2, 98) (3, 42) (4, 8) (5, 4) (6 2) (7, 0) (8, 0) (9, 0) (10, 0) (11, 0) (12, 0)
a) log-likelihood for $\lambda$? 
b) maximum likelihood? 
c) Calculate maximum likelihood?
d) Find an approximate 95% confidence interval for $\lambda $

$\bigstar$ Sufficient Statistics 
Proof
$X_{1}, \cdots, X_n \sim$ Poisson,  $ P(X_{1}=x_{1}, \cdots, X_{n}=x_{n})= \frac {\lambda^{\sum x_{i} \cdot e^{-n\lambda}}}{\prod x_{i}!}$ 
$\rightarrow \lambda^{\sum x_{i}e^{-n\lambda}}\cdot \frac {1}{\prod x_{i}!}$  where $g(\sum x_{i}, \lambda) = \lambda^{\sum x_{i}e^{-n\lambda}}\cdot$ , $h(X_{1}, \cdots , X_{n})= \frac {1}{\prod x_{i}!}$ 

Sufficient Statistics Example
Let $X_{1}, X_{2}$ be a random sample of size 2 from Poisson distribution $f(x_{1}=\lambda^{x_{1}} \cdot \frac {e^{-\lambda}}{x_{1}!})$ . Show $T=X_{1}+ X_{2}$ 
Solution??!! 


$\bigstar$ Poisson Distribution is a part of exponential Family 
Proof
$p_{\theta}=e^{-\theta} \cdot \frac {\theta^x}{x!}= \exp [x log \theta - \theta] \cdot \frac {1}{x!}$, where $T(x)=x, c(\theta)=log (\theta), d(\theta)=\theta$ 




$\bigstar$ Likelihood Ratio Testing
Example) Mathematical Statistics and Data Analysis 3ED, Chpater 9, Q7. 
Let $X_{1}, \cdots ,X_{n}$ be a sample from a Poisson distribution. Find the likelihood ratio for testing $H_{0}: \lambda = \lambda_{0}$ versus $H_{A}: \lambda = \lambda_{1}$, where $\lambda_{1}> \lambda_{0}$. Use the fact that the sum of independent Poisson random variables follows a Poisson distribution to explain how to determine a rejection region for a test at level $\alpha$   

Geometric Distribution Example_1

Example) Mathematical Statistics and Data Analysis, 3ED, Chapter8. Q8

In an ecological study of the feeding behavior of birds, the number of hos between flights was counted for several birds. For the following data, (a) fit a geometric distribution, (b) find an approximate 95% confidence interval for p. 
(# Hops, Frequency)= (1, 48) (2, 31) (3, 20) (4, 9) (5, 6) (6,5) (7,4) (8,2) (9,1) (10,1) (11,2) (12,1)

If X follows Geometric Distribution, then P(X=k)=$P(X=k)=p(1-p)^{k-1}$, k=1,2,3... 

$\triangleright$ Solution (a)
$\hat{p}= \frac {1}{\bar{x}}$, $f(k)= N \cdot P(X=k)$
$\rightarrow \hat{p}= \frac {\sum f_{i}}{\sum f_i \cdot x_i} = \frac {130}{363}=0.3581$  

$\triangleright$ Solution (b)
The easiest is to use MLE and get the estimated variance from the curvature of likelihood function. 
$L(p)=\prod_{i=1}^{n}p(1-p)^{x_{i}-1}$ 
$l(p)=\sum [\log p + (X_{i-1} \log (1-p))]=n \log p = n \log(1-p)+ \log(1-p)\sum x_{i}$
$l^{(1)}(p)= \frac {n}{p}+ \frac {n}{1-p}- \frac {1}{1-p}\sum x_{i}= \frac {n}{p(1-p)}-\frac{1}{1-p}\sum x_{i}$ 
$l^{(2)}(p)= \frac {-n}{p^2}+ \frac {n}{(1-p)^2}- \frac {1}{(1-p)^2}\sum x_{i}= \frac {n}{p^2(1-p)^2}[-(1-p)^2+p^2-\bar{x}p^2]$ (b/c $\sum x_{i}=n \bar{x}$ ) 
Here, in order to get the asymptotic standard error of the MLE, we evaluate the 2nd derivative at the MLE, which is $\bar {X}= \frac {1}{\hat{p}}$ 
$\rightarrow l^{(2)}(p)= \frac {n}{\hat {p^2}(1-\hat {p})^2}[-(1-\hat{p})^2+ \hat{p}^2- \hat {p}]=\frac {-n}{\hat{p}^2(1-\hat{p})}$ $\rightarrow Var(\hat{p})= \frac {p^2(1-\hat{p})}{n}$  

$\therefore$ 95% CI for p
= $\hat {p}$ $\pm$ $Z_{0.975}=\sqrt{\frac {p^2(1-\hat{p})}{n}}=0.36 \pm$ $1.96 \cdot \sqrt{\frac{(0.36)^2(1-0.36)}{136}}=(0.31, 0.41)$ 


Geometric Distribution

Discrete Random Variable

$\bigstar$ X~Geom(p) : the # of Bernoulli trials needed to get ONE success, each with probability P.

$\bigstar$ $P(X=k)=(1-p)^{k-1}\cdot p$,  k=1,2,3...
$\bigstar$ E(X)= $\frac {1}{p}$, Var(x)= $\frac {1-p}{p^2}= \frac {q}{p^2}$ 


Method of Moments(MoM) in Geometric Distribution
Proof
$M(t)=E(e^{tx})=\sum_{x=1}^{\infty}e^{tx}\cdot P(X=x)= P \sum_{x=1}^{\infty}e^{tx} \cdot (1-p)^{x-1}\cdot$ $ \frac {e^t}{e^t}$
          =  $p\cdot e^t \sum_{x=1}^{\infty}e^{t(x-1)}\cdot (1-p)^{x-1}=p \cdot e^t \sum_{x=1}^{\infty}(e^t \cdot (1-p))^{x-1}$  $\leftarrow$ Geometric Series!
        (we multiply by   $ \frac {e^t}{e^t}$ to make a geometric series)
        = $p \cdot e^t \sum_{x=0}^{\infty} (e^t(x-p))^x = \frac {p\cdot e^t}{1-(e^t(1-p))}$, $|e^t(1-p)| < 1$ 

$M^{(1)}(t)= \frac {d}{dt} \frac {p\cdot e^t}{1-(e^t(1-p))}= \frac {p \cdot e^t[1-(e^t(1-p))]-p \cdot e^t \cdot e^t(1-p)}{[1-(e^t(1-p))]^2}$
$M^{(1)}(0)= \frac {p \cdot[1-(1-p)]+(1-p)p}{[1-(1-p)]^2}= \frac {p^2+p-p^2}{(-p)^2} = \frac {p}{p^2}= \frac {1}{p} =\mu_{1}$  

$p= \frac {1}{\mu_{1}}$, $\therefore \hat{p}= \frac {1}{\hat{\mu_{1}}}$ 


Maximum Likelihood Estimate (MLE) in Geometric Distribution 
Proof
$L(p)=\prod P(Y_{i}=y_{i}|p)=p^n \cdot (1-p)^{\sum{(y_{i}-1)}}$
$l(p)=n \log p + \sum (y_{i}-1) \cdot \log(1-p)$ 
$l'(p)= \frac {n}{p}- \frac{\sum (y_{i}-1)}{1-p}=(set)0$ $\rightarrow n(1-p)=p \cdot \sum (y_i-1) \rightarrow \hat{p}= \frac {1}{\bar{y}}$ 



Example) Mathematical Statistics and Data Analysis, 3ED, Chapter8. Q8
In an ecological study of the feeding behavior of birds, the number of hos between flights was counted for several birds. For the following data, (a) fit a geometric distribution, (b) find an approximate 95% confidence interval for p. 
(# Hops, Frequency)= (1, 48) (2, 31) (3, 20) (4, 9) (5, 6) (6,5) (7,4) (8,2) (9,1) (10,1) (11,2) (12,1)
Solution??!! 

Binomial Distribution Example_Hypothesis Testing 2

Example) I have no idea where I found this example, Sorry! 

An experimenter has prepared a drug dosage level that she claims will induce sleep for 80% of people suffering from insomnia. After examining the dosage, we feel that her claims regading the effectiveness of the dosage are inflated. In an attempt to disprove her claim, we administer her prescribed dosage to 20 insominiacs and we observe Y, the number for whom the drug tdse induces sleep. The rejection region was found to be {Y $\leq$ 12}. 

(a)  $H_{0}$? $H_{1}$?
(b) In terms of this problem, what's a Type I error? 
(C) find $\alpha$ 
(d) In terms of this problem, what's a Type II error? 
(e) Find  $\beta$ when p=0.6
(f) Find $\beta$ when p=0.4 


\triangleright Solution (a) 
$H_{0}$ : p=0.8
$H_{1}$: p<0 .8="" span="">

\triangleright Solution (b) 
Rejecting $H_{0}$ when $H_{0}$ is true $\rightarrow$ Conclude that drug is worse than claimed when in fact 80% of insomnia are able to sleep after taking the drug. 

\triangleright Solution (c) 
$\alpha$ = P(reject $H_{0}$ when $H_{0}$ is true) 
  = P(Y $\leq$ 12) where Y is binomial (20, 0.8) 
  = P(Y=0)+P(Y=1)+...+P(Y=12) =  $\binom{20}{0}(0.8)^0(0.2)^{20}+ \cdots +\binom{20}{12}(0.8)^{12}(0.2)^8= 0.032$
  
\triangleright Solution (d) 
Do not reject $H_{0}$ when $H_{1}$ is true. $\rightarrow$ Conclude that drug is effective as claimed when in fact it is worse than claimed. 

\triangleright Solution (e) 
 $\beta$= P(do not reject $H_{0}$ when $H_{1}$ is true)=P(Y$\geq$ 13) where Y~Binomial(20, 0.6)
  = 1-P(Y $\leq$ 12) =0.416 (power = 0.584)

\triangleright Solution (e) 
$\beta$P(Y$\geq$ 13) = 1-P(Y $\leq$ 12) =0.021, (Y~Binomial (20, 0.4)) 

Binomial Distribution Example_Hypothesis test

Example) Mathematical Statistics and Data Analysis 3ED, Chapter 9. Q1. 

A coin is thrown independently 10 times to test the hypothesis that the probability of head is $\frac {1}{2}$ . $H_{1}: p\neq \frac {1}{2}$. The test rejects if either 0 or 10 heads are observed. 
a) What's the significance level of test?
b) If in fact, the probability of head is 0.1. What's the power of the test? 

$\triangleright$ Think First!
This is a binomial example. X~Bin(10, 0.5) as a coin is thrown 10 times and the probability of head is 0.5. 

$\triangleright$  Solution (a)
$\alpha =$ P(reject $H_{0}$ |$H_{0}$ is true) = P(X=0 | $H_{0}$) + P(X=10 | $H_{0}$)

Under $H_{0}$, $\alpha = \binom{10}{0}(0.5)^0(0.5)^{10}+ \binom{10}{10}(0.5)^{10}(0.5)^0 = \frac{1}{1024}+\frac{1}{1024}=$ 0.0020   

$\triangleright$  Solution (b)
$1-\beta$ = P(reject $H_{0}$ when $H_{1}$ is true) =  P(X=0|$H_{1}$)+P(X=10|$H_{1}$)
         = $\binom{10}{0}(0.1)^0(0.9)^{10}+ \binom{10}{10}(0.1)^{10}(0.9)^0 =$ 0.3487

Binomial Distribution

Discrete Random Variable 
: X~ Bin(n, p) = # of success in n Bernoulli trials each with success probability p.

$\bigstar P(X=k)=\left ( \frac{n}{k} \right )p^k(1-p)^{n-k}$, x=0,1,...,n
$\bigstar$ E(X)=np, Var(X)=np(1-p)=npq (where q=1-p) 


Finding Variance w/o formula Example) 
Let Y~be binomial (15, $\frac {1}{3}$). Evaluate Var(Y).  
(We know variance of binomial distribution is npq. However what if we don't know this formula?)

Example) Mathematical Statistics and Data Analysis 3ED Chapter8. Q31.
George spins a coin three times and observed no heads. He then gives the coin to Hilary. She spins it until the first head occurs, and ends up spinning it four times total. Let $\theta $ denote the probability the coin comes up heads. 
a) What is the likelihood of $\theta $?
b) What is the MLE of $\theta $?


Hypothesis Testing 
Example) Mathematical Statistics and Data Analysis 3ED, Chapter 9. Q1
A coin is thrown independently 10 times to test the hypothesis that the probability of head is $\frac {1}{2}$ . $H_{1}: p\neq \frac {1}{2}$. The test rejects if either 0 or 10 heads are observed. 
a) What's the significance level of test?
b) If in fact, the probability of head is 0.1. What's the power of the test? 
Solution??!!

Example) 
An experimenter has prepared a drug dosage level that she claims will induce sleep for 80% of people suffering from insomnia. After examining the dosage, we feel that her claims regading the effectiveness of the dosage are inflated. In an attempt to disprove her claim, we administer her prescribed dosage to 20 insominiacs and we observe Y, the number for whom the drug tdse induces sleep. The rejection region was found to be {Y $\leq$ 12}. 
(a)  $H_{0}$? $H_{1}$?
(b) In terms of this problem, what's a Type I error? 
(C) find $\alpha$ 
(d) In terms of this problem, what's a Type II error? 
(e) Find  $\beta$ when p=0.6
(f) Find $\beta$ when p=0.4

Binomial Example_likelihood&MLE

Example) Mathematical Statistics and Data Analysis 3ED Chapter 8. Q31.

George spins a coin three times and observed no heads. He then gives the coin to Hilary. She spins it until the first head occurs, and ends up spinning it four times total. Let $\theta $ denote the probability the coin comes up heads.
a) What is the likelihood of $\theta $?
b) What is the MLE of $\theta $  


$\triangleright$ Think First!! 
This is a binomial distribution example! $P(X=k)=\left ( \frac{n}{k} \right )p^k(1-p)^{n-k}$
PDF of binomial is $f(x|\theta)=\theta^x(1-\theta)^{1-x}$  


$\triangleright$ Solution (a)  
They spin a coin 7 times in total. (George: 3 times, Hilary 4 times) 
Let X be 1 (if the result is head), o (otherwise) 

George case (X)The likelihood is $\theta $, $L_{1}(\theta | X_{1},...,X_{n})= \prod_{i=1}^{n}f(X_{i}|\theta)=\prod \theta^{x_{i}}(1-\theta)^{n-x_{i}}$ 
                       George spins a coin three times, here n is 3. 
                       There is no head, $X_{1}=X_{2}=X_{3}=0$  

Hilary case (Y): PDF of Y: $g(Y|\theta)=\theta (1-\theta)^{y-1}$
                        The likelihood of Y: $L_{2}=(\theta|y)=\theta(1-\theta)^{y-1}$ 
                        Here Y is the number of toss required. So Y is 4.
                        (why $\theta^1$ ? b/c there is one head)    

$\rightarrow$ the likelihood of both George and Hilary $\theta $ is...
$\therefore$ $ L(\theta|X_{1},...,X_{n}, Y)= \left [ \prod_{i=1}^{n}\theta^{X_{i}}(1-\theta)^{n-x_i} \right ]\theta (1-\theta)^{y-1}$  $=\theta^{1+\sum x_{i}}(1-\theta)^{n-\sum x_{i}+y-1}$ 

$\triangleright$ Solution (b)  
$\Rightarrow$ Now log likelihood is $= \left [1+\sum _{i=1}^{n} x_{i} \right] \ln(\theta)+ \left [ n - \sum _{i=1}^{n} x_{i}+y -1 \right ] \ln (1-\theta)$
$\Rightarrow$ Take a derivative with respect to $\theta $ is $\frac{d \ln(L)}{d \theta}=\frac{1+\sum x_{i}}{\theta}+\frac{n-\sum x_{i} +y -1}{\theta - 1}$ = (set) 0
    = $\frac{1+\sum x_{i}}{\theta}=\frac{n-\sum x_{i} +y -1}{\theta - 1} \rightarrow (n+y)\theta =1+\sum x_{i} \rightarrow \theta = \frac{1+\sum_{i=1}^{n}x_{i}}{n+y}$ 

     $\star \frac{d}{dx}(\log (1-x))=\frac{1}{x-1}$ 

Now n=3, $X_{1}=X_{2}=X_{3}=0, Y=4$, 
$\therefore \theta_{MLE}= \frac{1+0+0+0}{3+4}=\frac {1}{7}$  
  

Sufficient Statistic Example

Example) I have no idea where I found this example, Sorry!

Let $X_{1},...,X{n}$ be a random sample from a distribution with the following density function. $f(x|\theta)= \frac {2x}{\theta}\exp (\frac{-x^2}{\theta})$, x>0 Show that $\sum_{i=1}^{n}X_{i}^2$ is a sufficient statistic for $\theta$

$\triangleright$ Think First!
We can use likelihood $L(\theta)$ is a joint density of the sample! 

$\triangleright$ Solution
$L(x_{1},...,x_{n}|\theta) = \frac {2x_{1}}{\theta}\exp (\frac{-x_{1}^2}{\theta})\cdots \frac {2x_{n}}{\theta}\exp (\frac{-x_{n}^2}{\theta})= \frac{2^n}{\theta^n} \cdot \prod x_{i}\exp \frac{-\sum x_{i}^2}{\theta}$ 

Let $g(\sum x_{i}^2, \theta)=\frac{2^n}{\theta^n} \exp \frac{-\sum x_{i}^2}{\theta}$, where g is the function of  $\sum x_{i}^2$ and $\theta$.
Also let $h(x_{1},...,x_{n}) = \prod_{x_{i}}$ 

We have $L(x_{1},...,x_{n}|\theta)$ = $g(\sum x_{i}^2, \theta)\cdot h(x_{1},...,x_{n})$ 

$\therefore \sum x_{i}^2$ is sufficient statistic for $\theta $

Bernoulli Distribution Example - Likelihood Ratio

Example) I have no idea where I found this example, sorry!

$Y_{1},...,Y_{n}$ denote a random sample from Bernoulli $P(Y_{i}|p)=p^{y_{i}}(1-p)^{1-y_{i}}$ , where $y_{i}$=0 or 1. Suppose $H_{0}:P=P_{0}$ , $H_{1}:P=P_{a}$, where $P_{0} < $ $P_{a}$

(a) Show that  $\frac{L(P_{o})}{L(P_{a})}=[\frac{P_{0}\cdot (1-P_{a})}{(1-P_{0}) \cdot P_{a}}]^{\sum y_{i}}\cdot (\frac{1-P_{0}}{1-P_{a}})^n$ 
(b) Argue that $\frac{L(P_{o})}{L(P_{a})}$ < K iff $\sum y_{i} < k$ 


$\triangleright$ Solution (a)
$\frac{L(P_{o})}{L(P_{a})}=\frac{P_{0}^{\sum y_{i}}\cdot (1-P_{0}^{\sum y_{i}})}{P_{a}^{\sum y_{i}}\cdot (1-P_{a})^{n-\sum y_{i}}}$ $=\frac{\frac {P_{0}}{1-P_{0}}^{\sum y_{i}}\cdot (1-P_{0})^n}{\frac {P_{a}}{1-P_{a}}^{\sum y_{i}}\cdot (1-P_{a})^n}$ $= [\frac {P_{0}(1-P_{a})}{P_{a}(1-P_{0})}]^{\sum y_{i}}\cdot (\frac{1-P_{0}}{1-P{a}})$



$\triangleright$ Solution (b)
$\frac{L(P_{o})}{L(P_{a})} \leq k \Leftrightarrow \log \frac{L(P_{o})}{L(P_{a})}\leq \log k$ $\Leftrightarrow \sum y_{i} \log A +n\cdot \log B \leq \log k$ 

where A=$\frac {P_{0}(1-P_{a})}{P_{a}(1-P_{0})}$, B=$\frac{1-P_{0}}{1-P{a}}$ from (a), $\Rightarrow \sum y_{i} \log \frac {p_{0}(1-P_{a})}{P_{a}(1-P_{0})}\leq \log k - n\cdot \log \frac {1-P_{a}}{1-P_{0}}$
Since $P_{0} < $ $P_{a}$ $\Rightarrow \frac {P_{0}}{P_{a}} < 1$ $\Rightarrow \log \frac {p_{0}(1-P_{a})}{P_{a}(1-P_{0})} \leq \log 1 = 0$ 

$\therefore \sum y_{i}$ $\large \geq \frac{\log k - n\cdot \log \frac {p_{0}(1-P_{a})}{P_{a}(1-P_{0})}}{\log (\frac {p_{0}(1-P_{a})}{P_{a}(1-P_{0})})} $ = (set) 0  

Bernoulli(p) Distribution

Discrete Random Variable 
: There are only TWO possible outcomes. (e.g. male or female, success or failure, 1 or 0)

$\bigstar$ $f(x)=p^x(1-p)^{1-x}$ , x=0 or 1 (two outcomes)
                                   0$\leq$ p$\leq$ 1 (probability is always between 0 and 1) 

$\bigstar$ E(X)=p, Var(x)=p(1-p)=pq (where q=1-p) 
Proof
$f(1|p)=p^1(1-p)^{1-1}=p$ (probability of being 1)
$f(0|p)=p^0(1-p)^1=1-p$  (probability of being 0)

If we have n data (n samples), we can calculate the likelihood by using joint distribution. 
$\Rightarrow$ $P(X_{1}=x_{1},X_{2}=x_{2},...,X_{n}=x_{n})=L=\prod_{i=1}^{n}p^{x_{i}}(1-p)^{1-x_{i}}$

To maximize? Take a derivative with respect to P and set the equation to equal to 0. 
$\Rightarrow$ $P=\frac{dL}{dP}=0$ $\Rightarrow \hat{p}_{MLE}$   
(we need a chain rule b/c consisting of product) 

$\Rightarrow log L=l=log(\prod_{i=1}^{n}p^{x_{i}}(1-p)^{1-x_{i}}=\sum_{i=1}^{n} log(p^{X_i}(1-p)^{1-X_{i}})$
   $=\sum_{i=1}^{n}[{x_{i}\cdot logp+(1-X_{i}})\cdot log(1-p) ]$ 
   $= logp \sum_{i=1}^{n}X_{i}+log(1-p)\sum_{i=1}^{n}(1-X_{i})$ 
   $= n\bar{X}\cdot log\hat{p}+n(1-\bar{X})\cdot log(1-\hat{p})$ 
   $\because \sum_{i=1}^{n}X_{i}=n\bar{X}\Rightarrow \bar{X}=\frac{1}{n}\sum_{i=1}^{n}X_{i}$ 

$\Rightarrow \frac{dl}{d\hat{p}}=\frac{n\bar{X}}{\hat{p}} - \frac{n(1-\bar{X})}{1-\hat{p}} = 0$ (for maximizing)
   $= \frac{n\bar{X}}{\hat{p}}=\frac {n(1-\bar{X})}{1-\hat{p}}=\bar{X}\cdot \hat{p}\bar{X}=\hat{p}-\hat{p}\bar{X}$ $\Rightarrow$  $\hat{p}_{MLE}=\bar{X}$



$\bigstar$ Show that $T=\sum_{i=1}^{n}X_{i}$ is a sufficient statistic. 
Proof
By independence, the joint distribution of the random sample is
$\prod_{i}^{n}p^{x_{i}}(1-p)^{1-x_{i}}=p^{\sum X_{i}}(1-p)^{n-\sum X_{i}} \cdot 1$ ,
where $p^{\sum X_{i}}(1-p)^{n-\sum X_{i}}$  = $g(\sum x_{i},p)$ , and 1= $h(X_{1},...,X_{n})$


$\bigstar$ Show that Bernoulli distribution is part of the exponential family. 
Proof 
We need to show $f_{\theta}(X)= \exp {[\sum_{i=1}^{k}]C_{i}(\theta)\cdot T_{j}(X)}+d(\theta)+ s(X)$ Click link to more details 

parameter p, where p= P(X=1)
$p(x|p)=p^x(1-p)^{1-x}$
$p(x|p)=exp{[log(p^x(1-p)^{1-x})]}=exp[x\cdot logp+(1-x)\cdot log(1-p)]$ 
          = $exp [x\cdot log \frac{p}{1-p} + log(1-p)]$     
This shows the Bernoulli distribution belongs to the exponential family with parameter $c(\theta)=\log \frac {p}{1-p}$, $T(x)=x, d(\theta)=\log (1-p), s(x)=0$  



Likelihood Ratio Example 
$Y_{1},...,Y_{n}$ denote a random sample from Bernoulli $P(Y_{i}|p)=p^{y_{i}}(1-p)^{1-y_{i}}$ , where $y_{i}$=0 or 1. Suppose $H_{0}:P=P_{0}$ , $H_{1}:P=P_{a}$, where $P_{0} < $ $P_{a}$

(a) Show that  $\frac{L(P_{o})}{L(P_{a})}=[\frac{P_{0}\cdot (1-P_{a})}{(1-P_{0}) \cdot P_{a}}]^{\sum y_{i}}\cdot (\frac{1-P_{0}}{1-P_{a}})^n$ 
(b) Argue that $\frac{L(P_{o})}{L(P_{a})}$ < K iff $\sum y_{i} < k$