Showing posts with label R. Show all posts
Showing posts with label R. Show all posts

Case Study: SLR with 1 dummy variable in R

 
Case Study: The Spock Conspiracy Trial in R
The main question is there is evidence that women underrepresented on Spock judges venire when compared to other judges?
 
1. Prepare Data
If you don't have the data set, just follow procedure below; install a package and load the package. Our case study data set is called "case0502".
 
2. Make a Boxplot
Our goal is to compare the mean values between Spock's and other judges (1~6). First of all, we can make a bloxplot displaying the 5 statistics: minimum, first quartile, median, third quartile, and maximum.  
 
3. Conduct the SLR 
First, we divide into two groups: Spocks' and others by using ifelse function as we want to compare means of two groups. And then conduct the simple linear regression by using lm function. The response variable is the "percent" and the explanatory variable is a "twogroup" which is a dummy variable.  
 
4. Conclusion  
If we use anova( ), then we can see the one-way ANOVA table above. The main idea is that between groups SS (twogroups Sum sq=1600.6) is larger than withing groups SS (residuals sum sq=49.79), there is evidence that means are different. Therefore, as we have a small p-value (1.03e-06), there is strong evidence in mean difference between two groups.  
Remark) We will get the same conclusion if we conduct the SLR. More information, Click! 
 
 
 

Case Study: t-test in R


Case Study: The Spock Conspiracy Trial in R
The main question is there is evidence that women underrepresented on Spock judges venire when compared to other judges?


1. Prepare Data  
If you don't have the data set, just follow procedure below; install a package and load the package. Our case study data set is called "case0502".


2. Make a Boxplot

Our goal is to compare the mean values between Spock's and other judges (1~6). First of all, we can make a bloxplot displaying the 5 statistics: minimum, first quartile, median, third quartile, and maximum.  

 
3. Conduct the t-test
The data consists of 7 subgroups: A, B, C, D, E, F, and Spock's. We want to compare the mean of Spocks vs. other judges, so firstly we make two subgroups. And then conduct the two sampe t-test by using t.test.  
 
The variance of each group is not the same, so we use the Satterthwaite approximation. The degrees of freedom is calculated by Satterthwaite approx. formular.  












4. Conclusion
The null hypothesis is that there is no difference in mean between Spock's and other judges. As we have a small p-value (1.303e-06), we reject the null hypothesis so that there is strong evidence in difference between two groups.
Remark) We will get the same conclusion if we conduct the SLR. More information, Click!



- Reference: http://www.inside-r.org/node/159733