cd ~/Documents/f09_210A/website
capture log close
log using optional.log, replace

*this is an optional set of exercises meant to let you review all the material to date

*it's based on the auto data
sysuse auto.dta, clear

*take a look at the variables: miles per gallon and foreign
* what kind of variables are these? 
* -- if categorical, how many categories?
* -- if continuous, can you get a rough feel for the shape of the distribution?
desc mpg foreign
sum mpg foreign, detail

*let's graph mpg
histogram mpg
graph export mpg.png, replace

*for mpg we know mean but we don't know how certain to be of it
* can you calculate standard error around the sample mean?
* hand draw a distribution of estimates, labeling the mean 
* and the 95% confidence interval

*NOTE: the n is fairly small so we should technically use a true "t" but
* you can still use normal for the exercise

*what is the difference between a graph of mpg and a graph of estimates of mpg?
* do they look the same? why or why not? 

*let's see mpg separately by foreign domestic
*first let's eyeball the distributions for the two groups
twoway (kdensity mpg if foreign==0) (kdensity mpg if foreign==1), xtitle(MPG) ytitle(Kernel Density) legend(order(1 "detroit" 2 "foreign"))
graph export mpg_foreign.png, replace

*does mpg for foreign and domestic cars overlap?
* which group tends to get better mpg?

*that's the distribution of data, let's make distributions of estimates
*first, we need to know mean and sd separately by group
sum mpg if foreign==0
sum mpg if foreign==1

*calculate standard error of mpg separately by group and draw the two 
* distributions of estimates on the same graph
*do they overlap much?
*hand calculate the t-test of means to see if the difference is statistically significant

*compare the results to an automated t-test
ttest mpg, by(foreign)

*if your estimate and Stata's are slightly off remember that Stata uses a true 
* "t" not the normal approximation 


*repeat all of the above subbing in another continuous var (eg, weight, trunk, headroom) instead of "mpg"

