********************************************************************
*Now you will be replicating some of the main results of the paper.
*First, you will investigate the relationship between birth weight 
*near the cutoff and 1 year mortality. Then, you will look at whether 
*gestational age-an indicator of health-is indeed smooth across the 
*1500g mark.
********************************************************************
*Start with some basic commands to get set up and save your work
********************************************************************

clear
set more off
capture log close

*The cd command tells stata where to look for the data, save files, etc. 
*To find the path to the data file, right click on the file -> properties -> location 

cd C:\Users\mac285\Desktop //change this to your own directory

log using myresults.log, replace 
*You've created a log file: a permanent record of your results. 
*This is what you will turn in with your pset

use adkw.dta //tell stata which data you want to use

**********************************
*Generate your independent variables: VLBW, VLBW*(g-1500), and (1-VLBW)*(g-1500)
**********************************

*Create a dummy variable for VLBW using the generate command
[          ]  //Hint: generate VLBW = ...

*Generate and define VLBW*(g-1500). Call it [lowBW]
[          ]

*Generate and define (1-VLBW)*(g-1500). Call it [highBW]
[          ] 

***************************
*Now run the regression. Tell stata to calculate robust standard errors using [, robust]  
****************************

[          ]  //Hint: regress death1year ...

estimates table, b(%-8.5f) se(%-8.5f) title(Mortality Coefficient Results) //This will make a table of your coefficients and robust standard errors

*****************************
*That last regression will have generated a table of coefficients in your log file.
*Remember, we are interested in a discontinuity at the VLBW threshold.
*Now, make a graph to see whether gestational age is smooth across the threshold.
*Remember, our underlying assumption is that there should be no jump in health risk 
*as you move across the threshold-we do not expect a discontinuity here.
******************************

[          ] //Use the "collapse" function to calculate the mean gestational age for each birthweight. Graphing the means will be easier than graphing every data point.

[          ] //Make a scatter plot of gestational age on the y-axis and birthweight on the x-axis FROM 1450 TO 1550 GRAMS   

graph export mygraph.pdf, replace //this will save your graph as a PDF

clear

