Thursday, March 10, 2011

An easier to use IV regression command in R

Update: I have added some functionality to my ivregress() command. Check out my newer post here.

After I posted my last video tutorial on how to use my IV regression function, I received a comment asking why I didn't write the command a different way to make the syntax easier to read.

The answer is that I didn't know how to write an easier to use function a year ago (when I wrote the ivreg() function). After some digging, I figured out how to work with "formula objects" in R and the result is an easier to use IV regression function (called ivregress()).

How to "install" ivregress()

Here's the code you need to run to define ivregress() and its companion summary command sum.iv(). This will provide instrumental variables regression estimates if you have one endogenous regressor with one or more instruments.



You only need to run the above code once to define the function object ivregress() for all future uses (as long as you don't write over it or clear your workspace).

How to use ivregress()

To create an "IV object" simply use the ivregress command as follows (the command relies on the car library):

library(car)
myivobject = ivregress(Y~X1+X2+...+Xk, Xj ~ Z1+Z2+...+Zl, dataframe)

where X1, X2,...Xk are second stage regressors, Xj is an endogenous regressor for which you would like to instrument, and Z1, Z2,...,Zl are instruments.

Then use sum.iv() from the previous posts to produce summary output. Sometime soon, I will post a video tutorial showing how to use ivregress() to perform IV regression.

1 comment:


  1. Hi

    I have a question regarding your code, tsls() command in "sem" package and ivreg() command in "AER" package. These three all estimate 2SLS but the results I get from ivregress in STATA is different from the results I get using the packages in R. Interestingly I get the same result using your code. As from the structure of the code, I can see that it's very similar to the one in STATA. But do you know what is going on in the two packages in R that I get different results?

    Thanks

    ReplyDelete