This directory contains copies of the offical IRS sales tax tables for 2004+ in several formats. The original spreadsheet files were supplied by Alan Plumley, Wu-Lang Lee and Ahmad R. Qadri of the IRS Research Division. These are formatted for IRS publications 600 and 600-A and the published format is poorly suited for machine processing. Here at the NBER we have done a lot of cut-and-pasting and some further processing to create more convenient flat files and also reconstructed the regressions from which the tables are made. The results are made available here to save others the same work. We expect to update the file each year as the form becomes available. The most usable file format is always a flat file, and here each record corresponds to the sales tax allowance for a particular family size, in a particular income class, in a particular state, in a particular year. The 8 columns are:
For ease of use the file has no embedded non-numerics. It should be easy to read with any statistical package or database language.
The amount of tax comes from the IRS file and because all values are integers should be exactly as printed in the official forms.
Because the allowance amount was calculated by the IRS from a formula, and the data supplied contain roughly enough information to recreate the formula parameters, we have run the regressions suggested by the 2004 and 2005 FAQs posted on the IRS website and Mr Plumley's letter (below). These files have the regression parameters for every state and year. These wouldn't be suitable for filling out your tax form, but for statistical analysis of tax liabilities they are superior to tables because they have continuous first derivatives.
Looking at the first line of coefficients.csv, we see that for Alabama in 2004 the best estimate of the sales tax allowance is given by the following formula:
log(salestax) = 1.911 + 0.3940*log(income) + 0.2443*log(famsize)
These are natural logs. Note that the RMSE refers to our ability to reproduce the published table from three parameters, which appears to be quite good, in spite of the fact that we use the midpoint of the income brackets for the income term, rather than the median (used by the IRS to create the tables).
27 March 2007
4 February 2008
15 November 2010
Date: Wed, 07 Feb 2007 13:48:28 -0500 From: Alan Plumley
To: Daniel Feenberg Hi, Dan. A number of people (e.g., software developers) have asked for the underlying formulas, and the IRS decision has been not to release the formulas themselves, but rather to release the tables as spreadsheets (which I've attached). I'm guessing that for what you're doing, you could easily derive a curve for each state from the amounts in the table cells. We use the same Cobb-Douglass form for each state separately, with Tax as a function of both Income and Exemptions (i.e., family size). However, I'm not sure you would have enough information to determine the appropriate local sales tax amount. That depends on the local sales tax rate, and there are zillions of those (well, almost). I'm also attaching the FAQs for 2004 and 2005. 2006 was the same as 2005. Let me know if you have any additional questions. Best wishes, Alan Plumley IRS RAS Office of Research