IRS Sales Tax Tables
This directory contains copies of the offical IRS sales tax tables for 2004+ in several formats. The original spreadsheet files were supplied by Alan Plumley, Wu-Lang Lee and Ahmad R. Qadri of the IRS Research Division. These are formatted for IRS publications 600 and 600-A and the published format is poorly suited for machine processing. Here at the NBER we have done a lot of cut-and-pasting and some further processing to create more convenient flat files and also reconstructed the regressions from which the tables are made. The results are made available here to save others the same work. We expect to update the file each year as the form becomes available. The most usable file format is always a flat file, and here each record corresponds to the sales tax allowance for a particular family size, in a particular income class, in a particular state, in a particular year. The 8 columns are:
For ease of use the file has no embedded non-numerics. It should be easy to read with any statistical package or database language.
The amount of tax comes from the IRS file and because all values are integers should be exactly as printed in the official forms.
Because the allowance amount was calculated by the IRS from a formula, and the data supplied contain roughly enough information to recreate the formula parameters, we have run the regressions suggested by the 2004 and 2005 FAQs posted on the IRS website and Mr Plumley's letter (below). These files have the regression parameters for every state and year. These wouldn't be suitable for filling out your tax form, but for statistical analysis of tax liabilities they are superior to tables because they have continuous first derivatives.
Looking at the first line of coefficients.csv, we see that for Alabama in 2004 the best estimate of the sales tax allowance is given by the following formula:
log(salestax) = 1.911 + 0.3940*log(income) + 0.2443*log(famsize)
These are natural logs. Note that the RMSE refers to our ability to reproduce the published table from three parameters, which appears to be quite good, in spite of the fact that we use the midpoint of the income brackets for the income term, rather than the median (used by the IRS to create the tables).
Files for download
Date: Wed, 07 Feb 2007 13:48:28 -0500 From: Alan Plumley