A working paper by Ochmann and Peichl describes 12 measures of progressivity for an income tax system. Without subscribing to any particular normative views about progressivity, there are still positive applications for such measures.

Peichl has written a Stata .ado file to calculate all 12 measures, and here we present the results of applying those formulas to a time series of US micro data. We present results from two data samples. I would like to thank him for sending me the code.

    Using Annual IRS Public Use Files

  1. Federal Income Tax by year (1960 - 2011) .dta .csv .html

    Using In/Deflated 1984 Data, uniform across states

  2. Federal and State Income Tax by year and state (1977 - 2016) .dta .csv .html

The calculations of federal tax only differ across years both because the laws change, and because the sample of tax returns changes. The calculations using are based on stratified random samples of 100,000+ taxpayer returns per year. While returns are heavily redacted to preserve confidentiality, this should not affect the numeric results shown here. The file greatly oversamples high income taxpayers and the results here compensate for that with weights supplied.

The federal plus state tax calculations using inflated/deflated 1984 data differ across years and states only because of differences in laws. These calculations are based on approximately 1,591 nationally representative returns subjected in turn to the tax law of each of the states. The numeric state-ids run from 1 to 51 and are those used by SOI. State "0" is no state tax.

The 1984 data is of interest because the actual income tax paid by a taxpayer is endogenous, i.e. it depends upon his income. So any regression explaining income (or anything else) in terms of taxes should be wary of using the actual tax liabilities directly as an explanatory variable. But variation in state tax laws across states and years is exogenous to individual labor supply and realization decisions. So an instumental variable that depends only on the state law (or, to a lesser extent, on the year), and not on the individual has the potential to be a valid instrument.

The inflation factor used to convert 1984 data to other years is just nominal GDP divided by the over-20 population.

Before taking these tables too seriously, the visitor is advised to read the working paper cited above, and to read something about TAXSIM and the IRS public use files. Before publishing, you should probably speak with me. I am at 617-588-0343 and don't mind the call.

Daniel Feenberg


Last revised December 3,2017