Sunday, June 21, 2009

Iran's Election: A statistical bibliography

(Updated 6/22 1pm ET) The upwelling of international interest in Iran's disputed election has attracted a handful of amateur and professional statisticians. Here I attempt to collate these various efforts under one roof.

Raw data
  • The Iranian government has released vote counts in 366 districts in XLS form. Roukema and Mebane supplied links to an earlier and later version of the district level counts here and here, but they were inactive at the time of this posting. Roukema provided links to archived versions here and here.

  • Here is a version of the district counts with province names and candidate names transliterated into Roman script. The source is not specified although it should not be difficult to check for consistency with the official sources above.

  • Walter Mebane has supplied ballot box level data for 23 provinces in this zipfile. The files are in CSV format, one for each province.
Academic analysis
  • Walter Mebane, professor of political science and statistics at the University of Michigan, has been updating a working paper on a daily basis. He has a track record of peer reviewed publications on the statistical analysis of election results. All of his papers are available for downloading here. He used a binomial and second digit Benford analysis.

  • Boudewijn F. Roukema, an astrophysicist at the Nicolaus Copernicus University in Poland, has submitted a preprint to the arXiv server of an article that investigates Benford's Law for the first digit of the district-level counts. The article has not as yet been peer-reviewed, but has been submitted to a major statistical journal.

  • Jon Baron, professor of psychology at the University of Pennsylvania, has published some analysis and data on his website concerning the first digit under the hypothesis that leading one's were converted to two's.

  • Bernd Beber and Alexandra Scacco, both political science graduate students at Columbia University, examined deviations from randomness in the last two digits of the 29 provincial vote counts. The technical version of their paper is available here and the version that appeared in the Washington Post is here. They also have a paper on the last Nigerian election which is under peer review.
Blogs
  • Nate Silver has discussed the above results on his blog
  • On the same blog, Andrew Gelman weighed in here.
  • A series of posts on Stochastic Democracy
  • There was a debunked notion that a time series of partial counts was too regular
  • This blog is summarizing the statistical analyses in Farsi
I will try to update this post as new results become available. Let me know if you find anything useful.

4 comments:

Haghighat-yaab said...

Washington Post also published an article on this:
http://www.washingtonpost.com/wp-dyn/content/article/2009/06/20/AR2009062000004.html?hpid=opinionsbox1

I'm summarizing the studies on my weblog too (in Farsi though):
http://haghighatyaab.blogspot.com

David said...

This is a great summary, thanks for letting me know! I'll look over some of what I've missed.

Haghighat-yaab said...
This comment has been removed by the author.
Haghighat-yaab said...

Hi again.
Thanks for listing my blog too :-).

I'm not sure whether you're summarizing the statistical analyses based on Benford's law and other digit laws, or just all sorts of statistical analyses. In case the latter is the case this is also a very interesting study:

http://www.chathamhouse.org.uk/files/14234_iranelection0609.pdf

By the way, I think I still have some decent arguments for the "debunked" idea that there is excessive regularity in the vote shares per province. I've talked about this in my blog (I take it that you can read Farsi). I've also replied to Nate Silver's arguments against the idea.

Keep up the bibliography.