Just How 'Transparent' Is the Obama White House? Scientists Try to Help
Computer scientists sort through reams of government data.
Nov. 16, 2010— -- Quick --who has been the most frequent visitor to the White House since the beginning of the Obama administration?
You don't know? Most of us would have no idea, despite the mountains of information the administration has posted at Data.gov, a website created in 2009 "to increase public access to high value, machine readable datasets generated by the executive branch of the federal government."
The Obama administration has promised to be, in its word, "transparent" about the workings of government. But it concedes that there is so much data -- most of it unsorted, without any context -- that even if you had heard of Data.gov, you would have terrible trouble making sense of it. It does have "machine readable datasets" on it, more than 270,000 of them.
Which brings us to James Hendler, a computer scientist at Rensselaer Polytechnic Institute in Troy, N.Y. No, he's not a frequent visitor to the White House -- more on that in a second -- but he is at a conference in Washington today, talking about how he and his team have been figuring out how to find information in all that digital noise. The administration says it is happy to have Hendler and his coleagues; it agrees the amount of material it posts can be daunting.
"The government collects data for very specific purposes," Hendler said in a telephone interview with ABC News. "You, as a citizen, could use that data for something. Maybe you want to track budgets."
But the raw stuff collected by government agencies -- from ozone monitors to the White House visitors' log -- is often in a form most users would not regard as user-friendly. The Environmental Protection Agency, for instance, listed readings from ozone meters nationwide, without their locations. And the White House listed every person cleared through security, just by name, date and home state, without including their affiliations or reasons for visiting.
Hendler and his team solved the problem by doing "mashups," a little like what musicians do when they combine parts of different songs, except they combined the mass of material from Data.gov with other lists. (The EPA does keep lists online of the locations of ozone sensors, just not in one place.)
So, it turns out that as of May, when the Rensselaer team last went through the list, the most frequent visitor to the White House was (drum roll, please) Anna Burger.