Tuesday, 20th April 2010
Open data from the World Bank
Making data freely available seems to be very fashionable at the moment, and that can only be a good thing. Just at the start of this month, Ordnance Survey released some of their data, allowing people to access and make use of huge amounts of geographical data. The postcode data was made available (to a degree and after a struggle) this year. Last year, Tim Berners-Lee got the UK government to make its data (collected at the tax payers' expense) freely available at data.gov.uk. And today, the World Bank has opened up its development data at data.worldbank.org.
I'm looking forward to seeing what people can do with all this information, and I have been re-inspired to attempt to do something useful with all this information and countries (I was previously going to use data from the Guardian Data Store, but this should be easier). I have got myself an API key and have actually been learning how to use it. Once I know the general way in which APIs work (which is something I've been meaning to learn for a long time), I should have a huge wealth of information at my fingertips.
A bit of code
I have finally learnt how to use Python to interrogate and search websites, which I can see is an incredibly powerful tool. The key point is to use the urllib module, which is part of the standard Python installation. It's then very simple to open a URL, which can be treated like a file. So you need to do then is read the file/URL and parse it (in this case, the data is XML format, so you use an XML parser, much like editing SVGs).
The following code gets all the countries for which the World Bank has data:
import urllib root = "http://open.worldbank.org/" queryURL = root + "countries?api_key=" + my_api_key sock = urllib.urlopen(queryURL) XML = sock.read() sock.close()