On 6th June 2015, we held Phandeeyar’s first civic tech and open data hackathon. We had a total of 58 hackers who came to participate and build tech products to make the newly released census data easier for the public to access, utilize, and understand. Teams spent the day hard at work cleaning datasets, building APIs, writing apps to visualize and query the data, and using Excel to build dashboards to visualize the data as well.
Myanmar’s First Census in 30 Years
The hackathon was held barely a week after the Department of Population released the main results of the 2014 Myanmar Population and Housing Census on May 29. The release of the data was much anticipated since it was the first nationwide census to be held in over 30 years (the previous one was in 1983). It was a rare chance for data geeks to dig into various facets of the country’s demographic, social and economic statistics dis-aggregated down to the township level. From UN organizations, to businesses, to passionate individuals, many folks had been pouring over the datasets since the day they were released.
The official census results were released in 16 different multi-tabled Excel files – 1 at the Union level, and 15 for each of the states and regions. The Union level Excel file contained 69 individual tables, divided into 10 categories (such as demographics, social, migration, education, etc), while the Excel files for the states and regions each contained 44 individual tables, also divided into 10 categories. This meant that in total, the dataset contained 729 different individual Excel tables.
We have compiled the details about the kinds of data that are available and the administrative levels (state/region, district, and township) and the breakdowns (gender, urban/rural, age groups) for which they are available, in this Excel spreadsheet uploaded on Phandeeyar’s Github.
The sheer number of different tables presented an initial challenge for the hackers: how do we gather all the data from these disparate tables into a coherent, standardized database that is machine readable? During the week leading up to the hackathon, our partners at the Myanmar Information Management Unit (MIMU) had been busy cleaning up the dataset. However, due to limited time, they did not manage to create a fully standardized database but did manage to incorporate standardized grographical Place Codes (Pcodes) to the official dataset. This modified dataset is now also available on Github.
The Goals for the Hackathon
The challenges presented to the hackers were to build the following four different products using the census data:
– An API to access the census data
– A Table Builder to allow users to build customised datasets
– Interactive dashboards to visualize the census data
– An open data website to serve as a home for all the other products
However, when the hackers self-organized at the beginning of the day when they started working on the challenges, a few of them volunteered to form a group to work a fifth challenge:
– To clean the data and compile them into a standardized machine readable format
What Happened Throughout the Day
Hackers came to Phandeeyar bright and early in the morning to get their tea and mohinga as early as 8AM. The event started officially with a series of presentations by:
– Petra Righetti, program specialist from UNFPA, gave a talk about the importance of the census data to the country and the potential for the projects initiated at the hackathon to have a positive impact on Myanmar’s development
– Reena Badiani-Magnusson, senior poverty economist from the World Bank, presented from the perspective of a data user, and how having products that made the data easily accessible would benefit resaerchers
– Ko Nway Aung, GIS manager from MIMU, gave a more technical description of the datasets that were released, and also explained the subtleties of the various geographic and administrative divisions, such as townships and sub-townships.
After the presentations, the hacking began. The hackers self-organized into different groups tackling the challenges described above, depending on their expertise and interest. They formed 8 different teams, some of whom were working on different versions of the same challenge. All five of the challenges had teams working on them, except for building an open data website. Most of the teams felt that building a website to house all the other products was less a challenging task and decided to dive right into the data instead.
What we Achieved
A few interactive visualizations
Querying and Visualization Platform