Tuesday, 26 January 2016

Improving air quality in Indian smart cities - Part1 - The needs

As per GOI, there is no specific definition of "A Smart City". The definition or the guidelines to define a Smart City will change from city to city, state to state and country to country. More information regarding smart city as per GOI guidelines can be found out at their website

According to Wikipedia, A smart city uses information and communication technologies (ICT) to

 a. Enhance quality, performance and interactivity of urban services
b. To reduce costs and resource consumption
 c. To improve contact between citizens and government.

One example of interaction between services could be putting the drainage clearance by civic authorities on areas likely to receive heavy rainfall. Another example could be a commuter receiving alerts for possible traffic jam points depending on real time weather data. Even information, like, "is it likely to rain today in the CBD" can also be useful. Such examples enhance quality of urban services and make do things that are not possible right now.

The reduction in cost is tied to right allocation of resources and that can only be brought about with the availability of right data for making decisions. City smart street lighting is one good example of data from physical environment combining with a decision system to save money. Smart lights that turn on and off based on the need save you power compared to the ones that operate on timers and are left on between dusk and dawn.

To improve contact between Government and citizens, both the entities need to be looking at the same data. That means having open data platforms that can be accessed and analyzed by all stakeholders. A lot of promise of smart city is about getting right data from the environment, sharing it with the stakeholders and then creating intelligent systems to drive better decisions.

Citywide environment monitoring is one of the core infrastructure requirements of a smart city. Environment monitoring would include weather, air pollution, traffic data, water quality monitoring etc. that are under the sustainable environment category as per Government of India. We are going to discuss air quality monitoring in this paper. This paper looks at the current situation, explains air quality monitoring terms, the associated challenges and possible solutions.

The model so far has been relying on data from stations installed by Government agencies. Typical stations are high cost and thus there can be only be few of them in a city. The coverage is poor. Today our metros are spread across 40 km in length and breadth. Look at the map of Bangalore, Delhi, Mumbai or Hyderabad and see the vast area that we are qualifying as a "city". Now think of population densities of a typical Indian metro and you can do the math about huge populations living in the metros. If you install four stations from Yehalanka to Electronics city in Bangalore then people in Kormangala may not be getting the right data about their environment.

Someone who said, “I want to know the pollution right outside my window” summarized this very beautifully. Right now there is no easy way to do that in our cities. The typical digital signages put up by agencies that display information in a static manner are not much use for policymaking or preventive actions. Suppose you stand outside the Majestic Railway station in Bangalore and stare at a board displaying the NOX levels in PPM. How is that information useful?

Like, with every system, we need to start with the requirements. Why install these signages? The goals are

1) Increase awareness about air pollution
2) Identify hotspots of pollution for preventive action
3) Identify events that trigger pollution
4) Archive pollution data for record keeping
5) Make this data available to researchers and citizens.

The first big problem we see with current deployment model is the identification of hotspots for preventive action. If the density of stations is low then the data is only representative of a particular location and using it as a proxy for other areas is a sure shot recipe for failure. If you have 10 stations in a metro that covers 1600 sq km then you have on an average just one station for 160 sq KM or millions of citizens! We are not solving "what is right outside my window" problem.

To do so we have to increase the density of the network. More stations means more data from immediate neighbourhood, the answer to "what is right outside my window" question. So how can we increase the density of the air quality stations? Budget is often a constraint and the only way forward is to bring the cost of stations down. However that has an inherent bias to bring down the quality of data. How can we achieve the right balance between price and performance? One possible solution is to create low cost stations with a do-it-yourself approach.

That DIY approach means you are supplying your own time so labor cost is not factored in. Since you are familiar with the intricacies of the system, you can identify and pick materials with a good price to reliability ratio. That means you can do the system at a fraction of the cost. That is the "expert who is ready to roll-his-sleeves approach". The downside is assumed familiarity with the system and know-how to interpret the data and fix glitches. To get more people on board, we can document the parts and process and release it to a wider audience in the hope that more people will follow suit. The success of this approach would depend on novices joining the project and ultimately turning into experts because of their innate curiosity. The movement has to become a social phenomenon to have the desired impact.

Then there are citizen science projects and crowd sourced projects, like “Air Quality Egg” that demand money but not expertise. They should have a better chance of success. The motto of such projects is "any data is better than no data". Such projects assume that citizens would deploy their sensors in large numbers and collaborate on data. The good part is that they really help create awareness and kickstart the debates. Everyone will sit up and take notice when you raise X $ on kickstarter for your air quality project. It indeed can have a galvanizing effect on your visitors when they see so many installations on your project dashboard.

That is the right bit about such projects. However there is a big issue that most of these projects tend to just ignore. For one, good air quality sensors are not available at dirt-cheap prices. Air quality may emerge as a lucrative market for manufacturers to take notice and with newer research we may get to a point where quality sensors will become available at throw away prices but that is not the case right now. How much of "a feature is required" or "what is good enough" are definitely debatable points but projects like Air Quality eggs pay scant regard to the quality of sensors or the process involved in getting the right data. This situation is very different from an expert striking a good balance between price and features. To cut down on cost, the stations typically use *cheap* sensors that are either low quality or would require extra work to make the data reliable.

One good example is citizen projects using the PPD42NJ (2) sensors (or derivatives) for measuring PM levels. The cheapest Good PM unit we have found is one from Dylos that retails for 200$. The PPD42NJ is about 20$ and you can get the cheaper Korean or Chinese knock-offs. However the cheap sensors come with a caveat. You have to calibrate each sensor individually. That means recording data from your sensor, comparing it against the readings from a more reliable (and expensive) instrument and the encoding that information for each station. That is not impossible and with the right knowledge that can be done.

However to get to that point, you have to plot graphs, compare the two graphs, do a curve fitting and find the equation for your sensor and then encoding that as a program that runs on your station. Typically that is beyond the capabilities of most citizens. Also, the time involved in doing the right thing would deviate from the "turn on and get data" motto of the project. What happens instead is that everyone just uses an equation that was published by someone the Internet (3) for his or her particular sensor and treats that as gospel for PM levels.  Unscrupulous researchers of publish-more-useless-papers industry who just want to latch onto oh-air-quality-is-so-hot-now trend and continue the bad data collection methodologies do not help the situation. Most such papers have less data collection honesty than afternoon student lab experiments(4).

 However if the whole premise of improving air quality is based on having the right data then we should worry about getting the right data. What is the way forward? Surely, we cannot have those 50,000$ Thomas Fischer stations (5) deployed at every nook and corner. The economics will not work. Do citizen projects serve no purpose? Surely their range and resolution may not be right but can they not provide broad indicators? How do we strike the right balance in this price vs. quality debate?

The right way is to create a hierarchy of data collection units. We can use the very fine instruments from Government and laboratories that act as as reference for data collection units. We pick the sensors that are quality sensors good enough for the task. Like for air quality we are fine with a 100 PPM or 250 ppm CO sensor but we pick an NDIR one instead of a cheap chemical sensor. (6)

Second, we mix the expertise and time investment as a substitute for price. We go back to the PM sensor example that can be made reliable with the right instruments and expertise. You cannot bring down the price without developing the expertise. That is even more critical in India because typically sensors are imported and a duty needs to be paid.

We need expertise to untangle the intricacies involved in sensor measurements, work with the right instruments and churn out the next level of stations to cover a city in a systematic manner. These units would be augumneted by citizen science projects. There should be a process behind deployment and professionals in charge of accessing the quality of data. We can partition the city into a number of grids at block levels and provide one instrument in each grid. The cost of such instruments would be 10x lower so we can expect 10x more coverage than current numbers. That means we can provide 10x quality data on same budget. Hopefully with cost coming down, more actors can be sponsors of such instruments, including schools, hospitals, corporates and NGO.

The Government can augment current offering with these instruments to improve coverage. This should make identifications of hotspots easier and provide granular data for decision-making that takes care of local problems. How can we create more awareness? We cannot just leave it to the government and civic authorities to spearhead the awareness campaigns. Air quality has a price in form of health concerns for the whole community. To start the right debates, the whole community has to chip in. That means taking on board citizens, schools, hospitals, corporates and NGO.

Measurement and signages are the first step to quantify the problem but to really solve the problem; we need to invite more actors to design solutions. We need to bring down the emission levels, cut on pollution, create watchdogs for industries, publish right data, involve more people and only a holistic approach can provide solutions. Like, you can have industrial watchdogs but if the measurements are not right then they can get away scot-free. You can have right measurements and yet the watchdog agency can look the other way if we don’t have the right pressure groups to ensure compliance.

Fixing just one part is not the answer. To address the concerns we need to create pressure groups, have more discussions; create more literature and design solutions. Awareness about this topic has to be integrated in the social fabric and cultural practices. RTO should check for NOX/SOX pollution levels of vehicles, citizens should report increased CO levels in vicinity and hospitals should display PM2.5 levels outside their doors. That is about the awareness and data capture parts.

To go from data to action, we need to do the right analysis too. We need to

A) Make the data available in public domain
B) Provide right format for different stakeholders

Tomorrow, Government may install 500 stations in a city but if the data is not open and available for analysis then it defeats the purpose of having stations. Look at the “Digital India” initiative. GOI departments just dump their PDF, Excel and Word files and it would be a humoungus task to sift through data and make sense of it. Citizens and researchers should demand open data that can be consumed easily by machines. What do we mean by that?

We should not assume human intelligence to act on the data to make sense out of it. To illustrate, a table of numbers in excel or PDF is also *digital* data but it is not easy to parse and get data out without human intervention. A human understands that following is same data.

 T | RH 30.2 | 80.0
 RH | T 80.0 | 30.2

However a machine cannot reach that conclusion easily. Most of the *digital* data dumps assume humans looking at that data and making sense of it.

To achieve scales, we need systems where instead of humans; machines can analyze large chunks of data. However machines do not have cognitive skills and to achieve above goals, the involved parties need to agree beforehand on formats and conventions.

Machine level automation would require this trait as a central tenet of system design and not something bolted as an afterthought. A simple way to do that is to provide API for data that different programs can use to access information. We need systems that publish simple API. Something like, I give you latitude and longitude and return me a number that is trace of carbon monoxide from nearest station. More coverage and open systems is the only way to "know" more.

You have to create right data and then you have to publish it in the right way so more people can participate. The ultimate aim is that - Machines can consume data from other machines - Machines can generate different formats for different target audience.

Second part of this paper would look at what to measure, what is Air quality index (AQI), what gases are included in AQI and what impact they have on health.


  1. Very interesting. Please check our advanced air quality monitor that could make your life much easier to collect measurements: https://airvisual.com/monitor

  2. Have you guys checked this out? http://breathe.indiaspend.org/

  3. @siddhya - Yes, we have checked breathe from Indianspend. They are doing incredible job with PM monitoring.