Air quality monitoring platform using data mining, big data, and machine learning
Event Type
Poster Presentation
TimeThursday, April 152:58pm - 3:00pm EDT
LocationDigital Health
DescriptionThe WHO stated that exposure to outdoor air pollution is responsible for 4.2 million premature deaths every year. Studies have shown that exposure to increasing levels of air pollution is related to infant mortality as a result of respiratory deaths in the postnatal period, linked to low birth weight, premature births, and intrauterine growth retardation. Also, air pollution is addressed to cardiovascular diseases, childhood asthma and atopic dermatitis. These issues are especially prevalent in Mongolia, widely considered to be the most polluted country in the world.

The government of Mongolia adopted specific policies and strategies aimed at reducing air pollution. These strategies include provision of free electricity to low-income households in the ger district in winters, distribution of efficient stoves and face masks to low-income households and banning the sale and use of raw coal. However, these measures have been unsuccessful in overcoming the social, structural, and habitual barriers associated with the local population.

The national government and non-profit organizations have pivoted to provide technological solution. UNICEF in particular has deployed a number of ubiquitous and readily available air quality monitoring sensors collecting air pollution data 24/7 at high velocity. This has contributed to the increasing volume of data and literature available using this data to understand the impact of air pollution on human health. At the same time, these sensors can collect a variety of data types due to the complex mixture of airborne pollutants. Although traditional epidemiological or environmental health models to analyze these types of data have been used for decades, the increasing amount and complexities of the data require new methods for data analysis. The authors emphasize that prediction-based or knowledge discovery methods present in data mining algorithms and machine learning can help epidemiologists, scientists and governments better understand the data.

We have partnered with UNICEF to develop a platform capable of collecting particulate matter, CO2 and air quality index data from different air quality units, as well as non-sensors data, such as hospital admissions. This approach will focus on aggregating data and generating a unique, informative and synthetic new raw data sources that will be used for future analysis. The platform will be developed using data fusion, a process recommended for IoT solution that aggregates heterogeneous data from different data sources and generating new raw data for further analyzes. Additionally, data mining algorithms and machine learning will be applied to hypotheses generation, prediction and forecasting.

When developing the platform, we intend to use User-Centered Design (UCD) to incorporate user requirements into the design and development to understand the needs of the various stakeholders. The process takes into consideration transformation of empathetic elements such as thoughts, frustrations, desires, and feelings into systematic elements such as goals, interaction styles, design philosophies, and capabilities.