2:00 PM - 2:20 PM
[MGI31-02] SNAP-CII: Algorithm to estimate aerosol concentration using commonly used camera
★Invited Papers
Keywords:Aerosol, Machine learning, Image analysis
We used the sky image data taken at 10 cities in Japan (Fukuoka, Kitakyushu, Nagasaki, Obama, Osaka, Kyoto, Nagoya, Fuji, Tokyo, and Sapporo). The aerosol concentration data was given by the suspended particle matter (SPM) measured by the Atmospheric Environmental Regional Observation System (AEROS) . The data in sunny and daytime (9:00~18:00) condition were selected by global solar irradiance (GSI) and diffuse solar irradiance (DSI) measured by Himawari-8 satellite. The criterion in this study was DSI/GSI < 0.15. The temperature and relative humidity data were given by the Japan Meteorological Agency database < https://www.data.jma.go.jp/gmd/risk/obsdl/index.php, Last access February 14, 2023>.
A machine learning (ML) model was developed to classify SPM concentration into three classes (Low: 0 – 10, Middle: 10 – 30, High: > 30 [ug/m3]) using the sky image data. The borders of classes were defined from the Air Quality Guidelines (AQG) 2021 of World Health Organization, i.e., 10 ug/m3 is mean of AQG levels of annual averaging of PM2.5 and PM10, and 30 ug/m3 is that of 24-hour averaging. The pixel values (RGB) of the image data were converted to ratios of linearized RGB, i.e., linB/linG, linG/linR, linB/linR, and ratios of CIE XYZ, i.e., Z/Y, Y/X, Z/X. The values of these ratios varied in the viewing angles. We calculated the gradients of these ratios in a vertical direction to obtain the values independent on the viewing angles of camera. The input variables of ML models were these gradients of pixel value ratios. To take into account the seasonality, cosine of solar zenith angle, cosine of solar azimuth angle, temperature and relative humidity were also included. The cloud in the sky was removed by the criterion of (linB - linR)/(linB + linR) < 0.2.
We tested three types of classification models, K-nearest neighbor (KN), support vector machine (SVM) and random forest (RF). The data were randomly divided into training and test data in a ratio of 6:4. The accuracy, number of correct predictions divided by total number of samples, for each model was calculated. Hyper parameter tuning was performed for each model. The accuracies for testing the image data at Fukuoka in 2021 were 70.5%, 72.8%, and 71.6% for KN, SVM, and RF, respectively. The largest factor to decrease the accuracy was that approximately 15~18% “Low” data were mis-predicted as “Middle.”
In this study, we developed three-class classification model of aerosol concentration based on sky image data in order to measure aerosol concentration more conveniently. The classification algorithm, named SNAP-CII, was developed for clear sunny conditions, and the accuracies were more than 70% for KN, SVM, and RF models for test in Fukuoka.