9:45 AM - 10:00 AM
[MGI29-04] Machine learning-based method for concentration simulation and source apportionment of Zn, Cu and Pb in Kosaka River, Northeast Japan
Keywords:heavy metal pollutants, machine learning technique, simulate, contamination sources, pollution level, river system
We collected some river water and sediment samples along Kosaka River. The analytical results showed that there was a high possibility of Zn pollution in tributaries. In order to specifically trace the sources of heavy metals, the study area was divided into individual tributary polygons and mainstream polygons based on digital elevation model (DEM) data in QGIS. For an individual tributary polygon, the heavy metal loads in tributary within the polygon were considered to depend on and only depend on the features within this polygon, such as geological features, land use type, precipitation, and mine site information. For the mainstream, the heavy metal loads depended on both the internal features of mainstream polygon and the tributaries merging into the mainstream. Several machine learning algorithms such as random forest (RF) and support vector machine (SVM) were applied for the model establishment, since different algorithms could verify with each other. Certain internal features of the divided polygons were used as input variables to develop models by machine learning algorithms for predicting heavy metal concentrations in river water and sediment.
The preliminary modeling result proves that the current method is feasible and effective. However, some input variables need to be adjusted to make the model more accurate. At the current stage, sensitivity analysis is being conducted to modify and calculate the importance of the input variables. The importance of variables can be utilized to reflect the sources of heavy metals and to estimate the background concentrations of heavy metals. After readjusting the input variables, the coefficient of determination (R2) and the mean absolute error (MAE) will be used as statistical measurements to evaluate the performances of applied machine learning algorithms. The optimal machine learning algorithm will finally be selected.