The Journal of
the Korean Society on Water Environment

The Journal of
the Korean Society on Water Environment

Bimonthly
  • ISSN : 2289-0971 (Print)
  • ISSN : 2289-098X (Online)
  • KCI Accredited Journal

Editorial Office

Title Cost-Sensitive Deep Learning for High-Concentration Chl-a Forecasting in River Systems
Authors 강덕준(Dejun Jiang) ; 권혁구(Hyuk-Ku Kwon)
DOI https://doi.org/10.15681/KSWE.2026.42.3.267
Page pp.267-279
ISSN 2289-0971
Keywords Algal bloom prediction; Cost-sensitive learning; Gated recurrent unit; Imbalanced regression; Local outlier factor
Abstract Predicting chlorophyll-a (Chl-a) concentrations in riverine systems remains a significant challenge for water quality management. The Gapcheon River basin exemplifies this issue, as conventional regression models consistently underestimate peak events due to the statistical rarity of high-concentration observations. This study introduces an end-to-end deep learning framework that addresses data quality and distributional bias without generating synthetic samples. It utilizes 7-day input sequences to predict the next-step Chl-a concentration. To enhance accuracy, density-based anomaly detection using the local outlier factor (LOF) was employed to selectively eliminate non-representative records while preserving valid seasonal extremes. A sigmoid-weighted cost-sensitive loss function was then introduced to focus the model's attention on the underrepresented bloom tail. This allowed bidirectional gated recurrent unit (GRU) and long short-term memory (LSTM) networks to prioritize peak magnitude and the dynamics of rising limbs. A systematic evaluation across three weighting regimes and five random seeds demonstrated that GRU consistently outperformed LSTM. The moderate Thresh_0.75 setting yielded the best mean performance in the high-concentration subset, with the GRU Baseline achieving an overall R² of 0.87 ± 0.01 and RMSE of 3.84 ± 0.14 μg/L, while GRU Thresh_0.75 reached a high-concentration R² of 0.53 ± 0.03 and RMSE of 6.37 ± 0.19 μg/L. Permutation-based feature attribution analysis indicated that cost-sensitive training relied heavily on hydrological dilution signals, as well as dissolved oxygen and pH levels, which are indicators of acute eutrophication. These findings demonstrate that targeted loss modification provides a coherent and plausible approach for forecasting high concentrations in continuously monitored river systems.