AIQUAM (Artificial Intelligence-based water QUAlity Model) is trained to predict seawater contamination levels using a supervised machine learning approach. The training data is constructed by coupling microbiological field samples—categorized into three discrete classes based on Escherichia Coli (MPN/100g) concentration—with time series of pollutant concentrations simulated by the high-resolution WaComM++ Lagrangian model. Each training instance corresponds to a 72-hour time window prior to a microbiological sampling event, representing the cumulative exposure of mussels to bacterial contaminants.
To address class imbalance in the labeled dataset (with a predominance of Class 0 samples), the Synthetic Minority Over-sampling Technique (SMOTE) was applied to augment the underrepresented classes. The feature space comprises hourly concentration values across the time window, forming multivariate time series inputs. AIQUAM evaluates multiple time series classification (TSC) models, including K-Nearest Neighbors (KNN), KNN combined with Dynamic Time Warping (KNN+DTW), and Convolutional Neural Networks (CNN). Among these, the standard KNN model achieved the best performance, with an overall classification accuracy of approximately 93%, making it the baseline for operational deployment.

In the inference phase, AIQUAM++ utilizes real-time or forecasted data from WaComM++ to generate spatially and temporally resolved predictions of water quality. Specifically, the model takes as input the predicted bacterial concentration time series at each grid point and applies the trained classifiers to estimate the categorical quality class (0: low contamination, 1: moderate, 2: high) at each location and forecast timestep.
The classification task is handled through a weighted majority voting ensemble, leveraging the outputs of the different base classifiers to enhance robustness and reduce model bias. The inference process is designed to be lightweight and parallelizable, enabling scalable deployment in high-performance computing (HPC) or cloud-native environments. AIQUAM’s predictions are intended to support decision-making in environmental monitoring and aquaculture management, offering a predictive layer on top of traditional oceanographic simulations to proactively identify contamination risks.




