Anomaly Detection in Sewer Systems Using Gaussian Process Regression

Mohsen Rezaee, Peter Melville-Shreeve, Hussein Rappel · CCWI 2025 - 21st Computing & Control for the Water Industry Conference · 2025

anomaly-detectiongaussian-process-regressionsewer-systemsprobabilistic-forecastingblockage-detection

Anomaly Detection in Sewer Systems Using Gaussian Process Regression

Authors: Mohsen Rezaee, Peter Melville-Shreeve, Hussein Rappel Year: 2025 Tags: gaussian-process-regression, anomaly-detection, sewer-systems, blockage-detection, probabilistic-forecasting, urban-drainage

TL;DR

A GPR model is trained on hydraulic simulator output to probabilistically forecast manhole water depth in a combined sewer; deviations outside a 95% credible interval sustained for ≥3 consecutive hours trigger a blockage alarm. The approach is demonstrated on a single simulated blockage scenario, providing an alarm with roughly 8 hours of lead time before manhole overflow.

First pass — the five C's

Category. Research prototype / proof-of-concept (conference short paper).

Context. Urban hydroinformatics / smart sewer monitoring. Builds on: Rezaee et al. (2025) preprint — same sewer network and emulator; Rosin et al. (2022) — ANN + statistical process control for near-real-time blockage detection; Glyn-Davies & Girolami (2022) — GP-based anomaly detection in streaming data; Williams & Rasmussen (2006) — foundational GP textbook.

Correctness. Load-bearing assumptions: (1) the hydraulic simulator faithfully represents real sensor observations so that training on simulator output transfers to field deployment; (2) blockages manifest as sustained, directional depth deviations detectable within the 95% credible interval; (3) a fixed 3-hour persistence threshold is operationally meaningful. Assumption (1) is never validated against real sensor data in this paper.

Contributions. - Frames sewer anomaly detection as a credible-interval exceedance problem under full probabilistic GPR output. - Shows directional signatures differ upstream (depth rise) and downstream (depth drop) of a blockage, enabling localization. - Establishes a 3-hour consecutive-exceedance trigger that provides lead time before overflow in the tested scenario. - Notes explicitly that gradual blockages embedded in training data constitute a systematic miss-detection failure mode.

Clarity. Writing is concise and accessible; however, quantitative model performance metrics are withheld ("model checking metrics indicate a satisfactory fit"), weakening reproducibility for a methods paper.

Second pass — content

Main thrust: GPR is fitted on one month of simulator-generated data (15-minute intervals, inputs: time and rainfall depth) to produce 95% credible intervals for manhole water depth; sustained exceedance triggers a blockage alarm roughly 8 hours before overflow occurs.

Supporting evidence: - Training set: 1 month of data, 15-minute resolution (≈2,880 points); no separate validation set size reported. - Blockage inserted between nodes J2 (upstream) and J3 (downstream); downstream depth falls below the 95% credible interval, upstream depth rises above it. - Alarm triggered after 3 consecutive hours of exceedance. - Manhole overflow occurs approximately 11 hours after blockage initiation, leaving ≈8 hours of operational lead time. - No precision, recall, F1, or RMSE values are reported anywhere in the paper.

Figures & tables: Two figures (Figures 1 and 2) show time-series plots of predicted mean (red line) and observed values (blue bullets) at downstream and upstream manholes respectively. The 95% credible interval appears as shading (described as "shaded red"). Axis units and labels are referenced in captions but not described in detail in the text; statistical significance is not reported; no error bars on predictions beyond the credible interval itself; no tables of metrics.

Follow-up references: - Rezaee et al. (2025) preprint — the extended version of this work with full methodology and likely more results. - Rosin et al. (2022) — nearest competing method (ANN + SPC for blockage detection) that should serve as a direct baseline. - Glyn-Davies & Girolami (2022) — GP-based anomaly detection framework directly relevant to the probabilistic approach used here. - Williams & Rasmussen (2006) — essential background for GPR kernel selection and inference.

Third pass — critique

Implicit assumptions: - Simulator-to-reality transfer: the GPR is trained entirely on PySWMM/SWMM hydraulic model output, not real sensor data. If the simulator does not capture real sensor noise, drift, or diurnal patterns accurately, the credible intervals may be miscalibrated in deployment — this assumption is untested and would break the anomaly detection logic. - Stationarity of the GP prior: the kernel choice and its hyperparameters are not described; non-stationarity in real sewer flow (storm events, seasonal infiltration) could inflate false positive rates. - The 3-hour threshold is treated as a tunable parameter but no sensitivity analysis is provided; its optimality is asserted, not demonstrated.

Missing context or citations: - No comparison to the Rosin et al. (2022) ANN + SPC method, which is cited but not benchmarked against. - No engagement with threshold-free or scoring-based anomaly detection approaches (e.g., isolation forests, autoencoders) that are common in this domain. - Real-world sensor noise, missing data, and data drift are acknowledged in passing but no citation or method addresses them. - Only a single blockage type and location tested; pump failure and structural collapse (mentioned in the introduction as target anomalies) are not evaluated.

Possible experimental / analytical issues: - Single synthetic scenario: one blockage event in one network; no cross-validation, no multiple blockage severities, no partial/gradual blockage experiments beyond acknowledging they are a limitation. - No quantitative metrics reported (precision, recall, false positive rate, detection latency distribution); "satisfactory fit" is unverifiable. - Training and test data are both from the same simulator run; there is no out-of-distribution test. - Kernel specification entirely omitted — readers cannot assess or reproduce the GPR fit. - The claim that GPR "performs well with limited data" is invoked as motivation but 2,880 training points is not limited for a scalar regression task; this framing is unsupported.

Ideas for future work: - Validate on real sensor data from operational sewer networks, including sensor faults and missing observations, to test the simulator-to-reality gap. - Systematic sensitivity analysis of the persistence threshold (3 hours) across blockage severities and network topologies to derive principled threshold selection. - Benchmark against Rosin et al. (2022) and at least one non-probabilistic baseline (e.g., LSTM, ARIMA) using standard detection metrics on a shared test set. - Incorporate physics-informed kernel design (e.g., encoding hydraulic residence time or pipe capacity constraints) to improve robustness to gradual blockages — consistent with the stated future direction.

Methods

Gaussian Process Regression
probabilistic forecasting
95% credible interval construction
hydraulic simulation via PySWMM
data-driven surrogate modelling

Datasets

combined sewer system sensor data (water depth at manholes, 15-minute intervals, one month of training data)
hydraulic simulator-generated data

Claims

GPR enables probabilistic forecasting of water depth in combined sewer systems with quantified uncertainty.
Deviations from 95% credible intervals derived from GPR predictions can be used to flag potential sewer anomalies such as blockages.
A 3-hour consecutive-deviation threshold provides sufficient lead time for operational response before manhole overflow occurs.
Probabilistic approaches outperform deterministic methods for anomaly detection in sewer systems due to inherent uncertainties in inputs.
Gradual blockages that develop slowly may evade detection because their effects become embedded in training data.