A data-driven method for estimating sewer inflow and infiltration based on temperature and conductivity monitoring

Jingyu Ge, Jiuling Li, Ruihong Qiu, Tao Shi, Chenming Zhang, Zi Huang, Zhiguo Yuan · Water Research · 2024

[doi]

A Data-Driven Method for Estimating Sewer Inflow and Infiltration Based on Temperature and Conductivity Monitoring

Authors: Jingyu Ge, Jiuling Li, Ruihong Qiu, Tao Shi, Chenming Zhang, Zi Huang, Zhiguo Yuan Year: 2024 Tags: sewer-monitoring, inflow-infiltration, time-series-reconstruction, prophet-model, water-quality-sensors, urban-drainage

TL;DR

A Prophet-model-based algorithm reconstructs dynamic base wastewater flow (BWF) temperature and conductivity profiles from in-sewer measurements; deviations from that baseline are fed into a three-source mass/energy balance to separately quantify surface-water inflow and groundwater infiltration fractions without flow meters. Reconstruction is validated on real data from two Australian catchments (KGE 0.88–0.99); I/I quantification is validated only against a simulation of a real-but-small Australian sewer network (KGE 0.76–0.97).

First pass — the five C's

Category. Research prototype / methodology paper.

Context. Urban sewer engineering / urban hydrology. Builds on: Taylor & Letham (2018) Prophet decomposition model; Zhang et al. (2018a) conductivity-only unit-hydrograph calibration for I/I; Guo et al. (2022) experimental validation of temperature and conductivity as suitable I/I tracers; Figueroa et al. (2021, 2023) in-sewer thermal-hydraulic modelling used here as the simulation engine.

Correctness. Load-bearing assumptions: (1) surface runoff, groundwater, and BWF have sufficiently distinct temperature and conductivity to keep the 3×3 balance system well-conditioned; (2) conductivity behaves as a conserved scalar (no chemical reactions or biofilm uptake); (3) BWF periodicity is stationary across the calibration-to-prediction window; (4) permanent groundwater infiltration is validly absorbed into BWF. All four are plausible in many settings but are not quantitatively bounded in the paper.

Contributions. - Prophet-based reconstruction algorithm that separates BWF temperature and conductivity patterns (trend, multi-period, event, residual) from I/I-induced deviations using only dry-weather filtered in-sewer measurements. - Three-source mass/energy balance framework that analytically solves for inflow ratio α(t) and infiltration ratio β(t) relative to BWF, requiring no geophysical coefficients. - Two-case simulation study on a real Australian network distinguishing intermittent vs. permanent groundwater infiltration scenarios. - Demonstrated that conductivity reconstruction is feasible but harder than temperature reconstruction (lower KGE, higher SD), attributed to weaker seasonal signal.

Clarity. Logical structure and mathematics are clear; key methodological details (model calibration priors, simulation parameters, BWF flow rate assumption) are deferred to supplementary material that is not reproduced in the paper.

Second pass — content

Main thrust: Fit a Prophet time-series model to dry-weather in-sewer temperature and conductivity data to recover what those signals would be without I/I; then solve a linear system derived from mass and energy conservation — using measured surface-water and groundwater temperature/conductivity as the other two end-members — to recover the fractional contributions of inflow and infiltration at each time step.

Supporting evidence: - BWF reconstruction KGE (10 random 70/30 splits): T_a1 testing avg 0.9332 (SD 0.0398); T_a2 testing avg 0.9630 (SD 0.0185); C_a2 testing avg 0.8842 (SD 0.0538) — all well above the −0.41 "better than mean" threshold. - Case 1 (intermittent groundwater): inflow KGE 0.8873, infiltration KGE 0.8622, total I/I KGE 0.9730. - Case 2 (permanent groundwater infiltration): inflow KGE 0.8357, infiltration KGE 0.7628, total I/I KGE 0.8894 — accuracy degrades but remains above −0.41. - Simulation network: 1 pumping station, 0.549 km rising main, 2.098 km gravity pipe, 11 inflow locations, 10 infiltration points, 5-minute sampling, 1% white noise added per HACH sensor specs. - Three months of simulation data per case with hourly sensor inputs from the Australian Bureau of Meteorology.

Figures & tables: - Fig. 2 (frequency-amplitude bar charts for model structure selection): axes labeled, no error bars needed; clearly identifies dominant periodicities used as inputs to Prophet. - Fig. 3 (four-panel BWF reconstruction for T_a1): well-labeled, shows raw → filtered → reconstructed → deviation sequence; no pointwise confidence intervals on the reconstructed profile — the reader cannot assess reconstruction uncertainty at individual time steps. - Fig. 4 (Case 1 I/I quantification, 6 panels): α(t), β(t) time series plus inflow/infiltration/total comparisons with residual bars; axes labeled; no confidence intervals; residual bars are unsigned-magnitude only, making it hard to distinguish systematic from random error. - Table 2: reconstruction KGE with mean and SD across 10 splits — appropriate uncertainty reporting. - Table 3: I/I quantification KGE for only two cases — no uncertainty measure; N=2 is too small to characterise method performance.

Follow-up references: - Zhang et al. (2018a) — nearest prior method (conductivity-only unit hydrograph); key quantitative comparison target missing from this paper. - Guo et al. (2022) — experimental basis for choosing temperature and conductivity; confirms feasibility claims. - Figueroa et al. (2023) — in-sewer thermal-hydraulic framework used as the simulation engine here; understanding it is needed to assess simulation validity. - Perez et al. (2024) — sanitary sewer unit hydrograph model; the most directly competing recent method.

Third pass — critique

Implicit assumptions: - The simulation model that generates the synthetic I/I data uses the same physical equations (advection-diffusion, heat transfer) that the method implicitly relies on through the mass/energy balance — this creates circular validation: the physics encoded in simulation will naturally support physics-based recovery. - BWF flow rate is assumed known to convert α(t) and β(t) to absolute flows; this is quietly deferred to practitioners and carries its own measurement uncertainty, but error propagation from BWF flow uncertainty to final I/I estimates is not quantified. - Multi-period Prophet stationarity over the calibration window: silently assumes no structural changes in water use, population, or infrastructure during the few-month fitting period. - The ill-conditioning threshold (ΔT₂/ΔC₂ ≈ ΔT₃/ΔC₃) is identified qualitatively but no quantitative criterion or reliability metric is provided to flag unreliable α/β estimates in practice.

Missing context or citations: - No quantitative comparison against any existing method (flow-meter-based, stable isotope, conductivity-only unit hydrograph) on the same network or dataset; the claim of superior cost-effectiveness is argued narratively, not demonstrated. - No testing outside coastal southeastern Queensland — generalizability to cold climates (where groundwater and surface-water temperatures may converge seasonally), arid regions, or different pipe materials is unaddressed. - The Prophet model is adopted without comparison to simpler alternatives (SARIMA, harmonic regression, seasonal decomposition); no ablation justifies it as the best choice. - Sensor fusion literature (e.g., Kalman-filter-based state estimation for sewer systems) is not engaged.

Possible experimental / analytical issues: - The I/I quantification is validated entirely on synthetic data from the model used to build the simulation — no real-world I/I ground truth exists; the validation cannot rule out that errors in the simulation model cancel errors in the estimation algorithm. - Three months per case is insufficient to test seasonal variation in groundwater and surface-water temperature/conductivity end-members; the method's performance in winter vs. summer is unknown. - Rainfall screening thresholds (η = 1, 3, 8 mm; w = 24, 48, 72 h) are presented as given choices with no sensitivity analysis — calibration sample quality is highly sensitive to these values. - Table 3 has N = 2 scenarios with no replication or uncertainty quantification; no statistical test is possible. - Data not shared (authors state no permission) and the simulation code is in MATLAB without a repository link — the paper is not reproducible as published. - 1% white noise is an optimistic sensor error model; sensor drift, fouling-induced bias, and communication dropouts (acknowledged to occur in T_a2 as multi-hour constant readings) are not systematically included in the simulation noise model.

Ideas for future work: 1. Real-world I/I ground truth validation: instrument a catchment simultaneously with the proposed T/C sensors and calibrated flow meters during a controlled period; compare method output against flow-derived I/I to break the simulation circularity. 2. Ill-conditioning detection and propagation: derive a closed-form sensitivity index from the denominator of Eq. (7) to flag and exclude time steps where end-member similarity renders α/β unreliable; quantify uncertainty propagation from BWF flow rate error. 3. Automated calibration sample selection: replace the manual η/w threshold approach with an unsupervised anomaly detection step (e.g., isolation forest or CUSUM on the T/C residuals) to reduce dependence on external rainfall and groundwater-level data. 4. Spatial I/I localization experiment: implement the upstream-downstream differential approach outlined in Section 5.2 on a real multi-node network to test whether the single-point method's residuals translate accurately to localized pipe-segment attribution.

Methods

  • Prophet time series model
  • discrete Fourier transform
  • Bayesian information criterion
  • l-BFGS optimization
  • mass/energy balance equations
  • instantaneous unit hydrograph model
  • Saint-Venant equations
  • advection-diffusion equation
  • Kling-Gupta Efficiency (KGE)

Datasets

  • real-life sewer temperature and conductivity data from two Australian catchments
  • simulated sewer network data based on a real coastal Australian city sewer

Claims

  • A Prophet-model-based algorithm can reliably reconstruct base wastewater flow (BWF) temperature and conductivity profiles with KGE values ranging from 0.8842 to 0.9939.
  • Inflow and infiltration can be separately quantified by combining reconstructed BWF profiles with mass/energy balance equations using only temperature and conductivity measurements.
  • The proposed method achieves I/I quantification KGE values between 0.7628 and 0.9730, demonstrating robust performance across different groundwater scenarios.
  • Temperature and conductivity sensors provide a more cost-effective and reliable alternative to flow meters for sewer I/I monitoring.
  • The method is applicable to both combined and sanitary sewers and can account for I/I driven by factors beyond rainfall, including seawater tides.