Performance improvement of hydrological models using Unscented Kalman Filter-type data assimilation and data fusion

Parisa Almasi, Alireza Moghaddam Nia, Shahram Khalighi Sigaroodi, Ali Salajeghe, Saeed Soltani Koopaei, Dawei Han, Ebrahim Ahmadisharaf, Amirhossein Nazari · Applied Water Science · 2026

[doi]

Performance Improvement of Hydrological Models Using Unscented Kalman Filter-type Data Assimilation and Data Fusion

Authors: Parisa Almasi, Alireza Moghaddam Nia, Shahram Khalighi Sigaroodi, Ali Salajeghe, Saeed Soltani Koopaei, Dawei Han, Ebrahim Ahmadisharaf, Amirhossein Nazari Year: 2026 Tags: unscented-kalman-filter, data-assimilation, data-fusion, hydrological-modeling, semi-arid, streamflow-simulation

TL;DR

Applies the Unscented Kalman Filter (UKF) for state-only updating and a weighted-average data fusion scheme to three hydrological models (HBV, SWAT, WetSpa) differing in spatial complexity, on a single 216 km² semi-arid Iranian watershed. UKF yields up to a 29.8% NSE gain in validation; data fusion yields up to a 10.2% NSE gain, with UKF consistently outperforming fusion.

First pass — the five C's

Category. Research prototype / applied comparison study.

Context. Hydrological data assimilation; builds on Kalman filtering lineage (Kalman 1960 → EKF Jazwinski 1970 → EnKF Evensen 1994 → UKF Julier et al. 1995/1997); draws on EnKF hydrological applications (Moradkhani et al. 2005, Clark et al. 2008) and prior limited UKF hydrology work (Jiang et al. 2018, Sun et al. 2020); references weighted data fusion (Abrahart & See 2002, Zhuo & Han 2017).

Correctness. Load-bearing assumptions: (1) UKF state-only updating (no parameter updating) is sufficient to characterize predictive uncertainty; (2) NSE-proportional weights are an adequate fusion rule; (3) process noise Q and measurement noise R are constant throughout simulation — all assumed, none validated. These are non-trivial and the paper does not test sensitivity to them.

Contributions. - Applies UKF to three spatially distinct model structures (lumped/semi-distributed/distributed) within a single semi-arid catchment, comparing improvement magnitudes across structures. - Documents that UKF gain is inversely related to model spatial complexity (largest for lumped HBV, smallest for distributed WetSpa). - Compares UKF-then-fusion sequencing against UKF alone, reporting 29.8% vs. 10.2% maximum NSE gain. - Provides one of few UKF applications to arid/semi-arid Iranian hydrology.

Clarity. Writing is functional but repetitive — the Results and Discussion section 3 appears verbatim twice in the manuscript (apparently an editing artifact in the pre-publication version); prose is straightforward but lacks quantitative depth beyond NSE.

Second pass — content

Main thrust: UKF-based state assimilation applied sequentially to HBV, SWAT, and WetSpa substantially reduces prediction error in a data-scarce semi-arid watershed; NSE-weighted averaging of the post-UKF outputs provides additional but smaller gains, and UKF dominates across all calibration and validation comparisons.

Supporting evidence: - Baseline NSE (calibration) across all three models: 0.57–0.66; baseline NSE (validation) up to 0.69. No significant structural differences without assimilation. - UKF validation-phase NSE improvement: 18.8%–29.8%; highest for HBV (lumped), lowest for WetSpa (distributed). - Data fusion (weighted average of UKF outputs) NSE improvement: 2%–15% in calibration, maximum 10.2% noted for validation. - WetSpa achieves slightly higher baseline NSE than HBV or SWAT, attributed to distributed spatial structure. - All models failed to reproduce peak flows accurately; attributed to daily timestep in a 216 km² basin where concentration time < 24 hours, and sparse low-altitude rain gauges.

Figures & tables: Figure 1 (watershed map with gauge locations) and Figure 2 (observed vs. simulated hydrographs including UKF results for calibration) are referenced. Table 1 lists calibration parameters and optimization methods per model. Table 2 contains NSE values before/after UKF and data fusion for calibration and validation. Axes labeling, error bars, confidence intervals, and statistical significance are not described in the text; no uncertainty bands on hydrographs are mentioned. The pre-publication manuscript does not include the actual figure/table content in the provided text, so visualization quality cannot be confirmed.

Follow-up references: - Sun et al. (2020) — directly predecessor UKF hydrology application with stability modifications; most relevant methodological context. - Jiang et al. (2018, Hydrology Research) — UKF state estimation in conceptual hydrological models; closest methodological parallel. - Moradkhani et al. (2005) — dual state–parameter EnKF estimation; benchmark for what this study deliberately omits (parameter updating). - Jahanshahi et al. (2025) — data fusion combining ML and physics-based models in semi-arid basin; direct comparison point for fusion performance.

Third pass — critique

Implicit assumptions: - Q (process noise covariance) and R (measurement noise covariance) are held constant throughout; UKF performance is highly sensitive to these choices, and their values are not reported or justified. Violation would directly alter all reported NSE gains. - UKF is applied in state-estimation-only mode; parameter uncertainty — which is likely dominant in a data-scarce semi-arid basin — is entirely excluded, limiting the generalizability of the improvement claims. - NSE-proportional weighting assumes model errors are uncorrelated; in a single small watershed with shared meteorological forcing, model outputs are correlated, violating this premise. - "Satisfactory" NSE > 0.50 threshold as fusion inclusion criterion is asserted; the sensitivity of fusion results to this threshold is untested.

Missing context or citations: - No comparison to EnKF, which is the dominant ensemble filter in hydrology and a natural benchmark; the paper justifies UKF theoretically but never tests it against EnKF. - Particle filter methods are mentioned in passing (via Chatzi & Smyth) but not considered as alternatives despite relevance to non-Gaussian semi-arid hydrology. - No engagement with remote-sensing-assisted assimilation (e.g., GRACE, SMAP soil moisture) — acknowledged as future work but relevant as a comparison baseline in data-scarce settings. - HBV is calibrated manually while SWAT uses SUFI-2 and WetSpa uses PEST; this inconsistency in calibration rigor is not discussed as a potential confound in the model-structure comparison.

Possible experimental / analytical issues: - The primary comparison — "UKF outperforms data fusion (29.8% vs. 10.2%)" — is structurally biased: data fusion is applied to UKF-filtered outputs, not to raw model outputs. This guarantees data fusion appears weaker; a fair comparison would apply fusion to raw outputs separately. - NSE is the sole reported metric; PBIAS, KGE, RMSE, or flow-duration curve statistics are absent, obscuring whether high flows (acknowledged as systematically underestimated) or low flows improve. - No uncertainty quantification (credible intervals, ensemble spread) is reported for either UKF or fusion outputs, despite uncertainty reduction being the stated goal. - Single catchment (216 km²), single climate (semi-arid Iran), 24-year record — no cross-validation across basins or climates; generalization claims are unsupported. - No statistical significance testing (e.g., Wilcoxon, DM test) is applied to NSE differences. - The Results and Discussion section appears duplicated verbatim in the manuscript — a pre-publication artifact that raises editing concerns.

Ideas for future work: - Repeat the experiment with data fusion applied to raw (pre-UKF) model outputs to produce an unconfounded UKF-vs.-fusion comparison. - Include joint state–parameter UKF or dual EnKF to quantify how much of the remaining error is attributable to parameter uncertainty vs. state uncertainty, particularly given the manually calibrated HBV baseline. - Apply UKF at sub-daily timesteps (if data become available) or on a larger basin with longer concentration time to test whether the peak-flow underestimation is timestep-driven or filter-driven. - Test adaptive noise covariance estimation (e.g., innovation-based Q/R adaptation) to assess sensitivity of the 18.8%–29.8% NSE gains to the constant-noise assumption.

Methods

  • Unscented Kalman Filter (UKF)
  • weighted-average data fusion
  • HBV lumped rainfall-runoff model
  • SWAT semi-distributed model
  • WetSpa distributed model
  • PEST parameter estimation
  • SUFI-2 uncertainty fitting
  • Nash-Sutcliffe efficiency (NSE) evaluation
  • sigma-point generation via symmetric point algorithm

Datasets

  • Menderjan watershed daily streamflow (1993-2017)
  • Iran Meteorological Organization weather station data
  • Iran Water Resources Management Company stream gauge data

Claims

  • UKF-based data assimilation improved Nash-Sutcliffe efficiency by up to 29.8% for the lumped HBV model during validation.
  • UKF consistently outperformed weighted-average data fusion across calibration and validation stages for all three model structures.
  • The greatest relative improvement from UKF was observed in the lumped model (HBV) and the smallest in the distributed model (WetSpa).
  • Data fusion using weighted averages provided modest NSE gains of up to 10.2%, less than those achieved by UKF.
  • Model spatial structure (lumped vs. semi-distributed vs. distributed) had little effect on overall NSE but strongly influenced simulation of specific runoff components such as peak flows and baseflows.