Performance improvement of hydrological models using Unscented Kalman Filter-type data assimilation and data fusion

Parisa Almasi, Alireza Moghaddam Nia, Shahram Khalighi Sigaroodi, Ali Salajeghe, Saeed Soltani Koopaei, Dawei Han, Ebrahim Ahmadisharaf, Amirhossein Nazari · Applied Water Science · 2026

[doi] hydrological-modelingdata-assimilationkalman-filterdata-fusionsemi-arid-hydrologyrunoff-simulation

Performance Improvement of Hydrological Models Using Unscented Kalman Filter-type Data Assimilation and Data Fusion

Authors: Parisa Almasi, Alireza Moghaddam Nia, Shahram Khalighi Sigaroodi, Ali Salajeghe, Saeed Soltani Koopaei, Dawei Han, Ebrahim Ahmadisharaf, Amirhossein Nazari Year: 2026 Tags: unscented-kalman-filter, data-assimilation, data-fusion, hydrological-modeling, semi-arid, streamflow-simulation

TL;DR

Applies the Unscented Kalman Filter (UKF) for state-only updating and a weighted-average data fusion scheme to three hydrological models (HBV, SWAT, WetSpa) differing in spatial complexity, on a single 216 km² semi-arid Iranian watershed. UKF yields up to a 29.8% NSE gain in validation; data fusion yields up to a 10.2% NSE gain, with UKF consistently outperforming fusion.

First pass — the five C's

Category. Research prototype / applied comparison study.

Context. Hydrological data assimilation; builds on Kalman filtering lineage (Kalman 1960 → EKF Jazwinski 1970 → EnKF Evensen 1994 → UKF Julier et al. 1995/1997); draws on EnKF hydrological applications (Moradkhani et al. 2005, Clark et al. 2008) and prior limited UKF hydrology work (Jiang et al. 2018, Sun et al. 2020); references weighted data fusion (Abrahart & See 2002, Zhuo & Han 2017).

Correctness. Load-bearing assumptions: (1) UKF state-only updating (no parameter updating) is sufficient to characterize predictive uncertainty; (2) NSE-proportional weights are an adequate fusion rule; (3) process noise Q and measurement noise R are constant throughout simulation — all assumed, none validated. These are non-trivial and the paper does not test sensitivity to them.

Contributions. - Applies UKF to three spatially distinct model structures (lumped/semi-distributed/distributed) within a single semi-arid catchment, comparing improvement magnitudes across structures. - Documents that UKF gain is inversely related to model spatial complexity (largest for lumped HBV, smallest for distributed WetSpa). - Compares UKF-then-fusion sequencing against UKF alone, reporting 29.8% vs. 10.2% maximum NSE gain. - Provides one of few UKF applications to arid/semi-arid Iranian hydrology.

Clarity. Writing is functional but repetitive — the Results and Discussion section 3 appears verbatim twice in the manuscript (apparently an editing artifact in the pre-publication version); prose is straightforward but lacks quantitative depth beyond NSE.

Second pass — content

Main thrust: UKF-based state assimilation applied sequentially to HBV, SWAT, and WetSpa substantially reduces prediction error in a data-scarce semi-arid watershed; NSE-weighted averaging of the post-UKF outputs provides additional but smaller gains, and UKF dominates across all calibration and validation comparisons.

Supporting evidence: - Baseline NSE (calibration) across all three models: 0.57–0.66; baseline NSE (validation) up to 0.69. No significant structural differences without assimilation. - UKF validation-phase NSE improvement: 18.8%–29.8%; highest for HBV (lumped), lowest for WetSpa (distributed). - Data fusion (weighted average of UKF outputs) NSE improvement: 2%–15% in calibration, maximum 10.2% noted for validation. - WetSpa achieves slightly higher baseline NSE than HBV or SWAT, attributed to distributed spatial structure. - All models failed to reproduce peak flows accurately; attributed to daily timestep in a 216 km² basin where concentration time < 24 hours, and sparse low-altitude rain gauges.

Figures & tables: Figure 1 (watershed map with gauge locations) and Figure 2 (observed vs. simulated hydrographs including UKF results for calibration) are referenced. Table 1 lists calibration parameters and optimization methods per model. Table 2 contains NSE values before/after UKF and data fusion for calibration and validation. Axes labeling, error bars, confidence intervals, and statistical significance are not described in the text; no uncertainty bands on hydrographs are mentioned. The pre-publication manuscript does not include the actual figure/table content in the provided text, so visualization quality cannot be confirmed.

Follow-up references: - Sun et al. (2020) — directly predecessor UKF hydrology application with stability modifications; most relevant methodological context. - Jiang et al. (2018, Hydrology Research) — UKF state estimation in conceptual hydrological models; closest methodological parallel. - Moradkhani et al. (2005) — dual state–parameter EnKF estimation; benchmark for what this study deliberately omits (parameter updating). - Jahanshahi et al. (2025) — data fusion combining ML and physics-based models in semi-arid basin; direct comparison point for fusion performance.

Third pass — critique

Implicit assumptions: - Q (process noise covariance) and R (measurement noise covariance) are held constant throughout; UKF performance is highly sensitive to these choices, and their values are not reported or justified. Violation would directly alter all reported NSE gains. - UKF is applied in state-estimation-only mode; parameter uncertainty — which is likely dominant in a data-scarce semi-arid basin — is entirely excluded, limiting the generalizability of the improvement claims. - NSE-proportional weighting assumes model errors are uncorrelated; in a single small watershed with shared meteorological forcing, model outputs are correlated, violating this premise. - "Satisfactory" NSE > 0.50 threshold as fusion inclusion criterion is asserted; the sensitivity of fusion results to this threshold is untested.

Missing context or citations: - No comparison to EnKF, which is the dominant ensemble filter in hydrology and a natural benchmark; the paper justifies UKF theoretically but never tests it against EnKF. - Particle filter methods are mentioned in passing (via Chatzi & Smyth) but not considered as alternatives despite relevance to non-Gaussian semi-arid hydrology. - No engagement with remote-sensing-assisted assimilation (e.g., GRACE, SMAP soil moisture) — acknowledged as future work but relevant as a comparison baseline in data-scarce settings. - HBV is calibrated manually while SWAT uses SUFI-2 and WetSpa uses PEST; this inconsistency in calibration rigor is not discussed as a potential confound in the model-structure comparison.

Possible experimental / analytical issues: - The primary comparison — "UKF outperforms data fusion (29.8% vs. 10.2%)" — is structurally biased: data fusion is applied to UKF-filtered outputs, not to raw model outputs. This guarantees data fusion appears weaker; a fair comparison would apply fusion to raw outputs separately. - NSE is the sole reported metric; PBIAS, KGE, RMSE, or flow-duration curve statistics are absent, obscuring whether high flows (acknowledged as systematically underestimated) or low flows improve. - No uncertainty quantification (credible intervals, ensemble spread) is reported for either UKF or fusion outputs, despite uncertainty reduction being the stated goal. - Single catchment (216 km²), single climate (semi-arid Iran), 24-year record — no cross-validation across basins or climates; generalization claims are unsupported. - No statistical significance testing (e.g., Wilcoxon, DM test) is applied to NSE differences. - The Results and Discussion section appears duplicated verbatim in the manuscript — a pre-publication artifact that raises editing concerns.

Ideas for future work: - Repeat the experiment with data fusion applied to raw (pre-UKF) model outputs to produce an unconfounded UKF-vs.-fusion comparison. - Include joint state–parameter UKF or dual EnKF to quantify how much of the remaining error is attributable to parameter uncertainty vs. state uncertainty, particularly given the manually calibrated HBV baseline. - Apply UKF at sub-daily timesteps (if data become available) or on a larger basin with longer concentration time to test whether the peak-flow underestimation is timestep-driven or filter-driven. - Test adaptive noise covariance estimation (e.g., innovation-based Q/R adaptation) to assess sensitivity of the 18.8%–29.8% NSE gains to the constant-noise assumption.

Methods

Unscented Kalman Filter (UKF)
weighted-average data fusion
HBV lumped rainfall-runoff model
SWAT semi-distributed model
WetSpa distributed model
PEST parameter estimation
SUFI-2 uncertainty fitting
Nash-Sutcliffe efficiency (NSE) evaluation
sigma-point generation via symmetric point algorithm

Datasets

Menderjan watershed daily streamflow (1993-2017)
Iran Meteorological Organization weather station data
Iran Water Resources Management Company stream gauge data

Claims

UKF-based data assimilation improved Nash-Sutcliffe efficiency by up to 29.8% for the lumped HBV model during validation.
UKF consistently outperformed weighted-average data fusion across calibration and validation stages for all three model structures.
The greatest relative improvement from UKF was observed in the lumped model (HBV) and the smallest in the distributed model (WetSpa).
Data fusion using weighted averages provided modest NSE gains of up to 10.2%, less than those achieved by UKF.
Model spatial structure (lumped vs. semi-distributed vs. distributed) had little effect on overall NSE but strongly influenced simulation of specific runoff components such as peak flows and baseflows.