Uncertainty-Aware Self-Attention Model for Time Series Prediction with Missing Values

Abstract

Missing values in time series data present a significant challenge, often degrading the performance of downstream tasks such as classification and forecasting. Traditional approaches address this issue by first imputing the missing values and then independently solving the predictive tasks. Recent methods have leveraged self-attention models to enhance imputation quality and accelerate inference. These models, however, predict values based on all input observations—including the missing values—thereby potentially compromising the fidelity of the imputed data. In this paper, we propose the Uncertainty-Aware Self-Attention (UASA) model to overcome these limitations. Our approach introduces two novel techniques: (i) A self-attention mechanism with a partially observed diagonal that effectively captures complex non-local dependencies in time series data—a characteristic also observed in fractional-order systems. This approach draws inspiration from fractional calculus, where non-integer-order derivatives better characterize complex dynamical systems with long-memory effects, providing a more comprehensive mathematical framework for handling temporal data. And (ii) uncertainty quantification in data imputation to better inform downstream tasks. The UASA model comprises an upstream component for data imputation and a downstream component for time series prediction, trained jointly in an end-to-end fashion to optimize both imputation accuracy and task-specific objectives simultaneously. For classification tasks, the UASA model demonstrates remarkable performance even under high missing data rates, achieving a ROC-AUC of (Formula presented.), a PR-AUC of (Formula presented.), and an F1-SCORE of (Formula presented.). For forecasting tasks on the AUST-Gait dataset, the UASA model achieves a Mean Squared Error (MSE) of 0.72 under (Formula presented.) missing data conditions (i.e., complete data input). Under the end-to-end training strategy evaluated across all missing data rates, the model achieves an average MSE of 0.74, showcasing its adaptability and robustness across diverse missing data scenarios.

Publication DOI: https://doi.org/10.3390/fractalfract9030181
Divisions: College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies
Aston University (General)
Funding Information: This work was supported by the University Synergy Innovation Program of Anhui Province (grant numbers GXXT-2022-053); Anhui New Era Education Quality Project (Graduate Education); Provincial Graduate students “Innovation and Entrepreneurship Star” (grant
Additional Information: Copyright © 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Uncontrolled Keywords: missing values,self-attention,time series predictive,uncertainty,Analysis,Statistical and Nonlinear Physics,Statistics and Probability
Publication ISSN: 2504-3110
Data Access Statement: The original data presented in the study are openly available in https://github.com/LIbbbao/AUST_gait (accessed on 13 March 2025). Any further question is available by the first author or corresponding author on request.
Last Modified: 01 Oct 2025 07:13
Date Deposited: 18 Sep 2025 15:52
Full Text Link:
Related URLs: http://www.scop ... tnerID=8YFLogxK (Scopus URL)
https://www.mdp ... 04-3110/9/3/181 (Publisher URL)
PURE Output Type: Article
Published Date: 2025-03
Published Online Date: 2025-03-16
Accepted Date: 2025-03-14
Authors: Li, Jiabao
Wang, Chengjun
Su, Wenhang
Ye, Dongdong
Wang, Ziyang (ORCID Profile 0000-0003-1605-0873)

Download

[img]

Version: Published Version

License: Creative Commons Attribution


Export / Share Citation


Statistics

Additional statistics for this record