Rethinking Hybrid U-Shape Network with Pixel-Level Feature Learning for Retinal Vessel Segmentation

Abstract

Retinal vessel segmentation is a critical, non-destructive medical imaging task in computer vision, essential for diagnosing fundus diseases. Although deep learning methods dominate this field, existing U-shaped encoder-decoder networks with skip connections face limitations when handling discrepancies in multi-scale features. Shallow encoder and decoder stages produce high-resolution but low-dimensional feature maps, effectively capturing fine vessel details, whereas deeper stages (such as the bottleneck) generate lower-resolution, high-dimensional feature maps rich in semantic information. Traditional U-shaped architectures often struggle to effectively integrate these distinct types of features. To address these challenges, this paper introduces a redesigned U-shaped network that incorporates modified convolution and transformer layers tailored specifically for segmenting slender and tortuous retinal vessel structures. A Multi-Core Channel-Spatial Attention (MCCSA) block replaces conventional skip connections, enhancing the extraction of high-frequency texture features in shallow stages. For deeper stages, a Pixel-level Vision Transformer (P-ViT) is introduced to model semantic interconnections among pixels, thereby improving semantic feature recognition. Furthermore, a Pixel-level residual dynamic adaptive Convolutional Neural Network (P-CNN) is proposed to better capture the intricate curved topology of blood vessels. The proposed method is evaluated on two publicly available benchmark datasets, demonstrating significant segmentation performance improvements compared to existing U-shaped methods. Our contributions include enhanced multi-scale feature integration, improved semantic feature learning, and refined extraction of vessel topology.

Publication DOI: https://doi.org/10.1109/ACCESS.2026.3663080
Divisions: College of Engineering & Physical Sciences
College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies
Aston University (General)
Additional Information: This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
Publication ISSN: 2169-3536
Data Access Statement: The code, evaluation metrics, trained network weights, and datasets will be made publicly available at https://github.com/ziyangwang007/CVPixUNet.
Last Modified: 12 Feb 2026 08:06
Date Deposited: 11 Feb 2026 10:42
Full Text Link:
Related URLs: https://ieeexpl ... cument/11386860 (Publisher URL)
PURE Output Type: Article
Published Date: 2026-02-09
Published Online Date: 2026-02-09
Accepted Date: 2026-02-01
Authors: Wang, Ziyang (ORCID Profile 0000-0003-1605-0873)
Wu, Mian

Download

[img]

Version: Accepted Version

License: Creative Commons Attribution


Export / Share Citation


Statistics

Additional statistics for this record