A multi-scale attention-based Swin transformer model for medical images segmentation - Scientific Reports

A multi-scale attention-based Swin transformer model for medical image segmentation

Accurate medical image segmentation plays a vital role in disease diagnosis and helps physicians analyze target regions effectively. To support this task, there is a strong need for intelligent systems that streamline diagnostics and minimize human errors.

Conventional neural networks typically face issues such as excessive parameters, heavy computational demands, and suboptimal accuracy. This study introduces a Swin transformer architecture that addresses these concerns through a transformer-based design optimized for segmentation tasks.

Model Architecture and Methodology

The proposed approach leverages a Swin transformer as an encoder to capture deep visual features from medical images. By using sliding windows combined with an attention mechanism, the model extracts essential details more effectively across different regions of an image.

To maintain computational efficiency while achieving high precision, the model architecture emphasizes balanced optimization between accuracy and resource demands.

Decoder Components

The decoder incorporates a specially designed Dynamic Feature Fusion Block (DFFB) that enhances extracted representations and enables multi-scale feature analysis. This multi-level processing helps the system understand the structural complexity of medical images with greater precision.

Additionally, a Dynamic Attention Enhancement Block refines these outputs by applying both spatial and channel attention mechanisms. These methods ensure that critical image areas receive additional focus, improving the final segmentation quality.

“The overarching goal of this research is to design an optimized architecture for medical image segmentation that maintains high accuracy while reducing the number of network parameters and minimizing computational costs.”

Summary

This work presents an optimized Swin transformer model and advanced attention modules for accurate, efficient segmentation of medical images.

Author’s summary: The study proposes a Swin transformer-based segmentation model enhancing diagnostic accuracy while reducing computational complexity and improving multi-scale feature analysis.

more

Nature Nature — 2025-11-07