Abstract:Colorectal cancer (CRC) is a prevalent disease, with polyps serving as its precursors. Accurate polyp segmentation is crucial for early CRC prevention. However, due to different sizes of the polyps, the boundaries are not clear. Therefore, accurate segmentation of polyps is a challenging task. This paper proposes vision Mamba attention feature fusion UNet (VMA-UNet), a U-shaped asymmetric codec structure model grounded in the state space model (SSM). The VMA-UNet incorporates attention feature fusion (AFF) in order to enhance the feature representation of small polyps. A new IUD loss function, namely combining intersection over union (IoU) loss function and Dice loss function, is proposed to address both large polyps and small polyps, and to mitigate the issue of data imbalance. When applied to multiple datasets, VMA-UNet demonstrates robust performance, particularly in small polyp segmentation, showcasing its practical value. The network proposed in this paper overcomes the inherent shortcomings of convolutional neural network (CNN) and transformers, not only performing well in remote interaction modeling, but also maintaining linear computational complexity. Our study introduces a new method for polyp segmentation based on SSM and advances the field.