Abstract:Aiming at the problem of low detection accuracy due to the different scale sizes of apple leaf disease spots and their similarity to the background, this paper proposes a multi-scale lightweight network (MSL-Net). Firstly, a multiplexed aggregated feature extraction network is proposed using residual bottleneck block (RES-Bottleneck) and middle partial-convolution (MP-Conv) to capture multi-scale spatial features and enhance focus on disease features for better differentiation between disease targets and background information. Secondly, a lightweight feature fusion network is designed using scale-fuse concatenation (SF-Cat) and triple-scale sequence feature fusion (TSSF) module to merge multi-scale feature maps comprehensively. Depthwise convolution (DWConv) and GhostNet lighten the network, while the cross stage partial bottleneck with 3 convolutions ghost-normalization attention module (C3-GN) reduces missed detections by suppressing irrelevant background information. Finally, soft non-maximum suppression (Soft-NMS) is used in the post-processing stage to improve the problem of misdetection of dense disease sites. The results show that the MSL-Net improves mean average precision at intersection over union of 0.5 (mAP@0.5) by 2.0% over the baseline you only look once version 5s (YOLOv5s) and reduces parameters by 44%, reducing computation by 27%, outperforming other state-of-the-art (SOTA) models overall. This method also shows excellent performance compared to the latest research.