Indistinguishable points attention-aware network for infrared small object detection
-
摘要:
随着飞行器机动性能提升,多帧红外小目标检测方法不足以满足检测要求。近年来,基于深度学习的单帧红外小目标检测方法取得了巨大成功。然而,红外小目标通常缺少形状特征,而且边界与背景模糊不清,给准确分割带来了一定的挑战。基于此,提出了难点注意力感知红外小目标检测网络。首先,通过基于点的区域建议模块获取目标潜在区域,同时滤除多余背景。然后,为实现高质量分割,细化掩码边界模块判断粗掩码中无序、非局部难以分辨点,融合这些难点的多尺度特征,进行逐像素注意力建模。最后,由点检测头对难点注意力感知特征重新预测,生成精细分割掩码。在公开数据集NUDT-SIRST和IRDST上mAP达到87.4和63.4,F值达到0.8935和0.7056。可在多检测场景、多目标形态下实现准确分割,抑制误报信息,同时控制计算开销。
Abstract:As aircraft maneuverability increases, multi-frame infrared small target detection methods are becoming insufficient to meet detection requirements. In recent years, significant progress has been achieved in single-frame infrared small-target detection methods based on deep learning; however, infrared small targets often lack shape features and have blurred boundaries and backgrounds, obstructing accurate segmentation. Based on this, an indistinguishable points attention-aware network for infrared small object detection was proposed. First, potential target areas were acquired through a point-based region proposal module while filtering out redundant backgrounds; then, to achieve high-quality segmentation, the fine mask boundary module determined disordered, non-local indistinguishable points in the coarse mask, fused multi-scale features, and modeled the attention pixel by pixel; finally, the point detection head generated a fine segmentation mask by re-predicting the indistinguishable points’ attention-aware features. The proposed method reached 87.4 mAP and 63.4 mAP on the publicly available datasets NUDT-SIRST and IRDST, and the F-measure reached 0.8935 and 0.7056, respectively. It can achieve accurate segmentation in multi-detection scenarios and multi-target morphology, suppressing false alarm information while controlling the computational overhead.
-
表 4 不同区域建议模块对比表
Table 4. Comparison of different region proposal modules
建议数量 基于点的区域建议 RPN mAP F值 mAP F值 1000 87.9 0.8927 86.2 0.8425 256 87.5 0.8962 85.8 0.8412 128 87.4 0.8935 85.2 0.8406 64 86.0 0.8901 84.5 0.8397 表 5 不同选点策略检测结果
Table 5. Detection results of different point selection strategies
选点策略 mAP 均匀选点 86.7 k=1,γ=0.00 86.9 k=3,γ=0.75 87.4 k=10,γ=1.00 85.8 表 1 传统算法超参数设置
Table 1. Traditional Algorithm Hyperparameter Settings
传统算法 超参数设置 Top-hat Nhood=ones(5) LEF h=0.2,α=0.5, P=9 AADCDD 内窗口尺寸={3, 5, 7, 9},外窗口尺寸=19 TLLCM 窗口尺寸={3, 5, 7, 9},k=9 表 2 各方法在NUDT-SIRST及IRDST数据集定量结果对比
Table 2. Comparison of quantitative results of each method on NUDT-SIRST and IRDST datasets
检测算法 NUDT-SIRST IRDST mAP F值(Pre,Rec) mAP F值(Pre,Rec) Top-hat 1.5 0.3599(0.2850,0.4884) 0.7 0.0088(0.0045,0.4107) LEF 6.4 0.1151(0.0748,0.2498) 2.5 0.1219(0.0686,0.5470) AADCDD 1.6 0.1490(0.3838,0.0924) 1.4 0.0705(0.0521,0.1090) TLLCM 16.5 0.0724(0.0479,0.1476) 6.1 0.1881(0.1254,0.3759) ALCNet 69.3 0.7595(0.7035,0.8251) 46.5 0.5929(0.5461,0.6486) DNANet 86.9 0.8645(0.9070,0.8259) 62.1 0.6697(0.7124,0.6319) RDIAN 82.4 0.8900(0.8990,0.8811) 60.0 0.7102(0.7092,0.7113) 本文方法 87.4 0.8935(0.8923,0.8948) 63.4 0.7056(0.7183,0.6935) 表 3 深度学习方法单张图片平均推理时间
Table 3. Average inference time of a single image for deep learning methods
检测算法 Time/s Time/s NUDT-SIRST IRDST ALCNet 0.104 0.166 DNANet 0.089 0.259 RDIAN 0.065 0.114 本文算法 0.099 0.121 表 6 难点不同特征融合结果
Table 6. Fusion results of different features at indistinguishable points
细粒度特征 粗糙掩码 位置嵌入 mAP √ 85.5 √ √ 85.8 √ √ √ 87.4 表 7 不同细化方案检测结果
Table 7. Results of different refinement methods
细化方案 mAP CNN(16×16) 85.5 MLP(16×16) 86.2 细化掩码边界模块(S=3) 87.4 细化掩码边界模块(S=6) 87.6 -
[1] 单秋莎, 谢梅林, 刘朝晖, 等. 制冷型长波红外光学系统设计[J]. 中国光学,2022,15(1):72-78. doi: 10.37188/CO.2021-0116SHAN Q SH, XIE M L, LIU ZH H, et al. Design of cooled long-wavelength infrared imaging optical system[J]. Chinese Optics, 2022, 15(1): 72-78. (in Chinese). doi: 10.37188/CO.2021-0116 [2] MA T L, YANG ZH, WANG J Q, et al. Infrared small target detection network with generate label and feature mapping[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 6505405. [3] SUN Y, YANG J G, AN W. Infrared dim and small target detection via multiple subspace learning and spatial-temporal patch-tensor model[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(5): 3737-3752. doi: 10.1109/TGRS.2020.3022069 [4] 赵鹏鹏, 李庶中, 李迅, 等. 融合视觉显著性和局部熵的红外弱小目标检测[J]. 中国光学,2022,15(2):267-275. doi: 10.37188/CO.2021-0170ZHAO P P, LI SH ZH, LI X, et al. Infrared dim small target detection based on visual saliency and local entropy[J]. Chinese Optics, 2022, 15(2): 267-275. (in Chinese). doi: 10.37188/CO.2021-0170 [5] GAO C Q, MENG D Y, YANG Y, et al. Infrared patch-image model for small target detection in a single image[J]. IEEE Transactions on Image Processing, 2013, 22(12): 4996-5009. doi: 10.1109/TIP.2013.2281420 [6] CHEN C L P, LI H, WEI Y T, et al. A local contrast method for small infrared target detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2014, 52(1): 574-581. doi: 10.1109/TGRS.2013.2242477 [7] XIA CH Q, LI X R, ZHAO L Y, et al. Infrared small target detection based on multiscale local contrast measure using local energy factor[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 17(1): 157-161. doi: 10.1109/LGRS.2019.2914432 [8] HAN J H, MORADI S, FARAMARZI I, et al. A local contrast method for infrared small-target detection utilizing a tri-layer window[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 17(10): 1822-1826. doi: 10.1109/LGRS.2019.2954578 [9] 刘彦磊, 李孟喆, 王宣宣. 轻量型YOLOv5s车载红外图像目标检测[J]. 中国光学,2023,16(5):1045-1055. doi: 10.37188/CO.2022-0254LIU Y L, LI M ZH, WANG X X. Lightweight YOLOv5s vehicle infrared image target detection[J]. Chinese Optics, 2023, 16(5): 1045-1055. (in Chinese). doi: 10.37188/CO.2022-0254 [10] PANG Y W, WANG T C, ANWER R M, et al. Efficient featurized image pyramid network for single shot detector[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2019: 7328-7336. [11] YANG X, YAN J CH, FENG Z M, et al. R3Det: Refined single-stage detector with feature refinement for rotating object[C]. Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI Press, 2020. [12] LIU Y ZH, CAO S, LASANG P, et al. Modular lightweight network for road object detection using a feature fusion approach[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2021, 51(8): 4716-4728. [13] ZHANG SH F, WEN L Y, BIAN X, et al. Single-shot refinement neural network for object detection [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018: 4203-4212. [14] YUAN Y, XIONG ZH T, WANG Q. VSSA-NET: Vertical spatial sequence attention network for traffic sign detection[J]. IEEE Transactions on Image Processing, 2019, 28(7): 3423-3434. doi: 10.1109/TIP.2019.2896952 [15] PANG Y W, CAO J L, WANG J, et al. JCS-Net: Joint classification and super-resolution network for small-scale pedestrian detection in surveillance images[J]. IEEE Transactions on Information Forensics and Security, 2019, 14(12): 3322-3331. doi: 10.1109/TIFS.2019.2916592 [16] DAI Y M, WU Y Q, ZHOU F, et al. Attentional local contrast networks for infrared small target detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(11): 9813-9824. doi: 10.1109/TGRS.2020.3044958 [17] LI B Y, XIAO CH, WANG L G, et al. Dense nested attention network for infrared small target detection[J]. IEEE Transactions on Image Processing, 2023, 32: 1745-1758. doi: 10.1109/TIP.2022.3199107 [18] WANG K W, DU SH Y, LIU CH X, et al. Interior attention-aware network for infrared small target detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5002013. [19] GIRSHICK R. Fast R-CNN[C]. 2015 IEEE International Conference on Computer Vision (ICCV), IEEE, 2015: 1440-1448. [20] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An Image is worth 16x16 words: Transformers for image recognition at scale[C]. 9th International Conference on Learning Representations, OpenReview. net, 2021. [21] SUN H, BAI J X, YANG F, et al. Receptive-field and direction induced attention network for infrared dim small target detection with a large-scale dataset IRDST[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1-13. [22] ZHOU X Y, KARPUR A, LUO L J, et al. StarMap for category-agnostic keypoint and viewpoint estimation[C]. Proceedings of the European Conference on Computer Vision, Springer, 2018: 328-345. [23] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]. 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017: 2999-3007. [24] YANG Z, LIU SH H, HU H, et al. RepPoints: Point set representation for object detection[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, 2019: 9656-9665. [25] XU B, WANG N Y, CHEN T Q, et al. Empirical evaluation of rectified activations in convolutional network[Z]. arXiv preprint: 1505.00853, 2015. (查阅网上资料, 未能确认文献类型, 请确认) .XU B, WANG N Y, CHEN T Q, et al.. Empirical evaluation of rectified activations in convolutional network[Z]. arXiv preprint: 1505.00853, 2015. (查阅网上资料, 未能确认文献类型, 请确认). [26] ZHU X ZH, SU W J, LU L W, et al. Deformable DETR: deformable transformers for end-to-end object detection[C]. 9th International Conference on Learning Representations, OpenReview. net, 2020. [27] WU Y, KIRILLOV A, Massa F, et al. Detectron2[CP/OL]. (2019)[2023-8-24]. https://github.com/facebookresearch/detectron2. (查阅网上所有资料,未找到本条文献信息,请确认) .WU Y, KIRILLOV A, Massa F, et al.. Detectron2[CP/OL]. (2019)[2023-8-24]. https://github.com/facebookresearch/detectron2. (查阅网上所有资料,未找到本条文献信息,请确认). [28] TAN M X, PANG R M, LE Q V. EfficientDet: Scalable and efficient object detection[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2020: 10778-10787. [29] YU F, WANG D Q, SHELHAMER E, et al. Deep layer aggregation[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018: 2403-2412. [30] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[M]. FLEET D, PAJDLA T, SCHIELE B, et al. Computer Vision – ECCV 2014. Cham: Springer, 2014: 740-755. [31] WANG H, ZHOU L P, WANG L. Miss detection vs. false alarm: adversarial learning for small object segmentation in infrared images[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, 2019: 8508-8517. [32] RIVEST J F, FORTIN R. Detection of dim targets in digital infrared imagery by morphological image processing[J]. Optical Engineering, 1996, 35(7): 1886-1893. doi: 10.1117/1.600620 [33] AGHAZIYARATI S, MORADI S, TALEBI H. Small infrared target detection using absolute average difference weighted by cumulative directional derivatives[J]. Infrared Physics & Technology, 2019, 101: 78-87. [34] HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]. 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017: 2980-2988. [35] KIRILLOV A, WU Y X, HE K M, et al. PointRend: Image segmentation as rendering[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2020: 9796-9805.