
尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!



郑一臻 戴键 张天 徐坤

郑一臻, 戴键, 张天, 徐坤. 基于异构光子神经网络的多模态特征融合[J]. 中国光学(中英文), 2023, 16(6): 1343-1355. doi: 10.37188/CO.2023-0036
引用本文: 郑一臻, 戴键, 张天, 徐坤. 基于异构光子神经网络的多模态特征融合[J]. 中国光学(中英文), 2023, 16(6): 1343-1355. doi: 10.37188/CO.2023-0036
ZHENG Yi-zhen, DAI Jian, ZHANG Tian, XU Kun. Multimodal feature fusion based on heterogeneous optical neural networks[J]. Chinese Optics, 2023, 16(6): 1343-1355. doi: 10.37188/CO.2023-0036
Citation: ZHENG Yi-zhen, DAI Jian, ZHANG Tian, XU Kun. Multimodal feature fusion based on heterogeneous optical neural networks[J]. Chinese Optics, 2023, 16(6): 1343-1355. doi: 10.37188/CO.2023-0036


基金项目: 国家自然科学基金资助(No. 62171055,No. 61705015,No. 61625104,No. 61821001,No. 62135009,No. 61971065);国家重点研发计划资助(No. 2019YFB1803504);信息光子学与光通信国家重点实验室(北京邮电大学)基金资助(No. IPOC2020ZT08,No. IPOC2020ZT03)


    戴 键(1987—),男,安徽合肥人,北京邮电大学电子工程学院副教授,博士生导师,主要从事微波光子学、集成光子学等方面的研究。E-mail:daijian@bupt.edu.cn

    张 天(1988—),女,湖北孝感人,北京邮电大学电子工程学院副教授,博士生导师,主要从事智能光计算、光子器件智能设计与优化、微纳光子学等方面的研究。E-mail:ztian@bupt.edu.cn

    徐 坤(1973—),男,湖南人,北京邮电大学电子工程学院教授,博士生导师,主要从事信息光子学等方面的研究。E-mail:xukun@bupt.edu.cn

  • 中图分类号: TP183

Multimodal feature fusion based on heterogeneous optical neural networks

Funds: Supported by the National Natural Science Foundation of China (No. 62171055, No. 61705015, No. 61625104, No. 61821001, No. 62135009, No. 61971065); National Key Research and Development Program (No. 2019YFB1803504); the State Key Laboratory of Information Photonics and Optical Communications (Beijing University of Posts and Telecommunications) (No. IPOC2020ZT08, No. IPOC2020ZT03)
More Information
  • 摘要:



  • 图 1  异构光子神经网络的结构示意图

    Figure 1.  Schematic diagram of the structure of the heterogeneous photonic neural network

    图 2  光学卷积结果

    Figure 2.  Optical convolution results

    图 3  (a)AbsSquared非线性激活函数结构及(b)其测试结果

    Figure 3.  (a) AbsSquared nonlinear activation function structure and (b) the test results

    图 4  端口输出光功率波形图

    Figure 4.  Port output optical power waveform

    图 5  学习率和优化器的选择

    Figure 5.  Learning rate and optimizer selection

    图 6  空间注意力模块

    Figure 6.  Spatial attention module

    图 7  基于注意力机制的异构光子神经网络结构示意图

    Figure 7.  Schematic diagram of heterogeneous photonic neural network structure based on attention mechanism

    图 8  基于注意力机制的异构光子神经网络的学习率和优化器的选择

    Figure 8.  Learning rate and optimizer selection for heterogeneous photonic neural networks based on attentionmechanism

    图 9  随机高斯噪声对训练集准确率的影响

    Figure 9.  The effect of random Gaussian noise on the accuracy of the training set

    表  1  拼接融合的异构电子神经网络训练各部分时间占比

    Table  1.   Time share of each part of training for heterogeneous electronic neural networks with splicing and fusion

    下载: 导出CSV

    表  2  基于注意力机制融合的异构电子神经网络训练各部分时间占比

    Table  2.   Time share of each part of training of heterogeneous electronic neural networks based on the fusion of attention mechanisms

    下载: 导出CSV

    表  3  先进方法分类结果对比表

    Table  3.   Comparison of classification results of advanced methods

    下载: 导出CSV
  • [1] 王惠琴, 侯文斌, 黄瑞, 等. 基于深度学习的空间脉冲位置调制多分类检测器[J]. 中国光学,2023,16(2):415-424. doi: 10.37188/CO.2022-0106

    WANG H Q, HOU W B, HUANG R, et al. Spatial pulse position modulation multi-classification detector based on deep learning[J]. Chinese Optics, 2023, 16(2): 415-424. (in Chinese) doi: 10.37188/CO.2022-0106
    [2] 姜林奇, 宁春玉, 余海涛. 基于多尺度特征与通道特征融合的脑肿瘤良恶性分类模型[J]. 中国光学,2022,15(6):1339-1349. doi: 10.37188/CO.2022-0067

    JIANG L Q, NING CH Y, YU H T, et al. Classification model based on fusion of multi-scale feature and channel feature for benign and malignant brain tumors[J]. Chinese Optics, 2022, 15(6): 1339-1349. (in Chinese) doi: 10.37188/CO.2022-0067
    [3] 李冠楠, 石俊凯, 陈晓梅, 等. 基于机器学习的过焦扫描显微测量方法研究[J]. 中国光学,2022,15(4):703-711. doi: 10.37188/CO.2022-0009

    LI G N, SHI J K, CHEN X M, et al. Through-focus scanning optical microscopy measurement based on machine learning[J]. Chinese Optics, 2022, 15(4): 703-711. (in Chinese) doi: 10.37188/CO.2022-0009
    [4] 肖树林, 胡长虹, 高路尧, 等. 像元映射变分辨率光谱成像重构[J]. 中国光学,2022,15(5):1045-1054. doi: 10.37188/CO.2022-0108

    XIAO SH L, HU CH H, GAO L Y, et al. Pixel mapping variable-resolution spectral imaging reconstruction[J]. Chinese Optics, 2022, 15(5): 1045-1054. (in Chinese) doi: 10.37188/CO.2022-0108
    [5] MARKRAM H, MULLER E, RAMASWAMY S, et al. Reconstruction and simulation of neocortical microcircuitry[J]. Cell, 2015, 163(2): 456-492. doi: 10.1016/j.cell.2015.09.029
    [6] GOODMAN J W, DIAS A R, WOODY L M. Fully parallel, high-speed incoherent optical method for performing discrete Fourier transforms[J]. Optics Letters, 1978, 2(1): 1-3. doi: 10.1364/OL.2.000001
    [7] RECK M, ZEILINGER A, BERNSTEIN H J, et al. Experimental realization of any discrete unitary operator[J]. Physical Review Letters, 1994, 73(1): 58-61. doi: 10.1103/PhysRevLett.73.58
    [8] CLEMENTS W R, HUMPHREYS P C, METCALF B J, et al. Optimal design for universal multiport interferometers[J]. Optica, 2016, 3(12): 1460-1465. doi: 10.1364/OPTICA.3.001460
    [9] SHEN Y CH, HARRIS N C, SKIRLO S, et al. Deep learning with coherent nanophotonic circuits[J]. Nature Photonics, 2017, 11(7): 441-446. doi: 10.1038/nphoton.2017.93
    [10] ZHANG T, WANG J, LIU Q, et al. Efficient spectrum prediction and inverse design for plasmonic waveguide systems based on artificial neural networks[J]. Photonics Research, 2019, 7(3): 368-380. doi: 10.1364/PRJ.7.000368
    [11] BAGHERIAN H, SKIRLO S, SHEN Y CH, et al. On-chip optical convolutional neural networks[J]. arXiv:, 1808, 03303: 2018.
    [12] QU Y R, ZHU H ZH, SHEN Y CH, et al. Inverse design of an integrated-nanophotonics optical neural network[J]. Science Bulletin, 2020, 65(14): 1177-1183. doi: 10.1016/j.scib.2020.03.042
    [13] DAN Y H, FAN Z Y, SUN X J, et al. All-type optical logic gates using plasmonic coding metamaterials and multi-objective optimization[J]. Optics Express, 2022, 30(7): 11633-11646. doi: 10.1364/OE.449280
    [14] ZHANG CH, YANG Z CH, HE X D, et al. Multimodal intelligence: representation learning, information fusion, and applications[J]. IEEE Journal of Selected Topics in Signal Processing, 2020, 14(3): 478-493. doi: 10.1109/JSTSP.2020.2987728
    [15] HUANG Y, DU CH ZH, XUE Z H, et al.. What makes multi-modal learning better than single (provably)[C]. 35th Conference on Neural Information Processing Systems, NeurIPS, 2021: 10944-10956.
    [16] PENG X K, WEI Y K, DENG A D, et al.. Balanced multimodal learning via on-the-fly gradient modulation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2022: 8228-8237.
    [17] RAMESH A, PAVLOV M, GOH G, et al.. Zero-shot text-to-image generation[C]. Proceedings of the 38th International Conference on Machine Learning, ICML, 2021: 8821-8831.
    [18] NAGRANI A, YANG SH, ARNAB A, et al.. Attention bottlenecks for multimodal fusion[C]. 35th Conference on Neural Information Processing Systems, NeurIPS, 2021: 14200-14213.
    [19] TROSTEN D J, LØKSE S, JENSSEN R, et al.. Reconsidering representation alignment for multi-view clustering[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2021: 1255-1265.
    [20] JIA CH, YANG Y F, XIA Y, et al.. Scaling up visual and vision-language representation learning with noisy text supervision[C]. Proceedings of the 38th International Conference on Machine Learning, ICML, 2021: 4904-4916.
    [21] ANASTASOPOULOS A, KUMAR S, LIAO H. Neural language modeling with visual features[J]. arXiv:, 1903, 02930: 2019.
    [22] VIELZEUF V, LECHERVY A, PATEUX S, et al.. Centralnet: a multilayer approach for multimodal fusion[C]. Proceedings of the European Conference on Computer Vision, Munich, 2019: 575-589.
    [23] ZHANG H, GU M, JIANG X D, et al. An optical neural chip for implementing complex-valued neural network[J]. Nature Communications, 2021, 12(1): 457. doi: 10.1038/s41467-020-20719-7
    [24] WOO S, PARK J, LEE J Y, et al.. CBAM: convolutional block attention module[C]. Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, 2018: 3-19.
    [25] LIN X, RIVENSON Y, YARDIMCI N T, et al. All-optical machine learning using diffractive deep neural networks[J]. Science, 2018, 361(6406): 1004-1008. doi: 10.1126/science.aat8084
    [26] WU Q H, SUI X B, FEI Y H, et al. Multi-layer optical Fourier neural network based on the convolution theorem[J]. AIP Advances, 2021, 11(5): 055012. doi: 10.1063/5.0055446
    [27] FELDMANN J, YOUNGBLOOD N, KARPOV M, et al. Parallel convolutional processing using an integrated photonic tensor core[J]. Nature, 2021, 589(7840): 52-58. doi: 10.1038/s41586-020-03070-1
    [28] ZHANG D N, ZHANG Y J, ZHANG Y, et al. Training and inference of optical neural networks with noise and low-bits control[J]. Applied Sciences, 2021, 11(8): 3692. doi: 10.3390/app11083692
    [29] KRIEGESKORTE N. Deep neural networks: a new framework for modeling biological vision and brain information processing[J]. Annual Review of Vision Science, 2015, 1: 417-446. doi: 10.1146/annurev-vision-082114-035447
    [30] GENG Y, HAN Z B, ZHANG CH Q, et al.. Uncertainty-aware multi-view representation learning[C]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 7545-7553.
    [31] JIA X D, JING X Y, ZHU X K, et al. Semi-supervised multi-view deep discriminant representation learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(7): 2496-2509. doi: 10.1109/TPAMI.2020.2973634
    [32] HAN Z B, ZHANG CH Q, FU H ZH, et al. Trusted multi-view classification with dynamic evidential fusion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 2551-2566. doi: 10.1109/TPAMI.2022.3171983
    [33] SHAO R, ZHANG G, GONG X. Generalized robust training scheme using genetic algorithm for optical neural networks with imprecise components[J]. Photonics Research, 2022, 10(8): 1868-1876. doi: 10.1364/PRJ.449570
  • 加载中
图(9) / 表(3)
  • 文章访问数:  561
  • HTML全文浏览量:  163
  • PDF下载量:  178
  • 被引次数: 0
  • 收稿日期:  2023-03-01
  • 修回日期:  2023-04-04
  • 网络出版日期:  2023-07-11


