Citation: | SHAO Xiao-feng, SU Jing-yi, WANG Jin. On-Chip Training and Its Noise Immunity for Optical Convolutional Neural Networks[J]. Chinese Optics. doi: 10.37188/CO.2025-0016 |
The hybrid optical-electronic optical convolutional neural network (OCNN) combines the parallel linear computation capabilities of photonic devices with the nonlinear processing advantages of electronic components, demonstrating significant potential in classification tasks. However, the fabrication inaccuracies of photonic devices and the circuit noise in FPGA-based backpropagation notably degrade the network performance. In this work, the hybrid OCNN is constructed, where the linear computations are performed by optical computing layers based on Mach-Zehnder interferometers (MZIs), while the pooling operations and the training process are implemented on the FPGA. This study focuses on the feasibility of on-chip training on FPGA, analyzing the impact of noise on training performance and proposing the network optimization strategies to enhance the noise immunity of OCNN. Specifically, the noise immunity is improved by adjusting the pooling method and pooling size, and the Dropout regularization is introduced after the pooling layer to further enhance the model's recognition accuracy. Experimental results indicate that the proposed on-chip training scheme effectively mitigates errors caused by the fabrication inaccuracy in the photonic devices. However, the circuit noise remains the primary factor limiting the OCNN performance. Notably, under the high circuit noise conditions, e.g. when the standard deviation of MZI phase error caused by circuit noise reaches 0.003, the combination of maximum pooling and Dropout regularization significantly improves the recognition accuracy of OCNN, which achieves a maximum of 78%. This research provides valuable insights for implementing on-chip training in OCNNs and explores new approaches for deploying hybrid optical-electronic architectures in high-noise environments.
[1] |
ZHAO X, WANG L M, ZHANG Y F, et al. A review of convolutional neural networks in computer vision[J]. Artificial Intelligence Review, 2024, 57(4): 99. doi: 10.1007/s10462-024-10721-6
|
[2] |
SUN Y N, XUE B, ZHANG M J, et al. Evolving deep convolutional neural networks for image classification[J]. IEEE Transactions on Evolutionary Computation, 2020, 24(2): 394-407. doi: 10.1109/TEVC.2019.2916183
|
[3] |
WALDROP M M. More than moore[J]. Nature, 2016, 530(7589): 144-148. (查阅网上资料, 未找到本条文献期号信息, 请确认).
|
[4] |
WANG Y. Neural networks on chip: From CMOS accelerators to in-memory-computing[C]. Proceedings of 2018 31st IEEE International System-on-Chip Conference (SOCC), IEEE, 2018: 1-3.
|
[5] |
FU T ZH, ZHANG J F, SUN R, et al. Optical neural networks: progress and challenges[J]. Light: Science & Applications, 2024, 13(1): 263.
|
[6] |
FELDMANN J, YOUNGBLOOD N, WRIGHT C D, et al. All-optical spiking neurosynaptic networks with self-learning capabilities[J]. Nature, 2019, 569(7755): 208-214. doi: 10.1038/s41586-019-1157-8
|
[7] |
GU L X, ZHANG L F, NI R H, et al. Giant optical nonlinearity of Fermi polarons in atomically thin semiconductors[J]. Nature Photonics, 2024, 18(8): 816-822. doi: 10.1038/s41566-024-01434-x
|
[8] |
ASHTIANI F, ON M B, SANCHEZ-JACOME D, et al. Photonic max-pooling for deep neural networks using a programmable photonic platform[C]. Proceedings of 2023 Optical Fiber Communications Conference and Exhibition (OFC), IEEE, 2023: 1-3.
|
[9] |
SHAO X F, SU J Y, LU M H, et al. All-optical convolutional neural network with on-chip integrable optical average pooling for image classification[J]. Applied Optics, 2024, 63(23): 6263-6271. doi: 10.1364/AO.524502
|
[10] |
YU Y Z, CAO Y, WANG G, et al. Optical diffractive convolutional neural networks implemented in an all-optical way[J]. Sensors, 2023, 23(12): 5749. doi: 10.3390/s23125749
|
[11] |
XU SH F, WANG J, WANG R, et al. High-accuracy optical convolution unit architecture for convolutional neural networks by cascaded acousto-optical modulator arrays[J]. Optics Express, 2019, 27(14): 19778-19787. doi: 10.1364/OE.27.019778
|
[12] |
CHENG Y, ZHANG J N, ZHOU T K, et al. Photonic neuromorphic architecture for tens-of-task lifelong learning[J]. Light: Science & Applications, 2024, 13(1): 56.
|
[13] |
QI J, WANG SH, LIU ZH, et al. A gradient-free training approach for optical neural networks based on stochastic functions[J]. Proceedings of SPIE, 2024, 13236: 132360R.
|
[14] |
GU J, ZHU H, FENG C, et al. L2ight: enabling on-chip learning for optical neural networks via efficient in-situ subspace optimization[C]. Proceedings of the 35th International Conference on Neural Information Processing Systems, Curran Associates Inc. , 2021: 662.
|
[15] |
FANG M Y S, MANIPATRUNI S, WIERZYNSKI C, et al. Design of optical neural networks with component imprecisions[J]. Optics Express, 2019, 27(10): 14009-14029. doi: 10.1364/OE.27.014009
|
[16] |
SHOKRANEH F, GEOFFROY-GAGNON S, LIBOIRON-LADOUCEUR O. The diamond mesh, a phase-error- and loss-tolerant field-programmable MZI-based optical processor for optical neural networks[J]. Optics Express, 2020, 28(16): 23495-23508. doi: 10.1364/OE.395441
|
[17] |
MOJAVER K H R, ZHAO B K, LEUNG E, et al. Addressing the programming challenges of practical interferometric mesh based optical processors[J]. Optics Express, 2023, 31(15): 23851-23866. doi: 10.1364/OE.489493
|
[18] |
TSAI Y H, HAMSICI O C, YANG M H. Adaptive region pooling for object detection[C]. Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2015: 731-739.
|
[19] |
SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
|
[20] |
CLEMENTS W R, HUMPHREYS P C, METCALF B J, et al. Optimal design for universal multiport interferometers[J]. Optica, 2016, 3(12): 1460-1465. doi: 10.1364/OPTICA.3.001460
|
[21] |
MITTAL S. A survey of FPGA-based accelerators for convolutional neural networks[J]. Neural Computing and Applications, 2020, 32(4): 1109-1139. doi: 10.1007/s00521-018-3761-1
|
[22] |
HAMERLY R, BANDYOPADHYAY S, ENGLUND D. Asymptotically fault-tolerant programmable photonics[J]. Nature Communications, 2022, 13(1): 6831. doi: 10.1038/s41467-022-34308-3
|
[23] |
SALEHIN I, KANG D K. A review on dropout regularization approaches for deep neural networks within the scholarly domain[J]. Electronics, 2023, 12(14): 3106. doi: 10.3390/electronics12143106
|
[24] |
GHOLAMALINEZHAD H, KHOSRAVI H. Pooling methods in deep neural networks, a review[J]. arxiv preprint arxiv:, 2009, 07485: 2020. (查阅网上资料, 不确定本条文献类型与格式, 请确认).
|
[25] |
SALEHIN I, KANG D K. A review on dropout regularization approaches for deep neural networks within the scholarly domain[J]. Electronics, 2023, 12(14): 3106. (查阅网上资料, 本条文献与第23条文献重复, 请确认).
|
[26] |
JU Y G. Scalable optical convolutional neural networks based on free-space optics using lens arrays and a spatial light modulator[J]. Journal of Imaging, 2023, 9(11): 241. doi: 10.3390/jimaging9110241
|
[27] |
HE K M, ZHANG X Y, REN SH Q, et al. Deep residual learning for image recognition[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2016: 770-778.
|
[28] |
PEARL N, TREIBITZ T, KORMAN S. NAN: Noise-aware NeRFs for burst-denoising[C]. Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2022: 12662-12671.
|
[29] |
XIE J, LIU S Y, CHEN J X, et al. Huber loss based distributed robust learning algorithm for random vector functional-link network[J]. Artificial Intelligence Review, 2023, 56(8): 8197-8218. doi: 10.1007/s10462-022-10362-7
|