NTRU is a lattice-based public key cryptosystem featuring reasonably short, easily created keys, high speed, and low memory requirements, seems viable for wireless network. This paper presents two optimized designs ba...NTRU is a lattice-based public key cryptosystem featuring reasonably short, easily created keys, high speed, and low memory requirements, seems viable for wireless network. This paper presents two optimized designs based on the enhanced NTRU algorithm. One is a light-weight and fast NTRU core, it performs encryption only. This work has a gate-count of 1175 gates and a power consumption of 1.51 μW. It can finish the whole encryption process in 1498 μs at 500 kHz. As such, it is perfect for wireless sensor network. Another high-speed NTRU core is capable of both encryption and decryption, with delays of 16,064 μs and 128,010 μs in encryption and decryption respectively. Moreover, it consists of 25,758 equivalent gates and has a total power consumption of 59.2 μW (it will be reduced greatly if low power methods were adopted). This core is recommended to be used in base stations or servers in wireless network.展开更多
The efficient implementation of the Advanced Encryption Standard(AES)is crucial for network data security.This paper presents novel hardware implementations of the AES S-box,a core component,using tower field represen...The efficient implementation of the Advanced Encryption Standard(AES)is crucial for network data security.This paper presents novel hardware implementations of the AES S-box,a core component,using tower field representations and Boolean Satisfiability(SAT)solvers.Our research makes several significant contri-butions to the field.Firstly,we have optimized the GF(24)inversion,achieving a remarkable 31.35%area reduction(15.33 GE)compared to the best known implementations.Secondly,we have enhanced multiplication implementa-tions for transformation matrices using a SAT-method based on local solutions.This approach has yielded notable improvements,such as a 22.22%reduction in area(42.00 GE)for the top transformation matrix in GF((24)2)-type S-box implementation.Furthermore,we have proposed new implementations of GF(((22)2)2)-type and GF((24)2)-type S-boxes,with the GF(((22)2)2)-type demonstrating superior performance.This implementation offers two variants:a small area variant that sets new area records,and a fast variant that establishes new benchmarks in Area-Execution-Time(AET)and energy consumption.Our approach significantly improves upon existing S-box implementations,offering advancements in area,speed,and energy consumption.These optimizations contribute to more efficient and secure AES implementations,potentially enhancing various cryptographic applications in the field of network security.展开更多
针对当前无人机(UAV)视角下小目标检测性能低以及漏检和误检的问题,提出基于YOLOv8改进的BDSYOLO(BiFPN-Dual-Small target detection-YOLO)模型。首先,使用RepViTBlock(Revisiting mobile CNN from ViT perspective Block)与EMA(Effici...针对当前无人机(UAV)视角下小目标检测性能低以及漏检和误检的问题,提出基于YOLOv8改进的BDSYOLO(BiFPN-Dual-Small target detection-YOLO)模型。首先,使用RepViTBlock(Revisiting mobile CNN from ViT perspective Block)与EMA(Efficient Multi-scale Attention)机制构造C2f-RE(C2f-RepViTBlock Efficient multi-scale attention)从而改进骨干网络中深层的C2f(faster implementation of CSP bottleneck with 2 Convolutions)模块,提升模型对小目标特征的提取能力并降低参数量;其次,使用双向特征金字塔网络(BiFPN)重构颈部网络,从而使不同层级的特征得以相互融合;然后,在改进颈部网络的基础上构造双重小目标检测层,并结合浅层和最浅层特征来提高模型对小目标的检测能力;最后,引入改进损失函数Inner-EIoU(Inner-Efficient-Intersection over Union),该函数使用更合理的宽高比衡量方式并解决交并比(IoU)自身的局限。实验结果表明,改进模型在VisDrone2019数据集上相对原始模型的精确率、召回率、mAP@50、mAP@50:95分别提升了8.5、7.7、9.2和6.3个百分点,而参数量仅为2.23×10~6,模型大小减小了19.1%。可见,所提模型在实现一定轻量化的同时显著提升了性能。展开更多
In medical imaging,accurate brain tumor classification in medical imaging requires real-time processing and efficient computation,making hardware acceleration essential.Field Programmable Gate Arrays(FPGAs)offer paral...In medical imaging,accurate brain tumor classification in medical imaging requires real-time processing and efficient computation,making hardware acceleration essential.Field Programmable Gate Arrays(FPGAs)offer parallelism and reconfigurability,making them well-suited for such tasks.In this study,we propose a hardware-accelerated Convolutional Neural Network(CNN)for brain cancer classification,implemented on the PYNQ-Z2 FPGA.Our approach optimizes the first Conv2D layer using different numerical representations:8-bit fixed-point(INT8),16-bit fixed-point(FP16),and 32-bit fixed-point(FP32),while the remaining layers run on an ARM Cortex-A9 processor.Experimental results demonstrate that FPGA acceleration significantly outperforms the CPU(Central Processing Unit)based approach.The obtained results emphasize the critical importance of selecting the appropriate numerical representation for hardware acceleration in medical imaging.On the PYNQ-Z2 FPGA,the INT8 achieves a 16.8%reduction in latency and 22.2%power savings compared to FP32,making it ideal for real-time and energy-constrained applications.FP16 offers a strong balance,delivering only a 0.1%drop in accuracy compared to FP32(94.1%vs.94.2%)while improving latency by 5%and reducing power consumption by 11.1%.Compared to prior works,the proposed FPGA-based CNN model achieves the highest classification accuracy(94.2%)with a throughput of up to 1.562 FPS,outperforming GPU-based and traditional CPU methods in both accuracy and hardware efficiency.These findings demonstrate the effectiveness of FPGA-based AI acceleration for real-time,power-efficient,and high-performance brain tumor classification,showcasing its practical potential in next-generation medical imaging systems.展开更多
文摘NTRU is a lattice-based public key cryptosystem featuring reasonably short, easily created keys, high speed, and low memory requirements, seems viable for wireless network. This paper presents two optimized designs based on the enhanced NTRU algorithm. One is a light-weight and fast NTRU core, it performs encryption only. This work has a gate-count of 1175 gates and a power consumption of 1.51 μW. It can finish the whole encryption process in 1498 μs at 500 kHz. As such, it is perfect for wireless sensor network. Another high-speed NTRU core is capable of both encryption and decryption, with delays of 16,064 μs and 128,010 μs in encryption and decryption respectively. Moreover, it consists of 25,758 equivalent gates and has a total power consumption of 59.2 μW (it will be reduced greatly if low power methods were adopted). This core is recommended to be used in base stations or servers in wireless network.
基金supported in part by the National Natural Science Foundation of China(No.62162016)in part by the Innovation Project of Guangxi Graduate Education(Nos.YCBZ2023132 and YCSW2023304).
文摘The efficient implementation of the Advanced Encryption Standard(AES)is crucial for network data security.This paper presents novel hardware implementations of the AES S-box,a core component,using tower field representations and Boolean Satisfiability(SAT)solvers.Our research makes several significant contri-butions to the field.Firstly,we have optimized the GF(24)inversion,achieving a remarkable 31.35%area reduction(15.33 GE)compared to the best known implementations.Secondly,we have enhanced multiplication implementa-tions for transformation matrices using a SAT-method based on local solutions.This approach has yielded notable improvements,such as a 22.22%reduction in area(42.00 GE)for the top transformation matrix in GF((24)2)-type S-box implementation.Furthermore,we have proposed new implementations of GF(((22)2)2)-type and GF((24)2)-type S-boxes,with the GF(((22)2)2)-type demonstrating superior performance.This implementation offers two variants:a small area variant that sets new area records,and a fast variant that establishes new benchmarks in Area-Execution-Time(AET)and energy consumption.Our approach significantly improves upon existing S-box implementations,offering advancements in area,speed,and energy consumption.These optimizations contribute to more efficient and secure AES implementations,potentially enhancing various cryptographic applications in the field of network security.
文摘针对当前无人机(UAV)视角下小目标检测性能低以及漏检和误检的问题,提出基于YOLOv8改进的BDSYOLO(BiFPN-Dual-Small target detection-YOLO)模型。首先,使用RepViTBlock(Revisiting mobile CNN from ViT perspective Block)与EMA(Efficient Multi-scale Attention)机制构造C2f-RE(C2f-RepViTBlock Efficient multi-scale attention)从而改进骨干网络中深层的C2f(faster implementation of CSP bottleneck with 2 Convolutions)模块,提升模型对小目标特征的提取能力并降低参数量;其次,使用双向特征金字塔网络(BiFPN)重构颈部网络,从而使不同层级的特征得以相互融合;然后,在改进颈部网络的基础上构造双重小目标检测层,并结合浅层和最浅层特征来提高模型对小目标的检测能力;最后,引入改进损失函数Inner-EIoU(Inner-Efficient-Intersection over Union),该函数使用更合理的宽高比衡量方式并解决交并比(IoU)自身的局限。实验结果表明,改进模型在VisDrone2019数据集上相对原始模型的精确率、召回率、mAP@50、mAP@50:95分别提升了8.5、7.7、9.2和6.3个百分点,而参数量仅为2.23×10~6,模型大小减小了19.1%。可见,所提模型在实现一定轻量化的同时显著提升了性能。
基金supported by Northern Border University Researchers Supporting Project number(NBU-FFR-2025-432-03),Northern Border University,Arar,Saudi Arabia.
文摘In medical imaging,accurate brain tumor classification in medical imaging requires real-time processing and efficient computation,making hardware acceleration essential.Field Programmable Gate Arrays(FPGAs)offer parallelism and reconfigurability,making them well-suited for such tasks.In this study,we propose a hardware-accelerated Convolutional Neural Network(CNN)for brain cancer classification,implemented on the PYNQ-Z2 FPGA.Our approach optimizes the first Conv2D layer using different numerical representations:8-bit fixed-point(INT8),16-bit fixed-point(FP16),and 32-bit fixed-point(FP32),while the remaining layers run on an ARM Cortex-A9 processor.Experimental results demonstrate that FPGA acceleration significantly outperforms the CPU(Central Processing Unit)based approach.The obtained results emphasize the critical importance of selecting the appropriate numerical representation for hardware acceleration in medical imaging.On the PYNQ-Z2 FPGA,the INT8 achieves a 16.8%reduction in latency and 22.2%power savings compared to FP32,making it ideal for real-time and energy-constrained applications.FP16 offers a strong balance,delivering only a 0.1%drop in accuracy compared to FP32(94.1%vs.94.2%)while improving latency by 5%and reducing power consumption by 11.1%.Compared to prior works,the proposed FPGA-based CNN model achieves the highest classification accuracy(94.2%)with a throughput of up to 1.562 FPS,outperforming GPU-based and traditional CPU methods in both accuracy and hardware efficiency.These findings demonstrate the effectiveness of FPGA-based AI acceleration for real-time,power-efficient,and high-performance brain tumor classification,showcasing its practical potential in next-generation medical imaging systems.