A novel collaborative beamforming algorithm is proposed in a wireless communication system with multiple transmitters and one receiver. All transmitters take part in the collaboration and the weighted message is trans...A novel collaborative beamforming algorithm is proposed in a wireless communication system with multiple transmitters and one receiver. All transmitters take part in the collaboration and the weighted message is transmitted simultaneously. In order to maximize the beamforming gain, the transmitters use one bit feedback information to adjust the phase offset. It tracks the direction in which the signal strength at the receiver can increase. The directional search and perturbation theory is used to achieve the phase alignment. The feasibility of the proposed algorithm is proved both experimentally and theoretically. Simulation results show that the proposed algorithm can improve the convergent speed of the phase alignment.展开更多
Deep learning typically requires large amounts of labeled data and often struggles with generalization,posing challenges for intelligent systems.In the real world,most electrocardiogram(ECG)signals are unlabeled,which...Deep learning typically requires large amounts of labeled data and often struggles with generalization,posing challenges for intelligent systems.In the real world,most electrocardiogram(ECG)signals are unlabeled,which limits the use of smart devices in ECG-related applications.Unsupervised learning methods,such as contrastive learning,have emerged as a solution to this constraint.However,most contrastive learning encoders rely on deep neural networks with many parameters,making them unsuitable for hardware implementation.This article introduces a hardware-friendly universal ECG encoder with around 1k parameters based on contrastive learning and a fine-tuning framework for ECG-related tasks.We apply the encoder to a dual-task system for ECG-based arrhythmia classification and authentication,achieving 98.2%and 99.7%accuracy on the MIT-BIH dataset,respectively,with FAR of 0.274 and FRR of 0.707 for authentication.We propose a dynamic averaging template concatenation technique to improve neural network generalization significantly.We also develop an energy-efficient hardware architecture optimized for the entire system,successfully implementing it on an FPGA.展开更多
The object detection algorithm based on convolutional neural networks(CNNs)significantly enhances accuracy by expanding network scale.As network parameters increase,large-scale networks demand substantial memory resou...The object detection algorithm based on convolutional neural networks(CNNs)significantly enhances accuracy by expanding network scale.As network parameters increase,large-scale networks demand substantial memory resources,making deployment on hardware challenging.Although most neural network accelerators utilize off-chip storage,frequent access to external memory restricts processing speed,hindering the ability to meet the frame rate requirements for embedded systems.This creates a trade-off in which the speed and accuracy of embedded target detection accelerators cannot be simultaneously optimized.In this paper,we propose PODALA,an energy-efficient accelerator developed through the algorithm-hardware co-design methodology.For object detection algorithm,we develop an optimized algorithm combined with the inverse-residual structure and depthwise separable convolution,effectively reducing network parameters while preserving high detection accuracy.For hardware accelerator,we develop a custom layer fusion technique for PODALA to minimize memory access requirements.The overall design employs a streaming hardware architecture that combines a computing array with a refined ping-pong output buffer to execute different layer fusion computing modes efficiently.Our approach substantially reduces memory usage through optimizations in both algorithmic and hardware design.Evaluated on the Xilinx ZCU102 FPGA platform,PODALA achieves 78 frames per second(FPS)and 79.73 GOPS/W energy efficiency,underscoring its superiority over state-of-the-art solutions.展开更多
基金supported by the National Natural Science Foundation of China(6130115561571003)+2 种基金the Ministry of Education(MCM20130111)the Funds for the Central Universities(ZYGX2014J001)the State Grid Power(W2015000333)
文摘A novel collaborative beamforming algorithm is proposed in a wireless communication system with multiple transmitters and one receiver. All transmitters take part in the collaboration and the weighted message is transmitted simultaneously. In order to maximize the beamforming gain, the transmitters use one bit feedback information to adjust the phase offset. It tracks the direction in which the signal strength at the receiver can increase. The directional search and perturbation theory is used to achieve the phase alignment. The feasibility of the proposed algorithm is proved both experimentally and theoretically. Simulation results show that the proposed algorithm can improve the convergent speed of the phase alignment.
基金supported by the National Natural Science Foundation of China under Grant 62104025.
文摘Deep learning typically requires large amounts of labeled data and often struggles with generalization,posing challenges for intelligent systems.In the real world,most electrocardiogram(ECG)signals are unlabeled,which limits the use of smart devices in ECG-related applications.Unsupervised learning methods,such as contrastive learning,have emerged as a solution to this constraint.However,most contrastive learning encoders rely on deep neural networks with many parameters,making them unsuitable for hardware implementation.This article introduces a hardware-friendly universal ECG encoder with around 1k parameters based on contrastive learning and a fine-tuning framework for ECG-related tasks.We apply the encoder to a dual-task system for ECG-based arrhythmia classification and authentication,achieving 98.2%and 99.7%accuracy on the MIT-BIH dataset,respectively,with FAR of 0.274 and FRR of 0.707 for authentication.We propose a dynamic averaging template concatenation technique to improve neural network generalization significantly.We also develop an energy-efficient hardware architecture optimized for the entire system,successfully implementing it on an FPGA.
基金supported by the National Natural Science Foundation of China under Grant 62104025,Grant 62104229,and Grant 62104259.
文摘The object detection algorithm based on convolutional neural networks(CNNs)significantly enhances accuracy by expanding network scale.As network parameters increase,large-scale networks demand substantial memory resources,making deployment on hardware challenging.Although most neural network accelerators utilize off-chip storage,frequent access to external memory restricts processing speed,hindering the ability to meet the frame rate requirements for embedded systems.This creates a trade-off in which the speed and accuracy of embedded target detection accelerators cannot be simultaneously optimized.In this paper,we propose PODALA,an energy-efficient accelerator developed through the algorithm-hardware co-design methodology.For object detection algorithm,we develop an optimized algorithm combined with the inverse-residual structure and depthwise separable convolution,effectively reducing network parameters while preserving high detection accuracy.For hardware accelerator,we develop a custom layer fusion technique for PODALA to minimize memory access requirements.The overall design employs a streaming hardware architecture that combines a computing array with a refined ping-pong output buffer to execute different layer fusion computing modes efficiently.Our approach substantially reduces memory usage through optimizations in both algorithmic and hardware design.Evaluated on the Xilinx ZCU102 FPGA platform,PODALA achieves 78 frames per second(FPS)and 79.73 GOPS/W energy efficiency,underscoring its superiority over state-of-the-art solutions.