Large language Models(LLMs)have immense potential to enhance the capabilities of Cyber-Physical-Social Intelligence(CPSI)systems,enabling them to better engage with complex cyber,physical,and social environments.Howev...Large language Models(LLMs)have immense potential to enhance the capabilities of Cyber-Physical-Social Intelligence(CPSI)systems,enabling them to better engage with complex cyber,physical,and social environments.However,the high inference latency of LLMs,which is inherited from the autoregressive decoding process,hinders their wide application in CPSI systems.To address this challenge,current approaches have incorporated speculative decoding to enable parallel prediction of multiple subsequent tokens,thereby achieving inference acceleration.Nevertheless,the accuracy of these decoding heads falls short of the autoregressive decoding approach.In light of these limitations,we propose ResDecode,a novel speculative decoding method characterized by its efficient and accurate decoding heads.Within the lightweight draft model,we propose a residual decoding head to compensate for the full context encoder’s limited capability on long-range dependencies,thus improving accuracy.ResDecode demonstrates impressive results,achieving a maximum speedup ratio of 3.2×on the MT-bench compared to vanilla autoregressive decoding.展开更多
Great progress has been made toward accurate face detection in recent years.However,the heavy model and expensive computation costs make it difficult to deploy many detectors on mobile and embedded devices where model...Great progress has been made toward accurate face detection in recent years.However,the heavy model and expensive computation costs make it difficult to deploy many detectors on mobile and embedded devices where model size and latency are highly constrained.In this paper,we present a millisecond-level anchor-free face detector,YuNet,which is specifically designed for edge devices.There are several key contributions in improving the efficiency-accuracy trade-off.First,we analyse the influential state-of-theart face detectors in recent years and summarize the rules to reduce the size of models.Then,a lightweight face detector,YuNet,is introduced.Our detector contains a tiny and efficient feature extraction backbone and a simplified pyramid feature fusion neck.To the best of our knowledge,YuNet has the best trade-off between accuracy and speed.It has only 75856 parameters and is less than 1/5 of other small-size detectors.In addition,a training strategy is presented for the tiny face detector,and it can effectively train models with the same distribution of the training set.The proposed YuNet achieves 81.1%mAP(single-scale)on the WIDER FACE validation hard track with a high inference efficiency(Intel i7-12700K:1.6ms per frame at 320×320).Because of its unique advantages,the repository for YuNet and its predecessors has been popular at GitHub and gained more than 11K stars at https://github.com/ShiqiYu/libfacedetection.Keywords:Face detection,object detection,computer version,lightweight,inference efficiency,anchor-free mechanism.展开更多
基金supported by the National Key R&D Program of China(No.2021ZD0110400)the National Natural Science Foundation of China(Nos.62406114 and 62306117)+4 种基金the Guangzhou Basic and Applied Basic Research Foundation(Nos.2023A04J1687 and 2024A04J3681)the Fundamental Research Funds for the Central Universities(Nos.2024ZYGXZR074 and 2023ZYGXZR023)the Guangdong Basic and Applied Basic Research Foundation(No.2024A1515010220)the Postdoctoral Fellowship Program of CPSF(No.GZC20230841)the South China University of Technology-TCL Technology Innovation Fund,and the CAAI-MindSpore Open Fund developed on Openl Community.
文摘Large language Models(LLMs)have immense potential to enhance the capabilities of Cyber-Physical-Social Intelligence(CPSI)systems,enabling them to better engage with complex cyber,physical,and social environments.However,the high inference latency of LLMs,which is inherited from the autoregressive decoding process,hinders their wide application in CPSI systems.To address this challenge,current approaches have incorporated speculative decoding to enable parallel prediction of multiple subsequent tokens,thereby achieving inference acceleration.Nevertheless,the accuracy of these decoding heads falls short of the autoregressive decoding approach.In light of these limitations,we propose ResDecode,a novel speculative decoding method characterized by its efficient and accurate decoding heads.Within the lightweight draft model,we propose a residual decoding head to compensate for the full context encoder’s limited capability on long-range dependencies,thus improving accuracy.ResDecode demonstrates impressive results,achieving a maximum speedup ratio of 3.2×on the MT-bench compared to vanilla autoregressive decoding.
基金supported in part by National Natural Science Foundation of China(No.61976144)the Stable Support Plan Program of Shenzhen Natural Science Fund,China(No.20200925155017002)the National Key Research and Development Program of China(No.2020 AAA0140000).
文摘Great progress has been made toward accurate face detection in recent years.However,the heavy model and expensive computation costs make it difficult to deploy many detectors on mobile and embedded devices where model size and latency are highly constrained.In this paper,we present a millisecond-level anchor-free face detector,YuNet,which is specifically designed for edge devices.There are several key contributions in improving the efficiency-accuracy trade-off.First,we analyse the influential state-of-theart face detectors in recent years and summarize the rules to reduce the size of models.Then,a lightweight face detector,YuNet,is introduced.Our detector contains a tiny and efficient feature extraction backbone and a simplified pyramid feature fusion neck.To the best of our knowledge,YuNet has the best trade-off between accuracy and speed.It has only 75856 parameters and is less than 1/5 of other small-size detectors.In addition,a training strategy is presented for the tiny face detector,and it can effectively train models with the same distribution of the training set.The proposed YuNet achieves 81.1%mAP(single-scale)on the WIDER FACE validation hard track with a high inference efficiency(Intel i7-12700K:1.6ms per frame at 320×320).Because of its unique advantages,the repository for YuNet and its predecessors has been popular at GitHub and gained more than 11K stars at https://github.com/ShiqiYu/libfacedetection.Keywords:Face detection,object detection,computer version,lightweight,inference efficiency,anchor-free mechanism.