高性能同轴电缆网络(High Performance Network Over Coax,HINOC)技术是一种光纤同轴混合接入技术,已发展至第3代。为了实现万兆以太网的接入速率,第3代HINOC引入了多信道绑定机制。但该机制在有效扩展HINOC网络信道带宽的同时易导致HIM...高性能同轴电缆网络(High Performance Network Over Coax,HINOC)技术是一种光纤同轴混合接入技术,已发展至第3代。为了实现万兆以太网的接入速率,第3代HINOC引入了多信道绑定机制。但该机制在有效扩展HINOC网络信道带宽的同时易导致HIMAC(HINOC Medium Access Control)拆帧端接收的数据流失序。针对该问题,文中提出了一种拆帧重排序方法。通过重排序队列缓存管理、入队逻辑地址计算、超时判断及清空以及出队判断等关键技术的设计和实现来解决多信道绑定机制引起的拆帧乱序问题,并对其关键功能点进行仿真验证和板级验证。实验结果表明,所提方法能够有效处理多信道绑定导致的乱序问题,并且能够确保系统在遇到错误情况时稳定运行,具有较强的鲁棒性,满足万兆同轴宽带接入HIMAC 3.0的功能和性能要求。展开更多
Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest....Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.However,Conformer-based architectures remain computational expensive due to the quadratic increase in the spatial and temporal complexity of their softmax-based attention mechanisms with sequence length.In addition,Conformerbased architectures may not provide sufficient flexibility for modeling local dependencies at different granularities.To mitigate these limitations,this study introduces a novel AVSR framework based on a ReLU-based Sparse and Grouped Conformer(RSG-Conformer)architecture.Specifically,we propose a Global-enhanced Sparse Attention(GSA)module incorporating an efficient context restoration block to recover lost contextual cues.Concurrently,a Grouped-scale Convolution(GSC)module replaces the standard Conformer convolution module,providing adaptive local modeling across varying temporal resolutions.Furthermore,we integrate a Refined Intermediate Contextual CTC(RIC-CTC)supervision strategy.This approach applies progressively increasing loss weights combined with convolution-based context aggregation,thereby further relaxing the constraint of conditional independence inherent in standard CTC frameworks.Evaluations on the LRS2 and LRS3 benchmark validate the efficacy of our approach,with word error rates(WERs)reduced to 1.8%and 1.5%,respectively.These results further demonstrate and validate its state-of-the-art performance in AVSR tasks.展开更多
文摘高性能同轴电缆网络(High Performance Network Over Coax,HINOC)技术是一种光纤同轴混合接入技术,已发展至第3代。为了实现万兆以太网的接入速率,第3代HINOC引入了多信道绑定机制。但该机制在有效扩展HINOC网络信道带宽的同时易导致HIMAC(HINOC Medium Access Control)拆帧端接收的数据流失序。针对该问题,文中提出了一种拆帧重排序方法。通过重排序队列缓存管理、入队逻辑地址计算、超时判断及清空以及出队判断等关键技术的设计和实现来解决多信道绑定机制引起的拆帧乱序问题,并对其关键功能点进行仿真验证和板级验证。实验结果表明,所提方法能够有效处理多信道绑定导致的乱序问题,并且能够确保系统在遇到错误情况时稳定运行,具有较强的鲁棒性,满足万兆同轴宽带接入HIMAC 3.0的功能和性能要求。
基金supported in part by the National Natural Science Foundation of China:61773330.
文摘Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.However,Conformer-based architectures remain computational expensive due to the quadratic increase in the spatial and temporal complexity of their softmax-based attention mechanisms with sequence length.In addition,Conformerbased architectures may not provide sufficient flexibility for modeling local dependencies at different granularities.To mitigate these limitations,this study introduces a novel AVSR framework based on a ReLU-based Sparse and Grouped Conformer(RSG-Conformer)architecture.Specifically,we propose a Global-enhanced Sparse Attention(GSA)module incorporating an efficient context restoration block to recover lost contextual cues.Concurrently,a Grouped-scale Convolution(GSC)module replaces the standard Conformer convolution module,providing adaptive local modeling across varying temporal resolutions.Furthermore,we integrate a Refined Intermediate Contextual CTC(RIC-CTC)supervision strategy.This approach applies progressively increasing loss weights combined with convolution-based context aggregation,thereby further relaxing the constraint of conditional independence inherent in standard CTC frameworks.Evaluations on the LRS2 and LRS3 benchmark validate the efficacy of our approach,with word error rates(WERs)reduced to 1.8%and 1.5%,respectively.These results further demonstrate and validate its state-of-the-art performance in AVSR tasks.