Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest....Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.However,Conformer-based architectures remain computational expensive due to the quadratic increase in the spatial and temporal complexity of their softmax-based attention mechanisms with sequence length.In addition,Conformerbased architectures may not provide sufficient flexibility for modeling local dependencies at different granularities.To mitigate these limitations,this study introduces a novel AVSR framework based on a ReLU-based Sparse and Grouped Conformer(RSG-Conformer)architecture.Specifically,we propose a Global-enhanced Sparse Attention(GSA)module incorporating an efficient context restoration block to recover lost contextual cues.Concurrently,a Grouped-scale Convolution(GSC)module replaces the standard Conformer convolution module,providing adaptive local modeling across varying temporal resolutions.Furthermore,we integrate a Refined Intermediate Contextual CTC(RIC-CTC)supervision strategy.This approach applies progressively increasing loss weights combined with convolution-based context aggregation,thereby further relaxing the constraint of conditional independence inherent in standard CTC frameworks.Evaluations on the LRS2 and LRS3 benchmark validate the efficacy of our approach,with word error rates(WERs)reduced to 1.8%and 1.5%,respectively.These results further demonstrate and validate its state-of-the-art performance in AVSR tasks.展开更多
In contrast to cyclic polymers with ring-like backbones,side-chain cyclization is another intriguing structural feature that has not been extensively studied.In this study,a library of orthogonally protected monomers ...In contrast to cyclic polymers with ring-like backbones,side-chain cyclization is another intriguing structural feature that has not been extensively studied.In this study,a library of orthogonally protected monomers featuring monocyclic,dicyclic,or tricyclic pendant motifs was designed and prepared based on malic acid derivatives.Polyesters with precise chemical structures and uniform chain lengths were prepared modularly through iterative growth.Meticulous control over the chemical details allows for a close investigation of the topological effects on the polymer properties.Compared to their linear side chain counterparts,the presence of cyclic pendant groups has a significant impact on chain conformation,leading to a reduction in hydrodynamic volume and an enhancement in the glass transition temperature.These results underscore the potential of tailoring polymer properties through rational engineering of side chain topology.展开更多
基金supported in part by the National Natural Science Foundation of China:61773330.
文摘Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.However,Conformer-based architectures remain computational expensive due to the quadratic increase in the spatial and temporal complexity of their softmax-based attention mechanisms with sequence length.In addition,Conformerbased architectures may not provide sufficient flexibility for modeling local dependencies at different granularities.To mitigate these limitations,this study introduces a novel AVSR framework based on a ReLU-based Sparse and Grouped Conformer(RSG-Conformer)architecture.Specifically,we propose a Global-enhanced Sparse Attention(GSA)module incorporating an efficient context restoration block to recover lost contextual cues.Concurrently,a Grouped-scale Convolution(GSC)module replaces the standard Conformer convolution module,providing adaptive local modeling across varying temporal resolutions.Furthermore,we integrate a Refined Intermediate Contextual CTC(RIC-CTC)supervision strategy.This approach applies progressively increasing loss weights combined with convolution-based context aggregation,thereby further relaxing the constraint of conditional independence inherent in standard CTC frameworks.Evaluations on the LRS2 and LRS3 benchmark validate the efficacy of our approach,with word error rates(WERs)reduced to 1.8%and 1.5%,respectively.These results further demonstrate and validate its state-of-the-art performance in AVSR tasks.
基金financially supported by the National Natural Science Foundation of China(No.22273026)Scientific Research Innovation Capability Support Project for Young Faculty(No.ZYGXQNJSKYCXNLZCXM-I15)+3 种基金Basic and Applied Basic Research Foundation of Guangdong Province(2024A1515012401)GJYC program of Guangzhou(No.2024D03J0002)the China Postdoctoral Science Foundation(No.2024M750938)Postdoctoral Fellowship Program of CPSF(No.GZC20240492)for their financial support。
文摘In contrast to cyclic polymers with ring-like backbones,side-chain cyclization is another intriguing structural feature that has not been extensively studied.In this study,a library of orthogonally protected monomers featuring monocyclic,dicyclic,or tricyclic pendant motifs was designed and prepared based on malic acid derivatives.Polyesters with precise chemical structures and uniform chain lengths were prepared modularly through iterative growth.Meticulous control over the chemical details allows for a close investigation of the topological effects on the polymer properties.Compared to their linear side chain counterparts,the presence of cyclic pendant groups has a significant impact on chain conformation,leading to a reduction in hydrodynamic volume and an enhancement in the glass transition temperature.These results underscore the potential of tailoring polymer properties through rational engineering of side chain topology.