Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest....Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.However,Conformer-based architectures remain computational expensive due to the quadratic increase in the spatial and temporal complexity of their softmax-based attention mechanisms with sequence length.In addition,Conformerbased architectures may not provide sufficient flexibility for modeling local dependencies at different granularities.To mitigate these limitations,this study introduces a novel AVSR framework based on a ReLU-based Sparse and Grouped Conformer(RSG-Conformer)architecture.Specifically,we propose a Global-enhanced Sparse Attention(GSA)module incorporating an efficient context restoration block to recover lost contextual cues.Concurrently,a Grouped-scale Convolution(GSC)module replaces the standard Conformer convolution module,providing adaptive local modeling across varying temporal resolutions.Furthermore,we integrate a Refined Intermediate Contextual CTC(RIC-CTC)supervision strategy.This approach applies progressively increasing loss weights combined with convolution-based context aggregation,thereby further relaxing the constraint of conditional independence inherent in standard CTC frameworks.Evaluations on the LRS2 and LRS3 benchmark validate the efficacy of our approach,with word error rates(WERs)reduced to 1.8%and 1.5%,respectively.These results further demonstrate and validate its state-of-the-art performance in AVSR tasks.展开更多
With the increasingly prominent trend of globalization,English,as the common language of international communication,plays an increasingly important role in university education.As a key link in English teaching,the c...With the increasingly prominent trend of globalization,English,as the common language of international communication,plays an increasingly important role in university education.As a key link in English teaching,the college English audio-visual oral course not only imparts language knowledge and skills,but also shoulders the important task of cultivating students’critical thinking.As one of the essential core qualities of modern talents,critical thinking ability plays an irreplaceable role in students’in-depth understanding of English knowledge,improving intercultural communication ability and cultivating innovative thinking.This paper expounds the significance of cultivating students’critical thinking ability in college English audio-visual and oral teaching,and puts forward a series of innovative teaching strategies to cultivate students’critical thinking ability combined with practical teaching experience and cutting-edge education theory,in order to provide new ideas and practical guidance for the improvement of college English teaching quality and the development of students’comprehensive quality.展开更多
WE observe that the response speed of a linear timeinvariant system to a step reference input depends not only on the system parameters but also on the magnitude of the step input.Based on this observation,we demonstr...WE observe that the response speed of a linear timeinvariant system to a step reference input depends not only on the system parameters but also on the magnitude of the step input.Based on this observation,we demonstrate a method to schedule the magnitude of the reference input to achieve a faster response.展开更多
Low heat input welding is widely used in the industry.The microstructure and toughness of the welded joints under low heat input conditions have received less attention than those under high heat input.The impact toug...Low heat input welding is widely used in the industry.The microstructure and toughness of the welded joints under low heat input conditions have received less attention than those under high heat input.The impact toughness,microstructure and failure mechanisms of the coarse-grain heat-affected zone(CGHAZ)in a micro-alloyed steel were investigated by welding thermal simulation with the heat input ranging from 15 to 65 kJ/cm.The impact toughness of CGHAZ is highly sensitive to variations in low heat input.The failure mechanisms were discussed from the viewpoints of micro-voids formation and micro-cracks propagation.The micro-voids are preferred to be formed and grow at soft phase of grain boundary ferrite(GBF).At the heat inputs no more than 22 kJ/cm,martensite was dominantly formed,and the micro-cracks initiated from the GBF were propagated into the grain interiors,leading to the brittle fracture and low toughness.When the heat input was increased to 31.2 kJ/cm,granular bainite became the dominant constitute,causing cracks to deflect away from GBF and propagate into prior austenite grains.The high density high-angle and low-angle grain boundaries and the presence of retained austenite,effectively restricted the crack propagation,resulting in ductile fracture behavior and enhanced toughness.High heat input(62.3 kJ/cm)promoted coarse GBF formation,providing continuous paths for microcrack propagation.This direct intergranular crack progression caused brittle fracture and low toughness.Industrial cold cracking in the CGHAZ can thus be controlled by heat input optimization to maximize toughness.展开更多
This study integrates explicit input enhancement into comparative continuation writing,defined as a task in which learners produce a continuation by comparing their own expression with an input text,aligning with its ...This study integrates explicit input enhancement into comparative continuation writing,defined as a task in which learners produce a continuation by comparing their own expression with an input text,aligning with its discourse structure and linguistic features,while developing their own ideas.It aims to examine whether English as a Foreign Language(EFL)learners in China exhibit differences in discourse competence and writing performance when completing comparative continuation writing combined with different input enhancement techniques,and whether the alignment effect occurs at the discourse level.Sixty first-year Chinese senior middle school students were divided into four groups:three groups engaged in comparative continuation writing with varying input enhancement,achieved by combining different techniques,while a control group performed a designated-topic writing task.The results revealed that three comparative continuation writing groups outperformed the designated-topic writing group in discourse competence,particularly in the use of temporal connectives.However,differences and some inconsistencies were observed among the comparative continuation writing groups across individual indices.The study highlights effective ways to incorporate comparative continuation writing into English instruction and demonstrates how explicit input enhancement can complement the task,simultaneously activating the alignment effect proposed by the xu-argument and enhancing discourse competence in writing.展开更多
In this paper,we study the issue of controlling a rotating flexible body-beam system(RFBBS)which consists of a tip mass attached to the free-end and a rigid disk attached to the clamped-end of an Euler-Bernoulli beam....In this paper,we study the issue of controlling a rotating flexible body-beam system(RFBBS)which consists of a tip mass attached to the free-end and a rigid disk attached to the clamped-end of an Euler-Bernoulli beam.The boundary control input is affected by both unknown disturbance and nonlinear input backlash.First,the input backlash is considered as desired control input combined with a nonlinear input error,converting it to an external disturbance,and then,the control signal is designed through the energy-based control method.Next,the closed-loop system’s stability is analysed through Lyapunov direct method.Finally,the efficacy of the proposed control scheme is tested through numerical simulations utilizing the finite difference method.展开更多
Soil greenhouse gas(GHG)emissions contribute profoundly to global warming;however,how plant detritus input alters GHG emissions is poorly understood.Here,we used detritus input and removal treatments(i.e.,DIRT:control...Soil greenhouse gas(GHG)emissions contribute profoundly to global warming;however,how plant detritus input alters GHG emissions is poorly understood.Here,we used detritus input and removal treatments(i.e.,DIRT:control,CK;double litter,DL;no roots with double litter,NRDL;no litter,NL;no roots,NR;no roots and no litter,NRNL)to assess the effects of litter and root inputs on soil CO_(2),CH_(4),and N_(2)O fluxes in soils in a coniferous(Pinus yunnanensis)and a broad-leaf forest(Quercus pannosa)in a subalpine region in southwestern China.Litter addition increased CO_(2) emissions on average 22.22%,but did not significantly alter CH_(4) uptake and N_(2)O emission compared to the CK.Litter removal(NL and NRNL)significantly reduced CO_(2) emissions on average 30.22%and N_(2)O emissions on average 31.16%from both forest soils,but did not significantly affect soil CH_(4) uptake.Root removal(NR and NRNL)generally decreased these three soil GHG fluxes.Changes inβ-1,4-glucosidase(BG)involved in C and phospholipid fatty acid(PLFAs)biomass were projected to influence CO_(2) emissions,while soil microclimates(temperature and moisture)combined with BG activity mainly regulated CH_(4) uptake.Alterations in dissolved organic nitrogen,microbial biomass nitrogen and BG were mainly responsible for changes in N_(2)O emissions.Interestingly,coniferous forest soil seemed to promote CH_(4) uptake more than the broad-leaf forest soil,but CO_(2) and N_(2)O fluxes were not significantly affected by the forest types.As expected,litter addition significantly increased the warming potential,while litter removal relatively lowered it.These findings revealed the divergent roles of plant detritus input and forest type in shaping soil GHG fluxes,thereby providing insights into forest management and predicting contributions of subalpine forests to global warming.展开更多
Audio-visual learning,aimed at exploiting the relationship between audio and visual modalities,has drawn considerable attention since deep learning started to be used successfully.Researchers tend to leverage these tw...Audio-visual learning,aimed at exploiting the relationship between audio and visual modalities,has drawn considerable attention since deep learning started to be used successfully.Researchers tend to leverage these two modalities to improve the performance of previously considered single-modality tasks or address new challenging problems.In this paper,we provide a comprehensive survey of recent audio-visual learning development.We divide the current audio-visual learning tasks into four different subfields:audiovisual separation and localization,audio-visual correspondence learning,audio-visual generation,and audio-visual representation learning.State-of-the-art methods,as well as the remaining challenges of each subfield,are further discussed.Finally,we summarize the commonly used datasets and challenges.展开更多
Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The p...Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.展开更多
In response to the evolving challenges posed by small unmanned aerial vehicles(UAVs),which have the potential to transport harmful payloads or cause significant damage,we present AV-FDTI,an innovative Audio-Visual Fus...In response to the evolving challenges posed by small unmanned aerial vehicles(UAVs),which have the potential to transport harmful payloads or cause significant damage,we present AV-FDTI,an innovative Audio-Visual Fusion system designed for Drone Threat Identification.AV-FDTI leverages the fusion of audio and omnidirectional camera feature inputs,providing a comprehensive solution to enhance the precision and resilience of drone classification and 3D localization.Specifically,AV-FDTI employs a CRNN network to capture vital temporal dynamics within the audio domain and utilizes a pretrained ResNet50 model for image feature extraction.Furthermore,we adopt a visual information entropy and cross-attention-based mechanism to enhance the fusion of visual and audio data.Notably,our system is trained based on automated Leica tracking annotations,offering accurate ground truth data with millimeter-level accuracy.Comprehensive comparative evaluations demonstrate the superiority of our solution over the existing systems.In our commitment to advancing this field,we will release this work as open-source code and wearable AV-FDTI design,contributing valuable resources to the research community.展开更多
This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some ligh...This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some lights on our national audio-visual practice and research.The review on the Chinese scholars’ audio-visual translation studies is to offer the potential developing direction and guidelines to the studies and aspects neglected as well.Based on the summary of relevant studies,possible topics for further studies are proposed.展开更多
Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A mod...Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learn- ing and adaptive boosting (AdaBoost) is used to train ensemble classifiers for different layers (cues). Except for the first layer, the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM clas- sifier is combined by these ensemble classifiers and takes advantage of the complementary informa- tion from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising.展开更多
February 10 (US Central Time), 2019, China National Peking Opera Company (CNPOC) and the Hubei Chime Bells National Chinese Orchestra presented a fantastic audio-visual performance of Chinese Peking Opera and Chinese ...February 10 (US Central Time), 2019, China National Peking Opera Company (CNPOC) and the Hubei Chime Bells National Chinese Orchestra presented a fantastic audio-visual performance of Chinese Peking Opera and Chinese chime bells for the American audience at the world s top-level Buntrock Hall at Symphony Center.展开更多
Mongolian audio-visual works are an important carrier of exploring the true significance to this national culture.This paper believes that the Mongolian people in Inner Mongolia constantly enhance the individual sense...Mongolian audio-visual works are an important carrier of exploring the true significance to this national culture.This paper believes that the Mongolian people in Inner Mongolia constantly enhance the individual sense of identity to the overall ethnic group through the influence of film and television and music,and on this basis constantly evolve a new culture in line with modern and contemporary life to further enhance their sense of belonging to the ethnic nation.展开更多
Based on the current situation of college audio-visual English teaching in China, this article points out that the avoidance in class is a serious phenomenon in the process of college audio-visual English teaching. Af...Based on the current situation of college audio-visual English teaching in China, this article points out that the avoidance in class is a serious phenomenon in the process of college audio-visual English teaching. After further analysis and combination with the characteristics of college English audio-visual teaching in China, it puts forward the application of task-based teaching method to college audio-visual English teaching and its steps, attempting to alleviate the avoidance phenomenon in students through task-based teaching method.展开更多
Zhuang culture,a representative of the native ethnic culture of Guangxi,China,is of great significance to Chinese culture.In order to promote traditional culture,enrich the teaching content of College English Audio-Vi...Zhuang culture,a representative of the native ethnic culture of Guangxi,China,is of great significance to Chinese culture.In order to promote traditional culture,enrich the teaching content of College English Audio-Visual Speaking Course,and enhance the intercultural communication ability of college students,this paper,from a multicultural perspective,explores the classroom practices of integrating indigenous Zhuang cultural elements in College English Audio-Visual Speaking Course,providing new perspectives and reference for multicultural education in foreign languages.展开更多
By distinguishing the differences between audio-visual interpretation and visual interpretation, it is clear that the two belong to different categories in essence and working methods, in order to avoid misunderstandi...By distinguishing the differences between audio-visual interpretation and visual interpretation, it is clear that the two belong to different categories in essence and working methods, in order to avoid misunderstanding and confusion between the two in learning. At the same time, there are some misconceptions in their teaching methods. This paper explores the teaching methods of visual interpretation and audio-visual interpretation, which will make them more reasonable and scientific in the teaching process.展开更多
The object-based scalable coding in MPEG-4 is investigated, and a prioritized transmission scheme of MPEG-4 audio-visual objects (AVOs) over the DiffServ network with the QoS guarantee is proposed. MPEG-4 AVOs are e...The object-based scalable coding in MPEG-4 is investigated, and a prioritized transmission scheme of MPEG-4 audio-visual objects (AVOs) over the DiffServ network with the QoS guarantee is proposed. MPEG-4 AVOs are extracted and classified into different groups according to their priority values and scalable layers (visual importance). These priority values are mapped to the 1P DiffServ per hop behaviors (PHB). This scheme can selectively discard packets with low importance, in order to avoid the network congestion. Simulation results show that the quality of received video can gracefully adapt to network state, as compared with the ‘best-effort' manner. Also, by allowing the content provider to define prioritization of each audio-visual object, the adaptive transmission of object-based scalable video can be customized based on the content.展开更多
This paper addresses the lane-keeping control problem for autonomous ground vehicles subject to input saturation and uncertain system parameters.An enhanced adaptive terminal sliding mode based prescribed performance ...This paper addresses the lane-keeping control problem for autonomous ground vehicles subject to input saturation and uncertain system parameters.An enhanced adaptive terminal sliding mode based prescribed performance control scheme is proposed,which enables the lateral position error of the vehicle to be kept within the prescribed performance boundaries all the time.This is achieved by firstly introducing an improved performance function into the controller design such that the stringent initial condition requirements can be relaxed,which further allows the global prescribed performance control result,and then,developing a multivariable adaptive terminal sliding mode based controller such that both input saturation and parameter uncertainties are handled effectively,which further ensures the robust lane-keeping control.Finally,the proposed control strategy is validated through numerical simulations,demonstrating its effectiveness.展开更多
Dear Editor,It is well known that event-triggered control(ETC)is an effective approach in addressing networked control problems for Industry 5.0.Its feasibility,however,is still restricted to canonical nonlinear syste...Dear Editor,It is well known that event-triggered control(ETC)is an effective approach in addressing networked control problems for Industry 5.0.Its feasibility,however,is still restricted to canonical nonlinear systems so far.Considering this,a gradient-based adaptive ETC scheme for noncanonical nonlinear systems is newly developed in this letter,where the hysteresis input constraints are considered also.By proper decomposition,the technical issue of handling ETC-induced measurement errors and hysteresis inputs can be transformed into the robustness problem to bounded disturbance-like terms,which is then addressed by integrating a switching modification strategy in adaptive design and developing a novel augmented error-based analysis framework.Experimental results based on a practical piezoactuator confirm the effectiveness of the proposed scheme.展开更多
基金supported in part by the National Natural Science Foundation of China:61773330.
文摘Audio-visual speech recognition(AVSR),which integrates audio and visual modalities to improve recognition performance and robustness in noisy or adverse acoustic conditions,has attracted significant research interest.However,Conformer-based architectures remain computational expensive due to the quadratic increase in the spatial and temporal complexity of their softmax-based attention mechanisms with sequence length.In addition,Conformerbased architectures may not provide sufficient flexibility for modeling local dependencies at different granularities.To mitigate these limitations,this study introduces a novel AVSR framework based on a ReLU-based Sparse and Grouped Conformer(RSG-Conformer)architecture.Specifically,we propose a Global-enhanced Sparse Attention(GSA)module incorporating an efficient context restoration block to recover lost contextual cues.Concurrently,a Grouped-scale Convolution(GSC)module replaces the standard Conformer convolution module,providing adaptive local modeling across varying temporal resolutions.Furthermore,we integrate a Refined Intermediate Contextual CTC(RIC-CTC)supervision strategy.This approach applies progressively increasing loss weights combined with convolution-based context aggregation,thereby further relaxing the constraint of conditional independence inherent in standard CTC frameworks.Evaluations on the LRS2 and LRS3 benchmark validate the efficacy of our approach,with word error rates(WERs)reduced to 1.8%and 1.5%,respectively.These results further demonstrate and validate its state-of-the-art performance in AVSR tasks.
基金A Study on the Teaching Reform of College English Audio-Visual Oral Course Oriented towards the Cultivation of Critical Thinking Ability(2501032339)。
文摘With the increasingly prominent trend of globalization,English,as the common language of international communication,plays an increasingly important role in university education.As a key link in English teaching,the college English audio-visual oral course not only imparts language knowledge and skills,but also shoulders the important task of cultivating students’critical thinking.As one of the essential core qualities of modern talents,critical thinking ability plays an irreplaceable role in students’in-depth understanding of English knowledge,improving intercultural communication ability and cultivating innovative thinking.This paper expounds the significance of cultivating students’critical thinking ability in college English audio-visual and oral teaching,and puts forward a series of innovative teaching strategies to cultivate students’critical thinking ability combined with practical teaching experience and cutting-edge education theory,in order to provide new ideas and practical guidance for the improvement of college English teaching quality and the development of students’comprehensive quality.
文摘WE observe that the response speed of a linear timeinvariant system to a step reference input depends not only on the system parameters but also on the magnitude of the step input.Based on this observation,we demonstrate a method to schedule the magnitude of the reference input to achieve a faster response.
基金supported by the National Natural Science Foundation of China(No.51804232)Beijing Municipal Natural Science Foundation(No.2212041)+1 种基金supported by the Interdisciplinary Research Project for Young Teachers of USTB(Fundamental Research Funds for the Central Universities)(FRF-IDRY-20-020)GIMRT Program of the Institute for Materials Research,Tohoku University(202303-RDKGE-0518).
文摘Low heat input welding is widely used in the industry.The microstructure and toughness of the welded joints under low heat input conditions have received less attention than those under high heat input.The impact toughness,microstructure and failure mechanisms of the coarse-grain heat-affected zone(CGHAZ)in a micro-alloyed steel were investigated by welding thermal simulation with the heat input ranging from 15 to 65 kJ/cm.The impact toughness of CGHAZ is highly sensitive to variations in low heat input.The failure mechanisms were discussed from the viewpoints of micro-voids formation and micro-cracks propagation.The micro-voids are preferred to be formed and grow at soft phase of grain boundary ferrite(GBF).At the heat inputs no more than 22 kJ/cm,martensite was dominantly formed,and the micro-cracks initiated from the GBF were propagated into the grain interiors,leading to the brittle fracture and low toughness.When the heat input was increased to 31.2 kJ/cm,granular bainite became the dominant constitute,causing cracks to deflect away from GBF and propagate into prior austenite grains.The high density high-angle and low-angle grain boundaries and the presence of retained austenite,effectively restricted the crack propagation,resulting in ductile fracture behavior and enhanced toughness.High heat input(62.3 kJ/cm)promoted coarse GBF formation,providing continuous paths for microcrack propagation.This direct intergranular crack progression caused brittle fracture and low toughness.Industrial cold cracking in the CGHAZ can thus be controlled by heat input optimization to maximize toughness.
文摘This study integrates explicit input enhancement into comparative continuation writing,defined as a task in which learners produce a continuation by comparing their own expression with an input text,aligning with its discourse structure and linguistic features,while developing their own ideas.It aims to examine whether English as a Foreign Language(EFL)learners in China exhibit differences in discourse competence and writing performance when completing comparative continuation writing combined with different input enhancement techniques,and whether the alignment effect occurs at the discourse level.Sixty first-year Chinese senior middle school students were divided into four groups:three groups engaged in comparative continuation writing with varying input enhancement,achieved by combining different techniques,while a control group performed a designated-topic writing task.The results revealed that three comparative continuation writing groups outperformed the designated-topic writing group in discourse competence,particularly in the use of temporal connectives.However,differences and some inconsistencies were observed among the comparative continuation writing groups across individual indices.The study highlights effective ways to incorporate comparative continuation writing into English instruction and demonstrates how explicit input enhancement can complement the task,simultaneously activating the alignment effect proposed by the xu-argument and enhancing discourse competence in writing.
基金supported in part by the National Natural Science Fundation of China under Grant Nos.62403263 and 62373207in part by the Natural Science Fundation of Qingdao,China under Grant No.24-4-4-zrjj-88-jch+1 种基金in part by the Team Plan for Youth Innovation of Universities in Shandong Province under Grant No.2024KJH148in part by the Foundation of Key Laboratory of Autonomous Systems and Networked Control(South China University of Technology),Ministry of Education under Grant No.2024A01.
文摘In this paper,we study the issue of controlling a rotating flexible body-beam system(RFBBS)which consists of a tip mass attached to the free-end and a rigid disk attached to the clamped-end of an Euler-Bernoulli beam.The boundary control input is affected by both unknown disturbance and nonlinear input backlash.First,the input backlash is considered as desired control input combined with a nonlinear input error,converting it to an external disturbance,and then,the control signal is designed through the energy-based control method.Next,the closed-loop system’s stability is analysed through Lyapunov direct method.Finally,the efficacy of the proposed control scheme is tested through numerical simulations utilizing the finite difference method.
基金supported by the National Natural Science Foundation of China(32130069)the National Key Research and Development Program of China(2024YFF1306700)the Scientific Research Foundation of Education Department of Yunnan Province(2024Y004).
文摘Soil greenhouse gas(GHG)emissions contribute profoundly to global warming;however,how plant detritus input alters GHG emissions is poorly understood.Here,we used detritus input and removal treatments(i.e.,DIRT:control,CK;double litter,DL;no roots with double litter,NRDL;no litter,NL;no roots,NR;no roots and no litter,NRNL)to assess the effects of litter and root inputs on soil CO_(2),CH_(4),and N_(2)O fluxes in soils in a coniferous(Pinus yunnanensis)and a broad-leaf forest(Quercus pannosa)in a subalpine region in southwestern China.Litter addition increased CO_(2) emissions on average 22.22%,but did not significantly alter CH_(4) uptake and N_(2)O emission compared to the CK.Litter removal(NL and NRNL)significantly reduced CO_(2) emissions on average 30.22%and N_(2)O emissions on average 31.16%from both forest soils,but did not significantly affect soil CH_(4) uptake.Root removal(NR and NRNL)generally decreased these three soil GHG fluxes.Changes inβ-1,4-glucosidase(BG)involved in C and phospholipid fatty acid(PLFAs)biomass were projected to influence CO_(2) emissions,while soil microclimates(temperature and moisture)combined with BG activity mainly regulated CH_(4) uptake.Alterations in dissolved organic nitrogen,microbial biomass nitrogen and BG were mainly responsible for changes in N_(2)O emissions.Interestingly,coniferous forest soil seemed to promote CH_(4) uptake more than the broad-leaf forest soil,but CO_(2) and N_(2)O fluxes were not significantly affected by the forest types.As expected,litter addition significantly increased the warming potential,while litter removal relatively lowered it.These findings revealed the divergent roles of plant detritus input and forest type in shaping soil GHG fluxes,thereby providing insights into forest management and predicting contributions of subalpine forests to global warming.
基金supported by National Key Research and Development Program of China(No.2016YFB1001001)Beijing Natural Science Foundation(No.JQ18017)National Natural Science Foundation of China(No.61976002)。
文摘Audio-visual learning,aimed at exploiting the relationship between audio and visual modalities,has drawn considerable attention since deep learning started to be used successfully.Researchers tend to leverage these two modalities to improve the performance of previously considered single-modality tasks or address new challenging problems.In this paper,we provide a comprehensive survey of recent audio-visual learning development.We divide the current audio-visual learning tasks into four different subfields:audiovisual separation and localization,audio-visual correspondence learning,audio-visual generation,and audio-visual representation learning.State-of-the-art methods,as well as the remaining challenges of each subfield,are further discussed.Finally,we summarize the commonly used datasets and challenges.
文摘Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.
基金National Research Foundation,Singapore,under its Medium-Sized Center for Advanced Robotics Technology Innovation(CARTIN)under project WP5 within the Delta-NTU Corporate Lab with funding support from A*STAR under its IAF-ICP program(Grant no:I2201E0013)and Delta Electronics Inc.
文摘In response to the evolving challenges posed by small unmanned aerial vehicles(UAVs),which have the potential to transport harmful payloads or cause significant damage,we present AV-FDTI,an innovative Audio-Visual Fusion system designed for Drone Threat Identification.AV-FDTI leverages the fusion of audio and omnidirectional camera feature inputs,providing a comprehensive solution to enhance the precision and resilience of drone classification and 3D localization.Specifically,AV-FDTI employs a CRNN network to capture vital temporal dynamics within the audio domain and utilizes a pretrained ResNet50 model for image feature extraction.Furthermore,we adopt a visual information entropy and cross-attention-based mechanism to enhance the fusion of visual and audio data.Notably,our system is trained based on automated Leica tracking annotations,offering accurate ground truth data with millimeter-level accuracy.Comprehensive comparative evaluations demonstrate the superiority of our solution over the existing systems.In our commitment to advancing this field,we will release this work as open-source code and wearable AV-FDTI design,contributing valuable resources to the research community.
文摘This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some lights on our national audio-visual practice and research.The review on the Chinese scholars’ audio-visual translation studies is to offer the potential developing direction and guidelines to the studies and aspects neglected as well.Based on the summary of relevant studies,possible topics for further studies are proposed.
基金Supported by the National Natural Science Foundation of China(60905006)the NSFC-Guangdong Joint Fund(U1035004)
文摘Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learn- ing and adaptive boosting (AdaBoost) is used to train ensemble classifiers for different layers (cues). Except for the first layer, the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM clas- sifier is combined by these ensemble classifiers and takes advantage of the complementary informa- tion from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising.
文摘February 10 (US Central Time), 2019, China National Peking Opera Company (CNPOC) and the Hubei Chime Bells National Chinese Orchestra presented a fantastic audio-visual performance of Chinese Peking Opera and Chinese chime bells for the American audience at the world s top-level Buntrock Hall at Symphony Center.
基金This paper is the periodic research result of the research project:Basic Research Project of Beijing Institute of Graphic Communication:Research on the Artistic,Modern Communication and Publishing of Dian-shi Zhai Pictorial(1884-1898)(Serial Number Eb202008).
文摘Mongolian audio-visual works are an important carrier of exploring the true significance to this national culture.This paper believes that the Mongolian people in Inner Mongolia constantly enhance the individual sense of identity to the overall ethnic group through the influence of film and television and music,and on this basis constantly evolve a new culture in line with modern and contemporary life to further enhance their sense of belonging to the ethnic nation.
文摘Based on the current situation of college audio-visual English teaching in China, this article points out that the avoidance in class is a serious phenomenon in the process of college audio-visual English teaching. After further analysis and combination with the characteristics of college English audio-visual teaching in China, it puts forward the application of task-based teaching method to college audio-visual English teaching and its steps, attempting to alleviate the avoidance phenomenon in students through task-based teaching method.
基金supported by Guangxi University of Chinese Medicine School-Level Education and Teaching Reform and Research Project:Integration and Innovative Practice of Ideological and Political Education and Zhuang Ethnic Culture in College English Audio-Visual Speaking Course(Project No.2022B073).
文摘Zhuang culture,a representative of the native ethnic culture of Guangxi,China,is of great significance to Chinese culture.In order to promote traditional culture,enrich the teaching content of College English Audio-Visual Speaking Course,and enhance the intercultural communication ability of college students,this paper,from a multicultural perspective,explores the classroom practices of integrating indigenous Zhuang cultural elements in College English Audio-Visual Speaking Course,providing new perspectives and reference for multicultural education in foreign languages.
文摘By distinguishing the differences between audio-visual interpretation and visual interpretation, it is clear that the two belong to different categories in essence and working methods, in order to avoid misunderstanding and confusion between the two in learning. At the same time, there are some misconceptions in their teaching methods. This paper explores the teaching methods of visual interpretation and audio-visual interpretation, which will make them more reasonable and scientific in the teaching process.
文摘The object-based scalable coding in MPEG-4 is investigated, and a prioritized transmission scheme of MPEG-4 audio-visual objects (AVOs) over the DiffServ network with the QoS guarantee is proposed. MPEG-4 AVOs are extracted and classified into different groups according to their priority values and scalable layers (visual importance). These priority values are mapped to the 1P DiffServ per hop behaviors (PHB). This scheme can selectively discard packets with low importance, in order to avoid the network congestion. Simulation results show that the quality of received video can gracefully adapt to network state, as compared with the ‘best-effort' manner. Also, by allowing the content provider to define prioritization of each audio-visual object, the adaptive transmission of object-based scalable video can be customized based on the content.
基金supported in part by the National Key Research and Development Program of China under Grant 2023YFA1011803in part by Natural Science Foundation of Chongqing,China under Grant CSTB2023NSCQ-MSX0588+2 种基金in part by the Fundamental Research Funds for the Central Universities,China under Grant 2023CDJKYJH047in part by the National Natural Science Foundation of China under Grant 62273064,Grant 61991400,Grant 61991403,Grant 61933012,Grant 62250710167,Grant 62203078in part by Innovation Support Program for International Students Returning to China under Grant cx2022016.
文摘This paper addresses the lane-keeping control problem for autonomous ground vehicles subject to input saturation and uncertain system parameters.An enhanced adaptive terminal sliding mode based prescribed performance control scheme is proposed,which enables the lateral position error of the vehicle to be kept within the prescribed performance boundaries all the time.This is achieved by firstly introducing an improved performance function into the controller design such that the stringent initial condition requirements can be relaxed,which further allows the global prescribed performance control result,and then,developing a multivariable adaptive terminal sliding mode based controller such that both input saturation and parameter uncertainties are handled effectively,which further ensures the robust lane-keeping control.Finally,the proposed control strategy is validated through numerical simulations,demonstrating its effectiveness.
文摘Dear Editor,It is well known that event-triggered control(ETC)is an effective approach in addressing networked control problems for Industry 5.0.Its feasibility,however,is still restricted to canonical nonlinear systems so far.Considering this,a gradient-based adaptive ETC scheme for noncanonical nonlinear systems is newly developed in this letter,where the hysteresis input constraints are considered also.By proper decomposition,the technical issue of handling ETC-induced measurement errors and hysteresis inputs can be transformed into the robustness problem to bounded disturbance-like terms,which is then addressed by integrating a switching modification strategy in adaptive design and developing a novel augmented error-based analysis framework.Experimental results based on a practical piezoactuator confirm the effectiveness of the proposed scheme.