With the increasingly prominent trend of globalization,English,as the common language of international communication,plays an increasingly important role in university education.As a key link in English teaching,the c...With the increasingly prominent trend of globalization,English,as the common language of international communication,plays an increasingly important role in university education.As a key link in English teaching,the college English audio-visual oral course not only imparts language knowledge and skills,but also shoulders the important task of cultivating students’critical thinking.As one of the essential core qualities of modern talents,critical thinking ability plays an irreplaceable role in students’in-depth understanding of English knowledge,improving intercultural communication ability and cultivating innovative thinking.This paper expounds the significance of cultivating students’critical thinking ability in college English audio-visual and oral teaching,and puts forward a series of innovative teaching strategies to cultivate students’critical thinking ability combined with practical teaching experience and cutting-edge education theory,in order to provide new ideas and practical guidance for the improvement of college English teaching quality and the development of students’comprehensive quality.展开更多
In response to the evolving challenges posed by small unmanned aerial vehicles(UAVs),which have the potential to transport harmful payloads or cause significant damage,we present AV-FDTI,an innovative Audio-Visual Fus...In response to the evolving challenges posed by small unmanned aerial vehicles(UAVs),which have the potential to transport harmful payloads or cause significant damage,we present AV-FDTI,an innovative Audio-Visual Fusion system designed for Drone Threat Identification.AV-FDTI leverages the fusion of audio and omnidirectional camera feature inputs,providing a comprehensive solution to enhance the precision and resilience of drone classification and 3D localization.Specifically,AV-FDTI employs a CRNN network to capture vital temporal dynamics within the audio domain and utilizes a pretrained ResNet50 model for image feature extraction.Furthermore,we adopt a visual information entropy and cross-attention-based mechanism to enhance the fusion of visual and audio data.Notably,our system is trained based on automated Leica tracking annotations,offering accurate ground truth data with millimeter-level accuracy.Comprehensive comparative evaluations demonstrate the superiority of our solution over the existing systems.In our commitment to advancing this field,we will release this work as open-source code and wearable AV-FDTI design,contributing valuable resources to the research community.展开更多
Zhuang culture,a representative of the native ethnic culture of Guangxi,China,is of great significance to Chinese culture.In order to promote traditional culture,enrich the teaching content of College English Audio-Vi...Zhuang culture,a representative of the native ethnic culture of Guangxi,China,is of great significance to Chinese culture.In order to promote traditional culture,enrich the teaching content of College English Audio-Visual Speaking Course,and enhance the intercultural communication ability of college students,this paper,from a multicultural perspective,explores the classroom practices of integrating indigenous Zhuang cultural elements in College English Audio-Visual Speaking Course,providing new perspectives and reference for multicultural education in foreign languages.展开更多
This article explores the design of a wireless fire alarm system supported by advanced data fusion technology.It includes discussions on the basic design ideas of the wireless fire alarm system,hardware design analysi...This article explores the design of a wireless fire alarm system supported by advanced data fusion technology.It includes discussions on the basic design ideas of the wireless fire alarm system,hardware design analysis,software design analysis,and simulation analysis,all supported by data fusion technology.Hopefully,this analysis can provide some reference for the rational application of data fusion technology to meet the actual design and application requirements of the system.展开更多
Emitting alarm calls may be costly,but few studies have asked whether calling increases a caller’s risk of predation and survival.Since observing animals calling and being killed is relatively rare,we capitalized on ...Emitting alarm calls may be costly,but few studies have asked whether calling increases a caller’s risk of predation and survival.Since observing animals calling and being killed is relatively rare,we capitalized on over 24,000 h of observations of marmot colonies and asked whether variation in the rate that yellow-bellied marmots(Marmota faviventer)alarm called was associated with the probability of summer mortality,a proxy for predation.Using a generalized mixed model that controlled for factors that infuenced the likelihood of survival,we found that marmots who called at higher rates were substantially more likely to die over the summer.Because virtually all summer mortality is due to predation,these results suggest that calling is indeed costly for marmots.Additionally,the results from a Cox survival analysis showed that marmots that called more lived signifcantly shorter lives.Prior studies have shown that marmots reduce the risk by emitting calls only when close to their burrows,but this newly quantifed survival cost suggests a constraint on eliminating risks.Quantifying the cost of alarm calling using a similar approach in other systems will help us better understand its true costs,which is an essential value for theoretical models of calling and social behavior.展开更多
Avian alarm calls mediate defenses against brood parasites and predators. These calls facilitate communication among adults and alert nestlings to potential danger. While heterospecific call recognition has been exten...Avian alarm calls mediate defenses against brood parasites and predators. These calls facilitate communication among adults and alert nestlings to potential danger. While heterospecific call recognition has been extensively studied in adult birds, nestlings—lacking direct predation experience and heterospecific alarm exposure—represent an ideal system to investigate the response to interspecific warning cues. This study explored the recognition capabilities of 5–6-day-old nestlings in Oriental Reed Warbler (Acrocephalus orientalis), a common host of the Common Cuckoo (Cuculus canorus). We exposed the nestlings to playbacks of alarm calls directed at parasites and raptors from conspecific, Vinous-throated Parrotbill (Sinosuthora webbiana, sympatric species), Isabelline Shrike (Lanius isabellinus, allopatric species) and Common Tailorbird (Orthotomus sutorius, allopatric species) adults. Results indicated that there was no significant difference in the responses of nestlings to the alarm calls of conspecific and allopatric adults directed at cuckoos and sparrowhawks. In addition, interestingly, nestlings significantly reduced their begging in response to conspecific and unfamiliar allopatric Isabelline Shrike and Common Tailorbird alarm calls but exhibited a weak response to the sympatric Vinous-throated Parrotbill. Whether older warbler nestlings with more social experience exhibit stronger responses to the alarm calls of Vinous-throated Parrotbill adults requires further investigation.展开更多
Alarm calls in bird vocalizations serve as acoustic signals announcing danger.Owing to the convergent evolution of alarm calls,some bird species can beneft from eavesdropping on certain parameters of alarm calls of ot...Alarm calls in bird vocalizations serve as acoustic signals announcing danger.Owing to the convergent evolution of alarm calls,some bird species can beneft from eavesdropping on certain parameters of alarm calls of other species.Vocal mimicry,displayed by many bird species,aids defense against predators and may help brood parasites during parasitism.In the coevolutionary dynamics between brood parasites,such as the common cuckoo(Cuculus canorus),and their hosts,female cuckoo vocalizations can induce hosts to leave the nest,increasing the probability of successful parasitism and reducing the risk of host attacks.Such cuckoo calls were thought to mimic those of the sparrowhawk.However,owing to their similarity to alarm calls,we propose a new hypothesis:Female cuckoos cheat their hosts by mimicking the parameters of the host alarm call.In this study,we tested this new hypothesis and the sparrowhawk mimicry hypothesis simultaneously by manipulating the syllable rate in male and female common cuckoo vocalizations and playing them in front of the host Oriental reed warbler(Acrocephalus orientalis)for examination.The results indicate that similar to a normal female cuckoo call,a female call with a reduced syllable rate prompted the hosts to leave their nests more frequently and rapidly than male cuckoo calls.Additionally,the male cuckoo calls with increased syllable rate did not prompt the host to leave their nests more frequently or quickly compared with the male cuckoo calls with a normal syllable rate.Our results further confrm that female common cuckoos mimic the vocalizations of Eurasian sparrowhawks(Accipiter nisus),reveal the function mechanisms underlying such mimicry,and support the theory of imperfect mimicry.展开更多
There are multiple types of risks involved in the service of long-span railway bridges.Classical methods are difficult to provide targeted alarm information according to different situations of load anomalies and stru...There are multiple types of risks involved in the service of long-span railway bridges.Classical methods are difficult to provide targeted alarm information according to different situations of load anomalies and structural anomalies.To accurately alarm different risks of long-span railway bridges by structural health monitoring systems,this paper proposes a cross-cooperative alarm method using principal and secondary indicators during high-wind periods.It provides the prior criterion for monitoring systems under special conditions,defining the principal and secondary indicators,alarm levels,and thresholds based on the relationship between dynamic equilibrium equations and multiple linear regression analysis.Analysis of one-year monitoring data from a longspan railway cable-stayed bridge shows that the 10-min average cross-bridge wind speed(excitation indicator)can be selected as the principal indicator,while lateral displacement(response indicator)can serve as the secondary indicator.The threshold levels of the secondary indicator prioritize the safety of bridge operation(mainly aiming at the safety of trains traversing bridges),with values significantly lower than structural safety thresholds.This approach enhances alarm timeliness and effectively distinguishes between load anomalies,structural anomalies,and equipment failures.Consequently,it improves alarm accuracy and provides timely decision support for bridge maintenance,train traversing,and emergency treatment.展开更多
Audio-visual learning,aimed at exploiting the relationship between audio and visual modalities,has drawn considerable attention since deep learning started to be used successfully.Researchers tend to leverage these tw...Audio-visual learning,aimed at exploiting the relationship between audio and visual modalities,has drawn considerable attention since deep learning started to be used successfully.Researchers tend to leverage these two modalities to improve the performance of previously considered single-modality tasks or address new challenging problems.In this paper,we provide a comprehensive survey of recent audio-visual learning development.We divide the current audio-visual learning tasks into four different subfields:audiovisual separation and localization,audio-visual correspondence learning,audio-visual generation,and audio-visual representation learning.State-of-the-art methods,as well as the remaining challenges of each subfield,are further discussed.Finally,we summarize the commonly used datasets and challenges.展开更多
Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The p...Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.展开更多
This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some ligh...This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some lights on our national audio-visual practice and research.The review on the Chinese scholars’ audio-visual translation studies is to offer the potential developing direction and guidelines to the studies and aspects neglected as well.Based on the summary of relevant studies,possible topics for further studies are proposed.展开更多
Post-earthquake rescue missions are full of challenges due to the unstable structure of ruins and successive aftershocks.Most of the current rescue robots lack the ability to interact with environments,leading to low ...Post-earthquake rescue missions are full of challenges due to the unstable structure of ruins and successive aftershocks.Most of the current rescue robots lack the ability to interact with environments,leading to low rescue efficiency.The multimodal electronic skin(e-skin)proposed not only reproduces the pressure,temperature,and humidity sensing capabilities of natural skin but also develops sensing functions beyond it—perceiving object proximity and NO2 gas.Its multilayer stacked structure based on Ecoflex and organohydrogel endows the e-skin with mechanical properties similar to natural skin.Rescue robots integrated with multimodal e-skin and artificial intelligence(AI)algorithms show strong environmental perception capabilities and can accurately distinguish objects and identify human limbs through grasping,laying the foundation for automated post-earthquake rescue.Besides,the combination of e-skin and NO2 wireless alarm circuits allows robots to sense toxic gases in the environment in real time,thereby adopting appropriate measures to protect trapped people from the toxic environment.Multimodal e-skin powered by AI algorithms and hardware circuits exhibits powerful environmental perception and information processing capabilities,which,as an interface for interaction with the physical world,dramatically expands intelligent robots’application scenarios.展开更多
Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A mod...Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learn- ing and adaptive boosting (AdaBoost) is used to train ensemble classifiers for different layers (cues). Except for the first layer, the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM clas- sifier is combined by these ensemble classifiers and takes advantage of the complementary informa- tion from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising.展开更多
February 10 (US Central Time), 2019, China National Peking Opera Company (CNPOC) and the Hubei Chime Bells National Chinese Orchestra presented a fantastic audio-visual performance of Chinese Peking Opera and Chinese ...February 10 (US Central Time), 2019, China National Peking Opera Company (CNPOC) and the Hubei Chime Bells National Chinese Orchestra presented a fantastic audio-visual performance of Chinese Peking Opera and Chinese chime bells for the American audience at the world s top-level Buntrock Hall at Symphony Center.展开更多
Mongolian audio-visual works are an important carrier of exploring the true significance to this national culture.This paper believes that the Mongolian people in Inner Mongolia constantly enhance the individual sense...Mongolian audio-visual works are an important carrier of exploring the true significance to this national culture.This paper believes that the Mongolian people in Inner Mongolia constantly enhance the individual sense of identity to the overall ethnic group through the influence of film and television and music,and on this basis constantly evolve a new culture in line with modern and contemporary life to further enhance their sense of belonging to the ethnic nation.展开更多
Based on the current situation of college audio-visual English teaching in China, this article points out that the avoidance in class is a serious phenomenon in the process of college audio-visual English teaching. Af...Based on the current situation of college audio-visual English teaching in China, this article points out that the avoidance in class is a serious phenomenon in the process of college audio-visual English teaching. After further analysis and combination with the characteristics of college English audio-visual teaching in China, it puts forward the application of task-based teaching method to college audio-visual English teaching and its steps, attempting to alleviate the avoidance phenomenon in students through task-based teaching method.展开更多
By distinguishing the differences between audio-visual interpretation and visual interpretation, it is clear that the two belong to different categories in essence and working methods, in order to avoid misunderstandi...By distinguishing the differences between audio-visual interpretation and visual interpretation, it is clear that the two belong to different categories in essence and working methods, in order to avoid misunderstanding and confusion between the two in learning. At the same time, there are some misconceptions in their teaching methods. This paper explores the teaching methods of visual interpretation and audio-visual interpretation, which will make them more reasonable and scientific in the teaching process.展开更多
Functionally referential signals are a complex form of communication that conveys information about the external environment.Such signals have been found in a range of mammal and bird species and have helped us unders...Functionally referential signals are a complex form of communication that conveys information about the external environment.Such signals have been found in a range of mammal and bird species and have helped us understand the complexities of animal communication.Corvids are well known for their extraordinary cognitive abilities,but relatively little attention has been paid to their vocal function.Here,we investigated the functionally referential signals of a cooperatively breeding corvid species,Azure-winged Magpie(Cyanopica cyanus).Through field observations,we suggest that Azure-winged Magpie uses referential alarm calls to distinguish two types of threats:’rasp’ calls for terrestrial threats and ’chatter’ calls for aerial threats.A playback experiment revealed that Azure-winged Magpies responded to the two call types with qualitatively different behaviors.They sought cover by flying into the bushes in response to the ’chatter’ calls,and flew to or stayed at higher positions in response to ’rasp’ calls,displaying a shorter response time to ’chatter’ calls.Significant differences in acoustic structure were found between the two types of calls.Given the extensive cognitive abilities of corvids and the fact that referential signals were once thought to be unique to primates,these findings are important for expanding our understanding of social communication and language evolution.展开更多
The object-based scalable coding in MPEG-4 is investigated, and a prioritized transmission scheme of MPEG-4 audio-visual objects (AVOs) over the DiffServ network with the QoS guarantee is proposed. MPEG-4 AVOs are e...The object-based scalable coding in MPEG-4 is investigated, and a prioritized transmission scheme of MPEG-4 audio-visual objects (AVOs) over the DiffServ network with the QoS guarantee is proposed. MPEG-4 AVOs are extracted and classified into different groups according to their priority values and scalable layers (visual importance). These priority values are mapped to the 1P DiffServ per hop behaviors (PHB). This scheme can selectively discard packets with low importance, in order to avoid the network congestion. Simulation results show that the quality of received video can gracefully adapt to network state, as compared with the ‘best-effort' manner. Also, by allowing the content provider to define prioritization of each audio-visual object, the adaptive transmission of object-based scalable video can be customized based on the content.展开更多
Overlooking the issue of false alarm suppression in heterogeneous change detection leads to inferior detection per-formance.This paper proposes a method to handle false alarms in heterogeneous change detection.A light...Overlooking the issue of false alarm suppression in heterogeneous change detection leads to inferior detection per-formance.This paper proposes a method to handle false alarms in heterogeneous change detection.A lightweight network of two channels is bulit based on the combination of convolutional neural network(CNN)and graph convolutional network(GCN).CNNs learn feature difference maps of multitemporal images,and attention modules adaptively fuse CNN-based and graph-based features for different scales.GCNs with a new kernel filter adaptively distinguish between nodes with the same and those with different labels,generating change maps.Experimental evaluation on two datasets validates the efficacy of the pro-posed method in addressing false alarms.展开更多
基金A Study on the Teaching Reform of College English Audio-Visual Oral Course Oriented towards the Cultivation of Critical Thinking Ability(2501032339)。
文摘With the increasingly prominent trend of globalization,English,as the common language of international communication,plays an increasingly important role in university education.As a key link in English teaching,the college English audio-visual oral course not only imparts language knowledge and skills,but also shoulders the important task of cultivating students’critical thinking.As one of the essential core qualities of modern talents,critical thinking ability plays an irreplaceable role in students’in-depth understanding of English knowledge,improving intercultural communication ability and cultivating innovative thinking.This paper expounds the significance of cultivating students’critical thinking ability in college English audio-visual and oral teaching,and puts forward a series of innovative teaching strategies to cultivate students’critical thinking ability combined with practical teaching experience and cutting-edge education theory,in order to provide new ideas and practical guidance for the improvement of college English teaching quality and the development of students’comprehensive quality.
基金National Research Foundation,Singapore,under its Medium-Sized Center for Advanced Robotics Technology Innovation(CARTIN)under project WP5 within the Delta-NTU Corporate Lab with funding support from A*STAR under its IAF-ICP program(Grant no:I2201E0013)and Delta Electronics Inc.
文摘In response to the evolving challenges posed by small unmanned aerial vehicles(UAVs),which have the potential to transport harmful payloads or cause significant damage,we present AV-FDTI,an innovative Audio-Visual Fusion system designed for Drone Threat Identification.AV-FDTI leverages the fusion of audio and omnidirectional camera feature inputs,providing a comprehensive solution to enhance the precision and resilience of drone classification and 3D localization.Specifically,AV-FDTI employs a CRNN network to capture vital temporal dynamics within the audio domain and utilizes a pretrained ResNet50 model for image feature extraction.Furthermore,we adopt a visual information entropy and cross-attention-based mechanism to enhance the fusion of visual and audio data.Notably,our system is trained based on automated Leica tracking annotations,offering accurate ground truth data with millimeter-level accuracy.Comprehensive comparative evaluations demonstrate the superiority of our solution over the existing systems.In our commitment to advancing this field,we will release this work as open-source code and wearable AV-FDTI design,contributing valuable resources to the research community.
基金supported by Guangxi University of Chinese Medicine School-Level Education and Teaching Reform and Research Project:Integration and Innovative Practice of Ideological and Political Education and Zhuang Ethnic Culture in College English Audio-Visual Speaking Course(Project No.2022B073).
文摘Zhuang culture,a representative of the native ethnic culture of Guangxi,China,is of great significance to Chinese culture.In order to promote traditional culture,enrich the teaching content of College English Audio-Visual Speaking Course,and enhance the intercultural communication ability of college students,this paper,from a multicultural perspective,explores the classroom practices of integrating indigenous Zhuang cultural elements in College English Audio-Visual Speaking Course,providing new perspectives and reference for multicultural education in foreign languages.
基金Chongqing Engineering University Undergraduate Innovation and Entrepreneurship Training Program Project:Wireless Fire Automatic Alarm System(Project No.:CXCY2024017)Chongqing Municipal Education Commission Science and Technology Research Project:Development and Research of Chongqing Wireless Fire Automatic Alarm System(Project No.:KJQN202401906)。
文摘This article explores the design of a wireless fire alarm system supported by advanced data fusion technology.It includes discussions on the basic design ideas of the wireless fire alarm system,hardware design analysis,software design analysis,and simulation analysis,all supported by data fusion technology.Hopefully,this analysis can provide some reference for the rational application of data fusion technology to meet the actual design and application requirements of the system.
基金National Geographic Society,UCLA(Faculty Senate and the Division of Life Sciences),a Rocky Mountain Biological Laboratory research fellowship,NSF IDBR-0754247,and DEB-1119660 and 1557130 all to D.T.B.DBI-0242960,0731346,1226713,and 1755522 to the RMBL.K.A.was a NSF GRFP fellow during the fnal preparation of this MS。
文摘Emitting alarm calls may be costly,but few studies have asked whether calling increases a caller’s risk of predation and survival.Since observing animals calling and being killed is relatively rare,we capitalized on over 24,000 h of observations of marmot colonies and asked whether variation in the rate that yellow-bellied marmots(Marmota faviventer)alarm called was associated with the probability of summer mortality,a proxy for predation.Using a generalized mixed model that controlled for factors that infuenced the likelihood of survival,we found that marmots who called at higher rates were substantially more likely to die over the summer.Because virtually all summer mortality is due to predation,these results suggest that calling is indeed costly for marmots.Additionally,the results from a Cox survival analysis showed that marmots that called more lived signifcantly shorter lives.Prior studies have shown that marmots reduce the risk by emitting calls only when close to their burrows,but this newly quantifed survival cost suggests a constraint on eliminating risks.Quantifying the cost of alarm calling using a similar approach in other systems will help us better understand its true costs,which is an essential value for theoretical models of calling and social behavior.
基金funded by the National Natural Science Foundation of China (No. 32301295 to JW, 32101242 to LM, and 32260253 to LW)High-Level Talents Research Start-Up Project of Hebei University (No. 521100222044 to JW)
文摘Avian alarm calls mediate defenses against brood parasites and predators. These calls facilitate communication among adults and alert nestlings to potential danger. While heterospecific call recognition has been extensively studied in adult birds, nestlings—lacking direct predation experience and heterospecific alarm exposure—represent an ideal system to investigate the response to interspecific warning cues. This study explored the recognition capabilities of 5–6-day-old nestlings in Oriental Reed Warbler (Acrocephalus orientalis), a common host of the Common Cuckoo (Cuculus canorus). We exposed the nestlings to playbacks of alarm calls directed at parasites and raptors from conspecific, Vinous-throated Parrotbill (Sinosuthora webbiana, sympatric species), Isabelline Shrike (Lanius isabellinus, allopatric species) and Common Tailorbird (Orthotomus sutorius, allopatric species) adults. Results indicated that there was no significant difference in the responses of nestlings to the alarm calls of conspecific and allopatric adults directed at cuckoos and sparrowhawks. In addition, interestingly, nestlings significantly reduced their begging in response to conspecific and unfamiliar allopatric Isabelline Shrike and Common Tailorbird alarm calls but exhibited a weak response to the sympatric Vinous-throated Parrotbill. Whether older warbler nestlings with more social experience exhibit stronger responses to the alarm calls of Vinous-throated Parrotbill adults requires further investigation.
基金funded by the Education Department of Hainan Province(no.HnjgY 2022-12)the National Natural Science Foundation of China(no.32260127).
文摘Alarm calls in bird vocalizations serve as acoustic signals announcing danger.Owing to the convergent evolution of alarm calls,some bird species can beneft from eavesdropping on certain parameters of alarm calls of other species.Vocal mimicry,displayed by many bird species,aids defense against predators and may help brood parasites during parasitism.In the coevolutionary dynamics between brood parasites,such as the common cuckoo(Cuculus canorus),and their hosts,female cuckoo vocalizations can induce hosts to leave the nest,increasing the probability of successful parasitism and reducing the risk of host attacks.Such cuckoo calls were thought to mimic those of the sparrowhawk.However,owing to their similarity to alarm calls,we propose a new hypothesis:Female cuckoos cheat their hosts by mimicking the parameters of the host alarm call.In this study,we tested this new hypothesis and the sparrowhawk mimicry hypothesis simultaneously by manipulating the syllable rate in male and female common cuckoo vocalizations and playing them in front of the host Oriental reed warbler(Acrocephalus orientalis)for examination.The results indicate that similar to a normal female cuckoo call,a female call with a reduced syllable rate prompted the hosts to leave their nests more frequently and rapidly than male cuckoo calls.Additionally,the male cuckoo calls with increased syllable rate did not prompt the host to leave their nests more frequently or quickly compared with the male cuckoo calls with a normal syllable rate.Our results further confrm that female common cuckoos mimic the vocalizations of Eurasian sparrowhawks(Accipiter nisus),reveal the function mechanisms underlying such mimicry,and support the theory of imperfect mimicry.
基金supported by the National Natural Science Foundation of China(Grants U23A20660,52008099,and 52378288)the Major Science and Technology Project of Yunnan Province,China(Grant 202502AD080007)the China Railway Engineering Corporation Science and Technology Research and Development Project(Grant 2022-Key-44).
文摘There are multiple types of risks involved in the service of long-span railway bridges.Classical methods are difficult to provide targeted alarm information according to different situations of load anomalies and structural anomalies.To accurately alarm different risks of long-span railway bridges by structural health monitoring systems,this paper proposes a cross-cooperative alarm method using principal and secondary indicators during high-wind periods.It provides the prior criterion for monitoring systems under special conditions,defining the principal and secondary indicators,alarm levels,and thresholds based on the relationship between dynamic equilibrium equations and multiple linear regression analysis.Analysis of one-year monitoring data from a longspan railway cable-stayed bridge shows that the 10-min average cross-bridge wind speed(excitation indicator)can be selected as the principal indicator,while lateral displacement(response indicator)can serve as the secondary indicator.The threshold levels of the secondary indicator prioritize the safety of bridge operation(mainly aiming at the safety of trains traversing bridges),with values significantly lower than structural safety thresholds.This approach enhances alarm timeliness and effectively distinguishes between load anomalies,structural anomalies,and equipment failures.Consequently,it improves alarm accuracy and provides timely decision support for bridge maintenance,train traversing,and emergency treatment.
基金supported by National Key Research and Development Program of China(No.2016YFB1001001)Beijing Natural Science Foundation(No.JQ18017)National Natural Science Foundation of China(No.61976002)。
文摘Audio-visual learning,aimed at exploiting the relationship between audio and visual modalities,has drawn considerable attention since deep learning started to be used successfully.Researchers tend to leverage these two modalities to improve the performance of previously considered single-modality tasks or address new challenging problems.In this paper,we provide a comprehensive survey of recent audio-visual learning development.We divide the current audio-visual learning tasks into four different subfields:audiovisual separation and localization,audio-visual correspondence learning,audio-visual generation,and audio-visual representation learning.State-of-the-art methods,as well as the remaining challenges of each subfield,are further discussed.Finally,we summarize the commonly used datasets and challenges.
文摘Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.
文摘This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some lights on our national audio-visual practice and research.The review on the Chinese scholars’ audio-visual translation studies is to offer the potential developing direction and guidelines to the studies and aspects neglected as well.Based on the summary of relevant studies,possible topics for further studies are proposed.
基金supports from the National Natural Science Foundation of China(61801525)the independent fund of the State Key Laboratory of Optoelectronic Materials and Technologies(Sun Yat-sen University)under grant No.OEMT-2022-ZRC-05+3 种基金the Opening Project of State Key Laboratory of Polymer Materials Engineering(Sichuan University)(Grant No.sklpme2023-3-5))the Foundation of the state key Laboratory of Transducer Technology(No.SKT2301),Shenzhen Science and Technology Program(JCYJ20220530161809020&JCYJ20220818100415033)the Young Top Talent of Fujian Young Eagle Program of Fujian Province and Natural Science Foundation of Fujian Province(2023J02013)National Key R&D Program of China(2022YFB2802051).
文摘Post-earthquake rescue missions are full of challenges due to the unstable structure of ruins and successive aftershocks.Most of the current rescue robots lack the ability to interact with environments,leading to low rescue efficiency.The multimodal electronic skin(e-skin)proposed not only reproduces the pressure,temperature,and humidity sensing capabilities of natural skin but also develops sensing functions beyond it—perceiving object proximity and NO2 gas.Its multilayer stacked structure based on Ecoflex and organohydrogel endows the e-skin with mechanical properties similar to natural skin.Rescue robots integrated with multimodal e-skin and artificial intelligence(AI)algorithms show strong environmental perception capabilities and can accurately distinguish objects and identify human limbs through grasping,laying the foundation for automated post-earthquake rescue.Besides,the combination of e-skin and NO2 wireless alarm circuits allows robots to sense toxic gases in the environment in real time,thereby adopting appropriate measures to protect trapped people from the toxic environment.Multimodal e-skin powered by AI algorithms and hardware circuits exhibits powerful environmental perception and information processing capabilities,which,as an interface for interaction with the physical world,dramatically expands intelligent robots’application scenarios.
基金Supported by the National Natural Science Foundation of China(60905006)the NSFC-Guangdong Joint Fund(U1035004)
文摘Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learn- ing and adaptive boosting (AdaBoost) is used to train ensemble classifiers for different layers (cues). Except for the first layer, the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM clas- sifier is combined by these ensemble classifiers and takes advantage of the complementary informa- tion from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising.
文摘February 10 (US Central Time), 2019, China National Peking Opera Company (CNPOC) and the Hubei Chime Bells National Chinese Orchestra presented a fantastic audio-visual performance of Chinese Peking Opera and Chinese chime bells for the American audience at the world s top-level Buntrock Hall at Symphony Center.
基金This paper is the periodic research result of the research project:Basic Research Project of Beijing Institute of Graphic Communication:Research on the Artistic,Modern Communication and Publishing of Dian-shi Zhai Pictorial(1884-1898)(Serial Number Eb202008).
文摘Mongolian audio-visual works are an important carrier of exploring the true significance to this national culture.This paper believes that the Mongolian people in Inner Mongolia constantly enhance the individual sense of identity to the overall ethnic group through the influence of film and television and music,and on this basis constantly evolve a new culture in line with modern and contemporary life to further enhance their sense of belonging to the ethnic nation.
文摘Based on the current situation of college audio-visual English teaching in China, this article points out that the avoidance in class is a serious phenomenon in the process of college audio-visual English teaching. After further analysis and combination with the characteristics of college English audio-visual teaching in China, it puts forward the application of task-based teaching method to college audio-visual English teaching and its steps, attempting to alleviate the avoidance phenomenon in students through task-based teaching method.
文摘By distinguishing the differences between audio-visual interpretation and visual interpretation, it is clear that the two belong to different categories in essence and working methods, in order to avoid misunderstanding and confusion between the two in learning. At the same time, there are some misconceptions in their teaching methods. This paper explores the teaching methods of visual interpretation and audio-visual interpretation, which will make them more reasonable and scientific in the teaching process.
基金funded by the National Natural Science Foundation of China (Grant No. 32170516, 31872243 to Y.Z.)。
文摘Functionally referential signals are a complex form of communication that conveys information about the external environment.Such signals have been found in a range of mammal and bird species and have helped us understand the complexities of animal communication.Corvids are well known for their extraordinary cognitive abilities,but relatively little attention has been paid to their vocal function.Here,we investigated the functionally referential signals of a cooperatively breeding corvid species,Azure-winged Magpie(Cyanopica cyanus).Through field observations,we suggest that Azure-winged Magpie uses referential alarm calls to distinguish two types of threats:’rasp’ calls for terrestrial threats and ’chatter’ calls for aerial threats.A playback experiment revealed that Azure-winged Magpies responded to the two call types with qualitatively different behaviors.They sought cover by flying into the bushes in response to the ’chatter’ calls,and flew to or stayed at higher positions in response to ’rasp’ calls,displaying a shorter response time to ’chatter’ calls.Significant differences in acoustic structure were found between the two types of calls.Given the extensive cognitive abilities of corvids and the fact that referential signals were once thought to be unique to primates,these findings are important for expanding our understanding of social communication and language evolution.
文摘The object-based scalable coding in MPEG-4 is investigated, and a prioritized transmission scheme of MPEG-4 audio-visual objects (AVOs) over the DiffServ network with the QoS guarantee is proposed. MPEG-4 AVOs are extracted and classified into different groups according to their priority values and scalable layers (visual importance). These priority values are mapped to the 1P DiffServ per hop behaviors (PHB). This scheme can selectively discard packets with low importance, in order to avoid the network congestion. Simulation results show that the quality of received video can gracefully adapt to network state, as compared with the ‘best-effort' manner. Also, by allowing the content provider to define prioritization of each audio-visual object, the adaptive transmission of object-based scalable video can be customized based on the content.
基金This work was supported by the Natural Science Foundation of Heilongjiang Province(LH2022F049).
文摘Overlooking the issue of false alarm suppression in heterogeneous change detection leads to inferior detection per-formance.This paper proposes a method to handle false alarms in heterogeneous change detection.A lightweight network of two channels is bulit based on the combination of convolutional neural network(CNN)and graph convolutional network(GCN).CNNs learn feature difference maps of multitemporal images,and attention modules adaptively fuse CNN-based and graph-based features for different scales.GCNs with a new kernel filter adaptively distinguish between nodes with the same and those with different labels,generating change maps.Experimental evaluation on two datasets validates the efficacy of the pro-posed method in addressing false alarms.