Background: Sickle cell anemia(SCA), a genetic hemoglobin disorder, suggests essential inner ear compromise and poor auditory processing. In humans, auditory processing differs physiologically between males and female...Background: Sickle cell anemia(SCA), a genetic hemoglobin disorder, suggests essential inner ear compromise and poor auditory processing. In humans, auditory processing differs physiologically between males and females, possibly true for SCA due to gender-specific disease pathophysiological changes. Objective: To investigate gender differences in psychoacoustical abilities, and speech perception in noise in SCA individuals and further compare with normal healthy(NH) population. Methods: 80 SCA and 80 NH normal-hearing participants aged 15-40 years were included and further grouped based on gender. Auditory discrimination for frequency, intensity, and duration at 500Hz and 4000Hz;temporal processing(Gap detection threshold & Modulation Detection Threshold) and Speech Perception In Noise(SPIN) at 0d BSNR tests were evaluated and compared between males and females of SCA and NH population. Results: SCA performed poorer compared to NH for all experimental measures. In the NH population, males performed poorer than females in psychoacoustical measures whereas within the SCA population, the reverse was true. Female participants performed better in the SPIN test in both populations. Conclusions: The adverse impact of SCA on the auditory system due to circulatory changes might cause poorer performance in SCA. Poorer performance by Female SCA is possibly due to the contrary impact of lower Hb level overlying Sickle disease.Estrogen levels and gender preference in auditory processing might lead to better performance by females within the NH population. SPIN performance depends on different attentional demands and sensorimotor processing strategies in noise beyond psychoacoustical processing may lead to better female performance in both populations.展开更多
Background:Research has shown that musicians outperform non-musicians in speech perception in noise(SPiN)tasks.However,it remains unclear whether the advantages of musical training are substantial enough to slow down ...Background:Research has shown that musicians outperform non-musicians in speech perception in noise(SPiN)tasks.However,it remains unclear whether the advantages of musical training are substantial enough to slow down the decline in SPiN performance associated with aging.Objectives:Therefore,we assessed SPiN performances in a continuum of age groups comprising musicians and non-musicians.The goal was to compare how the aging process affected SPiN performances of musicians and non-musicians.Method:A cross-sectional descriptive mixed design was used,involving 150 participants divided into 75 musicians and 75 non-musicians.Each age group(10-19,20-29,30-39,40-49,and 50-59)consisted of15 musicians and 15 non-musicians.Six Kannada sentence lists were combined with four-talker babble.At+5,0,and-5 dB signal-to-noise ratios(SNRs),the percent correct Speech Identification Scores were calculated.Results:The repeated measure ANOVA(RM ANOVA)revealed significant main effects and interaction effects between SNR,musicianship,and age groups(p<0.05).A small to large effect size was noted(ηp2=0.05 to0.17).A significant interaction effect and follow-up post hoc tests showed that SPiN abilities deteriorated more rapidly with increasing age in nonmusicians compared to musicians,especially at difficult SNRs.Conclusions:Musicians had better SPiN abilities than non-musicians across all age groups.Also,age-related deterioration in SPiN abilities was faster in non-musicians compared to musicians.展开更多
BackgroundIt's crucial to study the effect of changes in thresholds(T)and most comfortable levels(M)on behavioral measurements in young children using cochlear implants.This would help the clinician with the optim...BackgroundIt's crucial to study the effect of changes in thresholds(T)and most comfortable levels(M)on behavioral measurements in young children using cochlear implants.This would help the clinician with the optimization and validation of programming parameters.ObjectiveThe study has attempted to describe the changes in behavioral responses with modification of T and M levels.MethodsTwenty-five participants in the age range 5 to 12 years using HR90K/HiFocus1J or HR90KAdvantage/HiFocus1J with Harmony speech processors participated in the study.A decrease in T levels,a rise in T levels,or a decrease in M levels in the everyday program were used to create experimental programs.Sound field thresholds and speech perception were measured at 50 dBHL for three experimental and everyday programs.ConclusionThe results indicated that only reductions of M levels resulted in significantly(p<0.01)poor aided thresholds and speech perception.On the other hand,variation in T levels did not have significant changes in either sound field thresholds or speech perception.The results highlight that M levels must be correctly established in order to prevent decreased speech perception and audibility.展开更多
Musical training can counteract age-related decline in speech perception in noisy environments.However,it remains unclear whether older non-musicians and musicians rely on functional compensation or functional preserv...Musical training can counteract age-related decline in speech perception in noisy environments.However,it remains unclear whether older non-musicians and musicians rely on functional compensation or functional preservation to counteract the adverse efects of aging.This study utilized resting-state functional connectivity(FC)to investigate functional lateralization,a fundamental organization feature,in older musicians(OM),older non-musicians(ONM),and young non-musicians(YNM).Results showed that OM outperformed ONM and achieved comparable performance to YNM in speech-in-noise and speech-in-speech tasks.ONM exhibited reduced lateralization than YNM in lateralization index(LI)of intrahemispheric FC(LI_intra)in the cingulo-opercular network(CON)and LI of interhemispheric heterotopic FC(LI_he)in the language network(LAN).Conversely,OM showed higher neural alignment to YNM(i.e.,a more similar lateralization pattern)compared to ONM in CON,LAN,frontoparietal network(FPN),dorsal attention network(DAN),and default mode network(DMN),indicating preservation of youth-like lateralization patterns due to musical experience.Furthermore,in ONM,stronger left-lateralized and lower alignment-to-young of LI_intra in the somatomotor network(SMN)and DAN and LI_he in DMN correlated with better speech performance,indicating a functional compensation mechanism.In contrast,stronger right-lateralized LI_intra in FPN and DAN and higher alignment-to-young of LI_he in LAN correlated with better performance in OM,suggesting a functional preservation mechanism.These fndings highlight the diferential roles of functional preservation and compensation of lateralization in speech perception in noise among elderly individuals with and without musical expertise,ofering insights into successful aging theories from the lens of functional lateralization and speech perception.展开更多
Speech perception is essential for daily communication.Background noise or concurrent talkers,on the other hand,can make it challenging for listeners to track the target speech(i.e.,cocktail party problem).The present...Speech perception is essential for daily communication.Background noise or concurrent talkers,on the other hand,can make it challenging for listeners to track the target speech(i.e.,cocktail party problem).The present study reviews and compares existing findings on speech perception and unmasking in cocktail party listening environments in English and Mandarin Chinese.The review starts with an introduction section followed by related concepts of auditory masking.The next two sections review factors that release speech perception from masking in English and Mandarin Chinese,respectively.The last section presents an overall summary of the findings with comparisons between the two languages.Future research directions with respect to the difference in literature on the reviewed topic between the two languages are also discussed.展开更多
Based on the Motor Theory of speech perception, the interaction between the auditory and motor systems plays an essential role in speech perception. Since the Motor Theory was proposed, it has received remarkable atte...Based on the Motor Theory of speech perception, the interaction between the auditory and motor systems plays an essential role in speech perception. Since the Motor Theory was proposed, it has received remarkable attention in the field. However, each of the three hypotheses of the theory still needs further verification. In this review, we focus on how the auditory-motor anatomical and functional associations play a role in speech perception and discuss why previous studies could not reach an agreement and particularly whether the motor system involvement in speech perception is task-load dependent. Finally, we suggest that the function of the auditory-motor link is particularly useful for speech perception under adverse listening conditions and the further revised Motor Theory is a potential solution to the "cocktail-party" problem.展开更多
The interference of a tonal language poses challenges for Chinese learners of English to acquire word stress.The lack of symmetry between word stress problems in production and perception,and the absence of attention ...The interference of a tonal language poses challenges for Chinese learners of English to acquire word stress.The lack of symmetry between word stress problems in production and perception,and the absence of attention to specific stress patterns in teaching and learning,can reduce the effectiveness of word stress acquisition.The purpose of this paper is twofold:to examine the relationship between English word stress production and perception and to investigate how English word stress production and perception are affected by specific stress patterns.Ninety participants were involved in a production task and a perception task.Test words were selected based on 26 stress patterns in three categories:syllabic structure,phonological similarity,and vowel reduction.The results show that the production and perception of English word stress differ significantly without a strong linear correlation.Although the accuracy of word stress perception was higher than production for the test words in general,the comparative status of production and perception varied across different stress patterns.Specifically,in the syllabic structure category,the highest symmetry rate of word stress assignment in the production and perception forˈσCVCC(e.g.,climax,abend),while the symmetry rate forˈσoCVV(C)(e.g.,abdicate,importune)was the lowest and the most problematic for production.In the phonological similarity category,production and perception of word stress were most symmetrical for words with the suffix“-eous”and the most asymmetrical for words with the suffix“-ese,”which was also the most problematic for production.Identification of vowel reduction was more challenging for/ɒ/than/æ/in both production and perception.It is suggested that Chinese ESL teachers prioritize the teaching of stress patterns with low symmetrical relationships to achieve efficient learning outcomes.展开更多
Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning...Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning, we aim to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach towards gradation that is motivated by auditory perception. We have developed a computational method for generating a perceptual distance(PD) between two spoken phonemes. This is used to compute the auditory confusion of native language(L1). PD is found to correlate well with the mispronunciations detected in CAPT system for Chinese learners of English,i.e., L1 being Chinese(Mandarin and Cantonese) and L2 being US English. The results show that auditory confusion is indicative of pronunciation confusions in L2 learning. PD can also be used to help us grade the severity of errors(i.e.,mispronunciations that confuse more distant phonemes are more severe) and accordingly prioritize the order of corrective feedback generated for the learners.展开更多
Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain an...Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain and realizing it in an artificial system. A physiological research study based on electromyographic signals (Honda, 1996) suggested that speech communication in human brain might be based on a topological mapping between speech production and perception, according to an analogous topology between motor and sensory representations. Following this hypothesis, this study first investigated the topologies of the vowel system across the motor, kinematic, and acoustic spaces by means of a model simulation, and then examined the linkage between vowel production and perception in terms of a transformed auditory feedback (TAF) experiment. The model simulation indicated that there exists an invariant mapping from muscle activations (motor space) to articulations (kinematic space) via a coordinate consisting of force-dependent equilibrium positions, and the mapping from the motor space to kinematic space is unique. The motor-kinematic-acoustic deduction in the model simulation showed that the topologies were compatible from one space to another. In the TAF experiment, vowel production exhibited a compensatory response for a perturbation in the feedback sound. This implied that vowel production is controlled in reference to perception monitoring.展开更多
Objective:Contribute to clarifying the existence of subclinical hearing deficits associated with aging.Design:In this work,we study and compare the auditory perceptual and electrophysiological performance of normal-he...Objective:Contribute to clarifying the existence of subclinical hearing deficits associated with aging.Design:In this work,we study and compare the auditory perceptual and electrophysiological performance of normal-hearing young and adult subjects(tonal audiometry,high-frequency tone threshold,a triplet of digits in noise,and click-evoked auditory brainstem response).Study sample:45 normal hearing volunteers were evaluated and divided into two groups according to age.27 subjects were included in the“young group”(mean 22.1 years),and 18 subjects(mean 42.22 years)were included in the“adult group.”Results:In the perceptual tests,the adult group presented significantly worse tonal thresholds in the high frequencies(12 and 16 kHz)and worse performance in the digit triplet tests in noise.In the electrophysiological test using the auditory brainstem response technique,the adult group presented significantly lower I and V wave amplitudes and higher V wave latencies at the supra-threshold level.At the threshold level,we observed a significantly higher latency in wave V in the adult group.In addition,in the partial correlation analysis,controlling for the hearing level,we observed a relationship(negative)between age and speech in noise performance and high-frequency thresholds.No significant association was observed between age and the auditory brainstem response.Conclusion:The results are compatible with subclinical hearing loss associated with aging.展开更多
The aim of the study was to determine the development of syntax in language development of children who are deaf or hard-of-hearing, who are taught to new dynamic linguistic features with the help of computers. The sa...The aim of the study was to determine the development of syntax in language development of children who are deaf or hard-of-hearing, who are taught to new dynamic linguistic features with the help of computers. The sample consisted of 70 children who are deaf or hard-of-hearing, aged 7-17 years. To assess language development were applied following variables: total number of words used, the total number of different words used, the correct and incorrect statements (sentences) of the respondents. We calculated the basic statistical parameters on which it was found that the experimental program computer teaching children who are deaf or hard-of-hearing gave better results in the development of syntax. Also, canonical discriminate analysis revealed a statistically significant difference in the applied variables between the control and experimental groups the level of statistical significance ofp = 0.000. The results showed a significant improvement of the experimental group and that dynamic computer programming activities, which were challenged participants of the experimental group, contribute to a better linguistic competence of children who are deaf or hard-of-hearing.展开更多
Older adults often find it difficult to perceive speech, especially in noisy conditions. Though hearing aid is one of the rehabilitative devices available to older adults to alleviate hearing loss, some of them may ex...Older adults often find it difficult to perceive speech, especially in noisy conditions. Though hearing aid is one of the rehabilitative devices available to older adults to alleviate hearing loss, some of them may experience annoyance through hearing aid and hence reject it, may be due to circuitry noise and/or background noise. Acceptable noise level is a direct behavioural measure to estimate the extent of how much a person is able to put up with noise while simultaneously listening to speech. Acceptable noise level is a central auditory measure and it is not influenced by age, gender, presentation level or speaker. Using this measure, we can quantify the annoyance level experienced by an individual. This in-formation is of utmost importance and caution should be paid before setting the parameters in hearing aid, especially for those who are unable to accept noise. In this review article, an attempt has been made to document how to optimize the hearing aid program by setting parameters such as noise reduction circuit, microphone sensitivity and gain. These adjustments of parameters might help to reduce rejection rate of hearing aids, especially in those individuals who are annoyed by background noise. Copyright ? 2015 The Authors. Production & hosting by Elsevier (Singapore) Pte Ltd On behalf of PLA General Hospital Department of Otolaryngology Head and Neck Surgery. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).展开更多
Objective: To demonstrate the performance benefit of the Automatic Scene Classifier (SCAN) algorithm available in the Nucleus 6 (CP900 series) sound processor over the default processing algorithms of the previou...Objective: To demonstrate the performance benefit of the Automatic Scene Classifier (SCAN) algorithm available in the Nucleus 6 (CP900 series) sound processor over the default processing algorithms of the previous generation Nucleus 5 (CP810) and Freedom HybridTM sound processors. Methods: Eighty-two cochlear implant recipients (40 Nucleus 5 processor users and 42 Freedom Hybrid processor users) listened to and repeated AzBio sentences in noise with their current processor and with the Nucleus 6 processor. Results: The SCAN algorithm when enabled yielded statistically significant non-inferior and superior performance when compared to the Nucleus 5 and Freedom Hybrid sound processors programmed with ASC + ADRO. Conclusion: The results of these studies demonstrate the superior performance and clinical utility of the SCAN algorithm in the Nucleus 6 processor over the Nucleus 5 and Freedom Hybrid processors.展开更多
Objective:To evaluate the auditory function of an individual with genetically confirmed hemochromatosis. Methods: A 57 year old male with mildly impaired sound detection thresholds underwent a range of behavioural, el...Objective:To evaluate the auditory function of an individual with genetically confirmed hemochromatosis. Methods: A 57 year old male with mildly impaired sound detection thresholds underwent a range of behavioural, electroacoustic and elec-trophysiologic assessments. These included the recording of otoacoustic emissions and auditory brainstem responses, measurement of monaural temporal resolution and evaluation of binaural speech processing. Findings for this patient were subsequently compared with those of 80 healthy controls with similar audiometric thresholds. Results: The patient showed the three cardinal features of auditory neuropathy, presenting with evidence of normal cochlear outer hair cell function, disrupted neural activity in the auditory nerve/brainstem and impaired temporal processing. His functional hearing ability (speech perception) was significantly affected and suggested a reduced capacity to use localization cues to segregate signals in the presence of back-ground noise. Conclusion:We present the first case of an individual with hemochromatosis and auditory neuropathy. The findings for this patient highlight the need for careful evaluation of auditory function in individuals with the disorder.展开更多
The Perception Spectrogram Structure Boundary(PSSB)parameter is proposed for speech endpoint detection as a preprocess of speech or speaker recognition.At first a hearing perception speech enhancement is carried out...The Perception Spectrogram Structure Boundary(PSSB)parameter is proposed for speech endpoint detection as a preprocess of speech or speaker recognition.At first a hearing perception speech enhancement is carried out.Then the two-dimensional enhancement is performed upon the sound spectrogram according to the difference between the determinacy distribution characteristic of speech and the random distribution characteristic of noise.Finally a decision for endpoint was made by the PSSB parameter.Experimental results show that,in a low SNR environment from-10 dB to 10 dB,the algorithm proposed in this paper may achieve higher accuracy than the extant endpoint detection algorithms.The detection accuracy of 75.2%can be reached even in the extremely low SNR at-10 dB.Therefore it is suitable for speech endpoint detection in low-SNRs environment.展开更多
Perception is the interaction interface between an intelligent system and the real world. Without sophisticated and flexible perceptual capabilities, it is impossible to create advanced artificial intelligence (AI) ...Perception is the interaction interface between an intelligent system and the real world. Without sophisticated and flexible perceptual capabilities, it is impossible to create advanced artificial intelligence (AI) systems. For the next-generation AI, called 'AI 2.0', one of the most significant features will be that AI is empowered with intelligent perceptual capabilities, which can simulate human brain's mechanisms and are likely to surpass human brain in terms of performance. In this paper, we briefly review the state-of-the-art advances across different areas of perception, including visual perception, auditory perception, speech perception, and perceptual information processing and learning engines. On this basis, we envision several R&D trends in intelligent perception for the forthcoming era of AI 2.0, including: (1) human-like and transhuman active vision; (2) auditory perception and computation in an actual auditory setting; (3) speech perception and computation in a natural interaction setting; (4) autonomous learning of perceptual information; (5) large-scale perceptual information processing and learning platforms; and (6) urban omnidirectional intelligent perception and reasoning engines. We believe these research directions should be highlighted in the future plans for AI 2.0.展开更多
Background Many factors interfering with a listener attempting to grasp speech in noisy environments. The spatial hearing by which speech and noise can be spatially separated may play a crucial role in speech recognit...Background Many factors interfering with a listener attempting to grasp speech in noisy environments. The spatial hearing by which speech and noise can be spatially separated may play a crucial role in speech recognition in the presence of competing noise. This study aimed to assess whether, and to what degree, spatial hearing benefit speech recognition in young normal-hearing participants in both quiet and noisy environments. Methods Twenty-eight young participants were tested by Mandarin Hearing In Noise Test (MHINT) in quiet and noisy environments. The assessment method used was characterized by modifications of speech and noise configurations, as well as by changes of speech presentation mode. The benefit of spatial hearing was measured by speech recognition threshold (SRT) variation between speech condition 1 (SC1) and speech condition 2 (SC2). Results There was no significant difference found in the SRT between SC1 and SC2 in quiet. SRT in SC1 was about 4.2 dB lower than that in SC2, both in speech-shaped and four-babble noise conditions. SRTs measured in both SC1 and SC2 were lower in the speech-shaped noise condition than in the four-babble noise condition. Conclusion Spatial hearing in young normal-hearing participants contribute to speech recognition in noisy environments, but provide no benefit to speech recognition in quiet environments, which may be due to the offset of auditory extrinsic redundancy against the lack of spatial hearing.展开更多
Brain mechanisms of lexical-semantic processing have been well researched using electroencephalography(EEG)technique with high temporal resolution.However,the detailed brain dynamics regarding spatial connectivity and...Brain mechanisms of lexical-semantic processing have been well researched using electroencephalography(EEG)technique with high temporal resolution.However,the detailed brain dynamics regarding spatial connectivity and the spectral characteristics remain to be clarified.For this reason,this study performed frequency-specific effective connectivity analysis for the EEG recordings during the processing of real and pseudowords.In addition,we introduced f MRI-based network templates into a representational similarity analysis to compare the functional differences between real and pseudowords in different frequency bands.Our results revealed that real words could rapidly activate the brain network for speech perception and complete its comprehension with efficiency,especially when the first syllable of the real word has clear categorical features.In contrast,the pseudowords were delayed in the initiation of speech perception and required a longer time span to retrieve its meaning.The frequency-specific analysis showed that the theta,alpha,and beta rhythms contribute more to semantic processing than the gamma oscillation.These results showed that semantic processing is frequency-specific and time-dependent on the word categories.展开更多
This paper aims to examine the second language(L2)phonetic categorical perception(CP)pattern by Chinese learners of English,regarding the contrast of dark/l/and vowel/?/.Three perception experiments were carried out p...This paper aims to examine the second language(L2)phonetic categorical perception(CP)pattern by Chinese learners of English,regarding the contrast of dark/l/and vowel/?/.Three perception experiments were carried out progressively:a simple identification task,an AXB identification task,and a revised AX discrimination task.The study discovered a significant difference in vowel contexts in the perception of dark/l/and vowel/?/,in which high vowels stand out,and demonstrated that English proficiency evaluated by standard examinations cannot be reflected in L2 phonetic discrimination.The study also proved the validity of adding reference stimuli in enhancing CP performance,but this improvement only benefits the identification tasks.The study helps to fill in the current knowledge gap concerning Chinese L2 learners’difficulty in distinguishing dark/l/and vowel/?/.The new finding contributes to a deeper understanding of the vowel-context effect on CP performance,as well as implications in second language teaching in exploring the connections between L2 speech perception and production.展开更多
文摘Background: Sickle cell anemia(SCA), a genetic hemoglobin disorder, suggests essential inner ear compromise and poor auditory processing. In humans, auditory processing differs physiologically between males and females, possibly true for SCA due to gender-specific disease pathophysiological changes. Objective: To investigate gender differences in psychoacoustical abilities, and speech perception in noise in SCA individuals and further compare with normal healthy(NH) population. Methods: 80 SCA and 80 NH normal-hearing participants aged 15-40 years were included and further grouped based on gender. Auditory discrimination for frequency, intensity, and duration at 500Hz and 4000Hz;temporal processing(Gap detection threshold & Modulation Detection Threshold) and Speech Perception In Noise(SPIN) at 0d BSNR tests were evaluated and compared between males and females of SCA and NH population. Results: SCA performed poorer compared to NH for all experimental measures. In the NH population, males performed poorer than females in psychoacoustical measures whereas within the SCA population, the reverse was true. Female participants performed better in the SPIN test in both populations. Conclusions: The adverse impact of SCA on the auditory system due to circulatory changes might cause poorer performance in SCA. Poorer performance by Female SCA is possibly due to the contrary impact of lower Hb level overlying Sickle disease.Estrogen levels and gender preference in auditory processing might lead to better performance by females within the NH population. SPIN performance depends on different attentional demands and sensorimotor processing strategies in noise beyond psychoacoustical processing may lead to better female performance in both populations.
文摘Background:Research has shown that musicians outperform non-musicians in speech perception in noise(SPiN)tasks.However,it remains unclear whether the advantages of musical training are substantial enough to slow down the decline in SPiN performance associated with aging.Objectives:Therefore,we assessed SPiN performances in a continuum of age groups comprising musicians and non-musicians.The goal was to compare how the aging process affected SPiN performances of musicians and non-musicians.Method:A cross-sectional descriptive mixed design was used,involving 150 participants divided into 75 musicians and 75 non-musicians.Each age group(10-19,20-29,30-39,40-49,and 50-59)consisted of15 musicians and 15 non-musicians.Six Kannada sentence lists were combined with four-talker babble.At+5,0,and-5 dB signal-to-noise ratios(SNRs),the percent correct Speech Identification Scores were calculated.Results:The repeated measure ANOVA(RM ANOVA)revealed significant main effects and interaction effects between SNR,musicianship,and age groups(p<0.05).A small to large effect size was noted(ηp2=0.05 to0.17).A significant interaction effect and follow-up post hoc tests showed that SPiN abilities deteriorated more rapidly with increasing age in nonmusicians compared to musicians,especially at difficult SNRs.Conclusions:Musicians had better SPiN abilities than non-musicians across all age groups.Also,age-related deterioration in SPiN abilities was faster in non-musicians compared to musicians.
文摘BackgroundIt's crucial to study the effect of changes in thresholds(T)and most comfortable levels(M)on behavioral measurements in young children using cochlear implants.This would help the clinician with the optimization and validation of programming parameters.ObjectiveThe study has attempted to describe the changes in behavioral responses with modification of T and M levels.MethodsTwenty-five participants in the age range 5 to 12 years using HR90K/HiFocus1J or HR90KAdvantage/HiFocus1J with Harmony speech processors participated in the study.A decrease in T levels,a rise in T levels,or a decrease in M levels in the everyday program were used to create experimental programs.Sound field thresholds and speech perception were measured at 50 dBHL for three experimental and everyday programs.ConclusionThe results indicated that only reductions of M levels resulted in significantly(p<0.01)poor aided thresholds and speech perception.On the other hand,variation in T levels did not have significant changes in either sound field thresholds or speech perception.The results highlight that M levels must be correctly established in order to prevent decreased speech perception and audibility.
基金supported by STI 2030-Major Projects(2021ZD0201500)the National Natural Science Foundation of China(31822024,31671172,and 32300881)+1 种基金the Strategic Priority Research Program of Chinese Academy of Sciences(XDB32010300)the Scientifc Foundation of Institute of Psychology,Chinese Academy of Sciences(E1CX172005 and E1CX4725CX).
文摘Musical training can counteract age-related decline in speech perception in noisy environments.However,it remains unclear whether older non-musicians and musicians rely on functional compensation or functional preservation to counteract the adverse efects of aging.This study utilized resting-state functional connectivity(FC)to investigate functional lateralization,a fundamental organization feature,in older musicians(OM),older non-musicians(ONM),and young non-musicians(YNM).Results showed that OM outperformed ONM and achieved comparable performance to YNM in speech-in-noise and speech-in-speech tasks.ONM exhibited reduced lateralization than YNM in lateralization index(LI)of intrahemispheric FC(LI_intra)in the cingulo-opercular network(CON)and LI of interhemispheric heterotopic FC(LI_he)in the language network(LAN).Conversely,OM showed higher neural alignment to YNM(i.e.,a more similar lateralization pattern)compared to ONM in CON,LAN,frontoparietal network(FPN),dorsal attention network(DAN),and default mode network(DMN),indicating preservation of youth-like lateralization patterns due to musical experience.Furthermore,in ONM,stronger left-lateralized and lower alignment-to-young of LI_intra in the somatomotor network(SMN)and DAN and LI_he in DMN correlated with better speech performance,indicating a functional compensation mechanism.In contrast,stronger right-lateralized LI_intra in FPN and DAN and higher alignment-to-young of LI_he in LAN correlated with better performance in OM,suggesting a functional preservation mechanism.These fndings highlight the diferential roles of functional preservation and compensation of lateralization in speech perception in noise among elderly individuals with and without musical expertise,ofering insights into successful aging theories from the lens of functional lateralization and speech perception.
文摘Speech perception is essential for daily communication.Background noise or concurrent talkers,on the other hand,can make it challenging for listeners to track the target speech(i.e.,cocktail party problem).The present study reviews and compares existing findings on speech perception and unmasking in cocktail party listening environments in English and Mandarin Chinese.The review starts with an introduction section followed by related concepts of auditory masking.The next two sections review factors that release speech perception from masking in English and Mandarin Chinese,respectively.The last section presents an overall summary of the findings with comparisons between the two languages.Future research directions with respect to the difference in literature on the reviewed topic between the two languages are also discussed.
基金supported by the National Basic Research Development Program of China (2009CB320901, 2011CB707805, 2013CB329304)the National Natural Science Foundation of China (31170985, 91120001, 61121002)"985" project grants from Peking University
文摘Based on the Motor Theory of speech perception, the interaction between the auditory and motor systems plays an essential role in speech perception. Since the Motor Theory was proposed, it has received remarkable attention in the field. However, each of the three hypotheses of the theory still needs further verification. In this review, we focus on how the auditory-motor anatomical and functional associations play a role in speech perception and discuss why previous studies could not reach an agreement and particularly whether the motor system involvement in speech perception is task-load dependent. Finally, we suggest that the function of the auditory-motor link is particularly useful for speech perception under adverse listening conditions and the further revised Motor Theory is a potential solution to the "cocktail-party" problem.
基金supported by the General Research Fund(No.18600218)。
文摘The interference of a tonal language poses challenges for Chinese learners of English to acquire word stress.The lack of symmetry between word stress problems in production and perception,and the absence of attention to specific stress patterns in teaching and learning,can reduce the effectiveness of word stress acquisition.The purpose of this paper is twofold:to examine the relationship between English word stress production and perception and to investigate how English word stress production and perception are affected by specific stress patterns.Ninety participants were involved in a production task and a perception task.Test words were selected based on 26 stress patterns in three categories:syllabic structure,phonological similarity,and vowel reduction.The results show that the production and perception of English word stress differ significantly without a strong linear correlation.Although the accuracy of word stress perception was higher than production for the test words in general,the comparative status of production and perception varied across different stress patterns.Specifically,in the syllabic structure category,the highest symmetry rate of word stress assignment in the production and perception forˈσCVCC(e.g.,climax,abend),while the symmetry rate forˈσoCVV(C)(e.g.,abdicate,importune)was the lowest and the most problematic for production.In the phonological similarity category,production and perception of word stress were most symmetrical for words with the suffix“-eous”and the most asymmetrical for words with the suffix“-ese,”which was also the most problematic for production.Identification of vowel reduction was more challenging for/ɒ/than/æ/in both production and perception.It is suggested that Chinese ESL teachers prioritize the teaching of stress patterns with low symmetrical relationships to achieve efficient learning outcomes.
基金supported by the National Basic Research 973 Program of China under Grant No.2013CB329304the National Natural Science Foundation of China under Grant No.61370023+2 种基金the Major Project of the National Social Science Foundation of China under Grant No.13&ZD189partially supported by the General Research Fund of the Hong Kong SAR Government under Project No.415511the CUHK Teaching Development Grant
文摘Computer-aided pronunciation training(CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language(L2) learners' speech. In order to further facilitate learning, we aim to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach towards gradation that is motivated by auditory perception. We have developed a computational method for generating a perceptual distance(PD) between two spoken phonemes. This is used to compute the auditory confusion of native language(L1). PD is found to correlate well with the mispronunciations detected in CAPT system for Chinese learners of English,i.e., L1 being Chinese(Mandarin and Cantonese) and L2 being US English. The results show that auditory confusion is indicative of pronunciation confusions in L2 learning. PD can also be used to help us grade the severity of errors(i.e.,mispronunciations that confuse more distant phonemes are more severe) and accordingly prioritize the order of corrective feedback generated for the learners.
文摘Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain and realizing it in an artificial system. A physiological research study based on electromyographic signals (Honda, 1996) suggested that speech communication in human brain might be based on a topological mapping between speech production and perception, according to an analogous topology between motor and sensory representations. Following this hypothesis, this study first investigated the topologies of the vowel system across the motor, kinematic, and acoustic spaces by means of a model simulation, and then examined the linkage between vowel production and perception in terms of a transformed auditory feedback (TAF) experiment. The model simulation indicated that there exists an invariant mapping from muscle activations (motor space) to articulations (kinematic space) via a coordinate consisting of force-dependent equilibrium positions, and the mapping from the motor space to kinematic space is unique. The motor-kinematic-acoustic deduction in the model simulation showed that the topologies were compatible from one space to another. In the TAF experiment, vowel production exhibited a compensatory response for a perturbation in the feedback sound. This implied that vowel production is controlled in reference to perception monitoring.
基金supported by a grant of the University of Chile(UI-10/16)to EA.
文摘Objective:Contribute to clarifying the existence of subclinical hearing deficits associated with aging.Design:In this work,we study and compare the auditory perceptual and electrophysiological performance of normal-hearing young and adult subjects(tonal audiometry,high-frequency tone threshold,a triplet of digits in noise,and click-evoked auditory brainstem response).Study sample:45 normal hearing volunteers were evaluated and divided into two groups according to age.27 subjects were included in the“young group”(mean 22.1 years),and 18 subjects(mean 42.22 years)were included in the“adult group.”Results:In the perceptual tests,the adult group presented significantly worse tonal thresholds in the high frequencies(12 and 16 kHz)and worse performance in the digit triplet tests in noise.In the electrophysiological test using the auditory brainstem response technique,the adult group presented significantly lower I and V wave amplitudes and higher V wave latencies at the supra-threshold level.At the threshold level,we observed a significantly higher latency in wave V in the adult group.In addition,in the partial correlation analysis,controlling for the hearing level,we observed a relationship(negative)between age and speech in noise performance and high-frequency thresholds.No significant association was observed between age and the auditory brainstem response.Conclusion:The results are compatible with subclinical hearing loss associated with aging.
文摘The aim of the study was to determine the development of syntax in language development of children who are deaf or hard-of-hearing, who are taught to new dynamic linguistic features with the help of computers. The sample consisted of 70 children who are deaf or hard-of-hearing, aged 7-17 years. To assess language development were applied following variables: total number of words used, the total number of different words used, the correct and incorrect statements (sentences) of the respondents. We calculated the basic statistical parameters on which it was found that the experimental program computer teaching children who are deaf or hard-of-hearing gave better results in the development of syntax. Also, canonical discriminate analysis revealed a statistically significant difference in the applied variables between the control and experimental groups the level of statistical significance ofp = 0.000. The results showed a significant improvement of the experimental group and that dynamic computer programming activities, which were challenged participants of the experimental group, contribute to a better linguistic competence of children who are deaf or hard-of-hearing.
文摘Older adults often find it difficult to perceive speech, especially in noisy conditions. Though hearing aid is one of the rehabilitative devices available to older adults to alleviate hearing loss, some of them may experience annoyance through hearing aid and hence reject it, may be due to circuitry noise and/or background noise. Acceptable noise level is a direct behavioural measure to estimate the extent of how much a person is able to put up with noise while simultaneously listening to speech. Acceptable noise level is a central auditory measure and it is not influenced by age, gender, presentation level or speaker. Using this measure, we can quantify the annoyance level experienced by an individual. This in-formation is of utmost importance and caution should be paid before setting the parameters in hearing aid, especially for those who are unable to accept noise. In this review article, an attempt has been made to document how to optimize the hearing aid program by setting parameters such as noise reduction circuit, microphone sensitivity and gain. These adjustments of parameters might help to reduce rejection rate of hearing aids, especially in those individuals who are annoyed by background noise. Copyright ? 2015 The Authors. Production & hosting by Elsevier (Singapore) Pte Ltd On behalf of PLA General Hospital Department of Otolaryngology Head and Neck Surgery. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
文摘Objective: To demonstrate the performance benefit of the Automatic Scene Classifier (SCAN) algorithm available in the Nucleus 6 (CP900 series) sound processor over the default processing algorithms of the previous generation Nucleus 5 (CP810) and Freedom HybridTM sound processors. Methods: Eighty-two cochlear implant recipients (40 Nucleus 5 processor users and 42 Freedom Hybrid processor users) listened to and repeated AzBio sentences in noise with their current processor and with the Nucleus 6 processor. Results: The SCAN algorithm when enabled yielded statistically significant non-inferior and superior performance when compared to the Nucleus 5 and Freedom Hybrid sound processors programmed with ASC + ADRO. Conclusion: The results of these studies demonstrate the superior performance and clinical utility of the SCAN algorithm in the Nucleus 6 processor over the Nucleus 5 and Freedom Hybrid processors.
基金supported by the HEARing CRC(established and supported under the Australian Government's Cooperative Research Centres Program)
文摘Objective:To evaluate the auditory function of an individual with genetically confirmed hemochromatosis. Methods: A 57 year old male with mildly impaired sound detection thresholds underwent a range of behavioural, electroacoustic and elec-trophysiologic assessments. These included the recording of otoacoustic emissions and auditory brainstem responses, measurement of monaural temporal resolution and evaluation of binaural speech processing. Findings for this patient were subsequently compared with those of 80 healthy controls with similar audiometric thresholds. Results: The patient showed the three cardinal features of auditory neuropathy, presenting with evidence of normal cochlear outer hair cell function, disrupted neural activity in the auditory nerve/brainstem and impaired temporal processing. His functional hearing ability (speech perception) was significantly affected and suggested a reduced capacity to use localization cues to segregate signals in the presence of back-ground noise. Conclusion:We present the first case of an individual with hemochromatosis and auditory neuropathy. The findings for this patient highlight the need for careful evaluation of auditory function in individuals with the disorder.
基金supported by the National Natural Science Foundation of China.(61071215,61271359,61372146)
文摘The Perception Spectrogram Structure Boundary(PSSB)parameter is proposed for speech endpoint detection as a preprocess of speech or speaker recognition.At first a hearing perception speech enhancement is carried out.Then the two-dimensional enhancement is performed upon the sound spectrogram according to the difference between the determinacy distribution characteristic of speech and the random distribution characteristic of noise.Finally a decision for endpoint was made by the PSSB parameter.Experimental results show that,in a low SNR environment from-10 dB to 10 dB,the algorithm proposed in this paper may achieve higher accuracy than the extant endpoint detection algorithms.The detection accuracy of 75.2%can be reached even in the extremely low SNR at-10 dB.Therefore it is suitable for speech endpoint detection in low-SNRs environment.
基金supported by the Strategic Consulting Research Project of Chinese Academy of Engineering(No.2016-ZD-04-03)
文摘Perception is the interaction interface between an intelligent system and the real world. Without sophisticated and flexible perceptual capabilities, it is impossible to create advanced artificial intelligence (AI) systems. For the next-generation AI, called 'AI 2.0', one of the most significant features will be that AI is empowered with intelligent perceptual capabilities, which can simulate human brain's mechanisms and are likely to surpass human brain in terms of performance. In this paper, we briefly review the state-of-the-art advances across different areas of perception, including visual perception, auditory perception, speech perception, and perceptual information processing and learning engines. On this basis, we envision several R&D trends in intelligent perception for the forthcoming era of AI 2.0, including: (1) human-like and transhuman active vision; (2) auditory perception and computation in an actual auditory setting; (3) speech perception and computation in a natural interaction setting; (4) autonomous learning of perceptual information; (5) large-scale perceptual information processing and learning platforms; and (6) urban omnidirectional intelligent perception and reasoning engines. We believe these research directions should be highlighted in the future plans for AI 2.0.
基金This research was supported by a grant from the National Natural Science Foundation of China (No. 30973309).
文摘Background Many factors interfering with a listener attempting to grasp speech in noisy environments. The spatial hearing by which speech and noise can be spatially separated may play a crucial role in speech recognition in the presence of competing noise. This study aimed to assess whether, and to what degree, spatial hearing benefit speech recognition in young normal-hearing participants in both quiet and noisy environments. Methods Twenty-eight young participants were tested by Mandarin Hearing In Noise Test (MHINT) in quiet and noisy environments. The assessment method used was characterized by modifications of speech and noise configurations, as well as by changes of speech presentation mode. The benefit of spatial hearing was measured by speech recognition threshold (SRT) variation between speech condition 1 (SC1) and speech condition 2 (SC2). Results There was no significant difference found in the SRT between SC1 and SC2 in quiet. SRT in SC1 was about 4.2 dB lower than that in SC2, both in speech-shaped and four-babble noise conditions. SRTs measured in both SC1 and SC2 were lower in the speech-shaped noise condition than in the four-babble noise condition. Conclusion Spatial hearing in young normal-hearing participants contribute to speech recognition in noisy environments, but provide no benefit to speech recognition in quiet environments, which may be due to the offset of auditory extrinsic redundancy against the lack of spatial hearing.
基金supported partially by JSPS KAKENHI Grant(20K11883)
文摘Brain mechanisms of lexical-semantic processing have been well researched using electroencephalography(EEG)technique with high temporal resolution.However,the detailed brain dynamics regarding spatial connectivity and the spectral characteristics remain to be clarified.For this reason,this study performed frequency-specific effective connectivity analysis for the EEG recordings during the processing of real and pseudowords.In addition,we introduced f MRI-based network templates into a representational similarity analysis to compare the functional differences between real and pseudowords in different frequency bands.Our results revealed that real words could rapidly activate the brain network for speech perception and complete its comprehension with efficiency,especially when the first syllable of the real word has clear categorical features.In contrast,the pseudowords were delayed in the initiation of speech perception and required a longer time span to retrieve its meaning.The frequency-specific analysis showed that the theta,alpha,and beta rhythms contribute more to semantic processing than the gamma oscillation.These results showed that semantic processing is frequency-specific and time-dependent on the word categories.
基金supported by the Shanghai Social Science Project(2018BYY003)the Major Program of National Social Science Foundation of China(No.15ZDB103)China Scholarship Council
文摘This paper aims to examine the second language(L2)phonetic categorical perception(CP)pattern by Chinese learners of English,regarding the contrast of dark/l/and vowel/?/.Three perception experiments were carried out progressively:a simple identification task,an AXB identification task,and a revised AX discrimination task.The study discovered a significant difference in vowel contexts in the perception of dark/l/and vowel/?/,in which high vowels stand out,and demonstrated that English proficiency evaluated by standard examinations cannot be reflected in L2 phonetic discrimination.The study also proved the validity of adding reference stimuli in enhancing CP performance,but this improvement only benefits the identification tasks.The study helps to fill in the current knowledge gap concerning Chinese L2 learners’difficulty in distinguishing dark/l/and vowel/?/.The new finding contributes to a deeper understanding of the vowel-context effect on CP performance,as well as implications in second language teaching in exploring the connections between L2 speech perception and production.