期刊文献+
共找到684篇文章
< 1 2 35 >
每页显示 20 50 100
AI-Generated Text Detection:A Comprehensive Review of Active and Passive Approaches
1
作者 Lingyun Xiang Nian Li +1 位作者 Yuling Liu Jiayong Hu 《Computers, Materials & Continua》 2026年第3期201-229,共29页
The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and ot... The rapid advancement of large language models(LLMs)has driven the pervasive adoption of AI-generated content(AIGC),while also raising concerns about misinformation,academic misconduct,biased or harmful content,and other risks.Detecting AI-generated text has thus become essential to safeguard the authenticity and reliability of digital information.This survey reviews recent progress in detection methods,categorizing approaches into passive and active categories based on their reliance on intrinsic textual features or embedded signals.Passive detection is further divided into surface linguistic feature-based and language model-based methods,whereas active detection encompasses watermarking-based and semantic retrieval-based approaches.This taxonomy enables systematic comparison of methodological differences in model dependency,applicability,and robustness.A key challenge for AI-generated text detection is that existing detectors are highly vulnerable to adversarial attacks,particularly paraphrasing,which substantially compromises their effectiveness.Addressing this gap highlights the need for future research on enhancing robustness and cross-domain generalization.By synthesizing current advances and limitations,this survey provides a structured reference for the field and outlines pathways toward more reliable and scalable detection solutions. 展开更多
关键词 ai-generated text detection large language models text classification WATERMARKING
在线阅读 下载PDF
A Synthetic Speech Detection Model Combining Local-Global Dependency
2
作者 Jiahui Song Yuepeng Zhang Wenhao Yuan 《Computers, Materials & Continua》 2026年第1期1312-1326,共15页
Synthetic speech detection is an essential task in the field of voice security,aimed at identifying deceptive voice attacks generated by text-to-speech(TTS)systems or voice conversion(VC)systems.In this paper,we propo... Synthetic speech detection is an essential task in the field of voice security,aimed at identifying deceptive voice attacks generated by text-to-speech(TTS)systems or voice conversion(VC)systems.In this paper,we propose a synthetic speech detection model called TFTransformer,which integrates both local and global features to enhance detection capabilities by effectively modeling local and global dependencies.Structurally,the model is divided into two main components:a front-end and a back-end.The front-end of the model uses a combination of SincLayer and two-dimensional(2D)convolution to extract high-level feature maps(HFM)containing local dependency of the input speech signals.The back-end uses time-frequency Transformer module to process these feature maps and further capture global dependency.Furthermore,we propose TFTransformer-SE,which incorporates a channel attention mechanism within the 2D convolutional blocks.This enhancement aims to more effectively capture local dependencies,thereby improving the model’s performance.The experiments were conducted on the ASVspoof 2021 LA dataset,and the results showed that the model achieved an equal error rate(EER)of 3.37%without data augmentation.Additionally,we evaluated the model using the ASVspoof 2019 LA dataset,achieving an EER of 0.84%,also without data augmentation.This demonstrates that combining local and global dependencies in the time-frequency domain can significantly improve detection accuracy. 展开更多
关键词 Synthetic speech detection transformer local-global time-frequency domain
在线阅读 下载PDF
Detection of Maliciously Disseminated Hate Speech in Spanish Using Fine-Tuning and In-Context Learning Techniques with Large Language Models
3
作者 Tomás Bernal-Beltrán RonghaoPan +3 位作者 JoséAntonio García-Díaz María del Pilar Salas-Zárate Mario Andrés Paredes-Valverde Rafael Valencia-García 《Computers, Materials & Continua》 2026年第4期353-390,共38页
The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in S... The malicious dissemination of hate speech via compromised accounts,automated bot networks and malware-driven social media campaigns has become a growing cybersecurity concern.Automatically detecting such content in Spanish is challenging due to linguistic complexity and the scarcity of annotated resources.In this paper,we compare two predominant AI-based approaches for the forensic detection of malicious hate speech:(1)finetuning encoder-only models that have been trained in Spanish and(2)In-Context Learning techniques(Zero-and Few-Shot Learning)with large-scale language models.Our approach goes beyond binary classification,proposing a comprehensive,multidimensional evaluation that labels each text by:(1)type of speech,(2)recipient,(3)level of intensity(ordinal)and(4)targeted group(multi-label).Performance is evaluated using an annotated Spanish corpus,standard metrics such as precision,recall and F1-score and stability-oriented metrics to evaluate the stability of the transition from zero-shot to few-shot prompting(Zero-to-Few Shot Retention and Zero-to-Few Shot Gain)are applied.The results indicate that fine-tuned encoder-only models(notably MarIA and BETO variants)consistently deliver the strongest and most reliable performance:in our experiments their macro F1-scores lie roughly in the range of approximately 46%–66%depending on the task.Zero-shot approaches are much less stable and typically yield substantially lower performance(observed F1-scores range approximately 0%–39%),often producing invalid outputs in practice.Few-shot prompting(e.g.,Qwen 38B,Mistral 7B)generally improves stability and recall relative to pure zero-shot,bringing F1-scores into a moderate range of approximately 20%–51%but still falling short of fully fine-tuned models.These findings highlight the importance of supervised adaptation and discuss the potential of both paradigms as components in AI-powered cybersecurity and malware forensics systems designed to identify and mitigate coordinated online hate campaigns. 展开更多
关键词 Hate speech detection malicious communication campaigns AI-driven cybersecurity social media analytics large language models prompt-tuning fine-tuning in-context learning natural language processing
在线阅读 下载PDF
Upholding Academic Integrity amidst Advanced Language Models: Evaluating BiLSTM Networks with GloVe Embeddings for Detecting AI-Generated Scientific Abstracts
4
作者 Lilia-Eliana Popescu-Apreutesei Mihai-Sorin Iosupescu +1 位作者 Sabina Cristiana Necula Vasile-Daniel Pavaloaia 《Computers, Materials & Continua》 2025年第8期2605-2644,共40页
The increasing fluency of advanced language models,such as GPT-3.5,GPT-4,and the recently introduced DeepSeek,challenges the ability to distinguish between human-authored and AI-generated academic writing.This situati... The increasing fluency of advanced language models,such as GPT-3.5,GPT-4,and the recently introduced DeepSeek,challenges the ability to distinguish between human-authored and AI-generated academic writing.This situation is raising significant concerns regarding the integrity and authenticity of academic work.In light of the above,the current research evaluates the effectiveness of Bidirectional Long Short-TermMemory(BiLSTM)networks enhanced with pre-trained GloVe(Global Vectors for Word Representation)embeddings to detect AIgenerated scientific Abstracts drawn from the AI-GA(Artificial Intelligence Generated Abstracts)dataset.Two core BiLSTM variants were assessed:a single-layer approach and a dual-layer design,each tested under static or adaptive embeddings.The single-layer model achieved nearly 97%accuracy with trainable GloVe,occasionally surpassing the deeper model.Despite these gains,neither configuration fully matched the 98.7%benchmark set by an earlier LSTMWord2Vec pipeline.Some runs were over-fitted when embeddings were fine-tuned,whereas static embeddings offered a slightly lower yet stable accuracy of around 96%.This lingering gap reinforces a key ethical and procedural concern:relying solely on automated tools,such as Turnitin’s AI-detection features,to penalize individuals’risks and unjust outcomes.Misclassifications,whether legitimate work is misread as AI-generated or engineered text,evade detection,demonstrating that these classifiers should not stand as the sole arbiters of authenticity.Amore comprehensive approach is warranted,one which weaves model outputs into a systematic process supported by expert judgment and institutional guidelines designed to protect originality. 展开更多
关键词 AI-GA dataset bidirectional LSTM GloVe embeddings ai-generated text detection academic integrity deep learning OVERFITTING natural language processing
在线阅读 下载PDF
Annoyance-type speech emotion detection in working environment
5
作者 王青云 赵力 +1 位作者 梁瑞宇 张潇丹 《Journal of Southeast University(English Edition)》 EI CAS 2013年第4期366-371,共6页
In order to recognize people's annoyance emotions in the working environment and evaluate emotional well- being, emotional speech in a work environment is induced to obtain adequate samples of emotional speech, and a... In order to recognize people's annoyance emotions in the working environment and evaluate emotional well- being, emotional speech in a work environment is induced to obtain adequate samples of emotional speech, and a Mandarin database with two thousands samples is built. In searching for annoyance-type emotion features, the prosodic feature and the voice quality feature parameters of the emotional statements are extracted first. Then an improved back propagation (BP) neural network based on the shuffled frog leaping algorithm (SFLA) is proposed to recognize the emotion. The recognition capability of the BP, radical basis function (RBF) and the SFLA neural networks are compared experimentally. The results show that the recognition ratio of the SFLA neural network is 4. 7% better than that of the BP neural network and 4. 3% better than that of the RBF neural network. The experimental results demonstrate that the random initial data trained by the SFLA can optimize the connection weights and thresholds of the neural network, speed up the convergence and improve the recognition rate. 展开更多
关键词 speech emotion detection annoyance type sentence length shuffled frog leaping algorithm
在线阅读 下载PDF
Speech Endpoint Detection in Noisy Environments Using EMD and Teager Energy Operator 被引量:4
6
作者 De-Xiang Zhang Xiao-Pei Wu Zhao Lv 《Journal of Electronic Science and Technology》 CAS 2010年第2期183-186,共4页
Accurate endpoint detection is a necessary capability for speech recognition.A new energy measure method based on the empirical mode decomposition(EMD)algorithm and Teager energy operator(TEO)is proposed to locate end... Accurate endpoint detection is a necessary capability for speech recognition.A new energy measure method based on the empirical mode decomposition(EMD)algorithm and Teager energy operator(TEO)is proposed to locate endpoint intervals of a speech signal embedded in noise.With the EMD,the noise signals can be decomposed into different numbers of sub-signals called intrinsic mode functions(IMFs),which is a zero-mean AM-FM component.Then TEO can be used to extract the desired feature of the modulation energy for IMF components.In order to show the effectiveness of the proposed method,examples are presented to show that the new measure is more effective than traditional measures.The present experimental results show that the measure can be used to improve the performance of endpoint detection algorithms and the accuracy of this algorithm is quite satisfactory and acceptable. 展开更多
关键词 Index Terms----Empirical mode decomposition endpoint detection noisy speech Teager energy operator.
在线阅读 下载PDF
Speech enhancement through voice activity detection using speech absence probability based on Teager energy 被引量:2
7
作者 PARKYun-sik LEE Sang-min 《Journal of Central South University》 SCIE EI CAS 2013年第2期424-432,共9页
In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (... In this work, a novel voice activity detection (VAD) algorithm that uses speech absence probability (SAP) based on Teager energy (TE) was proposed for speech enhancement. The proposed method employs local SAP (LSAP) based on the TE of noisy speech as a feature parameter for voice activity detection (VAD) in each frequency subband, rather than conventional LSAP. Results show that the TE operator can enhance the abiTity to discriminate speech and noise and further suppress noise components. Therefore, TE-based LSAP provides a better representation of LSAP, resulting in improved VAD for estimating noise power in a speech enhancement algorithm. In addition, the presented method utilizes TE-based global SAP (GSAP) derived in each frame as the weighting parameter for modifying the adopted TE operator and improving its performance. The proposed algorithm was evaluated by objective and subjective quality tests under various environments, and was shown to produce better results than the conventional method. 展开更多
关键词 speech enhancement Teager energy speech absence probability voice activity detection
在线阅读 下载PDF
Wireless Bioradar Sensor Networks for Speech Detection and Communication 被引量:1
8
作者 Ying Tian Sheng Li Jianqi Wang 《Engineering(科研)》 2013年第5期37-41,共5页
Wireless multimedia sensor networks (WMSN) are emerging to serve for the collection of acoustic and image information. In the WMSN, the microphone is usually employed to function as sensor nodes for the acquisition of... Wireless multimedia sensor networks (WMSN) are emerging to serve for the collection of acoustic and image information. In the WMSN, the microphone is usually employed to function as sensor nodes for the acquisition of acoustic data. However, those microphone sensors are needed to be placed close with sound source and cannot detect sound signal through certain obstacles. To overcome the shortcomings of microphone sensor, we develop a new type of bioradar sensor to achieve non-contact speech detection and investigate theoretically the mechanism of bioradar for speech detection. Results show that the system can successfully detect speech at some distance and even through non-metallic objects with certain thickness. In addition, in order to suppress the noise and improve the quality of the detected speech, we use spectral subtraction and Wiener filtering algorithm respectively to enhance the bioradar speech and evaluate the performance of the two methods using spectrogram. 展开更多
关键词 Bioradar SENSOR speech detection WIRELESS SENSOR NETWORKS
暂未订购
Speech Signal Detection Based on Bayesian Estimation by Observing Air-Conducted Speech under Existence of Surrounding Noise with the Aid of Bone-Conducted Speech 被引量:1
9
作者 Hisako Orimoto Akira Ikuta Kouji Hasegawa 《Intelligent Information Management》 2021年第4期199-213,共15页
In order to apply speech recognition systems to actual circumstances such as inspection and maintenance operations in industrial factories to recording and reporting routines at construction sites, etc. where hand-wri... In order to apply speech recognition systems to actual circumstances such as inspection and maintenance operations in industrial factories to recording and reporting routines at construction sites, etc. where hand-writing is difficult, some countermeasure methods for surrounding noise are indispensable. In this study, a signal detection method to remove the noise for actual speech signals is proposed by using Bayesian estimation with the aid of bone-conducted speech. More specifically, by introducing Bayes’ theorem based on the observation of air-conducted speech contaminated by surrounding background noise, a new type of algorithm for noise removal is theoretically derived. In the proposed speech detection method, bone-conducted speech is utilized in order to obtain precise estimation for speech signals. The effectiveness of the proposed method is experimentally confirmed by applying it to air- and bone-conducted speeches measured in real environment under the existence of surrounding background noise. 展开更多
关键词 speech Signal detection Bayesian Estimation Air- and Bone-Conducted speeches Surrounding Noise
在线阅读 下载PDF
Comparing Fine-Tuning, Zero and Few-Shot Strategies with Large Language Models in Hate Speech Detection in English
10
作者 Ronghao Pan JoséAntonio García-Díaz Rafael Valencia-García 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第9期2849-2868,共20页
Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning... Large Language Models(LLMs)are increasingly demonstrating their ability to understand natural language and solve complex tasks,especially through text generation.One of the relevant capabilities is contextual learning,which involves the ability to receive instructions in natural language or task demonstrations to generate expected outputs for test instances without the need for additional training or gradient updates.In recent years,the popularity of social networking has provided a medium through which some users can engage in offensive and harmful online behavior.In this study,we investigate the ability of different LLMs,ranging from zero-shot and few-shot learning to fine-tuning.Our experiments show that LLMs can identify sexist and hateful online texts using zero-shot and few-shot approaches through information retrieval.Furthermore,it is found that the encoder-decoder model called Zephyr achieves the best results with the fine-tuning approach,scoring 86.811%on the Explainable Detection of Online Sexism(EDOS)test-set and 57.453%on the Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter(HatEval)test-set.Finally,it is confirmed that the evaluated models perform well in hate text detection,as they beat the best result in the HatEval task leaderboard.The error analysis shows that contextual learning had difficulty distinguishing between types of hate speech and figurative language.However,the fine-tuned approach tends to produce many false positives. 展开更多
关键词 Hate speech detection zero-shot few-shot fine-tuning natural language processing
在线阅读 下载PDF
An Adaptive Hate Speech Detection Approach Using Neutrosophic Neural Networks for Social Media Forensics
11
作者 Yasmine M.Ibrahim Reem Essameldin Saad M.Darwish 《Computers, Materials & Continua》 SCIE EI 2024年第4期243-262,共20页
Detecting hate speech automatically in social media forensics has emerged as a highly challenging task due tothe complex nature of language used in such platforms. Currently, several methods exist for classifying hate... Detecting hate speech automatically in social media forensics has emerged as a highly challenging task due tothe complex nature of language used in such platforms. Currently, several methods exist for classifying hatespeech, but they still suffer from ambiguity when differentiating between hateful and offensive content and theyalso lack accuracy. The work suggested in this paper uses a combination of the Whale Optimization Algorithm(WOA) and Particle Swarm Optimization (PSO) to adjust the weights of two Multi-Layer Perceptron (MLPs)for neutrosophic sets classification. During the training process of the MLP, the WOA is employed to exploreand determine the optimal set of weights. The PSO algorithm adjusts the weights to optimize the performanceof the MLP as fine-tuning. Additionally, in this approach, two separate MLP models are employed. One MLPis dedicated to predicting degrees of truth membership, while the other MLP focuses on predicting degrees offalse membership. The difference between these memberships quantifies uncertainty, indicating the degree ofindeterminacy in predictions. The experimental results indicate the superior performance of our model comparedto previous work when evaluated on the Davidson dataset. 展开更多
关键词 Hate speech detection whale optimization neutrosophic sets social media forensics
在线阅读 下载PDF
Robust Speech Endpoint Detection in Airplane Cockpit Voice Background
12
作者 Hongbing CHENG Ming LEI +1 位作者 Guorong HUANG Yan XIA 《Wireless Sensor Network》 2009年第5期489-495,共7页
A method of robust speech endpoint detection in airplane cockpit voice background is presented. Based on the analysis of background noise character, a complex Laplacian distribution model directly aiming at noisy spee... A method of robust speech endpoint detection in airplane cockpit voice background is presented. Based on the analysis of background noise character, a complex Laplacian distribution model directly aiming at noisy speech is established. Then the likelihood ratio test based on binary hypothesis test is carried out. The decision criterion of conventional maximum a posterior incorporating the inter-frame correlation leads to two separate thresholds. Speech endpoint detection decision is finally made depend on the previous frame and the observed spectrum, and the speech endpoint is searched based on the decision. Compared with the typical algorithms, the proposed method operates robust in the airplane cockpit voice background. 展开更多
关键词 Complex LAPLACIAN Model Maximum A POSTERIOR Criterion LIKELIHOOD Ratio Test speech End- point detection AIRPLANE COCKPIT VOICE
暂未订购
Chaotic Elephant Herd Optimization with Machine Learning for Arabic Hate Speech Detection
13
作者 Badriyya B.Al-onazi Jaber S.Alzahrani +5 位作者 Najm Alotaibi Hussain Alshahrani Mohamed Ahmed Elfaki Radwa Marzouk Heba Mohsen Abdelwahed Motwakel 《Intelligent Automation & Soft Computing》 2024年第3期567-583,共17页
In recent years,the usage of social networking sites has considerably increased in the Arab world.It has empowered individuals to express their opinions,especially in politics.Furthermore,various organizations that op... In recent years,the usage of social networking sites has considerably increased in the Arab world.It has empowered individuals to express their opinions,especially in politics.Furthermore,various organizations that operate in the Arab countries have embraced social media in their day-to-day business activities at different scales.This is attributed to business owners’understanding of social media’s importance for business development.However,the Arabic morphology is too complicated to understand due to the availability of nearly 10,000 roots and more than 900 patterns that act as the basis for verbs and nouns.Hate speech over online social networking sites turns out to be a worldwide issue that reduces the cohesion of civil societies.In this background,the current study develops a Chaotic Elephant Herd Optimization with Machine Learning for Hate Speech Detection(CEHOML-HSD)model in the context of the Arabic language.The presented CEHOML-HSD model majorly concentrates on identifying and categorising the Arabic text into hate speech and normal.To attain this,the CEHOML-HSD model follows different sub-processes as discussed herewith.At the initial stage,the CEHOML-HSD model undergoes data pre-processing with the help of the TF-IDF vectorizer.Secondly,the Support Vector Machine(SVM)model is utilized to detect and classify the hate speech texts made in the Arabic language.Lastly,the CEHO approach is employed for fine-tuning the parameters involved in SVM.This CEHO approach is developed by combining the chaotic functions with the classical EHO algorithm.The design of the CEHO algorithm for parameter tuning shows the novelty of the work.A widespread experimental analysis was executed to validate the enhanced performance of the proposed CEHOML-HSD approach.The comparative study outcomes established the supremacy of the proposed CEHOML-HSD model over other approaches. 展开更多
关键词 Arabic language machine learning elephant herd optimization TF-IDF vectorizer hate speech detection
在线阅读 下载PDF
A Review of Machine Learning Techniques in Cyberbullying Detection 被引量:1
14
作者 Daniyar Sultan Batyrkhan Omarov +5 位作者 Zhazira Kozhamkulova Gulnur Kazbekova Laura Alimzhanova Aigul Dautbayeva Yernar Zholdassov Rustam Abdrakhmanov 《Computers, Materials & Continua》 SCIE EI 2023年第3期5625-5640,共16页
Automatic identification of cyberbullying is a problem that is gaining traction,especially in the Machine Learning areas.Not only is it complicated,but it has also become a pressing necessity,considering how social me... Automatic identification of cyberbullying is a problem that is gaining traction,especially in the Machine Learning areas.Not only is it complicated,but it has also become a pressing necessity,considering how social media has become an integral part of adolescents’lives and how serious the impacts of cyberbullying and online harassment can be,particularly among teenagers.This paper contains a systematic literature review of modern strategies,machine learning methods,and technical means for detecting cyberbullying and the aggressive command of an individual in the information space of the Internet.We undertake an in-depth review of 13 papers from four scientific databases.The article provides an overview of scientific literature to analyze the problem of cyberbullying detection from the point of view of machine learning and natural language processing.In this review,we consider a cyberbullying detection framework on social media platforms,which includes data collection,data processing,feature selection,feature extraction,and the application ofmachine learning to classify whether texts contain cyberbullying or not.This article seeks to guide future research on this topic toward a more consistent perspective with the phenomenon’s description and depiction,allowing future solutions to be more practical and effective. 展开更多
关键词 CYBERBULLYING hate speech digital drama online harassment detection classification machine learning NLP
在线阅读 下载PDF
ENDPOINT DETECTOR OF NOISY SPEECH SIGNAL USING A RECURRENT NEURAL NETWORK
15
作者 韦晓东 胡光锐 《Journal of Shanghai Jiaotong university(Science)》 EI 1999年第1期60-63,共4页
IntroductionEndpointdetectionofspeechsignalisimportantinmanyareasofspeechprocessingtechnology,suchasspeechen... IntroductionEndpointdetectionofspeechsignalisimportantinmanyareasofspeechprocessingtechnology,suchasspeechenhancement,speechr... 展开更多
关键词 speech ENDPOINT detection RECURRENT NEURAL network(RNN) immunity learning
在线阅读 下载PDF
Automated Speech Recognition System to Detect Babies’ Feelings through Feature Analysis
16
作者 Sana Yasin Umar Draz +12 位作者 Tariq Ali Kashaf Shahid Amna Abid Rukhsana Bibi Muhammad Irfan Mohammed A.Huneif Sultan A.Almedhesh Seham M.Alqahtani Alqahtani Abdulwahab Mohammed Jamaan Alzahrani Dhafer Batti Alshehri Alshehri Ali Abdullah Saifur Rahman 《Computers, Materials & Continua》 SCIE EI 2022年第11期4349-4367,共19页
Diagnosing a baby’s feelings poses a challenge for both doctors and parents because babies cannot explain their feelings through expression or speech.Understanding the emotions of babies and their associated expressi... Diagnosing a baby’s feelings poses a challenge for both doctors and parents because babies cannot explain their feelings through expression or speech.Understanding the emotions of babies and their associated expressions during different sensations such as hunger,pain,etc.,is a complicated task.In infancy,all communication and feelings are propagated through cryspeech,which is a natural phenomenon.Several clinical methods can be used to diagnose a baby’s diseases,but nonclinical methods of diagnosing a baby’s feelings are lacking.As such,in this study,we aimed to identify babies’feelings and emotions through their cry using a nonclinical method.Changes in the cry sound can be identified using our method and used to assess the baby’s feelings.We considered the frequency of the cries from the energy of the sound.The feelings represented by the infant’s cry are judged to represent certain sensations expressed by the child using the optimal frequency of the recognition of a real-world audio sound.We used machine learning and artificial intelligence to distinguish cry tones in real time through feature analysis.The experimental group consisted of 50%each male and female babies,and we determined the relevancy of the results against different parameters.This application produced real-time results after recognizing a child’s cry sounds.The novelty of our work is that we,for the first time,successfully derived the feelings of young children through the cry-speech of the child,showing promise for end-user applications. 展开更多
关键词 Cry-to-speak machine learning artificial intelligence cry speech detection babies
在线阅读 下载PDF
HybridGAD: Identification of AI-Generated Radiology Abstracts Based on a Novel Hybrid Model with Attention Mechanism
17
作者 TugbaÇelikten Aytug Onan 《Computers, Materials & Continua》 SCIE EI 2024年第8期3351-3377,共27页
Class Title:Radiological imaging method a comprehensive overview purpose.This GPT paper provides an overview of the different forms of radiological imaging and the potential diagnosis capabilities they offer as well a... Class Title:Radiological imaging method a comprehensive overview purpose.This GPT paper provides an overview of the different forms of radiological imaging and the potential diagnosis capabilities they offer as well as recent advances in the field.Materials and Methods:This paper provides an overview of conventional radiography digital radiography panoramic radiography computed tomography and cone-beam computed tomography.Additionally recent advances in radiological imaging are discussed such as imaging diagnosis and modern computer-aided diagnosis systems.Results:This paper details the differences between the imaging techniques the benefits of each and the current advances in the field to aid in the diagnosis of medical conditions.Conclusion:Radiological imaging is an extremely important tool in modern medicine to assist in medical diagnosis.This work provides an overview of the types of imaging techniques used the recent advances made and their potential applications. 展开更多
关键词 Generative artificial intelligence ai-generated text detection attention mechanism hybrid model for text classification
暂未订购
Glide landmark detection using band-limited energy ratio contours
18
作者 Soojin Park Jeungyoon Choi Honggoo Kang 《Journal of Measurement Science and Instrumentation》 CAS 2012年第4期352-356,共5页
A detection system for American English glides/w y r 1] in a knowledge-based automatic speech recognition system is presented. The method uses detection of dips in band-limited energy to total energy ratios, instead o... A detection system for American English glides/w y r 1] in a knowledge-based automatic speech recognition system is presented. The method uses detection of dips in band-limited energy to total energy ratios, instead of detecting dips along the unmodified band-limited energy contours. By using band-limited energy ratio, the dip detection is applicable in not only intervocalic regions but also in non-intervocalic regions. A Gaussian mixture model(GMM) based classifier is then used to separate the detected vowels and nasals. This approach is tested using the TIMIT corpus and results in an overall detection rate of 69.5 %, which is a 4.7 % absolute increase in detection rate compared with an hidden Markov model (HMM) based phone recognizer. 展开更多
关键词 landmarks glide detection knowledge-based speech recognition
在线阅读 下载PDF
A Highly Accurate Dysphonia Detection System Using Linear Discriminant Analysis
19
作者 Anas Basalamah Mahedi Hasan +1 位作者 Shovan Bhowmik Shaikh Akib Shahriyar 《Computer Systems Science & Engineering》 SCIE EI 2023年第3期1921-1938,共18页
The recognition of pathological voice is considered a difficult task for speech analysis.Moreover,otolaryngologists needed to rely on oral communication with patients to discover traces of voice pathologies like dysph... The recognition of pathological voice is considered a difficult task for speech analysis.Moreover,otolaryngologists needed to rely on oral communication with patients to discover traces of voice pathologies like dysphonia that are caused by voice alteration of vocal folds and their accuracy is between 60%–70%.To enhance detection accuracy and reduce processing speed of dysphonia detection,a novel approach is proposed in this paper.We have leveraged Linear Discriminant Analysis(LDA)to train multiple Machine Learning(ML)models for dysphonia detection.Several ML models are utilized like Support Vector Machine(SVM),Logistic Regression,and K-nearest neighbor(K-NN)to predict the voice pathologies based on features like Mel-Frequency Cepstral Coefficients(MFCC),Fundamental Frequency(F0),Shimmer(%),Jitter(%),and Harmonic to Noise Ratio(HNR).The experiments were performed using Saarbrucken Voice Data-base(SVD)and a privately collected dataset.The K-fold cross-validation approach was incorporated to increase the robustness and stability of the ML models.According to the experimental results,our proposed approach has a 70%increase in processing speed over Principal Component Analysis(PCA)and performs remarkably well with a recognition accuracy of 95.24%on the SVD dataset surpassing the previous best accuracy of 82.37%.In the case of the private dataset,our proposed method achieved an accuracy rate of 93.37%.It can be an effective non-invasive method to detect dysphonia. 展开更多
关键词 Dimensionality reduction dysphonia detection linear discriminant analysis logistic regression speech feature extraction support vector machine
在线阅读 下载PDF
Audiovisual synchrony detection for fluent speech in early childhood:An eye-tracking study
20
作者 Han-yu Zhou Han-xue Yang +3 位作者 Zhen Wei Guo-bin Wan Simon S.Y.Lui Raymond C.K.Chan 《PsyCh Journal》 2022年第3期409-418,共10页
During childhood,the ability to detect audiovisual synchrony gradually sharpens for simple stimuli such as flashbeeps and single syllables.However,little is known about how children perceive synchrony for natural and ... During childhood,the ability to detect audiovisual synchrony gradually sharpens for simple stimuli such as flashbeeps and single syllables.However,little is known about how children perceive synchrony for natural and continuous speech.This study investigated young children’s gaze patterns while they were watching movies of two identical speakers telling stories side by side.Only one speaker’s lip movements matched the voices and the other one either led or lagged behind the soundtrack by 600 ms.Children aged 3–6 years(n=94,52.13%males)showed an overall preference for the synchronous speaker,with no age-related changes in synchrony-detection sensitivity as indicated by similar gaze patterns across ages.However,viewing time to the synchronous speech was significantly longer in the auditory-leading(AL)condition compared with that in the visual-leading(VL)condition,suggesting asymmetric sensitivities for AL versus VL asynchrony have already been established in early childhood.When further examining gaze patterns on dynamic faces,we found that more attention focused on the mouth region was an adaptive strategy to read visual speech signals and thus associated with increased viewing time of the synchronous videos.Attention to detail,one dimension of autistic traits featured by local processing,has been found to be correlated with worse performances in speech synchrony processing.These findings extended previous research by showing the development of speech synchrony perception in young children,and may have implications for clinical populations(e.g.,autism)with impaired multisensory integration. 展开更多
关键词 audiovisual autistic traits eye-tracking speech synchrony detection
暂未订购
上一页 1 2 35 下一页 到第
使用帮助 返回顶部