期刊文献+
共找到42篇文章
< 1 2 3 >
每页显示 20 50 100
Detection of Abnormal Cardiac Rhythms Using Feature Fusion Technique with Heart Sound Spectrograms
1
作者 Saif Ur Rehman Khan Zia Khan 《Journal of Bionic Engineering》 2025年第4期2030-2049,共20页
A heart attack disrupts the normal flow of blood to the heart muscle,potentially causing severe damage or death if not treated promptly.It can lead to long-term health complications,reduce quality of life,and signific... A heart attack disrupts the normal flow of blood to the heart muscle,potentially causing severe damage or death if not treated promptly.It can lead to long-term health complications,reduce quality of life,and significantly impact daily activities and overall well-being.Despite the growing popularity of deep learning,several drawbacks persist,such as complexity and the limitation of single-model learning.In this paper,we introduce a residual learning-based feature fusion technique to achieve high accuracy in differentiating abnormal cardiac rhythms heart sound.Combining MobileNet with DenseNet201 for feature fusion leverages MobileNet lightweight,efficient architecture with DenseNet201,dense connections,resulting in enhanced feature extraction and improved model performance with reduced computational cost.To further enhance the fusion,we employed residual learning to optimize the hierarchical features of heart abnormal sounds during training.The experimental results demonstrate that the proposed fusion method achieved an accuracy of 95.67%on the benchmark PhysioNet-2016 Spectrogram dataset.To further validate the performance,we applied it to the BreakHis dataset with a magnification level of 100X.The results indicate that the model maintains robust performance on the second dataset,achieving an accuracy of 96.55%.it highlights its consistent performance,making it a suitable for various applications. 展开更多
关键词 Cardiac rhythms Feature fusion Residual learning BreakHis Spectrogram sound
在线阅读 下载PDF
Health Monitoring of Milling Tool Inserts Using CNN Architectures Trained by Vibration Spectrograms 被引量:2
2
作者 Sonali S.Patil Sujit S.Pardeshi Abhishek D.Patange 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第7期177-199,共23页
In-process damage to a cutting tool degrades the surface􀀀nish of the job shaped by machining and causes a signi􀀀cant􀀀nancial loss.This stimulates the need for Tool Condition Monitoring(TCM)t... In-process damage to a cutting tool degrades the surface􀀀nish of the job shaped by machining and causes a signi􀀀cant􀀀nancial loss.This stimulates the need for Tool Condition Monitoring(TCM)to assist detection of failure before it extends to the worse phase.Machine Learning(ML)based TCM has been extensively explored in the last decade.However,most of the research is now directed toward Deep Learning(DL).The“Deep”formulation,hierarchical compositionality,distributed representation and end-to-end learning of Neural Nets need to be explored to create a generalized TCM framework to perform eciently in a high-noise environment of cross-domain machining.With this motivation,the design of dierent CNN(Convolutional Neural Network)architectures such as AlexNet,ResNet-50,LeNet-5,and VGG-16 is presented in this paper.Real-time spindle vibrations corresponding to healthy and various faulty con􀀀gurations of milling cutter were acquired.This data was transformed into the time-frequency domain and further processed by proposed architectures in graphical form,i.e.,spectrogram.The model is trained,tested,and validated considering dierent datasets and showcased promising results. 展开更多
关键词 Milling tool inserts health monitoring vibration spectrograms deep learning convolutional neural network
在线阅读 下载PDF
Continuous frequency and phase spectrograms: a study of their 2D and 3D capabilities and application to musical signal analysis 被引量:1
3
作者 Laurent NAVARRO Guy COURBEBAISSE Jean-Charles PINOLI 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2008年第2期199-206,共8页
A new lighting and enlargement on phase spectrogram (PS) and frequency spectrogram (FS) is presented in this paper. These representations result from the coupling of power spectrogram and short time Fourier transf... A new lighting and enlargement on phase spectrogram (PS) and frequency spectrogram (FS) is presented in this paper. These representations result from the coupling of power spectrogram and short time Fourier transform (STFT). The main contribution is the construction of the 3D phase spectrogram (3DPS) and the 3D frequency spectrogram (3DFS). These new tools allow such specific test signals as small slope linear chirp, phase jump case of musical signal analysis is reported. The main objective is to and small frequency jump to be analyzed. An application detect small frequency and phase variations in order to characterize each type of sound attack without losing the amplitude information given by power spectrogram 展开更多
关键词 Frequency spectrogram (FS) Phase spectrogram (PS) Time-frequency representations Musical signals
在线阅读 下载PDF
基于改进CBAM注意力机制的MobileNetV3风扇异常状况识别研究
4
作者 刘明 王荣燕 +3 位作者 王汝旭 武高旭 张佳宁 梁俊祥 《工业控制计算机》 2025年第3期90-92,共3页
工业风扇在生产设施中起着至关重要的作用,关键风扇的突然停机对安全生产影响巨大。通过分析在-6 dB噪声环境中的故障风扇发出的声音,提取声音样本的语谱图,采用MobileNetV3模型,针对该模型注意力模块SE(Squeeze-and-Excitation)存在的... 工业风扇在生产设施中起着至关重要的作用,关键风扇的突然停机对安全生产影响巨大。通过分析在-6 dB噪声环境中的故障风扇发出的声音,提取声音样本的语谱图,采用MobileNetV3模型,针对该模型注意力模块SE(Squeeze-and-Excitation)存在的参数化程度较低问题,采用空洞卷积(Dilated Convolution)优化的卷积块注意力模块CBAM(Convolutional Block Attention Module)予以替代,提出了改进后的MobileNetV3模型。实验结果显示,该模型的分类准确率达到了98%,相较于原MobileNetV3模型,准确率提升了2.07个百分点。 展开更多
关键词 空洞卷积 CBAM MobileNetV3 迁移学习 SPECTROGRAM
在线阅读 下载PDF
Cardiovascular Sound Classification Using Neural Architectures and Deep Learning for Advancing Cardiac Wellness
5
作者 Deepak Mahto Sudhakar Kumar +6 位作者 Sunil KSingh Amit Chhabra Irfan Ahmad Khan Varsha Arya Wadee Alhalabi Brij B.Gupta Bassma Saleh Alsulami 《Computer Modeling in Engineering & Sciences》 2025年第6期3743-3767,共25页
Cardiovascular diseases(CVDs)remain one of the foremost causes of death globally;hence,the need for several must-have,advanced automated diagnostic solutions towards early detection and intervention.Traditional auscul... Cardiovascular diseases(CVDs)remain one of the foremost causes of death globally;hence,the need for several must-have,advanced automated diagnostic solutions towards early detection and intervention.Traditional auscultation of cardiovascular sounds is heavily reliant on clinical expertise and subject to high variability.To counter this limitation,this study proposes an AI-driven classification system for cardiovascular sounds whereby deep learning techniques are engaged to automate the detection of an abnormal heartbeat.We employ FastAI vision-learner-based convolutional neural networks(CNNs)that include ResNet,DenseNet,VGG,ConvNeXt,SqueezeNet,and AlexNet to classify heart sound recordings.Instead of raw waveform analysis,the proposed approach transforms preprocessed cardiovascular audio signals into spectrograms,which are suited for capturing temporal and frequency-wise patterns.The models are trained on the PASCAL Cardiovascular Challenge dataset while taking into consideration the recording variations,noise levels,and acoustic distortions.To demonstrate generalization,external validation using Google’s Audio set Heartbeat Sound dataset was performed using a dataset rich in cardiovascular sounds.Comparative analysis revealed that DenseNet-201,ConvNext Large,and ResNet-152 could deliver superior performance to the other architectures,achieving an accuracy of 81.50%,a precision of 85.50%,and an F1-score of 84.50%.In the process,we performed statistical significance testing,such as the Wilcoxon signed-rank test,to validate performance improvements over traditional classification methods.Beyond the technical contributions,the research underscores clinical integration,outlining a pathway in which the proposed system can augment conventional electronic stethoscopes and telemedicine platforms in the AI-assisted diagnostic workflows.We also discuss in detail issues of computational efficiency,model interpretability,and ethical considerations,particularly concerning algorithmic bias stemming from imbalanced datasets and the need for real-time processing in clinical settings.The study describes a scalable,automated system combining deep learning,feature extraction using spectrograms,and external validation that can assist healthcare providers in the early and accurate detection of cardiovascular disease.AI-driven solutions can be viable in improving access,reducing delays in diagnosis,and ultimately even the continued global burden of heart disease. 展开更多
关键词 Healthy society cardiovascular system SPECTROGRAM FastAI audio signals computer vision neural network
在线阅读 下载PDF
Evaluation of cabin noise of various subway systems
6
作者 Hsiao Mun Lee Heow Pueh Lee 《Acta Mechanica Sinica》 2025年第7期319-331,共13页
This study examines the variations in noise levels across various subway lines in Singapore and three other cities,and provides a detailed overview of the trends and factors influencing subway noise.Most of the equiva... This study examines the variations in noise levels across various subway lines in Singapore and three other cities,and provides a detailed overview of the trends and factors influencing subway noise.Most of the equivalent sound pressure level(Leq)in typical subway cabins across the Singapore subway lines are below 85 dBA,with some notable exceptions.These variations in noise levels are influenced by several factors,including rolling stock structure,track conditions and environmental and aerodynamic factors.The spectrogram analysis indicates that the cabin noise is mostly concentrated below the frequency of 1,000 Hz.This study also analyzes cabin noise in subway systems in Suzhou,Seoul,and Tokyo to allow for broader comparisons.It studies the impact of factors such as stock materials,track conditions including the quality of the rails,the presence of curves or irregularities,and maintenance frequency on cabin noise. 展开更多
关键词 Cabin noise SUBWAY SPECTROGRAM
原文传递
Omnidirectional Human Behavior Recognition Method Based on Frequency-Modulated Continuous-Wave Radar
7
作者 SUN Chang WANG Shaohong LIN Yanping 《Journal of Shanghai Jiaotong university(Science)》 2025年第4期637-645,共9页
Frequency-modulated continuous-wave radar enables the non-contact and privacy-preserving recognition of human behavior.However,the accuracy of behavior recognition is directly influenced by the spatial relationship be... Frequency-modulated continuous-wave radar enables the non-contact and privacy-preserving recognition of human behavior.However,the accuracy of behavior recognition is directly influenced by the spatial relationship between human posture and the radar.To address the issue of low accuracy in behavior recognition when the human body is not directly facing the radar,a method combining local outlier factor with Doppler information is proposed for the correction of multi-classifier recognition results.Initially,the information such as distance,velocity,and micro-Doppler spectrogram of the target is obtained using the fast Fourier transform and histogram of oriented gradients-support vector machine methods,followed by preliminary recognition.Subsequently,Platt scaling is employed to transform recognition results into confidence scores,and finally,the Doppler-local outlier factor method is utilized to calibrate the confidence scores,with the highest confidence classifier result considered as the recognition outcome.Experimental results demonstrate that this approach achieves an average recognition accuracy of 96.23%for comprehensive human behavior recognition in various orientations. 展开更多
关键词 frequency-modulated continuous-wave radar omnidirectional human behavior recognition histogram of oriented gradients support vector machine micro-Doppler spectrogram Doppler-local outlier factor
原文传递
Auditory attention model based on Chirplet for cross-corpus speech emotion recognition 被引量:1
8
作者 张昕然 宋鹏 +2 位作者 查诚 陶华伟 赵力 《Journal of Southeast University(English Edition)》 EI CAS 2016年第4期402-407,共6页
To solve the problem of mismatching features in an experimental database, which is a key technique in the field of cross-corpus speech emotion recognition, an auditory attention model based on Chirplet is proposed for... To solve the problem of mismatching features in an experimental database, which is a key technique in the field of cross-corpus speech emotion recognition, an auditory attention model based on Chirplet is proposed for feature extraction.First, in order to extract the spectra features, the auditory attention model is employed for variational emotion features detection. Then, the selective attention mechanism model is proposed to extract the salient gist features which showtheir relation to the expected performance in cross-corpus testing.Furthermore, the Chirplet time-frequency atoms are introduced to the model. By forming a complete atom database, the Chirplet can improve the spectrum feature extraction including the amount of information. Samples from multiple databases have the characteristics of multiple components. Hereby, the Chirplet expands the scale of the feature vector in the timefrequency domain. Experimental results show that, compared to the traditional feature model, the proposed feature extraction approach with the prototypical classifier has significant improvement in cross-corpus speech recognition. In addition, the proposed method has better robustness to the inconsistent sources of the training set and the testing set. 展开更多
关键词 speech emotion recognition selective attention mechanism spectrogram feature cross-corpus
在线阅读 下载PDF
Single-Channel Speech Enhancement Using Critical-Band Rate Scale Based Improved Multi-Band Spectral Subtraction 被引量:1
9
作者 Navneet Upadhyay Abhijit Karmakar 《Journal of Signal and Information Processing》 2013年第3期314-326,共13页
This paper addresses the problem of single-channel speech enhancement in the adverse environment. The critical-band rate scale based on improved multi-band spectral subtraction is investigated in this study for enhanc... This paper addresses the problem of single-channel speech enhancement in the adverse environment. The critical-band rate scale based on improved multi-band spectral subtraction is investigated in this study for enhancement of single-channel speech. In this work, the whole speech spectrum is divided into different non-uniformly spaced frequency bands in accordance with the critical-band rate scale of the psycho-acoustic model and the spectral over-subtraction is carried-out separately in each band. In addition, for the estimation of the noise from each band, the adaptive noise estimation approach is used and does not require explicit speech silence detection. The noise is estimated and updated by adaptively smoothing the noisy signal power in each band. The smoothing parameter is controlled by a-posteriori signal-to-noise ratio (SNR). For the performance analysis of the proposed algorithm, the objective measures, such as, SNR, segmental SNR, and perceptual evaluations of the speech quality are conducted for the variety of noises at different levels of SNRs. The speech spectrogram and objective evaluations of the proposed algorithm are compared with other standard speech enhancement algorithms and proved that the musical structure of the remnant noise and background noise is better suppressed by the proposed algorithm. 展开更多
关键词 SINGLE-CHANNEL SPEECH Enhancement Critical-Band RATE SCALE Spectral Over-Subtraction Adaptive Noise Estimation Objective Measure SPEECH spectrograms
暂未订购
Visual classification of feral cat Felis silvestris catus vocalizations
10
作者 Jessica L. OWENS Mariana OLSEN +3 位作者 Amy FONTAINE Christopher KLOTH Arik KERSHENBAUM Sara WALLER 《Current Zoology》 SCIE CAS CSCD 2017年第3期331-339,共9页
Cat vocal behavior, in particular, the vocal and social behavior of feral cats, is poorly understood, as are the differences between feral and fully domestic cats. The relationship between feral cat social and vocal b... Cat vocal behavior, in particular, the vocal and social behavior of feral cats, is poorly understood, as are the differences between feral and fully domestic cats. The relationship between feral cat social and vocal behavior is important because of the markedly different ecology of feral and domestic cats, and enhanced comprehension of the repertoire and potential information content of feral cat calls can provide both better understanding of the domestication and socialization process, and improved welfare for feral cats undergoing adoption. Previous studies have used conflicting classi- fication schemes for cat vocalizations, often relying on onomatopoeic or popular descriptions of call types (e.g., "miow'). We studied the vocalizations of 13 unaltered domestic cats that complied with our behavioral definition used to distinguish feral cats from domestic. A total of 71 acoustic units were extracted and visually analyzed for the construction of a hierarchical classification of vocal sounds, based on acoustic properties. We identified 3 major categories (tonal, pulse, and broadband) that further breakdown into 8 subcategories, and show a high degree of reliability when sounds are classified blindly by independent observers (Fleiss' Kappa K= 0.863). Due to the limited behavioral contexts in this study, additional subcategories of cat vocalizations may be identified in the future, but our hierarchical classification system allows for the addition of new categories and new subcategories as they are described. This study shows that cat vocalizations are diverse and complex, and provides an objective and reliable classification system that can be used in future studies. 展开更多
关键词 BIOACOUSTICS COMMUNICATION SOCIALIZATION spectrograms.
原文传递
A Deep CNN-LSTM-Based Feature Extraction for Cyber-Physical System Monitoring
11
作者 Alaa Omran Almagrabi 《Computers, Materials & Continua》 SCIE EI 2023年第8期2079-2093,共15页
A potential concept that could be effective for multiple applications is a“cyber-physical system”(CPS).The Internet of Things(IoT)has evolved as a research area,presenting new challenges in obtaining valuable data t... A potential concept that could be effective for multiple applications is a“cyber-physical system”(CPS).The Internet of Things(IoT)has evolved as a research area,presenting new challenges in obtaining valuable data through environmental monitoring.The existing work solely focuses on classifying the audio system of CPS without utilizing feature extraction.This study employs a deep learning method,CNN-LSTM,and two-way feature extraction to classify audio systems within CPS.The primary objective of this system,which is built upon a convolutional neural network(CNN)with Long Short Term Memory(LSTM),is to analyze the vocalization patterns of two different species of anurans.It has been demonstrated that CNNs,when combined with mel-spectrograms for sound analysis,are suitable for classifying ambient noises.Initially,the data is augmented and preprocessed.Next,the mel spectrogram features are extracted through two-way feature extraction.First,Principal Component Analysis(PCA)is utilized for dimensionality reduction,followed by Transfer learning for audio feature extraction.Finally,the classification is performed using the CNN-LSTM process.This methodology can potentially be employed for categorizing various biological acoustic objects and analyzing biodiversity indexes in natural environments,resulting in high classification accuracy.The study highlights that this CNNLSTM approach enables cost-effective and resource-efficient monitoring of large natural regions.The dissemination of updated CNN-LSTM models across distant IoT nodes is facilitated flexibly and dynamically through the utilization of CPS. 展开更多
关键词 Cyber-physical system internet of things feature extraction classification CNN principal component analysis mel spectrograms MONITORING deep learning
在线阅读 下载PDF
Research and Modeling of Nonlinear Acoustic Processes in a Layered Nonlinear Medium with a Porous Fluid-Saturated Inclusion of a Hierarchical Type
12
作者 Olga Hachay Veniamin Dryagin Andrey Khachay 《Open Journal of Geology》 2019年第9期497-506,共10页
Problem statement: The results of the study of seism acoustic emission arising in a porous two-phase geological environment under acoustic influence are presented. Acoustic emission arising in reservoirs of oil fields... Problem statement: The results of the study of seism acoustic emission arising in a porous two-phase geological environment under acoustic influence are presented. Acoustic emission arising in reservoirs of oil fields using good observations is considered. The regularity of the emission processes of acoustic emission, which manifests itself in the form of discrete spectra of signals similar to oscillations of nonlinearly coupled oscillators, is shown. Spectra have special characteristics for each type of rock. Applied method and design: An algorithm for modeling the process of resonant acoustic response of a porous fluid-saturated reservoir with hierarchical structure and plastic properties on acoustic frequency excitation is developed. That algorithm is developed as an iterative process for the solution integral and integral-differential equations. The frequencies that are parameters of the direct problem are used from the spectra of observed data of acoustic emission in the oil wells. Typical results: For the first time, it had been found the relation between resonant frequencies of the acoustic emission and plastic properties, these values of frequencies had been used in the algorithm of modeling distribution of longitudinal waves in the fluid saturated nonlinear plastic environment. Concluding note (Practical value/implications): The analysis of these emission processes can serve as a source of information about the filtration-capacitive properties of productive reservoirs of a porous type with a hierarchical structure. It is used by practical data of oil fields of Western Siberia. 展开更多
关键词 SEISMIC Emission after Acoustic Impact RESERVOIR of HIERARCHICAL Structure with Plastic Properties Algorithm of 2D Modeling CONNECTED with Energy spectrograms
暂未订购
Acquisition of pronunciation of consonant clusters by Arabic speakers of English as a second language
13
作者 Seetha Jayaraman 《Sino-US English Teaching》 2010年第1期46-56,共11页
The study is based on an observation of the pronunciation of a group of undergraduate students of English as a Second Language (ESL) whose mother tongue is Arabic and who have no formal training in the spoken variet... The study is based on an observation of the pronunciation of a group of undergraduate students of English as a Second Language (ESL) whose mother tongue is Arabic and who have no formal training in the spoken variety of English other than that received in the classroom. The study of acquisition of pronunciation of consonant clusters at morphological, particularly at the morphophonological levels indicates that the learners are sensitive to the syllabic structure viz., cccv type and cccvcc type, at the word-initial, medial and final positions. Samples of words with different consonant clusters were tested with a homogeneous group of students. Words of identical morphological categories were used as the data to test the students' level of perception. These were analyzed using Speech Analyzer Version 2.5. The data includes consonant clusters like plosive-fricative, plosive-plosive, fricative-fricative and plosive-fricative-trill/liquid combinations. The results varied according to the perceptual and articulatory abilities of the learners. It was observed that the plosive perception and acquisition of three-consonant clusters of plosive-plosive word initially, plosive-plosive combinations word finally and plosive-fricative type, posed more difficulty for the learners. The tendency to drop one of the consonants of the cluster was more pronounced with syllables ending in plural morphemes and those ending in -mp, -pt, -kt, -nt, -bt, etc. Difficulty was also noticed with the initial plosive+/r/, plosive+/1/combinations, especially in word initial positions. Across the syllable boundaries, these clusters are almost inaudible with some speakers. The difficulty in the articulation of these consonant clusters can be accounted for the mother tongue influence, as in the case of many other features. The results of the analysis have a pedagogical implication in the use of such words with consonant clusters, to teach reading skills to the students of undergraduate level in the present setting and promote self-learning through the use of speech tools. 展开更多
关键词 consonant clusters acoustic parameters DURATION INTELLIGIBILITY spectrograms
在线阅读 下载PDF
Research on data diagnosis method of acoustic array sensor device based on spectrogram 被引量:4
14
作者 Xing Lei Hang Ji +3 位作者 Qiang Xu Ting Ye Shengfu Zhang Chengjun Huang 《Global Energy Interconnection》 EI CAS CSCD 2022年第4期418-433,共16页
Acoustic array sensor device for partial discharge detection is widely used in power equipment inspection with the advantages of non-contact and precise positioning compared with partial discharge detection methods su... Acoustic array sensor device for partial discharge detection is widely used in power equipment inspection with the advantages of non-contact and precise positioning compared with partial discharge detection methods such as ultrasonic method and pulse current method.However,due to the sensitivity of the acoustic array sensor and the influence of the equipment operation site interference,the acoustic array sensor device for partial discharge type diagnosis by phase resolved partial discharge(PRPD)map might occasionally presents incorrect results,thus affecting the power equipment operation and maintenance strategy.The acoustic array sensor detection device for power equipment developed in this paper applies the array design model of equal-area multi-arm spiral with machine learning fast fourier transform clean(FFT-CLEAN)sound source localization identification algorithm to avoid the interference factors in the noise acquisition system using a single microphone and conventional beam forming algorithm,improves the spatial resolution of the acoustic array sensor device,and proposes an acoustic array sensor device based on the acoustic spectrogram.The analysis and diagnosis method of discharge type of acoustic array sensor device can effectively reduce the system misjudgment caused by factors such as the resolution of the acoustic imaging device and the time domain pulse of the digital signal,and reduce the false alarm rate of the acoustic array sensor device.The proposed method is tested by selecting power cables as the object,and its effectiveness is proved by laboratory verification and field verification. 展开更多
关键词 Acoustic array sensor device Acoustic spectrogram Partial discharge Power equipment False alarm rate
在线阅读 下载PDF
Deep Scalogram Representations for Acoustic Scene Classification 被引量:5
15
作者 Zhao Ren Kun Qian +3 位作者 Zixing Zhang Vedhas Pandit Alice Baird Bjorn Schuller 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第3期662-669,共8页
Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency info... Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system. 展开更多
关键词 Acoustic scene classification(ASC) (bidirectional) gated recurrent neural networks((B) GRNNs) convolutional neural networks(CNNs) deep scalogram representation spectrogram representation
在线阅读 下载PDF
Forensic Seismology and Boundary Element Method Application vis-à-vis ROKS Cheonan Underwater Explosion 被引量:2
16
作者 So Gu Kim 《Journal of Marine Science and Application》 2013年第4期422-433,共12页
On March 26,2010 an underwater explosion(UWE)led to the sinking of the ROKS Cheonan.The official Multinational Civilian-Military Joint Investigation Group(MCMJIG)report concluded that the cause of the underwater explo... On March 26,2010 an underwater explosion(UWE)led to the sinking of the ROKS Cheonan.The official Multinational Civilian-Military Joint Investigation Group(MCMJIG)report concluded that the cause of the underwater explosion was a 250 kg net explosive weight(NEW)detonation at a depth of 6 9 m from a DPRK"CHT-02D"torpedo.Kim and Gitterman(2012a)determined the NEW and seismic magnitude as 136 kg at a depth of approximately 8m and 2.04,respectively using basic hydrodynamics based on theoretical and experimental methods as well as spectral analysis and seismic methods.The purpose of this study was to clarify the cause of the UWE via more detailed methods using bubble dynamics and simulation of propellers as well as forensic seismology.Regarding the observed bubble pulse period of 0.990 s,0.976 s and 1.030 s were found in case of a 136NEW at a detonation depth of 8 m using the boundary element method(BEM)and 3D bubble shape simulations derived for a 136kg NEW detonation at a depth of 8 m approximately 5 m portside from the hull centerline.Here we show through analytical equations,models and 3D bubble shape simulations that the most probable cause of this underwater explosion was a 136 kg NEW detonation at a depth of 8m attributable to a ROK littoral"land control"mine(LCM). 展开更多
关键词 CEPSTRUM SPECTROGRAM BUBBLE pulse TOROIDAL BUBBLE boundary element method ICCP forensic SEISMOLOGY underwater explosion
在线阅读 下载PDF
Cavitation recognition of axial piston pumps in noisy environment based on Grad-CAM visualization technique 被引量:2
17
作者 Qun Chao Xiaoliang Wei +2 位作者 Jianfeng Tao Chengliang Liu Yuanhang Wang 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第1期206-218,共13页
The cavitation in axial piston pumps threatens the reliability and safety of the overall hydraulic system.Vibration signal can reflect the cavitation conditions in axial piston pumps and it has been combined with mach... The cavitation in axial piston pumps threatens the reliability and safety of the overall hydraulic system.Vibration signal can reflect the cavitation conditions in axial piston pumps and it has been combined with machine learning to detect the pump cavitation.However,the vibration signal usually contains noise in real working conditions,which raises concerns about accurate recognition of cavitation in noisy environment.This paper presents an intelligent method to recognise the cavitation in axial piston pumps in noisy environment.First,we train a convolutional neural network(CNN)using the spectrogram images transformed from raw vibration data under different cavitation conditions.Second,we employ the technique of gradient-weighted class activation mapping(Grad-CAM)to visualise class-discriminative regions in the spectrogram image.Finally,we propose a novel image processing method based on Grad-CAM heatmap to automatically remove entrained noise and enhance class features in the spectrogram image.The experimental results show that the proposed method greatly improves the diagnostic performance of the CNN model in noisy environments.The classification accuracy of cavitation conditions increases from 0.50 to 0.89 and from 0.80 to 0.92 at signal-to-noise ratios of 4 and 6 dB,respectively. 展开更多
关键词 axial piston pump cavitation recognition CNN Grad-CAM spectrogram image
在线阅读 下载PDF
User Recognition System Based on Spectrogram Image Conversion Using EMG Signals 被引量:2
18
作者 Jae Myung Kim Gyu Ho Choi +1 位作者 Min-Gu Kim Sung Bum Pan 《Computers, Materials & Continua》 SCIE EI 2022年第7期1213-1227,共15页
Recently,user recognitionmethods to authenticate personal identity has attracted significant attention especially with increased availability of various internet of things(IoT)services through fifth-generation technol... Recently,user recognitionmethods to authenticate personal identity has attracted significant attention especially with increased availability of various internet of things(IoT)services through fifth-generation technology(5G)based mobile devices.The EMG signals generated inside the body with unique individual characteristics are being studied as a part of nextgeneration user recognition methods.However,there is a limitation when applying EMG signals to user recognition systems as the same operation needs to be repeated while maintaining a constant strength of muscle over time.Hence,it is necessary to conduct research on multidimensional feature transformation that includes changes in frequency features over time.In this paper,we propose a user recognition system that applies EMG signals to the short-time fourier transform(STFT),and converts the signals into EMG spectrogram images while adjusting the time-frequency resolution to extract multidimensional features.The proposed system is composed of a data pre-processing and normalization process,spectrogram image conversion process,and final classification process.The experimental results revealed that the proposed EMG spectrogram image-based user recognition system has a 95.4%accuracy performance,which is 13%higher than the EMGsignal-based system.Such a user recognition accuracy improvement was achieved by using multidimensional features,in the time-frequency domain. 展开更多
关键词 EMG user recognition SPECTROGRAM CNN
在线阅读 下载PDF
Cancelable Speaker Identification System Based on Optical-Like Encryption Algorithms 被引量:1
19
作者 Safaa El-Gazar Walid El-Shafai +4 位作者 Ghada El-Banby Hesham F.A.Hamed Gerges M.Salama Mohammed Abd-Elnaby Fathi E.Abd El-Samie 《Computer Systems Science & Engineering》 SCIE EI 2022年第10期87-102,共16页
Biometric authentication is a rapidly growing trend that is gaining increasing attention in the last decades.It achieves safe access to systems using biometrics instead of the traditional passwords.The utilization of ... Biometric authentication is a rapidly growing trend that is gaining increasing attention in the last decades.It achieves safe access to systems using biometrics instead of the traditional passwords.The utilization of a biometric in its original format makes it usable only once.Therefore,a cancelable biometric template should be used,so that it can be replaced when it is attacked.Cancelable biometrics aims to enhance the security and privacy of biometric authentication.Digital encryption is an efficient technique to be used in order to generate cancelable biometric templates.In this paper,a highly-secure encryption algorithm is proposed to ensure secure biometric data in verification systems.The considered biometric in this paper is the speech signal.The speech signal is transformed into its spectrogram.Then,the spectrogram is encrypted using two cascaded optical encryption algorithms.The first algorithm is the Optical Scanning Holography(OSH)for its efficiency as an encryption tool.The OSH encrypted spectrogram is encrypted using Double Random Phase Encoding(DRPE)by implementing two Random Phase Masks(RPMs).After the two cascaded optical encryption algorithms,the cancelable template is obtained.The verification is implemented through correlation estimation between enrolled and test templates in their encrypted format.If the correlation value is larger than a threshold value,the user is authorized.The threshold value can be determined from the genuine and imposter correlation distribution curves as the midpoint between the two curves.The implementation of optical encryption is adopted using its software rather than the optical setup.The efficiency of the proposed cancelable biometric algorithm is illustrated by the simulation results.It can improve the biometric data security without deteriorating the recognition accuracy.Simulation results give close-to-zero This values for the Equal Error Rate(EER)and close-to-one values for the Area under Receiver Operator Characteristic(AROC)curve. 展开更多
关键词 Cancelable biometrics SPECTROGRAM OSH DRPE EER AROC
在线阅读 下载PDF
Automatic Speaker Recognition Using Mel-Frequency Cepstral Coefficients Through Machine Learning 被引量:1
20
作者 U˘gur Ayvaz Hüseyin Gürüler +3 位作者 Faheem Khan Naveed Ahmed Taegkeun Whangbo Abdusalomov Akmalbek Bobomirzaevich 《Computers, Materials & Continua》 SCIE EI 2022年第6期5511-5521,共11页
Automatic speaker recognition(ASR)systems are the field of Human-machine interaction and scientists have been using feature extraction and feature matching methods to analyze and synthesize these signals.One of the mo... Automatic speaker recognition(ASR)systems are the field of Human-machine interaction and scientists have been using feature extraction and feature matching methods to analyze and synthesize these signals.One of the most commonly used methods for feature extraction is Mel Frequency Cepstral Coefficients(MFCCs).Recent researches show that MFCCs are successful in processing the voice signal with high accuracies.MFCCs represents a sequence of voice signal-specific features.This experimental analysis is proposed to distinguish Turkish speakers by extracting the MFCCs from the speech recordings.Since the human perception of sound is not linear,after the filterbank step in theMFCC method,we converted the obtained log filterbanks into decibel(dB)features-based spectrograms without applying the Discrete Cosine Transform(DCT).A new dataset was created with converted spectrogram into a 2-D array.Several learning algorithms were implementedwith a 10-fold cross-validationmethod to detect the speaker.The highest accuracy of 90.2%was achieved using Multi-layer Perceptron(MLP)with tanh activation function.The most important output of this study is the inclusion of human voice as a new feature set. 展开更多
关键词 Automatic speaker recognition human voice recognition spatial pattern recognition MFCCs SPECTROGRAM machine learning artificial intelligence
在线阅读 下载PDF
上一页 1 2 3 下一页 到第
使用帮助 返回顶部