期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
Audiovisual speech recognition based on a deep convolutional neural network 被引量:1
1
作者 Shashidhar Rudregowda Sudarshan Patilkulkarni +2 位作者 Vinayakumar Ravi Gururaj H.L. Moez Krichen 《Data Science and Management》 2024年第1期25-34,共10页
Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for India... Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively. 展开更多
关键词 Audiovisual speech recognition Custom dataset 1d convolution neural network(CNN) Deep CNN(DCNN) Long short-term memory(LSTM) LIPREADING Dlib Mel-frequency cepstral coefficient(MFCC)
在线阅读 下载PDF
An adaptive physics-informed deep learning method for pore pressure prediction using seismic data 被引量:6
2
作者 Xin Zhang Yun-Hu Lu +2 位作者 Yan Jin Mian Chen Bo Zhou 《Petroleum Science》 SCIE EI CAS CSCD 2024年第2期885-902,共18页
Accurate prediction of formation pore pressure is essential to predict fluid flow and manage hydrocarbon production in petroleum engineering.Recent deep learning technique has been receiving more interest due to the g... Accurate prediction of formation pore pressure is essential to predict fluid flow and manage hydrocarbon production in petroleum engineering.Recent deep learning technique has been receiving more interest due to the great potential to deal with pore pressure prediction.However,most of the traditional deep learning models are less efficient to address generalization problems.To fill this technical gap,in this work,we developed a new adaptive physics-informed deep learning model with high generalization capability to predict pore pressure values directly from seismic data.Specifically,the new model,named CGP-NN,consists of a novel parametric features extraction approach(1DCPP),a stacked multilayer gated recurrent model(multilayer GRU),and an adaptive physics-informed loss function.Through machine training,the developed model can automatically select the optimal physical model to constrain the results for each pore pressure prediction.The CGP-NN model has the best generalization when the physicsrelated metricλ=0.5.A hybrid approach combining Eaton and Bowers methods is also proposed to build machine-learnable labels for solving the problem of few labels.To validate the developed model and methodology,a case study on a complex reservoir in Tarim Basin was further performed to demonstrate the high accuracy on the pore pressure prediction of new wells along with the strong generalization ability.The adaptive physics-informed deep learning approach presented here has potential application in the prediction of pore pressures coupled with multiple genesis mechanisms using seismic data. 展开更多
关键词 Pore pressure prediction Seismic data 1d convolution pyramid pooling Adaptive physics-informed loss function High generalization capability
原文传递
Enhancing Human Action Recognition with Adaptive Hybrid Deep Attentive Networks and Archerfish Optimization
3
作者 Ahmad Yahiya Ahmad Bani Ahmad Jafar Alzubi +3 位作者 Sophers James Vincent Omollo Nyangaresi Chanthirasekaran Kutralakani Anguraju Krishnan 《Computers, Materials & Continua》 SCIE EI 2024年第9期4791-4812,共22页
In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the e... In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach. 展开更多
关键词 Human action recognition multi-modal sensor data and signals adaptive hybrid deep attentive network enhanced archerfish hunting optimizer 1d convolutional neural network gated recurrent units
在线阅读 下载PDF
Automatic Classification of Swedish Metadata Using Dewey Decimal Classification:A Comparison of Approaches 被引量:2
4
作者 Koraljka Golub Johan Hagelback Anders Ardo 《Journal of Data and Information Science》 CSCD 2020年第1期18-38,共21页
Purpose:With more and more digital collections of various information resources becoming available,also increasing is the challenge of assigning subject index terms and classes from quality knowledge organization syst... Purpose:With more and more digital collections of various information resources becoming available,also increasing is the challenge of assigning subject index terms and classes from quality knowledge organization systems.While the ultimate purpose is to understand the value of automatically produced Dewey Decimal Classification(DDC)classes for Swedish digital collections,the paper aims to evaluate the performance of six machine learning algorithms as well as a string-matching algorithm based on characteristics of DDC.Design/methodology/approach:State-of-the-art machine learning algorithms require at least 1,000 training examples per class.The complete data set at the time of research involved 143,838 records which had to be reduced to top three hierarchical levels of DDC in order to provide sufficient training data(totaling 802 classes in the training and testing sample,out of 14,413 classes at all levels).Findings:Evaluation shows that Support Vector Machine with linear kernel outperforms other machine learning algorithms as well as the string-matching algorithm on average;the string-matching algorithm outperforms machine learning for specific classes when characteristics of DDC are most suitable for the task.Word embeddings combined with different types of neural networks(simple linear network,standard neural network,1 D convolutional neural network,and recurrent neural network)produced worse results than Support Vector Machine,but reach close results,with the benefit of a smaller representation size.Impact of features in machine learning shows that using keywords or combining titles and keywords gives better results than using only titles as input.Stemming only marginally improves the results.Removed stop-words reduced accuracy in most cases,while removing less frequent words increased it marginally.The greatest impact is produced by the number of training examples:81.90%accuracy on the training set is achieved when at least 1,000 records per class are available in the training set,and 66.13%when too few records(often less than A Comparison of Approaches100 per class)on which to train are available—and these hold only for top 3 hierarchical levels(803 instead of 14,413 classes).Research limitations:Having to reduce the number of hierarchical levels to top three levels of DDC because of the lack of training data for all classes,skews the results so that they work in experimental conditions but barely for end users in operational retrieval systems.Practical implications:In conclusion,for operative information retrieval systems applying purely automatic DDC does not work,either using machine learning(because of the lack of training data for the large number of DDC classes)or using string-matching algorithm(because DDC characteristics perform well for automatic classification only in a small number of classes).Over time,more training examples may become available,and DDC may be enriched with synonyms in order to enhance accuracy of automatic classification which may also benefit information retrieval performance based on DDC.In order for quality information services to reach the objective of highest possible precision and recall,automatic classification should never be implemented on its own;instead,machine-aided indexing that combines the efficiency of automatic suggestions with quality of human decisions at the final stage should be the way for the future.Originality/value:The study explored machine learning on a large classification system of over 14,000 classes which is used in operational information retrieval systems.Due to lack of sufficient training data across the entire set of classes,an approach complementing machine learning,that of string matching,was applied.This combination should be explored further since it provides the potential for real-life applications with large target classification systems. 展开更多
关键词 LIBRIS Dewey Decimal Classification Automatic classification Machine learning Support Vector Machine Multinomial Naive Bayes Simple linear network Standard neural network 1d convolutional neural network Recurrent neural network Word embeddings String matching
在线阅读 下载PDF
Joint Deep Matching Model of OCT Retinal Layer Segmentation
5
作者 Mei Yang Yuanjie Zheng +3 位作者 Weikuan Jia Yunlong He Tongtong Che Jinyu Cong 《Computers, Materials & Continua》 SCIE EI 2020年第6期1485-1498,共14页
Optical Coherence Tomography(OCT)is very important in medicine and provide useful diagnostic information.Measuring retinal layer thicknesses plays a vital role in pathophysiologic factors of many ocular conditions.Amo... Optical Coherence Tomography(OCT)is very important in medicine and provide useful diagnostic information.Measuring retinal layer thicknesses plays a vital role in pathophysiologic factors of many ocular conditions.Among the existing retinal layer segmentation approaches,learning or deep learning-based methods belong to the state-of-art.However,most of these techniques rely on manual-marked layers and the performances are limited due to the image quality.In order to overcome this limitation,we build a framework based on gray value curve matching,which uses depth learning to match the curve for semi-automatic segmentation of retinal layers from OCT.The depth convolution network learns the column correspondence in the OCT image unsupervised.The whole OCT image participates in the depth convolution neural network operation,compares the gray value of each column,and matches the gray value sequence of the transformation column and the next column.Using this algorithm,when a boundary point is manually specified,we can accurately segment the boundary between retinal layers.Our experimental results obtained from a 54-subjects database of both normal healthy eyes and affected eyes demonstrate the superior performances of our approach. 展开更多
关键词 OCT retinal segmentation deep learning 1d convolution
在线阅读 下载PDF
Probabilistic simulation of electricity price scenarios using Conditional Generative Adversarial Networks 被引量:1
6
作者 Viktor Walter Andreas Wagner 《Energy and AI》 2024年第4期110-123,共14页
A novel approach for generative time series simulation of electricity price scenarios is presented.A"Time Series Simulation Conditional Generative Adversarial Network"(TSS-CGAN)generates short-term electrici... A novel approach for generative time series simulation of electricity price scenarios is presented.A"Time Series Simulation Conditional Generative Adversarial Network"(TSS-CGAN)generates short-term electricity price scenarios.In particular,the network is capable of generating a 24-dimensional output vector that corresponds to the expected behavior of electricity markets.The model can replace typical approaches from financial mathematics like statistical factor models to model the price distribution around a given forecast.The data cover a 3-year period from 2020 to 2023.Our empirical study is conducted on the EPEX SPOT market in Europe.An electricity price scenario includes the prices of the hourly contracts of a day-ahead auction at the EPEX SPOT power exchange.The model uses multivariate time series as input factors,consisting of point forecasts of electricity prices and fundamental data on generation and load profiles.The architecture of a CGAN TSS-is based on the idea of Conditional Generative Adversarial Networks combined with 1D Convolutional Neural Networks and Bidirectional Long Short-Term Memory.The model is evaluated using qualitative and quantitative criteria.For the evaluation,10,000 simulations of a test period are carried out.Qualitative criteria are whether the model follows certain electricity market-specific regularities and depicts them adequately.The quantitative analysis includes common error metric,compared to benchmark models,like DeepAR,Prophet and Temporal Fusion Transformer,the examination of the quantile ranges,the error distribution and a sensitivity analysis.The results show that the TSS-CGAN outperforms benchmark models such as DeepAR by reducing the continuous ranked probability score by 50%and considers market-specific circumstances such as the production of fluctuating energies and reacts correctly to changes in the corresponding variables. 展开更多
关键词 Time series simulation Probabilistic modeling Day-ahead electricity prices 1d convolutions Bidirectional long short-term memory Generative adversarial networks
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部