期刊文献+
共找到3,694篇文章
< 1 2 185 >
每页显示 20 50 100
A Composite Loss-Based Autoencoder for Accurate and Scalable Missing Data Imputation
1
作者 Thierry Mugenzi Cahit Perkgoz 《Computers, Materials & Continua》 2026年第1期1985-2005,共21页
Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel a... Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel autoencoder-based imputation framework that integrates a composite loss function to enhance robustness and precision.The proposed loss combines(i)a guided,masked mean squared error focusing on missing entries;(ii)a noise-aware regularization term to improve resilience against data corruption;and(iii)a variance penalty to encourage expressive yet stable reconstructions.We evaluate the proposed model across four missingness mechanisms,such as Missing Completely at Random,Missing at Random,Missing Not at Random,and Missing Not at Random with quantile censorship,under systematically varied feature counts,sample sizes,and missingness ratios ranging from 5%to 60%.Four publicly available real-world datasets(Stroke Prediction,Pima Indians Diabetes,Cardiovascular Disease,and Framingham Heart Study)were used,and the obtained results show that our proposed model consistently outperforms baseline methods,including traditional and deep learning-based techniques.An ablation study reveals the additive value of each component in the loss function.Additionally,we assessed the downstream utility of imputed data through classification tasks,where datasets imputed by the proposed method yielded the highest receiver operating characteristic area under the curve scores across all scenarios.The model demonstrates strong scalability and robustness,improving performance with larger datasets and higher feature counts.These results underscore the capacity of the proposed method to produce not only numerically accurate but also semantically useful imputations,making it a promising solution for robust data recovery in clinical applications. 展开更多
关键词 missing data imputation autoencoder deep learning missing mechanisms
在线阅读 下载PDF
Impact of Data Processing Techniques on AI Models for Attack-Based Imbalanced and Encrypted Traffic within IoT Environments
2
作者 Yeasul Kim Chaeeun Won Hwankuk Kim 《Computers, Materials & Continua》 2026年第1期247-274,共28页
With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comp... With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comprise heterogeneous networks where outdated systems coexist with the latest devices,spanning a range of devices from non-encrypted ones to fully encrypted ones.Given the limited visibility into payloads in this context,this study investigates AI-based attack detection methods that leverage encrypted traffic metadata,eliminating the need for decryption and minimizing system performance degradation—especially in light of these heterogeneous devices.Using the UNSW-NB15 and CICIoT-2023 dataset,encrypted and unencrypted traffic were categorized according to security protocol,and AI-based intrusion detection experiments were conducted for each traffic type based on metadata.To mitigate the problem of class imbalance,eight different data sampling techniques were applied.The effectiveness of these sampling techniques was then comparatively analyzed using two ensemble models and three Deep Learning(DL)models from various perspectives.The experimental results confirmed that metadata-based attack detection is feasible using only encrypted traffic.In the UNSW-NB15 dataset,the f1-score of encrypted traffic was approximately 0.98,which is 4.3%higher than that of unencrypted traffic(approximately 0.94).In addition,analysis of the encrypted traffic in the CICIoT-2023 dataset using the same method showed a significantly lower f1-score of roughly 0.43,indicating that the quality of the dataset and the preprocessing approach have a substantial impact on detection performance.Furthermore,when data sampling techniques were applied to encrypted traffic,the recall in the UNSWNB15(Encrypted)dataset improved by up to 23.0%,and in the CICIoT-2023(Encrypted)dataset by 20.26%,showing a similar level of improvement.Notably,in CICIoT-2023,f1-score and Receiver Operation Characteristic-Area Under the Curve(ROC-AUC)increased by 59.0%and 55.94%,respectively.These results suggest that data sampling can have a positive effect even in encrypted environments.However,the extent of the improvement may vary depending on data quality,model architecture,and sampling strategy. 展开更多
关键词 Encrypted traffic attack detection data sampling technique AI-based detection IoT environment
在线阅读 下载PDF
Handling missing data in large-scale TBM datasets:Methods,strategies,and applications 被引量:1
3
作者 Haohan Xiao Ruilang Cao +5 位作者 Zuyu Chen Chengyu Hong Jun Wang Min Yao Litao Fan Teng Luo 《Intelligent Geoengineering》 2025年第3期109-125,共17页
Substantial advancements have been achieved in Tunnel Boring Machine(TBM)technology and monitoring systems,yet the presence of missing data impedes accurate analysis and interpretation of TBM monitoring results.This s... Substantial advancements have been achieved in Tunnel Boring Machine(TBM)technology and monitoring systems,yet the presence of missing data impedes accurate analysis and interpretation of TBM monitoring results.This study aims to investigate the issue of missing data in extensive TBM datasets.Through a comprehensive literature review,we analyze the mechanism of missing TBM data and compare different imputation methods,including statistical analysis and machine learning algorithms.We also examine the impact of various missing patterns and rates on the efficacy of these methods.Finally,we propose a dynamic interpolation strategy tailored for TBM engineering sites.The research results show that K-Nearest Neighbors(KNN)and Random Forest(RF)algorithms can achieve good interpolation results;As the missing rate increases,the interpolation effect of different methods will decrease;The interpolation effect of block missing is poor,followed by mixed missing,and the interpolation effect of sporadic missing is the best.On-site application results validate the proposed interpolation strategy's capability to achieve robust missing value interpolation effects,applicable in ML scenarios such as parameter optimization,attitude warning,and pressure prediction.These findings contribute to enhancing the efficiency of TBM missing data processing,offering more effective support for large-scale TBM monitoring datasets. 展开更多
关键词 Tunnel boring machine(TBM) missing data imputation Machine learning(ML) Time series interpolation data preprocessing Real-time data stream
在线阅读 下载PDF
Prediction of radionuclide diffusion enabled by missing data imputation and ensemble machine learning 被引量:1
4
作者 Jun-Lei Tian Jia-Xing Feng +4 位作者 Jia-Cong Shen Lei Yao Jing-Yan Wang Tao Wu Yao-Lin Zhao 《Nuclear Science and Techniques》 2025年第10期47-61,共15页
Missing values in radionuclide diffusion datasets can undermine the predictive accuracy and robustness of the machine learning(ML)models.In this study,regression-based missing data imputation method using a light grad... Missing values in radionuclide diffusion datasets can undermine the predictive accuracy and robustness of the machine learning(ML)models.In this study,regression-based missing data imputation method using a light gradient boosting machine(LGBM)algorithm was employed to impute more than 60%of the missing data,establishing a radionuclide diffusion dataset containing 16 input features and 813 instances.The effective diffusion coefficient(D_(e))was predicted using ten ML models.The predictive accuracy of the ensemble meta-models,namely LGBM-extreme gradient boosting(XGB)and LGBM-categorical boosting(CatB),surpassed that of the other ML models,with R^(2)values of 0.94.The models were applied to predict the D_(e)values of EuEDTA^(−)and HCrO_(4)^(−)in saturated compacted bentonites at compactions ranging from 1200 to 1800 kg/m^(3),which were measured using a through-diffusion method.The generalization ability of the LGBM-XGB model surpassed that of LGB-CatB in predicting the D_(e)of HCrO_(4)^(−).Shapley additive explanations identified total porosity as the most significant influencing factor.Additionally,the partial dependence plot analysis technique yielded clearer results in the univariate correlation analysis.This study provides a regression imputation technique to refine radionuclide diffusion datasets,offering deeper insights into analyzing the diffusion mechanism of radionuclides and supporting the safety assessment of the geological disposal of high-level radioactive waste. 展开更多
关键词 Machine learning Radionuclide diffusion BENTONITE Regression imputation missing data Diffusion experiments
在线阅读 下载PDF
Effective and efficient handling of missing data in supervised machine learning
5
作者 Peter Ayokunle Popoola Jules-Raymond Tapamo Alain Guy HonoréAssounga 《Data Science and Management》 2025年第3期361-373,共13页
The prevailing consensus in statistical literature is that multiple imputation is generally the most suitable method for addressing missing data in statistical analyses,whereas a complete case analysis is deemed appro... The prevailing consensus in statistical literature is that multiple imputation is generally the most suitable method for addressing missing data in statistical analyses,whereas a complete case analysis is deemed appropriate only when the rate of missingness is negligible or when the missingness mechanism is missing completely at random(MCAR).This study investigates the applicability of this consensus within the context of supervised machine learning,with particular emphasis on the interactions between the imputation method,missingness mechanism,and missingness rate.Furthermore,we examine the time efficiency of these“state-of-the-art”imputation methods considering the time-sensitive nature of certain machine learning applications.Utilizing ten real-world datasets,we introduced missingness at rates ranging from approximately 5%–75%under the MCAR,missing at random(MAR),and missing not at random(MNAR)mechanisms.We subsequently address missing data using five methods:complete case analysis(CCA),mean imputation,hot deck imputation,regression imputation,and multiple imputation(MI).Statistical tests are conducted on the machine learning outcomes,and the findings are presented and analyzed.Our investigation reveals that in nearly all scenarios,CCA performs comparably to MI,even with substantial levels of missingness under the MAR and MNAR conditions and with missingness in the output variable for regression problems.Under some conditions,CCA surpasses MI in terms of its performance.Thus,given the considerable computational demands associated with MI,the application of CCA is recommended within the broader context of supervised machine learning,particularly in big-data environments. 展开更多
关键词 CLASSIFICATION IMPUTATION LEARNING missing data Prediction
在线阅读 下载PDF
A spatiotemporal recurrent neural network for missing data imputation in tunnel monitoring
6
作者 Junchen Ye Yuhao Mao +3 位作者 Ke Cheng Xuyan Tan Bowen Du Weizhong Chen 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第8期4815-4826,共12页
Given the swift proliferation of structural health monitoring(SHM)technology within tunnel engineering,there is a demand on proficiently and precisely imputing the missing monitoring data to uphold the precision of di... Given the swift proliferation of structural health monitoring(SHM)technology within tunnel engineering,there is a demand on proficiently and precisely imputing the missing monitoring data to uphold the precision of disaster prediction.In contrast to other SHM datasets,the monitoring data specific to tunnel engineering exhibits pronounced spatiotemporal correlations.Nevertheless,most methodologies fail to adequately combine these types of correlations.Hence,the objective of this study is to develop spatiotemporal recurrent neural network(ST-RNN)model,which exploits spatiotemporal information to effectively impute missing data within tunnel monitoring systems.ST-RNN consists of two moduli:a temporal module employing recurrent neural network(RNN)to capture temporal dependencies,and a spatial module employing multilayer perceptron(MLP)to capture spatial correlations.To confirm the efficacy of the model,several commonly utilized methods are chosen as baselines for conducting comparative analyses.Furthermore,parametric validity experiments are conducted to illustrate the efficacy of the parameter selection process.The experimentation is conducted using original raw datasets wherein various degrees of continuous missing data are deliberately introduced.The experimental findings indicate that the ST-RNN model,incorporating both spatiotemporal modules,exhibits superior interpolation performance compared to other baseline methods across varying degrees of missing data.This affirms the reliability of the proposed model. 展开更多
关键词 MONITORING TUNNEL Machine learning INTERPOLATION missing data
在线阅读 下载PDF
A Modified Deep Residual-Convolutional Neural Network for Accurate Imputation of Missing Data
7
作者 Firdaus Firdaus Siti Nurmaini +8 位作者 Anggun Islami Annisa Darmawahyuni Ade Iriani Sapitri Muhammad Naufal Rachmatullah Bambang Tutuko Akhiar Wista Arum Muhammad Irfan Karim Yultrien Yultrien Ramadhana Noor Salassa Wandya 《Computers, Materials & Continua》 2025年第2期3419-3441,共23页
Handling missing data accurately is critical in clinical research, where data quality directly impacts decision-making and patient outcomes. While deep learning (DL) techniques for data imputation have gained attentio... Handling missing data accurately is critical in clinical research, where data quality directly impacts decision-making and patient outcomes. While deep learning (DL) techniques for data imputation have gained attention, challenges remain, especially when dealing with diverse data types. In this study, we introduce a novel data imputation method based on a modified convolutional neural network, specifically, a Deep Residual-Convolutional Neural Network (DRes-CNN) architecture designed to handle missing values across various datasets. Our approach demonstrates substantial improvements over existing imputation techniques by leveraging residual connections and optimized convolutional layers to capture complex data patterns. We evaluated the model on publicly available datasets, including Medical Information Mart for Intensive Care (MIMIC-III and MIMIC-IV), which contain critical care patient data, and the Beijing Multi-Site Air Quality dataset, which measures environmental air quality. The proposed DRes-CNN method achieved a root mean square error (RMSE) of 0.00006, highlighting its high accuracy and robustness. We also compared with Low Light-Convolutional Neural Network (LL-CNN) and U-Net methods, which had RMSE values of 0.00075 and 0.00073, respectively. This represented an improvement of approximately 92% over LL-CNN and 91% over U-Net. The results showed that this DRes-CNN-based imputation method outperforms current state-of-the-art models. These results established DRes-CNN as a reliable solution for addressing missing data. 展开更多
关键词 data imputation missing data deep learning deep residual convolutional neural network
在线阅读 下载PDF
Dynamic Relative Advantage-Driven Multi-Fault Synergistic Diagnosis Method for Motors under Imbalanced Missing Data Rates
8
作者 Zhenpeng Teng Xiaojian Yi Biao Wang 《Journal of Dynamics, Monitoring and Diagnostics》 2025年第2期111-120,共10页
Missing data handling is vital for multi-sensor information fusion fault diagnosis of motors to prevent the accuracy decay or even model failure,and some promising results have been gained in several current studies.T... Missing data handling is vital for multi-sensor information fusion fault diagnosis of motors to prevent the accuracy decay or even model failure,and some promising results have been gained in several current studies.These studies,however,have the following limitations:1)effective supervision is neglected for missing data across different fault types and 2)imbalance in missing rates among fault types results in inadequate learning during model training.To overcome the above limitations,this paper proposes a dynamic relative advantagedriven multi-fault synergistic diagnosis method to accomplish accurate fault diagnosis of motors under imbalanced missing data rates.Firstly,a cross-fault-type generalized synergistic diagnostic strategy is established based on variational information bottleneck theory,which is able to ensure sufficient supervision in handling missing data.Then,a dynamic relative advantage assessment technique is designed to reduce diagnostic accuracy decay caused by imbalanced missing data rates.The proposed method is validated using multi-sensor data from motor fault simulation experiments,and experimental results demonstrate its effectiveness and superiority in improving diagnostic accuracy and generalization under imbalanced missing data rates. 展开更多
关键词 data missing motor fault relative advantage synergistic diagnosis
在线阅读 下载PDF
A Novel Reduced Error Pruning Tree Forest with Time-Based Missing Data Imputation(REPTF-TMDI)for Traffic Flow Prediction
9
作者 Yunus Dogan Goksu Tuysuzoglu +4 位作者 Elife Ozturk Kiyak Bita Ghasemkhani Kokten Ulas Birant Semih Utku Derya Birant 《Computer Modeling in Engineering & Sciences》 2025年第8期1677-1715,共39页
Accurate traffic flow prediction(TFP)is vital for efficient and sustainable transportation management and the development of intelligent traffic systems.However,missing data in real-world traffic datasets poses a sign... Accurate traffic flow prediction(TFP)is vital for efficient and sustainable transportation management and the development of intelligent traffic systems.However,missing data in real-world traffic datasets poses a significant challenge to maintaining prediction precision.This study introduces REPTF-TMDI,a novel method that combines a Reduced Error Pruning Tree Forest(REPTree Forest)with a newly proposed Time-based Missing Data Imputation(TMDI)approach.The REP Tree Forest,an ensemble learning approach,is tailored for time-related traffic data to enhance predictive accuracy and support the evolution of sustainable urbanmobility solutions.Meanwhile,the TMDI approach exploits temporal patterns to estimate missing values reliably whenever empty fields are encountered.The proposed method was evaluated using hourly traffic flow data from a major U.S.roadway spanning 2012-2018,incorporating temporal features(e.g.,hour,day,month,year,weekday),holiday indicator,and weather conditions(temperature,rain,snow,and cloud coverage).Experimental results demonstrated that the REPTF-TMDI method outperformed conventional imputation techniques across various missing data ratios by achieving an average 11.76%improvement in terms of correlation coefficient(R).Furthermore,REPTree Forest achieved improvements of 68.62%in RMSE and 70.52%in MAE compared to existing state-of-the-art models.These findings highlight the method’s ability to significantly boost traffic flow prediction accuracy,even in the presence of missing data,thereby contributing to the broader objectives of sustainable urban transportation systems. 展开更多
关键词 Machine learning traffic flow prediction missing data imputation reduced error pruning tree(REPTree) sustainable transportation systems traffic management artificial intelligence
在线阅读 下载PDF
Improving Disease Prevalence Estimates Using Missing Data Techniques
10
作者 Elhadji Moustapha Seck Ngesa Owino Oscar Abdou Ka Diongue 《Open Journal of Statistics》 2016年第6期1110-1122,共14页
The prevalence of a disease in a population is defined as the proportion of people who are infected. Selection bias in disease prevalence estimates occurs if non-participation in testing is correlated with disease sta... The prevalence of a disease in a population is defined as the proportion of people who are infected. Selection bias in disease prevalence estimates occurs if non-participation in testing is correlated with disease status. Missing data are commonly encountered in most medical research. Unfortunately, they are often neglected or not properly handled during analytic procedures, and this may substantially bias the results of the study, reduce the study power, and lead to invalid conclusions. The goal of this study is to illustrate how to estimate prevalence in the presence of missing data. We consider a case where the variable of interest (response variable) is binary and some of the observations are missing and assume that all the covariates are fully observed. In most cases, the statistic of interest, when faced with binary data is the prevalence. We develop a two stage approach to improve the prevalence estimates;in the first stage, we use the logistic regression model to predict the missing binary observations and then in the second stage we recalculate the prevalence using the observed data and the imputed missing data. Such a model would be of great interest in research studies involving HIV/AIDS in which people usually refuse to donate blood for testing yet they are willing to provide other covariates. The prevalence estimation method is illustrated using simulated data and applied to HIV/AIDS data from the Kenya AIDS Indicator Survey, 2007. 展开更多
关键词 Disease Prevalence missing data Non-Participant Logistic Regression Model Prevalence Estimates HIV/AIDS
暂未订购
Data mining techniques在冶金领域的应用
11
作者 马智明 徐荣军 +1 位作者 姚忠卯 马林海 《河南冶金》 2001年第2期3-8,18,共7页
介绍了Data mining techniques产生背景、发展情况、优化过程,以及在冶金领域的应用。
关键词 data MINING techniqueS 优化 冶金 应用
在线阅读 下载PDF
Integrating multisource RS data and GIS techniques to assist the evaluation of resource-environment carrying capacity in karst mountainous area 被引量:9
12
作者 PU Jun-wei ZHAO Xiao-qing +4 位作者 MIAO Pei-pei LI Si-nan TAN Kun WANG Qian TANG Wei 《Journal of Mountain Science》 SCIE CSCD 2020年第10期2528-2547,共20页
The karst mountainous area is an ecologically fragile region with prominent humanland contradictions.The resource-environment carrying capacity(RECC)of this region needs to be further clarified.The development of remo... The karst mountainous area is an ecologically fragile region with prominent humanland contradictions.The resource-environment carrying capacity(RECC)of this region needs to be further clarified.The development of remote sensing(RS)and geographic information system(GIS)provides data sources and processing platform for RECC monitoring.This study analyzed and established the evaluation index system of RECC by considering particularity in the karst mountainous area of Southwest China;processed multisource RS data(Sentinel-2,Aster-DEM and Landsat-8)to extract the spatial distributions of nine key indexes by GIS techniques(information classification,overlay analysis and raster calculation);proposed the methods of index integration and fuzzy comprehensive evaluation of the RECC by GIS;and took a typical area,Guangnan County in Yunnan Province of China,as an experimental area to explore the effectiveness of the indexes and methods.The results showed that:(1)The important indexes affecting the RECC of karst mountainous area are water resources,tourism resources,position resources,geographical environment and soil erosion environment.(2)Data on cultivated land,construction land,minerals,transportation,water conservancy,ecosystem services,topography,soil erosion and rocky desertification can be obtained from RS data.GIS techniques integrate the information into the RECC results.The data extraction and processing methods are feasible on evaluating RECC.(3)The RECC of Guangnan County was in the mid-carrying level in 2018.The midcarrying and low-carrying levels were the main types,accounting for more than 80.00%of the total study area.The areas with high carrying capacity were mainly distributed in the northern regions of the northwest-southeast line of the county,and other areas have a low carrying capacity comparatively.The coordination between regional resource-environment status and socioeconomic development is the key to improve RECC.This study explores the evaluation index system of RECC in karst mountainous area and the application of multisource RS data and GIS techniques in the comprehensive evaluation.The methods can be applied in related fields to provide suggestions for data/information extraction and integration,and sustainable development. 展开更多
关键词 Carrying capacity Multisource RS data GIS techniques Evaluation index system data Integration Karst mountainous area Fuzzy comprehensive evaluation method
原文传递
Real-time prediction of rock mass classification based on TBM operation big data and stacking technique of ensemble learning 被引量:35
13
作者 Shaokang Hou Yaoru Liu Qiang Yang 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2022年第1期123-143,共21页
Real-time prediction of the rock mass class in front of the tunnel face is essential for the adaptive adjustment of tunnel boring machines(TBMs).During the TBM tunnelling process,a large number of operation data are g... Real-time prediction of the rock mass class in front of the tunnel face is essential for the adaptive adjustment of tunnel boring machines(TBMs).During the TBM tunnelling process,a large number of operation data are generated,reflecting the interaction between the TBM system and surrounding rock,and these data can be used to evaluate the rock mass quality.This study proposed a stacking ensemble classifier for the real-time prediction of the rock mass classification using TBM operation data.Based on the Songhua River water conveyance project,a total of 7538 TBM tunnelling cycles and the corresponding rock mass classes are obtained after data preprocessing.Then,through the tree-based feature selection method,10 key TBM operation parameters are selected,and the mean values of the 10 selected features in the stable phase after removing outliers are calculated as the inputs of classifiers.The preprocessed data are randomly divided into the training set(90%)and test set(10%)using simple random sampling.Besides stacking ensemble classifier,seven individual classifiers are established as the comparison.These classifiers include support vector machine(SVM),k-nearest neighbors(KNN),random forest(RF),gradient boosting decision tree(GBDT),decision tree(DT),logistic regression(LR)and multilayer perceptron(MLP),where the hyper-parameters of each classifier are optimised using the grid search method.The prediction results show that the stacking ensemble classifier has a better performance than individual classifiers,and it shows a more powerful learning and generalisation ability for small and imbalanced samples.Additionally,a relative balance training set is obtained by the synthetic minority oversampling technique(SMOTE),and the influence of sample imbalance on the prediction performance is discussed. 展开更多
关键词 Tunnel boring machine(TBM)operation data Rock mass classification Stacking ensemble learning Sample imbalance Synthetic minority oversampling technique(SMOTE)
在线阅读 下载PDF
K-Nearest Neighbor Based Missing Data Estimation Algorithm in Wireless Sensor Networks 被引量:8
14
作者 Liqiang Pan Jianzhong Li 《Wireless Sensor Network》 2010年第2期115-122,共8页
In wireless sensor networks, the missing of sensor data is inevitable due to the inherent characteristic of wireless sensor networks, and it causes many difficulties in various applications. To solve the problem, the ... In wireless sensor networks, the missing of sensor data is inevitable due to the inherent characteristic of wireless sensor networks, and it causes many difficulties in various applications. To solve the problem, the missing data should be estimated as accurately as possible. In this paper, a k-nearest neighbor based missing data estimation algorithm is proposed based on the temporal and spatial correlation of sensor data. It adopts the linear regression model to describe the spatial correlation of sensor data among different sensor nodes, and utilizes the data information of multiple neighbor nodes to estimate the missing data jointly rather than independently, so that a stable and reliable estimation performance can be achieved. Experimental results on two real-world datasets show that the proposed algorithm can estimate the missing data accurately. 展开更多
关键词 missing data ESTIMATION WIRELESS SENSOR NETWORKS
在线阅读 下载PDF
Comparison of Missing Data Imputation Methods in Time Series Forecasting 被引量:4
15
作者 Hyun Ahn Kyunghee Sun Kwanghoon Pio Kim 《Computers, Materials & Continua》 SCIE EI 2022年第1期767-779,共13页
Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.I... Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.In this study,we evaluate and compare the effects of imputationmethods for estimating missing values in a time series.Our approach does not include a simulation to generate pseudo-missing data,but instead perform imputation on actual missing data and measure the performance of the forecasting model created therefrom.In an experiment,therefore,several time series forecasting models are trained using different training datasets prepared using each imputation method.Subsequently,the performance of the imputation methods is evaluated by comparing the accuracy of the forecasting models.The results obtained from a total of four experimental cases show that the k-nearest neighbor technique is the most effective in reconstructing missing data and contributes positively to time series forecasting compared with other imputation methods. 展开更多
关键词 missing data imputation method time series forecasting LSTM
在线阅读 下载PDF
Optimal estimation of zonal velocity and transport through Luzon Strait using variational data assimilation technique 被引量:7
16
作者 兰健 鲍献文 高郭平 《Chinese Journal of Oceanology and Limnology》 SCIE CAS CSCD 2004年第4期335-339,共5页
A P-vector method was optimized using variational data assimilation technique, with which the vertical structures and seasonal variations of zonal velocities and transports were investigated. The results showed that w... A P-vector method was optimized using variational data assimilation technique, with which the vertical structures and seasonal variations of zonal velocities and transports were investigated. The results showed that westward and eastward flowes occur in the Luzon Strait in the same period in a year. However the net volume transport is westward. In the upper level (0m -500m),the westward flow exits in the middle and south of the Luzon Strait, and the eastward flow exits in the north. There are two centers of westward flow and one center of eastward flow. In the middle of the Luzon Strait, westward and eastward flowes appear alternately in vertical direction. The westward flow strengthens in winter and weakens in summer. The net volume transport is strong in winter (5.53 Sv) but weak in summer (0.29 Sv). Except in summer, the volume transport in the upper level accounts for more than half of the total volume transport (0m bottom). In summer, the net volume transport in the upper level is eastward (1.01 Sv), but westward underneath. 展开更多
关键词 South China Sea Luzon Strait zonal velocity and transport variational data assimilation technique
原文传递
Generalized unscented Kalman filtering based radial basis function neural network for the prediction of ground radioactivity time series with missing data 被引量:2
17
作者 伍雪冬 王耀南 +1 位作者 刘维亭 朱志宇 《Chinese Physics B》 SCIE EI CAS CSCD 2011年第6期546-551,共6页
On the assumption that random interruptions in the observation process are modeled by a sequence of independent Bernoulli random variables, we firstly generalize two kinds of nonlinear filtering methods with random in... On the assumption that random interruptions in the observation process are modeled by a sequence of independent Bernoulli random variables, we firstly generalize two kinds of nonlinear filtering methods with random interruption failures in the observation based on the extended Kalman filtering (EKF) and the unscented Kalman filtering (UKF), which were shortened as GEKF and CUKF in this paper, respectively. Then the nonlinear filtering model is established by using the radial basis function neural network (RBFNN) prototypes and the network weights as state equation and the output of RBFNN to present the observation equation. Finally, we take the filtering problem under missing observed data as a special case of nonlinear filtering with random intermittent failures by setting each missing data to be zero without needing to pre-estimate the missing data, and use the GEKF-based RBFNN and the GUKF-based RBFNN to predict the ground radioactivity time series with missing data. Experimental results demonstrate that the prediction results of GUKF-based RBFNN accord well with the real ground radioactivity time series while the prediction results of GEKF-based RBFNN are divergent. 展开更多
关键词 prediction of time series with missing data random interruption failures in the observation neural network approximation
原文传递
Data-driven fault diagnosis of control valve with missing data based on modeling and deep residual shrinkage network 被引量:3
18
作者 Feng SUN He XU +1 位作者 Yu-han ZHAO Yu-dong ZHANG 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2022年第4期303-313,共11页
A control valve is one of the most widely used machines in hydraulic systems.However,it often works in harsh environments and failure occurs from time to time.An intelligent and robust control valve fault diagnosis is... A control valve is one of the most widely used machines in hydraulic systems.However,it often works in harsh environments and failure occurs from time to time.An intelligent and robust control valve fault diagnosis is therefore important for operation of the system.In this study,a fault diagnosis based on the mathematical model(MM)imputation and the modified deep residual shrinkage network(MDRSN)is proposed to solve the problem that data-driven models for control valves are susceptible to changing operating conditions and missing data.The multiple fault time-series samples of the control valve at different openings are collected for fault diagnosis to verify the effectiveness of the proposed method.The effects of the proposed method in missing data imputation and fault diagnosis are analyzed.Compared with random and k-nearest neighbor(KNN)imputation,the accuracies of MM-based imputation are improved by 17.87%and 21.18%,in the circumstances of a20.00%data missing rate at valve opening from 10%to 28%.Furthermore,the results show that the proposed MDRSN can maintain high fault diagnosis accuracy with missing data. 展开更多
关键词 Control valve missing data Fault diagnosis Mathematical model(MM) Deep residual shrinkage network(DRSN)
原文传递
Data Mining as a Technique for Healthcare Approach 被引量:5
19
作者 E. N. Ekwonwune C. I. Ubochi A. E. Duroha 《International Journal of Communications, Network and System Sciences》 2022年第9期149-165,共17页
Data Mining, also known as knowledge discovery in data (KDC), is the process of uncovering patterns and other valuable information from large data sets. According to https://www.geeksforgeeks.org/data-mining/, it can ... Data Mining, also known as knowledge discovery in data (KDC), is the process of uncovering patterns and other valuable information from large data sets. According to https://www.geeksforgeeks.org/data-mining/, it can be referred to as knowledge mining from data, knowledge extraction, data/pattern analysis, data archaeology, and data dredging. With advance research in health sector, there is multitude of Data available in healthcare sector. The general problem then becomes how to use the existing information in a more useful targeted way. Data Mining therefore is the best available technique. The objective of this paper is to review and analyse some of the different Data Mining Techniques such as Application, Classification, Clustering, Regression, etc. applied in the Domain of Healthcare. 展开更多
关键词 data Mining techniqueS Relational database KNOWLEDGE CLUSTERING CLASSIFICATION Regression Healthcare
在线阅读 下载PDF
Encryption with Image Steganography Based Data Hiding Technique in IIoT Environment 被引量:2
20
作者 Mahmoud Ragab Samah Alshehri +3 位作者 Hani A.Alhadrami Faris Kateb Ehab Bahaudien Ashary SAbdel-khalek 《Computers, Materials & Continua》 SCIE EI 2022年第7期1323-1338,共16页
Rapid advancements of the Industrial Internet of Things(IIoT)and artificial intelligence(AI)pose serious security issues by revealing secret data.Therefore,security data becomes a crucial issue in IIoT communication w... Rapid advancements of the Industrial Internet of Things(IIoT)and artificial intelligence(AI)pose serious security issues by revealing secret data.Therefore,security data becomes a crucial issue in IIoT communication where secrecy needs to be guaranteed in real time.Practically,AI techniques can be utilized to design image steganographic techniques in IIoT.In addition,encryption techniques act as an important role to save the actual information generated from the IIoT devices to avoid unauthorized access.In order to accomplish secure data transmission in IIoT environment,this study presents novel encryption with image steganography based data hiding technique(EISDHT)for IIoT environment.The proposed EIS-DHT technique involves a new quantum black widow optimization(QBWO)to competently choose the pixel values for hiding secrete data in the cover image.In addition,the multi-level discrete wavelet transform(DWT)based transformation process takes place.Besides,the secret image is divided into three R,G,and B bands which are then individually encrypted using Blowfish,Twofish,and Lorenz Hyperchaotic System.At last,the stego image gets generated by placing the encrypted images into the optimum pixel locations of the cover image.In order to validate the enhanced data hiding performance of the EIS-DHT technique,a set of simulation analyses take place and the results are inspected interms of different measures.The experimental outcomes stated the supremacy of the EIS-DHT technique over the other existing techniques and ensure maximum security. 展开更多
关键词 IIoT SECURITY data hiding technique image steganography ENCRYPTION secure communication
在线阅读 下载PDF
上一页 1 2 185 下一页 到第
使用帮助 返回顶部