期刊文献+
共找到338篇文章
< 1 2 17 >
每页显示 20 50 100
DriftXMiner: A Resilient Process Intelligence Approach for Safe and Transparent Detection of Incremental Concept Drift in Process Mining
1
作者 Puneetha B.H Manoj Kumar M.V +1 位作者 Prashanth B.S. Piyush Kumar Pareek 《Computers, Materials & Continua》 2026年第1期1086-1118,共33页
Processes supported by process-aware information systems are subject to continuous and often subtle changes due to evolving operational,organizational,or regulatory factors.These changes,referred to as incremental con... Processes supported by process-aware information systems are subject to continuous and often subtle changes due to evolving operational,organizational,or regulatory factors.These changes,referred to as incremental concept drift,gradually alter the behavior or structure of processes,making their detection and localization a challenging task.Traditional process mining techniques frequently assume process stationarity and are limited in their ability to detect such drift,particularly from a control-flow perspective.The objective of this research is to develop an interpretable and robust framework capable of detecting and localizing incremental concept drift in event logs,with a specific emphasis on the structural evolution of control-flow semantics in processes.We propose DriftXMiner,a control-flow-aware hybrid framework that combines statistical,machine learning,and process model analysis techniques.The approach comprises three key components:(1)Cumulative Drift Scanner that tracks directional statistical deviations to detect early drift signals;(2)a Temporal Clustering and Drift-Aware Forest Ensemble(DAFE)to capture distributional and classification-level changes in process behavior;and(3)Petri net-based process model reconstruction,which enables the precise localization of structural drift using transition deviation metrics and replay fitness scores.Experimental validation on the BPI Challenge 2017 event log demonstrates that DriftXMiner effectively identifies and localizes gradual and incremental process drift over time.The framework achieves a detection accuracy of 92.5%,a localization precision of 90.3%,and an F1-score of 0.91,outperforming competitive baselines such as CUSUM+Histograms and ADWIN+Alpha Miner.Visual analyses further confirm that identified drift points align with transitions in control-flow models and behavioral cluster structures.DriftXMiner offers a novel and interpretable solution for incremental concept drift detection and localization in dynamic,process-aware systems.By integrating statistical signal accumulation,temporal behavior profiling,and structural process mining,the framework enables finegrained drift explanation and supports adaptive process intelligence in evolving environments.Its modular architecture supports extension to streaming data and real-time monitoring contexts. 展开更多
关键词 Process mining concept drift gradual drift incremental drift clustering ensemble techniques process model event log
在线阅读 下载PDF
Dynamic domain analysis for predicting concept drift in engineering AI-enabled software
2
作者 Murtuza Shahzad Hamed Barzamini +2 位作者 Joseph Wilson Hamed Alhoori Mona Rahimi 《Journal of Data and Information Science》 2025年第2期124-151,共28页
Purpose:This research addresses the challenge of concept drift in AI-enabled software,particularly within autonomous vehicle systems where concept drift in object recognition(like pedestrian detection)can lead to misc... Purpose:This research addresses the challenge of concept drift in AI-enabled software,particularly within autonomous vehicle systems where concept drift in object recognition(like pedestrian detection)can lead to misclassifications and safety risks.This study introduces a proactive framework to detect early signs of domain-specific concept drift by leveraging domain analysis and natural language processing techniques.This method is designed to help maintain the relevance of domain knowledge and prevent potential failures in AI systems due to evolving concept definitions.Design/methodology/approach:The proposed framework integrates natural language processing and image analysis to continuously update and monitor key domain concepts against evolving external data sources,such as social media and news.By identifying terms and features closely associated with core concepts,the system anticipates and flags significant changes.This was tested in the automotive domain on the pedestrian concept,where the framework was evaluated for its capacity to detect shifts in the recognition of pedestrians,particularly during events like Halloween and specific car accidents.Findings:The framework demonstrated an ability to detect shifts in the domain concept of pedestrians,as evidenced by contextual changes around major events.While it successfully identified pedestrian-related drift,the system’s accuracy varied when overlapping with larger social events.The results indicate the model’s potential to foresee relevant shifts before they impact autonomous systems,although further refinement is needed to handle high-impact concurrent events.Research limitations:This study focused on detecting concept drift in the pedestrian domain within autonomous vehicles,with results varying across domains.To assess generalizability,we tested the framework for airplane-related incidents and demonstrated adaptability.However,unpredictable events and data biases from social media and news may obscure domain-specific drifts.Further evaluation across diverse applications is needed to enhance robustness in evolving AI environments.Practical implications:The proactive detection of concept drift has significant implications for AI-driven domains,especially in safety-critical applications like autonomous driving.By identifying early signs of drift,this framework provides actionable insights for AI system updates,potentially reducing misclassification risks and enhancing public safety.Moreover,it enables timely interventions,reducing costly and labor-intensive retraining requirements by focusing only on the relevant aspects of evolving concepts.This method offers a streamlined approach for maintaining AI system performance in environments where domain knowledge rapidly changes.Originality/value:This study contributes a novel domain-agnostic framework that combines natural language processing with image analysis to predict concept drift early.This unique approach,which is focused on real-time data sources,offers an effective and scalable solution for addressing the evolving nature of domain-specific concepts in AI applications. 展开更多
关键词 AI-enable software system concept drift detection Applied machine learning Autonomous vehicles Natural language processing
在线阅读 下载PDF
Concept Drift Detection and Adaptation Method for IoT Security Framework
3
作者 Yin Jie Xie Wenwei +2 位作者 Liang Guangjun Zhang Lanping Zhang Xixi 《China Communications》 2025年第12期137-147,共11页
With the gradual penetration of the internet of things(IoT)into all areas of life,the scale of IoT devices shows an explosive growth trend.The era of internet of everything is coming,and the important position of IoT ... With the gradual penetration of the internet of things(IoT)into all areas of life,the scale of IoT devices shows an explosive growth trend.The era of internet of everything is coming,and the important position of IoT security is becoming increasingly prominent.Due to the large number types of IoT devices,there may be different security vulnerabilities,and unknown attack forms and virus samples are appear.In other words,large number of IoT devices,large data volumes,and various attack forms pose a big challenge of malicious traffic identification.To solve these problems,this paper proposes a concept drift detection and adaptation(CDDA)method for IoT security framework.The AI model performance is evaluated by verifying the effectiveness of IoT traffic for data drift detection,so as to select the best AI model.The experimental test are given to confirm that the feasibility of the framework and the adaptive method in practice,and the effect on the performance of IoT traffic identification is also verified. 展开更多
关键词 concept drift detection and adaptive(CDDA)method IoT security malicious traffic identification
在线阅读 下载PDF
An Optimal Big Data Analytics with Concept Drift Detection on High-Dimensional Streaming Data 被引量:1
4
作者 Romany F.Mansour Shaha Al-Otaibi +3 位作者 Amal Al-Rasheed Hanan Aljuaid Irina V.Pustokhina Denis A.Pustokhin 《Computers, Materials & Continua》 SCIE EI 2021年第9期2843-2858,共16页
Big data streams started becoming ubiquitous in recent years,thanks to rapid generation of massive volumes of data by different applications.It is challenging to apply existing data mining tools and techniques directl... Big data streams started becoming ubiquitous in recent years,thanks to rapid generation of massive volumes of data by different applications.It is challenging to apply existing data mining tools and techniques directly in these big data streams.At the same time,streaming data from several applications results in two major problems such as class imbalance and concept drift.The current research paper presents a new Multi-Objective Metaheuristic Optimization-based Big Data Analytics with Concept Drift Detection(MOMBD-CDD)method on High-Dimensional Streaming Data.The presented MOMBD-CDD model has different operational stages such as pre-processing,CDD,and classification.MOMBD-CDD model overcomes class imbalance problem by Synthetic Minority Over-sampling Technique(SMOTE).In order to determine the oversampling rates and neighboring point values of SMOTE,Glowworm Swarm Optimization(GSO)algorithm is employed.Besides,Statistical Test of Equal Proportions(STEPD),a CDD technique is also utilized.Finally,Bidirectional Long Short-Term Memory(Bi-LSTM)model is applied for classification.In order to improve classification performance and to compute the optimum parameters for Bi-LSTM model,GSO-based hyperparameter tuning process is carried out.The performance of the presented model was evaluated using high dimensional benchmark streaming datasets namely intrusion detection(NSL KDDCup)dataset and ECUE spam dataset.An extensive experimental validation process confirmed the effective outcome of MOMBD-CDD model.The proposed model attained high accuracy of 97.45%and 94.23%on the applied KDDCup99 Dataset and ECUE Spam datasets respectively. 展开更多
关键词 Streaming data concept drift classification model deep learning class imbalance data
在线阅读 下载PDF
Concept Drift Analysis and Malware Attack Detection System Using Secure Adaptive Windowing
5
作者 Emad Alsuwat Suhare Solaiman Hatim Alsuwat 《Computers, Materials & Continua》 SCIE EI 2023年第5期3743-3759,共17页
Concept drift is a main security issue that has to be resolved since it presents a significant barrier to the deployment of machine learning(ML)models.Due to attackers’(and/or benign equivalents’)dynamic behavior ch... Concept drift is a main security issue that has to be resolved since it presents a significant barrier to the deployment of machine learning(ML)models.Due to attackers’(and/or benign equivalents’)dynamic behavior changes,testing data distribution frequently diverges from original training data over time,resulting in substantial model failures.Due to their dispersed and dynamic nature,distributed denial-of-service attacks pose a danger to cybersecurity,resulting in attacks with serious consequences for users and businesses.This paper proposes a novel design for concept drift analysis and detection of malware attacks like Distributed Denial of Service(DDOS)in the network.The goal of this architecture combination is to accurately represent data and create an effective cyber security prediction agent.The intrusion detection system and concept drift of the network has been analyzed using secure adaptive windowing with website data authentication protocol(SAW_WDA).The network has been analyzed by authentication protocol to avoid malware attacks.The data of network users will be collected and classified using multilayer perceptron gradient decision tree(MLPGDT)classifiers.Based on the classification output,the decision for the detection of attackers and authorized users will be identified.The experimental results show output based on intrusion detection and concept drift analysis systems in terms of throughput,end-end delay,network security,network concept drift,and results based on classification with regard to accuracy,memory,and precision and F-1 score. 展开更多
关键词 concept drift machine learning DDOS cyber security SAW_WDA MLPGDT
在线阅读 下载PDF
Combined Effect of Concept Drift and Class Imbalance on Model Performance During Stream Classification
6
作者 Abdul Sattar Palli Jafreezal Jaafar +3 位作者 Manzoor Ahmed Hashmani Heitor Murilo Gomes Aeshah Alsughayyir Abdul Rehman Gilal 《Computers, Materials & Continua》 SCIE EI 2023年第4期1827-1845,共19页
Every application in a smart city environment like the smart grid,health monitoring, security, and surveillance generates non-stationary datastreams. Due to such nature, the statistical properties of data changes over... Every application in a smart city environment like the smart grid,health monitoring, security, and surveillance generates non-stationary datastreams. Due to such nature, the statistical properties of data changes overtime, leading to class imbalance and concept drift issues. Both these issuescause model performance degradation. Most of the current work has beenfocused on developing an ensemble strategy by training a new classifier on thelatest data to resolve the issue. These techniques suffer while training the newclassifier if the data is imbalanced. Also, the class imbalance ratio may changegreatly from one input stream to another, making the problem more complex.The existing solutions proposed for addressing the combined issue of classimbalance and concept drift are lacking in understating of correlation of oneproblem with the other. This work studies the association between conceptdrift and class imbalance ratio and then demonstrates how changes in classimbalance ratio along with concept drift affect the classifier’s performance.We analyzed the effect of both the issues on minority and majority classesindividually. To do this, we conducted experiments on benchmark datasetsusing state-of-the-art classifiers especially designed for data stream classification.Precision, recall, F1 score, and geometric mean were used to measure theperformance. Our findings show that when both class imbalance and conceptdrift problems occur together the performance can decrease up to 15%. Ourresults also show that the increase in the imbalance ratio can cause a 10% to15% decrease in the precision scores of both minority and majority classes.The study findings may help in designing intelligent and adaptive solutionsthat can cope with the challenges of non-stationary data streams like conceptdrift and class imbalance. 展开更多
关键词 CLASSIFICATION data streams class imbalance concept drift class imbalance ratio
在线阅读 下载PDF
An Ensemble Learning Model Based on Three-Way Decision for Concept Drift Adaptation
7
作者 Dayong Deng Wenxin Shen +2 位作者 Zhixuan Deng Tianrui Li Anjin Liu 《Tsinghua Science and Technology》 2025年第5期2029-2047,共19页
The ensemble learning model can effectively detect drift and utilize diversity to improve the performance of adapting to drift.However,local concept drift can occur in different types at different time points,causing ... The ensemble learning model can effectively detect drift and utilize diversity to improve the performance of adapting to drift.However,local concept drift can occur in different types at different time points,causing basic learners are difficult to distinguish the drift of local boundaries,and the drift range is difficult to determine.Thus,the ensemble learning model to adapt local concept drifts is still challenging problem.Moreover,there are often differences in decision boundaries after drift adaptation,and employing overall diversity measurement is inappropriate.To address these two issues,this paper proposes a novel ensemble learning model called instance-weighted ensemble learning based on the three-way decision(IWE-TWD).In IWE-TWD,a divide-and-conquer strategy is employed to handle uncertain drift and to select base learners;Density clustering dynamically constructs density regions to lock drift range;Three-way decision is adopted to estimate whether the region distribution changes,and the instance is weighted with the probability of region distribution change;The diversities between base learners are determined with three-way decision also.Experimental results show that IWE-TWD has better performance than the state-of-the-art models in data stream classification on ten synthetic data sets and seven real-world data sets. 展开更多
关键词 three-way decision concept drift ensemble learning region drift information fusion
原文传递
An ensemble method for data stream classification in the presence of concept drift 被引量:3
8
作者 Omid ABBASZADEH Ali AMIRI Ali Reza KHANTEYMOORI 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2015年第12期1059-1068,共10页
One recent area of interest in computer science is data stream management and processing. By ‘data stream', we refer to continuous and rapidly generated packages of data. Specific features of data streams are imm... One recent area of interest in computer science is data stream management and processing. By ‘data stream', we refer to continuous and rapidly generated packages of data. Specific features of data streams are immense volume, high production rate, limited data processing time, and data concept drift; these features differentiate the data stream from standard types of data. An issue for the data stream is classification of input data. A novel ensemble classifier is proposed in this paper. The classifier uses base classifiers of two weighting functions under different data input conditions. In addition, a new method is used to determine drift, which emphasizes the precision of the algorithm. Another characteristic of the proposed method is removal of different numbers of the base classifiers based on their quality. Implementation of a weighting mechanism to the base classifiers at the decision-making stage is another advantage of the algorithm. This facilitates adaptability when drifts take place, which leads to classifiers with higher efficiency. Furthermore, the proposed method is tested on a set of standard data and the results confirm higher accuracy compared to available ensemble classifiers and single classifiers. In addition, in some cases the proposed classifier is faster and needs less storage space. 展开更多
关键词 Data stream Classificaion Ensemble classifiers concept drift
原文传递
A Real-Time Deep Learning Approach for Electrocardiogram-Based Cardiovascular Disease Prediction with Adaptive Drift Detection and Generative Feature Replay
9
作者 Soumia Zertal Asma Saighi +2 位作者 Sofia Kouah Souham Meshoul Zakaria Laboudi 《Computer Modeling in Engineering & Sciences》 2025年第9期3737-3782,共46页
Cardiovascular diseases(CVDs)continue to present a leading cause ofmortalityworldwide,emphasizing the importance of early and accurate prediction.Electrocardiogram(ECG)signals,central to cardiac monitoring,have increa... Cardiovascular diseases(CVDs)continue to present a leading cause ofmortalityworldwide,emphasizing the importance of early and accurate prediction.Electrocardiogram(ECG)signals,central to cardiac monitoring,have increasingly been integratedwithDeep Learning(DL)for real-time prediction of CVDs.However,DL models are prone to performance degradation due to concept drift and to catastrophic forgetting.To address this issue,we propose a realtime CVDs prediction approach,referred to as ADWIN-GFR that combines Convolutional Neural Network(CNN)layers,for spatial feature extraction,with Gated Recurrent Units(GRU),for temporal modeling,alongside adaptive drift detection and mitigation mechanisms.The proposed approach integratesAdaptiveWindowing(ADWIN)for realtime concept drift detection,a fine-tuning strategy based on Generative Features Replay(GFR)to preserve previously acquired knowledge,and a dynamic replay buffer ensuring variance,diversity,and data distribution coverage.Extensive experiments conducted on the MIT-BIH arrhythmia dataset demonstrate that ADWIN-GFR outperforms standard fine-tuning techniques,achieving an average post-drift accuracy of 95.4%,amacro F1-score of 93.9%,and a remarkably low forgetting score of 0.9%.It also exhibits an average drift detection delay of 12 steps and achieves an adaptation gain of 17.2%.These findings underscore the potential of ADWIN-GFR for deployment in real-world cardiac monitoring systems,including wearable ECG devices and hospital-based patient monitoring platforms. 展开更多
关键词 Real-time cardiovascular disease prediction concept drift detection catastrophic forgetting fine-tuning electrocardiogram convolutional neural networks gated recurrent units adaptive windowing generative feature replay
在线阅读 下载PDF
目标解耦驱动的在线深度网络
10
作者 郭虎升 申聪 +1 位作者 夏浩森 王文剑 《小型微型计算机系统》 北大核心 2026年第1期42-50,共9页
概念漂移是数据流挖掘中不可避免的难点问题,其典型特征是数据分布随时间可能发生改变.针对现有模型处理数据流分类任务时出现过拟合的问题,本文提出了一种目标解耦驱动的在线深度网络(Online Deep Network driven by Target Decoupling... 概念漂移是数据流挖掘中不可避免的难点问题,其典型特征是数据分布随时间可能发生改变.针对现有模型处理数据流分类任务时出现过拟合的问题,本文提出了一种目标解耦驱动的在线深度网络(Online Deep Network driven by Target Decoupling,ODNTD).首先,该模型从历史数据流中学习一个任务未知型特征提取器,实现了对任务的无偏见表示学习,从而增强了模型的泛化能力;其次,模型利用任务特定的权重调整,使得任务未知的通用特征表示能够适应具体任务,通过这种目标任务的权重学习进一步提升了模型的适应性.实验结果表明,所提出的方法对含概念漂移的数据流有良好的泛化性能. 展开更多
关键词 概念漂移 表示学习 权重学习 自适应深度网络 特征表示蒸馏
在线阅读 下载PDF
端到端框架下基于LSTM与在线修正的适应性投资组合策略
11
作者 刘悦 张永 +1 位作者 黎嘉豪 王晓辉 《系统管理学报》 北大核心 2026年第1期233-246,共14页
深度学习对长序列信息具有较强的记忆能力,并能有效建模复杂关系。本文采用多对多长短期记忆网络,研究端到端框架下的投资组合策略。首先,在端到端深度学习框架下,结合多对多长短期记忆神经网络与滑动窗口技术构建投资组合策略;其次,以... 深度学习对长序列信息具有较强的记忆能力,并能有效建模复杂关系。本文采用多对多长短期记忆网络,研究端到端框架下的投资组合策略。首先,在端到端深度学习框架下,结合多对多长短期记忆神经网络与滑动窗口技术构建投资组合策略;其次,以固定历史窗口的均匀定常再调整策略为基准,在线评估神经网络策略近期表现,并对其进行修正以缓解概念漂移问题;再次,集成多个历史窗口下的修正策略,形成稳健的投资组合策略;最后,基于国内外市场数据开展数值分析,结果表明,该策略在稳健性、收益性及交易费率敏感性方面均优于对比策略。 展开更多
关键词 投资组合 端到端学习 多对多长短期记忆网络 在线修正 概念漂移
在线阅读 下载PDF
基于增量加权的概念漂移数据流分类算法
12
作者 吴勇华 梅颖 卢诚波 《上海交通大学学报》 北大核心 2026年第1期112-122,I0005-I0007,共14页
概念漂移是数据流挖掘中最常见的现象之一,数据流中隐含的知识模式随时间动态变化,导致先前建立的分类器的准确性下降.针对这一问题,提出基于增量加权的概念漂移数据流分类(SCIW)算法.该算法采用启发式的权重更新策略,结合基于准确性差... 概念漂移是数据流挖掘中最常见的现象之一,数据流中隐含的知识模式随时间动态变化,导致先前建立的分类器的准确性下降.针对这一问题,提出基于增量加权的概念漂移数据流分类(SCIW)算法.该算法采用启发式的权重更新策略,结合基于准确性差异的自适应方法,同时改进了基于泊松分布的重采样策略.SCIW算法能够适应不同类型的概念漂移,有效缓解了分类器准确率下降的问题.在14个合成数据集和6个真实数据集上的实验结果表明,SCIW算法和自适应随机森林(ARF)算法在准确率方面表现出色,明显优于其他对比算法;SCIW算法在时间和内存消耗方面明显优于ARF算法,总体平均时间消耗约为ARF的83%,总体平均内存消耗约为ARF算法的13%. 展开更多
关键词 数据流 概念漂移 分类算法 集成学习 自适应
在线阅读 下载PDF
利用人工内分泌机制约简的KNN故障诊断模型及其应用
13
作者 张丰硕 赵理 郭鹏旭 《重庆理工大学学报(自然科学)》 北大核心 2026年第2期133-140,共8页
为了解决新能源汽车实时运行过程中由于数据流的无限性,以及概念漂移导致的静态KNN(k-nearest neighbors)故障诊断模型难以在线更新的问题,提出一种通过人工内分泌系统改进KNN的新能源汽车故障诊断模型(new energy vehicle fault diagno... 为了解决新能源汽车实时运行过程中由于数据流的无限性,以及概念漂移导致的静态KNN(k-nearest neighbors)故障诊断模型难以在线更新的问题,提出一种通过人工内分泌系统改进KNN的新能源汽车故障诊断模型(new energy vehicle fault diagnosis model, AES-KNN),该模型将新能源汽车在线数据流中的样本视为细胞,利用人工内分泌调节机制对数据流中的分类边界进行在线更新,基于荷尔蒙浓度的实时检测,进行边界点约简,通过阈值检测数据流中是否发生概念漂移,利用在线更新的KNN边界进行故障诊断。在新能源国家大数据平台汽车故障数据集上进行测试时,运行时间比传统增量KNN模型降低了56.0%,F1分数比WIN-KNN和RW-KNN分别提高了0.99%和0.75%。实验表明,该模型能在降低故障诊断时间的同时,有效处理概念漂移问题,得到精度高于传统模型的故障诊断结果。 展开更多
关键词 故障诊断 人工内分泌系统 KNN算法 概念漂移
在线阅读 下载PDF
在线更新的Koopman算子及其在松散回潮过程应用
14
作者 吴悦 曹建福 曹晔 《重庆理工大学学报(自然科学)》 北大核心 2026年第1期166-174,共9页
许多工业过程呈现非线性特性,其建模研究具有非常重要的工程价值,Koopman算子通过将非线性系统变换到高维或无限维的线性空间,可利用线性系统方法来分析非线性系统动力学。传统的扩展动态模式分解方法通常采用离线批处理方式,难以应对... 许多工业过程呈现非线性特性,其建模研究具有非常重要的工程价值,Koopman算子通过将非线性系统变换到高维或无限维的线性空间,可利用线性系统方法来分析非线性系统动力学。传统的扩展动态模式分解方法通常采用离线批处理方式,难以应对系统随时间演化的情形;同时,由于生产环境与设备状态的变化,往往会出现概念漂移的情况,导致基于历史数据的建模假设失效。提出了一种Koopman算子在线动态分解(rEDMD)算法,通过自适应遗忘因子与协方差矩阵迹限制等机制,增强在非平稳与噪声环境下的鲁棒性。通过在松散回潮过程上的仿真和实验验证,结果表明该方法在计算效率、预测精度和适应性方面均优于传统的建模方法,可为非线性生产过程在线建模提供新的思路。 展开更多
关键词 Koopman算子 递归最小二乘 概念漂移 在线学习 松散回潮 非线性系统
在线阅读 下载PDF
Learning Association Rules and Tracking the Changing Concepts on Webpages:An Effective Pornographic Websites Filtering Approach
15
作者 Jyh-Jian Sheu 《Journal of Electronic Science and Technology》 CAS CSCD 2018年第1期24-36,共13页
We applied the decision tree algorithm to learn association rules between webpage’s category(pornographic or normal) and the critical features.Based on these rules, we proposed an efficient method of filtering pornog... We applied the decision tree algorithm to learn association rules between webpage’s category(pornographic or normal) and the critical features.Based on these rules, we proposed an efficient method of filtering pornographic webpages with the following major advantages: 1) a weighted window-based technique was proposed to estimate for the condition of concept drift for the keywords found recently in pornographic webpages; 2) checking only contexts of webpages without scanning pictures; 3) an incremental learning mechanism was designed to incrementally update the pornographic keyword database. 展开更多
关键词 concept drift data mining decision tree pornographic websites filtering
在线阅读 下载PDF
Drift DetectionMethod Using DistanceMeasures and Windowing Schemes for Sentiment Classification
16
作者 Idris Rabiu Naomie Salim +3 位作者 Maged Nasser Aminu Da’u Taiseer Abdalla Elfadil Eisa Mhassen Elnour Elneel Dalam 《Computers, Materials & Continua》 SCIE EI 2023年第3期6001-6017,共17页
Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred t... Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred to as concept drift,mining this data stream is a challenging problem for researchers.The majority of the existing drift detection techniques are based on classification errors,which have higher probabilities of false-positive or missed detections.To improve classification accuracy,there is a need to develop more intuitive detection techniques that can identify a great number of drifts in the data streams.This paper presents an adaptive unsupervised learning technique,an ensemble classifier based on drift detection for opinion mining and sentiment classification.To improve classification performance,this approach uses four different dissimilarity measures to determine the degree of concept drifts in the data stream.Whenever a drift is detected,the proposed method builds and adds a new classifier to the ensemble.To add a new classifier,the total number of classifiers in the ensemble is first checked if the limit is exceeded before the classifier with the least weight is removed from the ensemble.To this end,a weighting mechanism is used to calculate the weight of each classifier,which decides the contribution of each classifier in the final classification results.Several experiments were conducted on real-world datasets and the resultswere evaluated on the false positive rate,miss detection rate,and accuracy measures.The proposed method is also compared with the state-of-the-art methods,which include DDM,EDDM,and PageHinkley with support vector machine(SVM)and Naive Bayes classifiers that are frequently used in concept drift detection studies.In all cases,the results show the efficiency of our proposed method. 展开更多
关键词 Data streams sentiment analysis concept drift ensemble classification adaptive window
在线阅读 下载PDF
基于工业视角的概念漂移检测与适应方法综述 被引量:2
17
作者 周平 张宇 《控制与决策》 北大核心 2025年第6期1774-1792,共19页
智能工业化的迅速发展推动了技术设备的持续创新,随之而来产生大量实时数据流.在这些数据流中,数据的统计特性随时间可能发生变化,这一现象称为概念漂移.概念漂移对机器学习模型的性能产生显著影响,未能及时识别和应对会导致模型性能的... 智能工业化的迅速发展推动了技术设备的持续创新,随之而来产生大量实时数据流.在这些数据流中,数据的统计特性随时间可能发生变化,这一现象称为概念漂移.概念漂移对机器学习模型的性能产生显著影响,未能及时识别和应对会导致模型性能的逐步下降,进而引发错误决策,从而在工业应用中造成不可忽视的损失.鉴于此,从工业应用的角度出发,总结目前概念漂移检测与适应的研究进展.首先,聚焦于有监督环境下的工业概念漂移检测方法,从基于性能、窗口技术和集成方法角度详细总结相关技术的发展现状;其次,针对工业场景中常见的标签稀缺问题,系统介绍半监督学习和无监督学习在工业概念漂移检测中的应用方法,此外讨论工业环境中普遍存在的不平衡类问题对概念漂移检测的影响,并综述解决这一问题的相关策略;最后,针对工业环境下的概念漂移适应方法进行总结,并提出未来研究的方向,以进一步提升概念漂移检测方法在复杂动态环境中的表现. 展开更多
关键词 概念漂移 工业场景 标签稀缺 不平衡类 漂移适应 研究综述
原文传递
基于Kolmogorov不等式的数据流漂移检测方法
18
作者 韩萌 孟凡兴 +3 位作者 李春鹏 张瑞华 何菲菲 丁剑 《计算机工程与应用》 北大核心 2025年第9期102-115,共14页
在现实数据环境中,数据分布经常随着时间推移而变化,该现象称为概念漂移。概念漂移会显著影响原分类模型的性能。因此,当概念漂移出现时,分类模型需及时调整以适应数据分布变化,从而保证学习的有效性。探讨了Kolmogorov不等式在概念漂... 在现实数据环境中,数据分布经常随着时间推移而变化,该现象称为概念漂移。概念漂移会显著影响原分类模型的性能。因此,当概念漂移出现时,分类模型需及时调整以适应数据分布变化,从而保证学习的有效性。探讨了Kolmogorov不等式在概念漂移检测领域的应用潜力。提出了一种基于错误率的Kolmogorov漂移检验策略,利用Kolmogorov不等式设计了概念漂移检测方法,并利用该算法来检测数据流中突然或逐渐出现的概念漂移。提出了一种尾部实例调整策略,减轻了漂移检测样本集中旧实例的影响,从而进一步降低了漂移检测延迟。实验表明,与经典或先进的漂移检测器相比,提出的算法在分类准确率方面表现最佳。在漂移检测性能方面,提出的算法在误检率和检测延迟方面的表现均位于前列,达到了较好的平衡。在运行时间方面也表现出了良好的性能。在上述四个指标的总体比较中优于其他算法,达到了该研究的预期。 展开更多
关键词 概念漂移 漂移检测 数据流 分类 Kolmogorov不等式
在线阅读 下载PDF
面向概念漂移的磁盘故障动态集成预测方法
19
作者 丁建立 梁烨文 李静 《小型微型计算机系统》 北大核心 2025年第5期1105-1111,共7页
在大规模数据中心中,磁盘日志通常随着时间的推移不断从磁盘生成,磁盘日志数据的分布会随着时间的推移发生不可预测的变化,产生概念漂移.然而当前磁盘故障预测方法大多是离线训练的,预测性能会随着时间的流逝而逐渐降低,无法对数据分布... 在大规模数据中心中,磁盘日志通常随着时间的推移不断从磁盘生成,磁盘日志数据的分布会随着时间的推移发生不可预测的变化,产生概念漂移.然而当前磁盘故障预测方法大多是离线训练的,预测性能会随着时间的流逝而逐渐降低,无法对数据分布的变化做出反映.针对这一问题,提出了一种面向概念漂移的磁盘故障动态集成预测方法AIDF.该方法从数据分析到磁盘故障预测整个环节都是动态进行的,是一个完整的自动化磁盘故障预测方法.首先,提出了AIDF总体架构.其次,对磁盘故障动态集成预测模型进行构建.包括以下3个方面:对磁盘数据流进行实时数据分析;根据磁盘数据流中存在的概念漂移类型,改进了基学习器的概念漂移检测过程,并基于磁盘故障预测性能为基学习器分配动态权重,建立集成学习模型;为了解决磁盘数据流中特征选择更新问题,提出一种基于概念漂移的动态特征更新与模型再训练算法,当磁盘数据流出现概念漂移并且所选择的最优特征集发生变化时,使用近期窗口中的数据再训练基学习器.实验结果表明,AIDF能够很好地应对磁盘故障预测模型老化的问题,长期保持95%以上的故障检测率,并且适用于实际动态应用环境. 展开更多
关键词 磁盘故障 概念漂移 集成学习 动态预测 增量学习
在线阅读 下载PDF
弹性梯度集成的概念漂移适应
20
作者 郭虎升 张羽桐 王文剑 《计算机研究与发展》 北大核心 2025年第5期1235-1247,共13页
随着流数据的大量涌现,概念漂移已成为流数据挖掘中备受关注且具有挑战性的重要问题.目前,多数集成学习方法未针对性地识别概念漂移类型,并采取高效的集成适应策略,导致模型在不同漂移类型上的性能参差不齐.为此,提出了一种弹性梯度集... 随着流数据的大量涌现,概念漂移已成为流数据挖掘中备受关注且具有挑战性的重要问题.目前,多数集成学习方法未针对性地识别概念漂移类型,并采取高效的集成适应策略,导致模型在不同漂移类型上的性能参差不齐.为此,提出了一种弹性梯度集成的概念漂移适应(elastic gradient ensemble for concept drift adaptation,EGE_CD)方法.该方法首先通过提取梯度提升残差,计算流动残差比检测漂移位点,之后计算残差波动率识别漂移类型;然后,利用学习器损失变化提取漂移学习器,结合不同漂移类型与残差分布特征删除对应学习器,实现弹性梯度剪枝;最后,将增量学习与滑动采样方法结合,通过计算最优拟合率优化学习器拟合过程,再根据残差变化实现增量梯度生长.实验结果表明,所提方法提高了模型对不同漂移类型的稳定性与适应性,取得了良好的泛化性能. 展开更多
关键词 概念漂移 漂移类型 梯度提升 漂移检测 弹性梯度剪枝 增量梯度生长
在线阅读 下载PDF
上一页 1 2 17 下一页 到第
使用帮助 返回顶部