In today's connected world,the generation of massive streaming data across diverse domains has become commonplace.In the presence of concept drift,class imbalance,label scarcity,and new class emergence,these chall...In today's connected world,the generation of massive streaming data across diverse domains has become commonplace.In the presence of concept drift,class imbalance,label scarcity,and new class emergence,these challenges jointly degrade representation stability,bias learning toward outdated distributions,and reduce the resilience and reliability of detection in dynamic environments.This paper proposes a streaming classincremental learning(SCIL)framework to address these issues.The SCIL framework integrates an autoencoder(AE)with a multi-layer perceptron for multi-class prediction,employs a dual-loss strategy(classification and reconstruction)for prediction and new class detection,uses corrected pseudo-labels for online training,manages classes with queues,and applies oversampling to handle imbalance.The rationale behind the method's structure is elucidated through ablation studies,and a comprehensive experimental evaluation is performed using both real-world and synthetic datasets that feature class imbalance,incremental classes,and concept drifts.Our results demonstrate that SCIL outperforms strong baselines and state-of-the-art methods.In line with our commitment to Open Science,we make our code and datasets available to the community.展开更多
Logistic regression is a fast classifier and can achieve higher accuracy on small training data.Moreover,it can work on both discrete and continuous attributes with nonlinear patterns.Based on these properties of logi...Logistic regression is a fast classifier and can achieve higher accuracy on small training data.Moreover,it can work on both discrete and continuous attributes with nonlinear patterns.Based on these properties of logistic regression,this paper proposed an algorithm,called evolutionary logistical regression classifier(ELRClass),to solve the classification of evolving data streams.This algorithm applies logistic regression repeatedly to a sliding window of samples in order to update the existing classifier,to keep this classifier if its performance is deteriorated by the reason of bursting noise,or to construct a new classifier if a major concept drift is detected.The intensive experimental results demonstrate the effectiveness of this algorithm.展开更多
基金supported by the European Research Council(ERC)under Grant Agreement No.951424(Water-Futures)by the Republic of Cyprus through the Deputy Ministry of Research,Innovation and Digital Policy.
文摘In today's connected world,the generation of massive streaming data across diverse domains has become commonplace.In the presence of concept drift,class imbalance,label scarcity,and new class emergence,these challenges jointly degrade representation stability,bias learning toward outdated distributions,and reduce the resilience and reliability of detection in dynamic environments.This paper proposes a streaming classincremental learning(SCIL)framework to address these issues.The SCIL framework integrates an autoencoder(AE)with a multi-layer perceptron for multi-class prediction,employs a dual-loss strategy(classification and reconstruction)for prediction and new class detection,uses corrected pseudo-labels for online training,manages classes with queues,and applies oversampling to handle imbalance.The rationale behind the method's structure is elucidated through ablation studies,and a comprehensive experimental evaluation is performed using both real-world and synthetic datasets that feature class imbalance,incremental classes,and concept drifts.Our results demonstrate that SCIL outperforms strong baselines and state-of-the-art methods.In line with our commitment to Open Science,we make our code and datasets available to the community.
文摘Logistic regression is a fast classifier and can achieve higher accuracy on small training data.Moreover,it can work on both discrete and continuous attributes with nonlinear patterns.Based on these properties of logistic regression,this paper proposed an algorithm,called evolutionary logistical regression classifier(ELRClass),to solve the classification of evolving data streams.This algorithm applies logistic regression repeatedly to a sliding window of samples in order to update the existing classifier,to keep this classifier if its performance is deteriorated by the reason of bursting noise,or to construct a new classifier if a major concept drift is detected.The intensive experimental results demonstrate the effectiveness of this algorithm.