期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Stratified and Un-stratified Sampling in Bagging: Data Mining
1
作者 Yousef M.T.El Gimati 《Journal of Mathematics and System Science》 2021年第1期29-36,共8页
Stratified sampling is often used in opinion polls to reduce standard errors,and it is known as variance reduction technique in sampling theory.The most common approach of resampling method is based on bootstrapping t... Stratified sampling is often used in opinion polls to reduce standard errors,and it is known as variance reduction technique in sampling theory.The most common approach of resampling method is based on bootstrapping the dataset with replacement.A main purpose of this work is to investigate extensions of the resampling methods in classification problems,specifically we use decision trees,from a family of stratification models to improve prediction accuracy by aggregating classifiers built on a perturbed dataset.We use bagging,as a method of estimating a good decision boundary according to a family of stratification models.The overall conclusion is that for decision trees,un-stratified bootstrapping with bagging can yield lower error rates than other sampling strategies for simulated datasets.Based on the results in these experiments,a possible explanation as to why un-stratified sampling is a best is because bagging is itself a method of stratification. 展开更多
关键词 BOOTSTRAPPING decision boundary stratification models RESAMPLING classifier.
在线阅读 下载PDF
Semi-Discrete Optimal Transpport for Long-Tailed Classification
2
作者 Lian-Bao Jin Na Lei +3 位作者 Zhong-Xuan Luo Jin Wu Chao Ai Xianfeng Gu 《Journal of Computer Science & Technology》 2025年第1期252-266,共15页
The long-tailed data distribution poses an enormous challenge for training neural networks in classification.A classification network can be decoupled into a feature extractor and a classifier.This paper takes a semi-... The long-tailed data distribution poses an enormous challenge for training neural networks in classification.A classification network can be decoupled into a feature extractor and a classifier.This paper takes a semi-discrete optimal transport(OT)perspective to analyze the long-tailed classification problem,where the feature space is viewed as a continuous source domain,and the classifier weights are viewed as a discrete target domain.The classifier is indeed to find a cell decomposition of the feature space with each cell corresponding to one class.An imbalanced training set causes the more frequent classes to have larger volume cells,which means that the classifier's decision boundary is biased towards less frequent classes,resulting in reduced classification performance in the inference phase.Therefore,we propose a novel OTdynamic softmax loss,which dynamically adjusts the decision boundary in the training phase to avoid overfitting in the tail classes.In addition,our method incorporates the supervised contrastive loss so that the feature space can satisfy the uniform distribution condition.Extensive and comprehensive experiments demonstrate that our method achieves state-ofthe-art performance on multiple long-tailed recognition benchmarks,including CIFAR-LT,ImageNet-LT,iNaturalist 2018,and Places-LT. 展开更多
关键词 semi-discrete optimal transport long-tailed classification decision boundary supervised contrastive loss
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部