期刊文献+
共找到3,767篇文章
< 1 2 189 >
每页显示 20 50 100
Similarity measurement method of high-dimensional data based on normalized net lattice subspace 被引量:4
1
作者 李文法 Wang Gongming +1 位作者 Li Ke Huang Su 《High Technology Letters》 EI CAS 2017年第2期179-184,共6页
The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities... The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities occupies a large proportion of the similarity,leading to the dissimilarities between any results.A similarity measurement method of high-dimensional data based on normalized net lattice subspace is proposed.The data range of each dimension is divided into several intervals,and the components in different dimensions are mapped onto the corresponding interval.Only the component in the same or adjacent interval is used to calculate the similarity.To validate this method,three data types are used,and seven common similarity measurement methods are compared.The experimental result indicates that the relative difference of the method is increasing with the dimensionality and is approximately two or three orders of magnitude higher than the conventional method.In addition,the similarity range of this method in different dimensions is [0,1],which is fit for similarity analysis after dimensionality reduction. 展开更多
关键词 high-dimensional data the curse of dimensionality SIMILARITY NORMALIZATION SUBSPACE NPsim
在线阅读 下载PDF
A Length-Adaptive Non-Dominated Sorting Genetic Algorithm for Bi-Objective High-Dimensional Feature Selection 被引量:1
2
作者 Yanlu Gong Junhai Zhou +2 位作者 Quanwang Wu MengChu Zhou Junhao Wen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第9期1834-1844,共11页
As a crucial data preprocessing method in data mining,feature selection(FS)can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected featu... As a crucial data preprocessing method in data mining,feature selection(FS)can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected features.Evolutionary computing(EC)is promising for FS owing to its powerful search capability.However,in traditional EC-based methods,feature subsets are represented via a length-fixed individual encoding.It is ineffective for high-dimensional data,because it results in a huge search space and prohibitive training time.This work proposes a length-adaptive non-dominated sorting genetic algorithm(LA-NSGA)with a length-variable individual encoding and a length-adaptive evolution mechanism for bi-objective highdimensional FS.In LA-NSGA,an initialization method based on correlation and redundancy is devised to initialize individuals of diverse lengths,and a Pareto dominance-based length change operator is introduced to guide individuals to explore in promising search space adaptively.Moreover,a dominance-based local search method is employed for further improvement.The experimental results based on 12 high-dimensional gene datasets show that the Pareto front of feature subsets produced by LA-NSGA is superior to those of existing algorithms. 展开更多
关键词 Bi-objective optimization feature selection(FS) genetic algorithm high-dimensional data length-adaptive
在线阅读 下载PDF
Generalized Functional Linear Models:Efficient Modeling for High-dimensional Correlated Mixture Exposures
3
作者 Bingsong Zhang Haibin Yu +11 位作者 Xin Peng Haiyi Yan Siran Li Shutong Luo Renhuizi Wei Zhujiang Zhou Yalin Kuang Yihuan Zheng Chulan Ou Linhua Liu Yuehua Hu Jindong Ni 《Biomedical and Environmental Sciences》 2025年第8期961-976,共16页
Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemio... Objective Humans are exposed to complex mixtures of environmental chemicals and other factors that can affect their health.Analysis of these mixture exposures presents several key challenges for environmental epidemiology and risk assessment,including high dimensionality,correlated exposure,and subtle individual effects.Methods We proposed a novel statistical approach,the generalized functional linear model(GFLM),to analyze the health effects of exposure mixtures.GFLM treats the effect of mixture exposures as a smooth function by reordering exposures based on specific mechanisms and capturing internal correlations to provide a meaningful estimation and interpretation.The robustness and efficiency was evaluated under various scenarios through extensive simulation studies.Results We applied the GFLM to two datasets from the National Health and Nutrition Examination Survey(NHANES).In the first application,we examined the effects of 37 nutrients on BMI(2011–2016 cycles).The GFLM identified a significant mixture effect,with fiber and fat emerging as the nutrients with the greatest negative and positive effects on BMI,respectively.For the second application,we investigated the association between four pre-and perfluoroalkyl substances(PFAS)and gout risk(2007–2018 cycles).Unlike traditional methods,the GFLM indicated no significant association,demonstrating its robustness to multicollinearity.Conclusion GFLM framework is a powerful tool for mixture exposure analysis,offering improved handling of correlated exposures and interpretable results.It demonstrates robust performance across various scenarios and real-world applications,advancing our understanding of complex environmental exposures and their health impacts on environmental epidemiology and toxicology. 展开更多
关键词 Mixture exposure modeling Functional data analysis high-dimensional data Correlated exposures Environmental epidemiology
暂未订购
Syn-Aug:An Effective and General Synchronous Data Augmentation Framework for 3D Object Detection
4
作者 Huaijin Liu Jixiang Du +2 位作者 Yong Zhang Hongbo Zhang Jiandian Zeng 《CAAI Transactions on Intelligence Technology》 2025年第3期912-928,共17页
Data augmentation plays an important role in boosting the performance of 3D models,while very few studies handle the 3D point cloud data with this technique.Global augmentation and cut-paste are commonly used augmenta... Data augmentation plays an important role in boosting the performance of 3D models,while very few studies handle the 3D point cloud data with this technique.Global augmentation and cut-paste are commonly used augmentation techniques for point clouds,where global augmentation is applied to the entire point cloud of the scene,and cut-paste samples objects from other frames into the current frame.Both types of data augmentation can improve performance,but the cut-paste technique cannot effectively deal with the occlusion relationship between the foreground object and the background scene and the rationality of object sampling,which may be counterproductive and may hurt the overall performance.In addition,LiDAR is susceptible to signal loss,external occlusion,extreme weather and other factors,which can easily cause object shape changes,while global augmentation and cut-paste cannot effectively enhance the robustness of the model.To this end,we propose Syn-Aug,a synchronous data augmentation framework for LiDAR-based 3D object detection.Specifically,we first propose a novel rendering-based object augmentation technique(Ren-Aug)to enrich training data while enhancing scene realism.Second,we propose a local augmentation technique(Local-Aug)to generate local noise by rotating and scaling objects in the scene while avoiding collisions,which can improve generalisation performance.Finally,we make full use of the structural information of 3D labels to make the model more robust by randomly changing the geometry of objects in the training frames.We verify the proposed framework with four different types of 3D object detectors.Experimental results show that our proposed Syn-Aug significantly improves the performance of various 3D object detectors in the KITTI and nuScenes datasets,proving the effectiveness and generality of Syn-Aug.On KITTI,four different types of baseline models using Syn-Aug improved mAP by 0.89%,1.35%,1.61%and 1.14%respectively.On nuScenes,four different types of baseline models using Syn-Aug improved mAP by 14.93%,10.42%,8.47%and 6.81%respectively.The code is available at https://github.com/liuhuaijjin/Syn-Aug. 展开更多
关键词 3D object detection data augmentation DIVERSITY GENERALIZATION point cloud ROBUSTNESS
在线阅读 下载PDF
A nearest neighbor search algorithm of high-dimensional data based on sequential NPsim matrix
5
作者 李文法 Wang Gongming +1 位作者 Ma Nan Liu Hongzhe 《High Technology Letters》 EI CAS 2016年第3期241-247,共7页
Problems existin similarity measurement and index tree construction which affect the performance of nearest neighbor search of high-dimensional data. The equidistance problem is solved using NPsim function to calculat... Problems existin similarity measurement and index tree construction which affect the performance of nearest neighbor search of high-dimensional data. The equidistance problem is solved using NPsim function to calculate similarity. And a sequential NPsim matrix is built to improve indexing performance. To sum up the above innovations,a nearest neighbor search algorithm of high-dimensional data based on sequential NPsim matrix is proposed in comparison with the nearest neighbor search algorithms based on KD-tree or SR-tree on Munsell spectral data set. Experimental results show that the proposed algorithm similarity is better than that of other algorithms and searching speed is more than thousands times of others. In addition,the slow construction speed of sequential NPsim matrix can be increased by using parallel computing. 展开更多
关键词 nearest neighbor search high-dimensional data SIMILARITY indexing tree NPsim KD-TREE SR-tree Munsell
在线阅读 下载PDF
A State-Migration Particle Swarm Optimizer for Adaptive Latent Factor Analysis of High-Dimensional and Incomplete Data
6
作者 Jiufang Chen Kechen Liu +4 位作者 Xin Luo Ye Yuan Khaled Sedraoui Yusuf Al-Turki MengChu Zhou 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第11期2220-2235,共16页
High-dimensional and incomplete(HDI) matrices are primarily generated in all kinds of big-data-related practical applications. A latent factor analysis(LFA) model is capable of conducting efficient representation lear... High-dimensional and incomplete(HDI) matrices are primarily generated in all kinds of big-data-related practical applications. A latent factor analysis(LFA) model is capable of conducting efficient representation learning to an HDI matrix,whose hyper-parameter adaptation can be implemented through a particle swarm optimizer(PSO) to meet scalable requirements.However, conventional PSO is limited by its premature issues,which leads to the accuracy loss of a resultant LFA model. To address this thorny issue, this study merges the information of each particle's state migration into its evolution process following the principle of a generalized momentum method for improving its search ability, thereby building a state-migration particle swarm optimizer(SPSO), whose theoretical convergence is rigorously proved in this study. It is then incorporated into an LFA model for implementing efficient hyper-parameter adaptation without accuracy loss. Experiments on six HDI matrices indicate that an SPSO-incorporated LFA model outperforms state-of-the-art LFA models in terms of prediction accuracy for missing data of an HDI matrix with competitive computational efficiency.Hence, SPSO's use ensures efficient and reliable hyper-parameter adaptation in an LFA model, thus ensuring practicality and accurate representation learning for HDI matrices. 展开更多
关键词 data science generalized momentum high-dimensional and incomplete(HDI)data hyper-parameter adaptation latent factor analysis(LFA) particle swarm optimization(PSO)
在线阅读 下载PDF
Censored Composite Conditional Quantile Screening for High-Dimensional Survival Data
7
作者 LIU Wei LI Yingqiu 《应用概率统计》 CSCD 北大核心 2024年第5期783-799,共17页
In this paper,we introduce the censored composite conditional quantile coefficient(cC-CQC)to rank the relative importance of each predictor in high-dimensional censored regression.The cCCQC takes advantage of all usef... In this paper,we introduce the censored composite conditional quantile coefficient(cC-CQC)to rank the relative importance of each predictor in high-dimensional censored regression.The cCCQC takes advantage of all useful information across quantiles and can detect nonlinear effects including interactions and heterogeneity,effectively.Furthermore,the proposed screening method based on cCCQC is robust to the existence of outliers and enjoys the sure screening property.Simulation results demonstrate that the proposed method performs competitively on survival datasets of high-dimensional predictors,particularly when the variables are highly correlated. 展开更多
关键词 high-dimensional survival data censored composite conditional quantile coefficient sure screening property rank consistency property
在线阅读 下载PDF
Dimensionality Reduction of High-Dimensional Highly Correlated Multivariate Grapevine Dataset
8
作者 Uday Kant Jha Peter Bajorski +3 位作者 Ernest Fokoue Justine Vanden Heuvel Jan van Aardt Grant Anderson 《Open Journal of Statistics》 2017年第4期702-717,共16页
Viticulturists traditionally have a keen interest in studying the relationship between the biochemistry of grapevines’ leaves/petioles and their associated spectral reflectance in order to understand the fruit ripeni... Viticulturists traditionally have a keen interest in studying the relationship between the biochemistry of grapevines’ leaves/petioles and their associated spectral reflectance in order to understand the fruit ripening rate, water status, nutrient levels, and disease risk. In this paper, we implement imaging spectroscopy (hyperspectral) reflectance data, for the reflective 330 - 2510 nm wavelength region (986 total spectral bands), to assess vineyard nutrient status;this constitutes a high dimensional dataset with a covariance matrix that is ill-conditioned. The identification of the variables (wavelength bands) that contribute useful information for nutrient assessment and prediction, plays a pivotal role in multivariate statistical modeling. In recent years, researchers have successfully developed many continuous, nearly unbiased, sparse and accurate variable selection methods to overcome this problem. This paper compares four regularized and one functional regression methods: Elastic Net, Multi-Step Adaptive Elastic Net, Minimax Concave Penalty, iterative Sure Independence Screening, and Functional Data Analysis for wavelength variable selection. Thereafter, the predictive performance of these regularized sparse models is enhanced using the stepwise regression. This comparative study of regression methods using a high-dimensional and highly correlated grapevine hyperspectral dataset revealed that the performance of Elastic Net for variable selection yields the best predictive ability. 展开更多
关键词 high-dimensional data MULTI-STEP Adaptive Elastic Net MINIMAX CONCAVE Penalty Sure Independence Screening Functional data Analysis
暂未订购
Optimal Estimation of High-Dimensional Covariance Matrices with Missing and Noisy Data
9
作者 Meiyin Wang Wanzhou Ye 《Advances in Pure Mathematics》 2024年第4期214-227,共14页
The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based o... The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method. 展开更多
关键词 high-dimensional Covariance Matrix Missing data Sub-Gaussian Noise Optimal Estimation
在线阅读 下载PDF
Making Short-term High-dimensional Data Predictable
10
作者 CHEN Luonan 《Bulletin of the Chinese Academy of Sciences》 2018年第4期243-244,共2页
Making accurate forecast or prediction is a challenging task in the big data era, in particular for those datasets involving high-dimensional variables but short-term time series points,which are generally available f... Making accurate forecast or prediction is a challenging task in the big data era, in particular for those datasets involving high-dimensional variables but short-term time series points,which are generally available from real-world systems.To address this issue, Prof. 展开更多
关键词 RDE MAKING SHORT-TERM high-dimensional data Predictable
在线阅读 下载PDF
OBJECT ORIENTED DATA MODELLING WITH APPLICATIONS TO CAD
11
作者 应维云 傅向阳 周儒荣 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 1996年第2期69+63-68,共7页
An object oriented data modelling in computer aided design (CAD) databases is focused. Starting with the discussion of data modelling requirements for CAD applications, appropriate data modelling features are introdu... An object oriented data modelling in computer aided design (CAD) databases is focused. Starting with the discussion of data modelling requirements for CAD applications, appropriate data modelling features are introduced herewith. A feasible approach to select the “best” data model for an application is to analyze the data which has to be stored in the database. A data model is appropriate for modelling a given task if the information of the application environment can be easily mapped to the data model. Thus, the involved data are analyzed and then object oriented data model appropriate for CAD applications are derived. Based on the reviewed object oriented techniques applied in CAD, object oriented data modelling in CAD is addressed in details. At last 3D geometrical data models and implementation of their data model using the object oriented method are presented. 展开更多
关键词 computer aided design dataBASES data models object oriented data models complex objects geometrical models
在线阅读 下载PDF
基本表定义变更对数据及数据窗口对象(Data Window Object)的影响 被引量:2
12
作者 谢延东 崔毅 《微计算机信息》 2002年第6期67-68,共2页
本文时基本表定义的变更对表内数据及相关数据窗口对象定义的影响进行了分析整理,给出了一个参照结果表。
关键词 表定义 数据 数据窗口对象 POWERBUILDER 数据窗口控件 数据库
在线阅读 下载PDF
Distributed Parallelization of a Global Atmospheric Data Objective Analysis System 被引量:2
13
作者 赵军 宋君强 李振军 《Advances in Atmospheric Sciences》 SCIE CAS CSCD 2003年第1期159-163,共5页
It is difficult to parallelize a subsistent sequential algorithm. Through analyzing the sequential algorithm of a Global Atmospheric Data Objective Analysis System, this article puts forward a distributed parallel alg... It is difficult to parallelize a subsistent sequential algorithm. Through analyzing the sequential algorithm of a Global Atmospheric Data Objective Analysis System, this article puts forward a distributed parallel algorithm that statically distributes data on a massively parallel processing (MPP) computer. The algorithm realizes distributed parailelization by extracting the analysis boxes and model grid point Iatitude rows with leaped steps, and by distributing the data to different processors. The parallel algorithm achieves good load balancing, high parallel efficiency, and low parallel cost. Performance experiments on a MPP computer arc also presented. 展开更多
关键词 distributed parailelization analysis box data distribution objective analysis
在线阅读 下载PDF
Randomized Latent Factor Model for High-dimensional and Sparse Matrices from Industrial Applications 被引量:14
14
作者 Mingsheng Shang Xin Luo +3 位作者 Zhigang Liu Jia Chen Ye Yuan MengChu Zhou 《IEEE/CAA Journal of Automatica Sinica》 EI CSCD 2019年第1期131-141,共11页
Latent factor(LF)models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS)matrices which are commonly seen in various industrial applications.An LF model usually adopts iterativ... Latent factor(LF)models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS)matrices which are commonly seen in various industrial applications.An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost.Hence,determining how to accelerate the training process for LF models has become a significant issue.To address this,this work proposes a randomized latent factor(RLF)model.It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices,thereby greatly alleviating computational burden.It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models,RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices,which is especially desired for industrial applications demanding highly efficient models. 展开更多
关键词 Big data high-dimensional and sparse matrix latent factor analysis latent factor model randomized learning
在线阅读 下载PDF
Mapping methods for output-based objective speech quality assessment using data mining 被引量:3
15
作者 王晶 赵胜辉 +1 位作者 谢湘 匡镜明 《Journal of Central South University》 SCIE EI CAS 2014年第5期1919-1926,共8页
Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.T... Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.The degraded speech is firstly separated into three classes(unvoiced,voiced and silence),and then the consistency measurement between the degraded speech signal and the pre-trained reference model for each class is calculated and mapped to an objective speech quality score using data mining.Fuzzy Gaussian mixture model(GMM)is used to generate the artificial reference model trained on perceptual linear predictive(PLP)features.The mean opinion score(MOS)mapping methods including multivariate non-linear regression(MNLR),fuzzy neural network(FNN)and support vector regression(SVR)are designed and compared with the standard ITU-T P.563 method.Experimental results show that the assessment methods with data mining perform better than ITU-T P.563.Moreover,FNN and SVR are more efficient than MNLR,and FNN performs best with 14.50% increase in the correlation coefficient and 32.76% decrease in the root-mean-square MOS error. 展开更多
关键词 objective speech quality data mining multivariate non-linear regression fuzzy neural network support vector regression
在线阅读 下载PDF
Behavior Mining of Spatial Objects with Data Field 被引量:2
16
作者 王树良 伍爵博 +2 位作者 程峰 金红 曾寔 《Geo-Spatial Information Science》 2009年第3期202-211,共10页
The advanced data mining technologies and the large quantities of remotely sensed Imagery provide a data mining opportunity with high potential for useful results. Extracting interesting patterns and rules from data s... The advanced data mining technologies and the large quantities of remotely sensed Imagery provide a data mining opportunity with high potential for useful results. Extracting interesting patterns and rules from data sets composed of images and associated ground data can be of importance in object identification, community planning, resource discovery and other areas. In this paper, a data field is presented to express the observed spatial objects and conduct behavior mining on them. First, most of the important aspects are discussed on behavior mining and its implications for the future of data mining. Furthermore, an ideal framework of the behavior mining system is proposed in the network environment. Second, the model of behavior mining is given on the observed spatial objects, including the objects described by the first feature data field and the main feature data field by means of the potential function. Finally, a case study about object identification in public is given and analyzed. The experimental results show that the new model is feasible in behavior mining. 展开更多
关键词 behavior mining data field spatial object identification spatial data mining
原文传递
CSFW-SC: Cuckoo Search Fuzzy-Weighting Algorithm for Subspace Clustering Applying to High-Dimensional Clustering 被引量:1
17
作者 WANG Jindong HE Jiajing +1 位作者 ZHANG Hengwei YU Zhiyong 《China Communications》 SCIE CSCD 2015年第S2期55-63,共9页
Aimed at the issue that traditional clustering methods are not appropriate to high-dimensional data, a cuckoo search fuzzy-weighting algorithm for subspace clustering is presented on the basis of the exited soft subsp... Aimed at the issue that traditional clustering methods are not appropriate to high-dimensional data, a cuckoo search fuzzy-weighting algorithm for subspace clustering is presented on the basis of the exited soft subspace clustering algorithm. In the proposed algorithm, a novel objective function is firstly designed by considering the fuzzy weighting within-cluster compactness and the between-cluster separation, and loosening the constraints of dimension weight matrix. Then gradual membership and improved Cuckoo search, a global search strategy, are introduced to optimize the objective function and search subspace clusters, giving novel learning rules for clustering. At last, the performance of the proposed algorithm on the clustering analysis of various low and high dimensional datasets is experimentally compared with that of several competitive subspace clustering algorithms. Experimental studies demonstrate that the proposed algorithm can obtain better performance than most of the existing soft subspace clustering algorithms. 展开更多
关键词 high-dimensional data CLUSTERING soft SUBSPACE CUCKOO SEARCH FUZZY CLUSTERING
在线阅读 下载PDF
Multiple Data Augmentation Strategy for Enhancing the Performance of YOLOv7 Object Detection Algorithm 被引量:4
18
作者 Abdulghani M.Abdulghani Mokhles M.Abdulghani +1 位作者 Wilbur L.Walters Khalid H.Abed 《Journal on Artificial Intelligence》 2023年第1期15-30,共16页
The object detection technique depends on various methods for duplicating the dataset without adding more images.Data augmentation is a popularmethod that assists deep neural networks in achieving better generalizatio... The object detection technique depends on various methods for duplicating the dataset without adding more images.Data augmentation is a popularmethod that assists deep neural networks in achieving better generalization performance and can be seen as a type of implicit regularization.Thismethod is recommended in the casewhere the amount of high-quality data is limited,and gaining new examples is costly and time-consuming.In this paper,we trained YOLOv7 with a dataset that is part of the Open Images dataset that has 8,600 images with four classes(Car,Bus,Motorcycle,and Person).We used five different data augmentations techniques for duplicates and improvement of our dataset.The performance of the object detection algorithm was compared when using the proposed augmented dataset with a combination of two and three types of data augmentation with the result of the original data.The evaluation result for the augmented data gives a promising result for every object,and every kind of data augmentation gives a different improvement.The mAP@.5 of all classes was 76%,and F1-score was 74%.The proposed method increased the mAP@.5 value by+13%and F1-score by+10%for all objects. 展开更多
关键词 Artificial intelligence object detection YOLOv7 data augmentation data brightness data darkness data blur data noise convolutional neural network
在线阅读 下载PDF
Observation points classifier ensemble for high-dimensional imbalanced classification 被引量:1
19
作者 Yulin He Xu Li +3 位作者 Philippe Fournier‐Viger Joshua Zhexue Huang Mianjie Li Salman Salloum 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第2期500-517,共18页
In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)... In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)feature extraction technique.First,dimensionality of the original imbalanced data is reduced using MDS so that distances between any two different samples are preserved as well as possible.Second,a novel OPCE algorithm is applied to classify imbalanced samples by placing optimised observation points in a low-dimensional data space.Third,optimization of the observation point mappings is carried out to obtain a reliable assessment of the unknown samples.Exhaustive experiments have been conducted to evaluate the feasibility,rationality,and effectiveness of the proposed OPCE algorithm using seven benchmark HDIC data sets.Experimental results show that(1)the OPCE algorithm can be trained faster on low-dimensional imbalanced data than on high-dimensional data;(2)the OPCE algorithm can correctly identify samples as the number of optimised observation points is increased;and(3)statistical analysis reveals that OPCE yields better HDIC performances on the selected data sets in comparison with eight other HDIC algorithms.This demonstrates that OPCE is a viable algorithm to deal with HDIC problems. 展开更多
关键词 classifier ensemble feature transformation high-dimensional data classification imbalanced learning observation point mechanism
在线阅读 下载PDF
上一页 1 2 189 下一页 到第
使用帮助 返回顶部