期刊文献+
共找到192篇文章
< 1 2 10 >
每页显示 20 50 100
LeaDen-Stream: A Leader Density-Based Clustering Algorithm over Evolving Data Stream
1
作者 Amineh Amini Teh Ying Wah 《Journal of Computer and Communications》 2013年第5期26-31,共6页
Clustering evolving data streams is important to be performed in a limited time with a reasonable quality. The existing micro clustering based methods do not consider the distribution of data points inside the micro c... Clustering evolving data streams is important to be performed in a limited time with a reasonable quality. The existing micro clustering based methods do not consider the distribution of data points inside the micro cluster. We propose LeaDen-Stream (Leader Density-based clustering algorithm over evolving data Stream), a density-based clustering algorithm using leader clustering. The algorithm is based on a two-phase clustering. The online phase selects the proper mini-micro or micro-cluster leaders based on the distribution of data points in the micro clusters. Then, the leader centers are sent to the offline phase to form final clusters. In LeaDen-Stream, by carefully choosing between two kinds of micro leaders, we decrease time complexity of the clustering while maintaining the cluster quality. A pruning strategy is also used to filter out real data from noise by introducing dense and sparse mini-micro and micro-cluster leaders. Our performance study over a number of real and synthetic data sets demonstrates the effectiveness and efficiency of our method. 展开更多
关键词 EVOLVING data STREAMS Density-based clustering Micro cluster Mini-Micro cluster
暂未订购
Adaptive Spectral Clustering Ensemble Selection via Resampling and Population-Based Incremental Learning Algorithm 被引量:5
2
作者 XU Yuanchun JIA Jianhua 《Wuhan University Journal of Natural Sciences》 CAS 2011年第3期228-236,共9页
In this paper, we explore a novel ensemble method for spectral clustering. In contrast to the traditional clustering ensemble methods that combine all the obtained clustering results, we propose the adaptive spectral ... In this paper, we explore a novel ensemble method for spectral clustering. In contrast to the traditional clustering ensemble methods that combine all the obtained clustering results, we propose the adaptive spectral clustering ensemble method to achieve a better clustering solution. This method can adaptively assess the number of the component members, which is not owned by many other algorithms. The component clusterings of the ensemble system are generated by spectral clustering (SC) which bears some good characteristics to engender the diverse committees. The selection process works by evaluating the generated component spectral clustering through resampling technique and population-based incremental learning algorithm (PBIL). Experimental results on UCI datasets demonstrate that the proposed algorithm can achieve better results compared with traditional clustering ensemble methods, especially when the number of component clusterings is large. 展开更多
关键词 spectral clustering clustering ensemble selective ensemble RESAMPLING population-based incremental learning algorithm (PBIL) data clustering
原文传递
Outlier detection based on multi-dimensional clustering and local density
3
作者 SHOU Zhao-yu LI Meng-ya LI Si-min 《Journal of Central South University》 SCIE EI CAS CSCD 2017年第6期1299-1306,共8页
Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outl... Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outlier. In this work, an effective outlier detection method based on multi-dimensional clustering and local density(ODBMCLD) is proposed. ODBMCLD firstly identifies the center objects by the local density peak of data objects, and clusters the whole dataset based on the center objects. Then, outlier objects belonging to different clusters will be marked as candidates of abnormal data. Finally, the top N points among these abnormal candidates are chosen as final anomaly objects with high outlier factors. The feasibility and effectiveness of the method are verified by experiments. 展开更多
关键词 data MINING OUTLIER DETECTION OUTLIER DETECTION method based on MULTI-DIMENSIONAL clusterING and local density (ODBMCLD) algorithm deviation DEGREE
在线阅读 下载PDF
基于密度的多度量空间数据聚类算法 被引量:2
4
作者 朱轶凡 罗程阳 +3 位作者 马瑞遥 陈璐 毛玉仁 高云君 《软件学报》 北大核心 2025年第2期851-873,共23页
具有噪声的基于密度的数据聚类(DBSCAN)算法是数据挖掘领域中的经典方法之一,其不仅能发现数据中潜藏的复杂关系,还能过滤其中的数据噪声,从而获得高质量的数据聚类.然而,现有的基于密度的数据聚类算法仅支持单模态(类型)数据的聚类,难... 具有噪声的基于密度的数据聚类(DBSCAN)算法是数据挖掘领域中的经典方法之一,其不仅能发现数据中潜藏的复杂关系,还能过滤其中的数据噪声,从而获得高质量的数据聚类.然而,现有的基于密度的数据聚类算法仅支持单模态(类型)数据的聚类,难以应对多模态(类型)数据并存的应用场景.随着信息技术的快速发展,数据呈现多模态化的发展态势,现实生活中的数据不再是单一的数据类型,而是多种数据模态(类型)的组合,如文本、图像、地理坐标、数据特征等.因此,现有的数据聚类方法难以对复杂的多模态数据进行有效的数据建模,更无法进行高效的多模态数据聚类.基于此,提出一种基于密度的多度量空间聚类算法.首先,为了刻画多模态数据间的复杂关系,利用多度量空间表征数据之间的相似性关系,并且利用聚合多度量图索引(AMG)实现多模态数据建模.接着,利用差分化的相似性关系优化聚合多度量图的图结构,并且结合最优策略优先的搜索策略进行剪枝,以实现高效的多模态数据聚类.最后,在真实与合成数据集上针对多种参数设置进行实验.实验结果验证了所提方法运行效率提升了至少1个数量级,并具有较高的聚类精度与良好的可扩展性. 展开更多
关键词 多度量空间 多度量图 基于密度的数据聚类 数据挖掘 多模态数据
在线阅读 下载PDF
基于Cluster的数据网格请求代理服务器设计 被引量:1
5
作者 黄斌 李春江 +2 位作者 肖侬 刘波 付伟 《计算机应用研究》 CSCD 北大核心 2004年第9期185-187,共3页
数据网格为数据密集型的应用提供了强有力的支持,数据服务是数据网格的核心,因而数据请求代理(DRB)服务器的设计是实现数据服务的关键。一个结构、性能较好的服务器能屏蔽数据的广域分布性和异构性,实现一体化数据访问、存储、传输与管... 数据网格为数据密集型的应用提供了强有力的支持,数据服务是数据网格的核心,因而数据请求代理(DRB)服务器的设计是实现数据服务的关键。一个结构、性能较好的服务器能屏蔽数据的广域分布性和异构性,实现一体化数据访问、存储、传输与管理。基于Cluster实现了一种数据请求代理服务器,这种服务器实现了上述目标,并具有许多优点,特别在具有多Cluster的高性能计算中,可以同时建立多个连接进行数据分块传输,能够获得Cluster-to-Cluster的聚集吞吐率。介绍了基于Cluster的DRB详细设计方案,描述了多个自治域的DRB之间协同服务的过程,并分析了这种设计的优点。 展开更多
关键词 cluster-based 数据网格 数据请求代理服务器 设计
在线阅读 下载PDF
基于SAE-MSCNN的网络入侵检测
6
作者 王泽辉 郝秦霞 《计算机工程与设计》 北大核心 2025年第10期2858-2865,共8页
针对现有的网络入侵检测方法忽略了流量特征间的关联性对特征选择的重要性,且在数据平衡时未能考虑到低频攻击样本的分布离散性,导致检测性能下降的问题,提出互信息值融合(mutual information value fusion,MIVF)方法来选择与攻击行为... 针对现有的网络入侵检测方法忽略了流量特征间的关联性对特征选择的重要性,且在数据平衡时未能考虑到低频攻击样本的分布离散性,导致检测性能下降的问题,提出互信息值融合(mutual information value fusion,MIVF)方法来选择与攻击行为相关性高且彼此之间关联性低的特征。提出基于DBSCAN改进的SMOTE方法对低频攻击样本按照其密度聚类分布进行过采样;构建SAE-MSCNN分类模型来检验性能。在NSL-KDD和UNSW-NB15数据集上验证,准确率分别达到92.89%和94.85%。结果表明所提方法可以有效地选择特征以及平衡数据,尤其是提高低频攻击的检测准确率。 展开更多
关键词 网络入侵检测 互信息 特征关联 特征选择 密度聚类 过采样 数据平衡
在线阅读 下载PDF
基于集成学习的物联网通信数据快速分类研究
7
作者 杨瑞丽 王俊仃 邱秀荣 《通信电源技术》 2025年第5期4-6,共3页
物联网设备持续产出的数据中会掺杂部分异常数据,导致物联网通信数据分类的质量与效率下降。因此,提出一种基于集成学习的物联网通信数据快速分类方法。从物联网设备收集通信数据,利用孤立森林算法确定物联网通信数据样本的异常分值,并... 物联网设备持续产出的数据中会掺杂部分异常数据,导致物联网通信数据分类的质量与效率下降。因此,提出一种基于集成学习的物联网通信数据快速分类方法。从物联网设备收集通信数据,利用孤立森林算法确定物联网通信数据样本的异常分值,并去除异常分值较高的数据,通过基于密度的带噪声应用空间聚类(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)算法整合去除异常后的数据,结合集成学习算法实现物联网通信数据快速分类。实验结果表明,所提方法的物联网通信数据分类准确率始终在97.2%以上,物联网通信数据分类时间均值约为1.55 s,具有良好的应用潜力。 展开更多
关键词 集成学习 物联网通信 数据分类 基于密度的带噪声应用空间聚类(DBSCAN)
在线阅读 下载PDF
Over-sampling algorithm for imbalanced data classification 被引量:13
8
作者 XU Xiaolong CHEN Wen SUN Yanfei 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第6期1182-1191,共10页
For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic... For imbalanced datasets, the focus of classification is to identify samples of the minority class. The performance of current data mining algorithms is not good enough for processing imbalanced datasets. The synthetic minority over-sampling technique(SMOTE) is specifically designed for learning from imbalanced datasets, generating synthetic minority class examples by interpolating between minority class examples nearby. However, the SMOTE encounters the overgeneralization problem. The densitybased spatial clustering of applications with noise(DBSCAN) is not rigorous when dealing with the samples near the borderline.We optimize the DBSCAN algorithm for this problem to make clustering more reasonable. This paper integrates the optimized DBSCAN and SMOTE, and proposes a density-based synthetic minority over-sampling technique(DSMOTE). First, the optimized DBSCAN is used to divide the samples of the minority class into three groups, including core samples, borderline samples and noise samples, and then the noise samples of minority class is removed to synthesize more effective samples. In order to make full use of the information of core samples and borderline samples,different strategies are used to over-sample core samples and borderline samples. Experiments show that DSMOTE can achieve better results compared with SMOTE and Borderline-SMOTE in terms of precision, recall and F-value. 展开更多
关键词 imbalanced data density-based spatial clustering of applications with noise(DBSCAN) synthetic minority over sampling technique(SMOTE) over-sampling.
在线阅读 下载PDF
自注意力优化密度聚类的风机数据清洗方法 被引量:1
9
作者 张茹顶 张铖 +3 位作者 潘钱宇 李少帅 孟井煜枫 吴博阳 《微特电机》 2025年第4期34-38,共5页
针对风电机组监控与数据采集系统常受多种因素影响,导致数据异常问题,提出一种基于自注意力编码器改进的密度聚类模型方法,结合自注意力编码器的特征提取能力和密度聚类的空间特性,通过引入相对位置编码和优化多头注意力机制,提升对监... 针对风电机组监控与数据采集系统常受多种因素影响,导致数据异常问题,提出一种基于自注意力编码器改进的密度聚类模型方法,结合自注意力编码器的特征提取能力和密度聚类的空间特性,通过引入相对位置编码和优化多头注意力机制,提升对监控与数据采集系统异常数据识别能力。实验结果表明,所提方法的数据清洗效果和模型精度与传统方法相比更优,其中异常数据剔除率达到26.58%,并且在拟合风速-功率曲线时,平均绝对误差、均方根误差最低,决定系数最高。清洗后的监控与数据采集系统数据应用于机组故障诊断,将风电机组故障识别准确性提高到了92%以上、故障预警及时性提前了20%,故障类型分类精度提高了30%。该方法不仅提高了风电机组的运行效率和可靠性,还为风电场的运行管理和决策提供了较为可靠的数据支持。 展开更多
关键词 自注意力编码器 密度聚类算法 数据清洗 监控与数据采集系统 风电机组
在线阅读 下载PDF
基于数据分布的风电机组叶片失速诊断方法研究
10
作者 尹业峰 张家友 +1 位作者 娄斌 陈亚楠 《控制与信息技术》 2025年第1期44-49,共6页
风电机组作为风能转换的关键设备,其运行稳定性和可靠性对于保障能源供应和减少维护成本至关重要。围绕其进行故障诊断与预警是业内普遍关注及研究的热点,特别是其中的叶片失速故障,直接影响风电机组的发电效率和安全性。对此,文章提出... 风电机组作为风能转换的关键设备,其运行稳定性和可靠性对于保障能源供应和减少维护成本至关重要。围绕其进行故障诊断与预警是业内普遍关注及研究的热点,特别是其中的叶片失速故障,直接影响风电机组的发电效率和安全性。对此,文章提出了一种用于诊断风电机组叶片失速情况的方法,其利用DBSCAN空间密度聚类算法初步筛选正常工况数据,再根据正常工况数据的空间分布规律获取数据下边界,依据数据下边界进行叶片失速与否的诊断。在若干个风场采用该方法进行了叶片失速诊断,实验结果显示,该方法能有效识别风电机组叶片失速情况,其平均诊断真正率达94.5%,漏检率为1.6%。该方法为风电机组叶片失速诊断提供了一种新的思路,可通过与主控系统协同响应,提升风电机组的发电效率,进而提高风电机组的发电性能及安全稳定性。 展开更多
关键词 风电机组 叶片失速 DBSCAN算法 数据分布
在线阅读 下载PDF
REMUDA: A Practical Topology Control and Data Forwarding Mechanism for Wireless Sensor Networks
11
作者 SUN Li-Min YAN Ting-Xin BI Yan-Zhong 《自动化学报》 EI CSCD 北大核心 2006年第6期867-874,共8页
In wireless sensor networks, topology control plays an important role for data forwarding efficiency in the data gathering applications. In this paper, we present a novel topology control and data forwarding mechanism... In wireless sensor networks, topology control plays an important role for data forwarding efficiency in the data gathering applications. In this paper, we present a novel topology control and data forwarding mechanism called REMUDA, which is designed for a practical indoor parking lot management system. REMUDA forms a tree-based hierarchical network topology which brings as many nodes as possible to be leaf nodes and constructs a virtual cluster structure. Meanwhile, it takes the reliability, stability and path length into account in the tree construction process. Through an experiment in a network of 30 real sensor nodes, we evaluate the performance of REMUDA and compare it with LEPS which is also a practical routing protocol in TinyOS. Experiment results show that REMUDA can achieve better performance than LEPS. 展开更多
关键词 data forwarding mechanism tree-based hierarchical topology virtual cluster
在线阅读 下载PDF
结合软约束的演化数据流模糊聚类算法 被引量:1
12
作者 代少升 边志奇 袁中明 《重庆邮电大学学报(自然科学版)》 CSCD 北大核心 2024年第2期287-298,共12页
多源局部放电检测中,不同类型的局放信号同时存在且不断变化使得信号的分离更具挑战,而这种情况同样存在于许多数据流的聚类分析场景中。为了能够适应类簇内的不均匀密度和类簇间的重叠边界问题,同时对数据流的漂移和演化进行及时跟踪,... 多源局部放电检测中,不同类型的局放信号同时存在且不断变化使得信号的分离更具挑战,而这种情况同样存在于许多数据流的聚类分析场景中。为了能够适应类簇内的不均匀密度和类簇间的重叠边界问题,同时对数据流的漂移和演化进行及时跟踪,提出了一种结合软约束的实时数据流模糊聚类算法。算法引入2种模糊性软约束来描述微簇距离和密度上的不确定度,通过阈值划分出核心微簇、边界微簇和离群微簇;在类簇边缘使用模糊隶属度,给予微簇分属不同类簇的可能性,保证类簇的完整性并提高聚类效果;使用两阶段的流程结构和2种时间窗口模型,赋予算法具有对可变化数据流的适应能力和更低的时间空间占用率。在多种数据集上的实验表明,该算法相比同类型算法在聚类效果上提升了1%~3%,且平均运行时间缩短5%~20%,在实际硬件平台的测试中也验证了算法的聚类分离性能。 展开更多
关键词 数据流聚类 密度聚类 模糊聚类 概念漂移 局部放电
在线阅读 下载PDF
A New Integrated Fuzzifier Evaluation and Selection (NIFEs) Algorithm for Fuzzy Clustering
13
作者 Chanpaul Jin Wang Hua Fang +2 位作者 Sun Kim Ann Moormann Honggang Wang 《Journal of Applied Mathematics and Physics》 2015年第7期802-807,共6页
Fuzzy C-means (FCM) is simple and widely used for complex data pattern recognition and image analyses. However, selecting an appropriate fuzzifier (m) is crucial in identifying an optimal number of patterns and achiev... Fuzzy C-means (FCM) is simple and widely used for complex data pattern recognition and image analyses. However, selecting an appropriate fuzzifier (m) is crucial in identifying an optimal number of patterns and achieving higher clustering accuracy, which few studies have investigated. Built upon two existing methods on selecting fuzzifier, we developed an integrated fuzzifier evaluation and selection algorithm and tested it using real datasets. Our findings indicate that the consistent optimal number of clusters can be learnt from testing different fuzzifiers for each dataset and the fuzzifier with the lowest value for this consistency should be selected for clustering. Our evaluation also shows that the fuzzifier impacts the clustering accuracy. For longitudinal data with missing values, m = 2 could be an empirical rule to start fuzzy clustering, and the best clustering accuracy was achieved for tested data, especially using our multiple-imputation based fuzzy clustering. 展开更多
关键词 Fuzzifier FUZZY C-MEANS Multiple Imputation-based FUZZY clusterING (MIFuzzy) MISSING data Longitudinal data
暂未订购
基于聚类和AdaBoost的ADS⁃B数据质量综合评估方法 被引量:5
14
作者 张召悦 阳颖 《航空学报》 EI CAS CSCD 北大核心 2024年第13期381-392,共12页
为更好地发挥ADS-B数据应用价值,针对ADS-B数据质量评估过程中传统方法无法客观准确得到质量等级的问题,在分析行业应用、发射设备性能、数据安全等方面对ADS-B数据质量需求的基础上,构建了ADS-B数据质量评估指标体系,提出了基于集成学... 为更好地发挥ADS-B数据应用价值,针对ADS-B数据质量评估过程中传统方法无法客观准确得到质量等级的问题,在分析行业应用、发射设备性能、数据安全等方面对ADS-B数据质量需求的基础上,构建了ADS-B数据质量评估指标体系,提出了基于集成学习自适应提升算法(AdaBoost)的新型数据质量评估方法。该方法通过K-means聚类确定最佳质量等级类别,结合熵权法和双基点法(TOPSIS)打分确定数据标签,并采用AdaBoost算法对评估模型进行了训练和优化。以天津机场数据为例,实验得出ADS-B数据质量的最佳等级划分为5级,得到的数据质量评估模型准确率高达98.5%,验证了该方法可以有效避免主观因素对评估结果的影响,并得到最优的质量等级划分,能够提高评估结果的稳定性和精确度。 展开更多
关键词 ADS-B数据质量 K-MEANS聚类 熵权法 双基点法 TOPSIS 自适应提升算法
原文传递
Knowledge Based Consolidation of UML Diagrams for Creation of Virtual Enterprise
15
作者 Debasis Chanda Dwijesh Dutta Majumder Swapan Bhattacharya 《Intelligent Information Management》 2010年第3期159-177,共19页
In this paper we address the problem related to determination of the most suitable candidates for an M&amp;A (Merger &amp;Acquisition) scenario of Banks/Financial Institutions. During the pre-merger period of ... In this paper we address the problem related to determination of the most suitable candidates for an M&amp;A (Merger &amp;Acquisition) scenario of Banks/Financial Institutions. During the pre-merger period of an M&amp;A, a number of candidates may be available to undergo the Merger/Acquisition, but all of them may not be suitable. The normal practice is to carry out a due diligence exercise to identify the candidates that should lead to optimum increase in shareholder value and customer satisfaction, post-merger. The due diligence ought to be able to determine those candidates that are unsuitable for merger, those candidates that are relatively suitable, and those that are most suitable. Towards achieving the above objective, we propose a Fuzzy Data Mining Framework wherein Fuzzy Cluster Analysis concept is used for advisability of merger of two banks and other Financial Institutions. Subsequently, we propose orchestration/composition of business processes of two banks into consolidated business process during Merger &amp;Acquisition (M&amp;A) scenario. Our paper discusses modeling of individual business process with UML, and the consolidation of the individual business process models by means of our proposed Knowledge Based approach. 展开更多
关键词 Knowledge base PREDICATE CALCULUS Service Oriented Architecture UML Fuzzy data Mining cluster Analysis
暂未订购
基于改进DBSCAN和距离共识评估的分段点云去噪方法 被引量:6
16
作者 葛程鹏 赵东 +1 位作者 王蕊 马庆华 《系统仿真学报》 CAS CSCD 北大核心 2024年第8期1800-1809,共10页
针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行... 针对点云数据中噪声点的剔除问题,提出了一种基于改进DBSCAN(density-based spatial clustering of applications with noise)算法的多尺度点云去噪方法。应用统计滤波对孤立离群点进行预筛选,去除点云中的大尺度噪声;对DBSCAN算法进行优化,减少算法时间复杂度和实现参数的自适应调整,以此将点云分为正常簇、疑似簇及异常簇,并立即去除异常簇;利用距离共识评估法对疑似簇进行精细判定,通过计算疑似点与其最近的正常点拟合表面之间的距离,判定其是否为异常,有效保持了数据的关键特征和模型敏感度。利用该方法对两个船体分段点云进行去噪,并与其他去噪算法进行对比,结果表明,该方法在去噪效率和特征保持方面具有优势,精确地保留了点云数据的几何特性。 展开更多
关键词 点云去噪 点云数据 DBSCAN(density-based spatial clustering of applications with noise)聚类 距离共识评估 特征保持
原文传递
On Density-Based Data Streams Clustering Algorithms: A Survey 被引量:10
17
作者 Amineh Amini Teh Ying Wah Hadi Saboohi 《Journal of Computer Science & Technology》 SCIE EI CSCD 2014年第1期116-141,共26页
Clustering data streams has drawn lots of attention in the last few years due to their ever-growing presence. Data streams put additional challenges on clustering such as limited time and memory and one pass clusterin... Clustering data streams has drawn lots of attention in the last few years due to their ever-growing presence. Data streams put additional challenges on clustering such as limited time and memory and one pass clustering. Furthermore, discovering clusters with arbitrary shapes is very important in data stream applications. Data streams are infinite and evolving over time, and we do not have any knowledge about the number of clusters. In a data stream environment due to various factors, some noise appears occasionally. Density-based method is a remarkable class in clustering data streams, which has the ability to discover arbitrary shape clusters and to detect noise. Furthermore, it does not need the nmnber of clusters in advance. Due to data stream characteristics, the traditional density-based clustering is not applicable. Recently, a lot of density-based clustering algorithms are extended for data streams. The main idea in these algorithms is using density- based methods in the clustering process and at the same time overcoming the constraints, which are put out by data streanFs nature. The purpose of this paper is to shed light on some algorithms in the literature on density-based clustering over data streams. We not only summarize the main density-based clustering algorithms on data streams, discuss their uniqueness and limitations, but also explain how they address the challenges in clustering data streams. Moreover, we investigate the evaluation metrics used in validating cluster quality and measuring algorithms' performance. It is hoped that this survey will serve as a steppingstone for researchers studying data streams clustering, particularly density-based algorithms. 展开更多
关键词 data stream density-based clustering grid-based clustering micro-clustering
原文传递
基于机器学习空间聚类的出租车停靠站点布局规划 被引量:2
18
作者 年光跃 黄建云 潘海啸 《交通运输研究》 2024年第1期10-17,27,共9页
针对出租车随意停靠给城市交通带来的负面影响,为规范出租车营运秩序、改善出租车营运环境和居民乘车条件,提出一种将出租车出行空间信息与机器学习算法相结合的出租车停靠站点布局规划方法。首先利用出租车GPS轨迹数据提取出租车出行起... 针对出租车随意停靠给城市交通带来的负面影响,为规范出租车营运秩序、改善出租车营运环境和居民乘车条件,提出一种将出租车出行空间信息与机器学习算法相结合的出租车停靠站点布局规划方法。首先利用出租车GPS轨迹数据提取出租车出行起点,然后采用HDBSCAN聚类算法对起点进行空间密度聚类,形成聚类簇后以其中心点作为出租车停靠站点布局的备选点。最后,为验证所提方法的可行性和有效性,选取重庆市中心城区一土地利用类型丰富、人口密度高的典型区域进行案例分析。结果显示,107个备选点主要分布于商业中心区和居住集中区,与出租车出行高需求区域的空间分布基本吻合;布局的出租车停靠站点在300 m范围内的覆盖率达到76.0%,未覆盖区域主要为城市绿地和水体。研究表明,机器学习算法可实现出租车停靠站点的高效布局规划,但在规划和实施阶段,停靠站点的设置还应结合邻近区域的建成环境特点综合考虑。 展开更多
关键词 城市交通 布局规划 空间聚类 出租车停靠站点 轨迹数据 机器学习算法 HDBSCAN
在线阅读 下载PDF
基于改进密度聚类法的高压加热器传热系数研究
19
作者 钱虹 王海心 张栋良 《热能动力工程》 CAS CSCD 北大核心 2024年第3期100-108,共9页
为了及时发现并处理高压加热器运行经济性失常,采用传热系数直观地反映高压加热器的运行效率,提出基于时序数据分析方法得到传热系数的在线动态模型。首先通过热动力学机理分析得到影响高压加热器传热系数的主要特征参数并建立基于特征... 为了及时发现并处理高压加热器运行经济性失常,采用传热系数直观地反映高压加热器的运行效率,提出基于时序数据分析方法得到传热系数的在线动态模型。首先通过热动力学机理分析得到影响高压加热器传热系数的主要特征参数并建立基于特征参数的动态模型;其次,通过蜻蜓算法改进的密度聚类方法构建具有最优邻域参数的优化聚类模型,得到可信端差区间。通过一段时间的某电厂的计算结果比较表明,基于改进密度聚类法的传热系数在线动态模型在计算高压加热器传热系数时均方误差MSE低至0.030 5%,说明该模型有效、可行。 展开更多
关键词 高压加热器 传热系数 数据处理 蜻蜓算法 密度聚类
原文传递
基于规则库和聚类分析的复句短语字段的自动识别研究 被引量:9
20
作者 胡金柱 俞小娟 +1 位作者 李琼 周毕吉 《华中师范大学学报(自然科学版)》 CAS CSCD 2008年第2期190-194,共5页
复句层次结构与层次关系研究,是一项将中文信息处理由字、词处理阶段提升到句处理阶段的关键性难题.在研究复句层次划分和层次关系之前,首先要确定复句中分句的数量,需要排除其中非完整分句的字段(本文中称之为短语字段).结合语言学的... 复句层次结构与层次关系研究,是一项将中文信息处理由字、词处理阶段提升到句处理阶段的关键性难题.在研究复句层次划分和层次关系之前,首先要确定复句中分句的数量,需要排除其中非完整分句的字段(本文中称之为短语字段).结合语言学的相关理论,首先建立规则库,在此基础上,引入聚类分析法,对短语字段进行分类,最终使短语字段的自动识别率达到92.1%. 展开更多
关键词 短语字段 规则库 聚类分析 变量
在线阅读 下载PDF
上一页 1 2 10 下一页 到第
使用帮助 返回顶部