期刊文献+
共找到116篇文章
< 1 2 6 >
每页显示 20 50 100
A Systematic Review of Greenhouse Gas Emissions Derived From Combined Sewer Overflows and Synergistic Control Strategies Toward Carbon Neutrality
1
作者 Yilin Xu Cheng Ye +1 位作者 Zuxin Xu Wenhai Chu 《Engineering》 2025年第7期40-51,共12页
Climate change is accelerating globally,raising significant concerns regarding the environmental risks associated with combined sewer overflows(CSOs).These rainfall events lead to the excessive discharge of multiple p... Climate change is accelerating globally,raising significant concerns regarding the environmental risks associated with combined sewer overflows(CSOs).These rainfall events lead to the excessive discharge of multiple pollutants into natural waters.However,greenhouse gas(GHG)emissions from CSOs,which are crucial for carbon neutrality in urban water systems,remain fragmented.Using the life-cycle assess-ment method expansion approach,this study breaks down the formation and discharge processes of CSOs and uncovers the underlying mechanisms driving GHG emissions during each period.Given the complex-ity and uncertainty in the spatial distribution of GHG emissions from CSOs,the development of standard monitoring and estimation methods is vital.This study identifies the factors influencing GHG emissions within the urban drainage system(UDS)and defines the interactive GHG emission boundaries and accounting framework related to CSOs.This framework is expanded to consider the hybrid nature of urban engineering and hydraulic interactions during the CSO events.Advanced modeling technologies have emerged as essential tools for predicting and managing GHG emissions from CSOs.This review pro-motes comprehensive data-driven methods for predicting GHG emissions from CSOs,fully considering the inherent heterogeneity of CSOs and the impact of multi-source contaminants discharged into aquatic environments.It emphasizes refining emission boundary definitions,novel accounting practices adapting data-driven methods,and comprehensive management strategies in line with the move toward carbon neutrality in the UDS.It advocates the adoption of solutions including advanced technologies and artifi-cial intelligent methods to mitigate CSO-related GHG emissions,stressing the significance of integrating low-carbon solutions and a comprehensive data-driven management framework in future research directions. 展开更多
关键词 Combined sewer overflow Greenhouse gas emission Data-driven models Urban water management Integrated control strategy
在线阅读 下载PDF
Poisson Process and Its Application to the Storm Water Overflows 被引量:1
2
作者 Malick Baldeh Chris Samba +1 位作者 Kenneth Tuffour Assane Boya 《Computational Water, Energy, and Environmental Engineering》 2016年第2期47-53,共7页
The homogenous Poisson process is often used to describe the event arrivals. Such Poisson process has been applied in various areas. This study focuses on the arrival pattern of storm water overflows. A set of overflo... The homogenous Poisson process is often used to describe the event arrivals. Such Poisson process has been applied in various areas. This study focuses on the arrival pattern of storm water overflows. A set of overflow data was obtained from the storm water pipeline of a municipality. The aim is to verify the overflow arrival pattern and check whether the Poisson process can be applied. The adopted method is the analysis over the inter-arrival times. The exponential distribution test is conducted on the annual data set as well as the entire data set. The results show that all data sets follow the exponential distribution. With the verification of Poisson process, specific examples are also given to show how the Poisson process properties can be used in the management of storm water pipeline management. For other data that are featured with various heterogeneities, the homogenous Poisson process might not be able to be verified and used. Under such circumstances, non-homogenous survival model can be used to simulate the arrival process. 展开更多
关键词 Storm Water Overflow Poisson Process Exponential Distribution Weibull Distribution
在线阅读 下载PDF
Arrival Analysis of Dry Weather Sanitary Sewer Overflows
3
作者 Kenneth Tuffour Chris Samba 《Open Journal of Civil Engineering》 2016年第3期462-468,共7页
This study investigates arrivals of sanitary sewer overflows collected from a municipality. The data set consists of recorded overflows from 2011 to 2014 during dry weather. Reliability analysis is conducted upon each... This study investigates arrivals of sanitary sewer overflows collected from a municipality. The data set consists of recorded overflows from 2011 to 2014 during dry weather. Reliability analysis is conducted upon each data set. The Weibull distribution is adopted to evaluate the data sets. The results show that the arrival of dry weather SSOs cannot be simply modeled with a Poisson process that is featured with a constant arrival rate. For annual data set, 2-parameter Weibull generally has an acceptable fitting (except 2014 data). The shape parameters are close to 1 or a little greater than 1, indicating relatively constant arrival rate or slightly increased rate. For the entire data set, the 3-parameter Weibull distribution is able to fit the data well. The shape parameter is also greater than 1. Therefore, an increased SSO arrival rate is noticed for this data set. There are needs to make more efforts in maintaining the sewer system. 展开更多
关键词 Poisson Process Weibull Distribution Sanitary Sewer Overflow Shapeparameter
在线阅读 下载PDF
Erratum to:Flooding(or breaching)of inter-connected proglacial lakes by cascading overflow in the arid region of Western Mongolia(Mt.Tsambagarav,Mongolian Altai)
4
作者 Otgonbayar DEMBEREL Chinmay DASH +6 位作者 Battsetseg DUGERSUREN Munkhbat BAYARMAA Yeong Bae SEONG Elora CHAKRABORTY Batsuren DORJSUREN Atul SINGH Nemekhbayar GANHUYAG 《Journal of Mountain Science》 2025年第5期1888-1888,共1页
The author affiliation and the funding information in the Acknowledgement section of the online version of the original article was revised.One affiliation(the 8th affiliation)of the first author is added.The Acknowle... The author affiliation and the funding information in the Acknowledgement section of the online version of the original article was revised.One affiliation(the 8th affiliation)of the first author is added.The Acknowledgement section of the original article has been revised to:Acknowledgments:This research was funded by the National University of Mongolia under grant agreement P2023(grant number P2023-4578)and supported by the Chey Institute for Advanced Studies“International Scholarship Exchange Fellowship for the academic year of 2024-2025”,Republic of Korea,and the National University of Mongolia.We would like to acknowledge the National University of Mongolia and Soumik Das from the Center for the Study of Regional Development,Jawaharlal Nehru University,New Delhi-110067,for his valuable assistance in preparing the geological maps. 展开更多
关键词 Mongolia proglacial lakes Mongolian Altai arid region tsambagarav cascading overflow geological maps FLOOD
原文传递
Flood Overflows Jinshan Temple (Chinese Painting)
5
《Women of China》 1999年第6期30-30,共1页
The Chinese painting Flood Ouerflows Jinshan Temple draws its subject from a beautiful and well-known legend,The white Snake.In the tale Jinshan was an islet in the Yangtze River of yesteryear.Inorder to aave her husb... The Chinese painting Flood Ouerflows Jinshan Temple draws its subject from a beautiful and well-known legend,The white Snake.In the tale Jinshan was an islet in the Yangtze River of yesteryear.Inorder to aave her husband kept in a temple at the top of the isle.Bai Niangzi.incarnation of the whitesnake.bravely fought Monk Fahat.She borrowed the Yangtze River’s water to overcome Jinshan Templeand force Fahai to release her husband. 展开更多
关键词 Flood overflows Jinshan Temple Chinese Painting
原文传递
一种结合代码片段和混合主题模型的软件数据聚类方法 被引量:2
6
作者 魏林林 沈国华 +2 位作者 黄志球 蔡梦男 郭菲菲 《计算机科学》 CSCD 北大核心 2024年第6期44-51,共8页
使用主题模型进行文档聚类是众多文本挖掘任务中一种常见的做法。许多研究针对软件问答网站的数据,利用主题模型进行聚类来分析不同领域在社区的发展情况。然而,这些软件相关数据往往包含代码片段且文本长度分布不均,使用传统单一的主... 使用主题模型进行文档聚类是众多文本挖掘任务中一种常见的做法。许多研究针对软件问答网站的数据,利用主题模型进行聚类来分析不同领域在社区的发展情况。然而,这些软件相关数据往往包含代码片段且文本长度分布不均,使用传统单一的主题模型对文本数据建模,易得到不稳定的聚类结果。文中提出了一种结合代码片段和混合主题模型的聚类方法,并使用Stack Overflow作为数据源,构造了在该平台上被提问数量排名前60的Python第三方库数据集,经过建模,该数据集最终划分为以下6个不同的领域:网络安全、数据分析、人工智能、文本处理、软件开发和系统终端。实验结果表明,在自动评估和人工评估的指标上,使用代码片段结合文本进行主题建模,在聚类结果划分的质量上表现良好,而联合多个模型进行实验,一定程度上提高了聚类结果的稳定性和准确性。 展开更多
关键词 代码片段 主题模型 Stack Overflow PYTHON 聚类
在线阅读 下载PDF
Effects of retained dry material on the impact,overflow and landing dynamics
7
作者 Jun Fang Yifei Cui Haiming Liu 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第9期3629-3640,共12页
During long-term operation,the performance of obstacles would be changed due to the material accumulating upslope the obstacle.However,the effects of retained material on impact,overflow and landing dynamics of granul... During long-term operation,the performance of obstacles would be changed due to the material accumulating upslope the obstacle.However,the effects of retained material on impact,overflow and landing dynamics of granular flow have not yet been elucidated.To address this gap,physical flume tests and discrete element simulations are conducted considering a range of normalized deposition height h0/H from 0 to 1,where h0 and H represent the deposition height and obstacle height,respectively.An analytical model is modified to evaluate the flow velocity and flow depth after interacting with the retained materials,which further serve to calculate the peak impact force on the obstacle.Notably,the computed impact forces successfully predict the experimental results when a≥25°.In addition,the results indicate that a higher h0/H leads to a lower dynamic impact force,a greater landing distance L,and a larger landing coefficient Cr,where Cr is the ratio of slope-parallel component of landing velocity to flow velocity just before landing.Compared to the existing overflow model,the measured landing distance L is underestimated by up to 30%,and therefore it is insufficient for obstacle design when there is retained material.Moreover,the recommended Cr in current design practice is found to be nonconservative for estimating the landing velocity of geophysical flow.This study provides insightful scientific basis for designing obstacles with deposition. 展开更多
关键词 Granular flow Obstacle deposition Impact OVERFLOW LANDING
在线阅读 下载PDF
Optimized parameters of downhole all-metal PDM based on genetic algorithm
8
作者 Jia-Xing Lu Ling-Rong Kong +2 位作者 Yu Wang Chao Feng Yu-Lin Gao 《Petroleum Science》 SCIE EI CAS CSCD 2024年第4期2663-2676,共14页
Currently,deep drilling operates under extreme conditions of high temperature and high pressure,demanding more from subterranean power motors.The all-metal positive displacement motor,known for its robust performance,... Currently,deep drilling operates under extreme conditions of high temperature and high pressure,demanding more from subterranean power motors.The all-metal positive displacement motor,known for its robust performance,is a critical choice for such drilling.The dimensions of the PDM are crucial for its performance output.To enhance this,optimization of the motor's profile using a genetic algorithm has been undertaken.The design process begins with the computation of the initial stator and rotor curves based on the equations for a screw cycloid.These curves are then refined using the least squares method for a precise fit.Following this,the PDM's mathematical model is optimized,and motor friction is assessed.The genetic algorithm process involves encoding variations and managing crossovers to optimize objective functions,including the isometric radius coefficient,eccentricity distance parameter,overflow area,and maximum slip speed.This optimization yields the ideal profile parameters that enhance the motor's output.Comparative analyses of the initial and optimized output characteristics were conducted,focusing on the effects of the isometric radius coefficient and overflow area on the motor's performance.Results indicate that the optimized motor's overflow area increased by 6.9%,while its rotational speed reduced by 6.58%.The torque,as tested by Infocus,saw substantial improvements of38.8%.This optimization provides a theoretical foundation for improving the output characteristics of allmetal PDMs and supports the ongoing development and research of PDM technology. 展开更多
关键词 Positive displacement motor Genetic algorithm Profile optimization Matlab programming Overflow area
原文传递
Overflowing phenomenon during ultrasonic treatment in Al-Si alloys 被引量:5
9
作者 张宇博 卢一平 +3 位作者 接金川 傅莹 钟德水 李廷举 《Transactions of Nonferrous Metals Society of China》 SCIE EI CAS CSCD 2013年第11期3242-3248,共7页
At the late stage of solidification with ultrasonic treatment (UST) in Al-Si alloys, a part of semisolid overflows and climbs along the probe. The interesting phenomenon and its influence on the solidification micro... At the late stage of solidification with ultrasonic treatment (UST) in Al-Si alloys, a part of semisolid overflows and climbs along the probe. The interesting phenomenon and its influence on the solidification microstructure were investigated in order to better study the mechanism of UST. It is considered that the overflowing phenomenon occurs due to the changes of vibration and flow in the remaining semisolid. Because the overflowed portion comes from the region with intense UST effect and vibrates with the probe during solidification, great modification of primary and euteetic Si (about 10 pm in length) and refinement of primary a(Al) (about 70 μm in size) are observed in this portion. 展开更多
关键词 Al-Si alloy ultrasonic treatment overflowing phenomenon solidification microstructure
在线阅读 下载PDF
基于问答语义匹配的知识社区新问题专家推荐方法 被引量:2
10
作者 杜军威 邹树林 +3 位作者 李浩杰 江峰 于旭 胡强 《电子学报》 EI CAS CSCD 北大核心 2023年第7期1875-1888,共14页
传统的知识社区专家推荐方法采用文本相似度匹配机理,并基于问题或专家描述来构建专家特征.这些方法没有利用问题与答案的语义匹配关系,因此难以充分挖掘专家回答问题的能力特征,影响推荐性能.提出一种基于综合历史和当前问答语义匹配... 传统的知识社区专家推荐方法采用文本相似度匹配机理,并基于问题或专家描述来构建专家特征.这些方法没有利用问题与答案的语义匹配关系,因此难以充分挖掘专家回答问题的能力特征,影响推荐性能.提出一种基于综合历史和当前问答语义匹配的知识社区新问题的专家推荐方法(History-Now Semantics Expert RECommendation model,HNS-EREC).首先,采用反馈评价和负采样技术来处理数据集中的两类不平衡现象;其次,基于问答语义来提取专家回答问题能力特征;最后,提出一种基于问答语义匹配的History-Now联合专家推荐模型,该模型能够实现面向专家的历史问答和当前问答的语义联合学习.实验结果表明,相对于其他方法,本文所提出的HNS-EREC方法在新问题专家推荐方面具有显著的优势. 展开更多
关键词 专家推荐 知识社区 不平衡学习 问答语义 stack overflow
在线阅读 下载PDF
Stack Overflow上机器学习相关问题的大规模实证研究 被引量:4
11
作者 万志远 陶嘉恒 +4 位作者 梁家坤 才振功 苌程 乔林 周巧妮 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2019年第5期819-828,共10页
为了调查机器学习相关主题分布和发展趋势,从在线问答网站Stack Overflow上,利用过滤标签,从4 178多万帖子中提取出60 028个与机器学习相关的问题帖.通过分析问题帖,统计各个机器学习平台的讨论量,发现Scikit-learn、TensorFlow、Keras... 为了调查机器学习相关主题分布和发展趋势,从在线问答网站Stack Overflow上,利用过滤标签,从4 178多万帖子中提取出60 028个与机器学习相关的问题帖.通过分析问题帖,统计各个机器学习平台的讨论量,发现Scikit-learn、TensorFlow、Keras是前3位频繁被讨论的机器学习平台,占总讨论量的58%.为了进一步分析机器学习相关讨论主题,进行潜在狄利克雷分布(LDA)主题模型训练,提出自适应LDA中的主题数渐进搜索方法,采用主题一致性系数评估输出结果,获得主题最佳数量,从而发现9个讨论主题,分属3个类别:代码相关、模型相关、理论相关.基于主题中问题帖的浏览数、评论数,分析不同主题的流行度和回答困难程度. 展开更多
关键词 实证研究 机器学习 STACK OVERFLOW 潜在狄利克雷分布(LDA) 主题一致性
在线阅读 下载PDF
台山核电厂淡水水源工程水库溢流坝消能试验研究 被引量:2
12
作者 黄智敏 何小惠 +2 位作者 付波 陈卓英 钟勇明 《水电能源科学》 北大核心 2010年第8期76-79,共4页
以台山核电厂淡水水源工程为例,通过水力模型试验研究,推荐溢流坝采用宽尾墩+坝面削角阶梯+底流消力池的联合消能方案。试验结果表明,该方案消能效果较显著、工程量小,优化了溢流坝工程布置和体型。
关键词 核电厂 淡水 水源工程 水库 溢流坝 消能效果 模型试验研究 Nuclear Power Plant Fresh Water OVERFLOW Dam Energy Dissipation 消能方案 试验结果 工程布置 消力池 宽尾墩 工程量 优化 体型 水力
原文传递
Probability mass first flush evaluation for combined sewer discharges 被引量:5
13
作者 Inhyeok Park Hongmyeong Kim +1 位作者 Soo-Kwon Chae Sungryong Ha 《Journal of Environmental Sciences》 SCIE EI CAS CSCD 2010年第6期915-922,共8页
The Korea government has put in a lot of effort to construct sanitation facilities for controlling non-point source pollution. The first flush phenomenon is a prime example of such pollution. However, to date, several... The Korea government has put in a lot of effort to construct sanitation facilities for controlling non-point source pollution. The first flush phenomenon is a prime example of such pollution. However, to date, several serious problems have arisen in the operation and treatment effectiveness of these facilities due to unsuitable design flow volumes and pollution loads. It is difficult to assess the optimal flow volume and pollution mass when considering both monetary and temporal limitations. The objective of this article was to characterize the discharge of storm runoff pollution from urban catchments in Korea and to estimate the probability of mass first flush (MFFn) using the storm water management model and probability density functions. As a result of the review of gauged storms for the representative using probability density function with rainfall volumes during the last two years, all the gauged storms were found to be valid representative precipitation. Both the observed MFFn and probability MFFn in BE-1 denoted similarly large magnitudes of first flush with roughly 40% of the total pollution mass contained in the first 20% of the runoff. In the case of BE-2, however, there were significant difference between the observed MFFn and probability MFFn. 展开更多
关键词 first flush combined sewer overflows (CSOs) probability mass first flush storm water management model best management practices
原文传递
基于数据挖掘的专业可信回答者个性化推荐——以Stack Overflow问答社区为例 被引量:4
14
作者 刘迎春 朱旭 +1 位作者 谢年春 李佳 《现代教育技术》 CSSCI 北大核心 2019年第5期78-84,共7页
针对问答社区中问题不能得到及时、有效解答的现状,文章以Stack Overflow问答社区为例,首先介绍了问答社区数据的采集与预处理情况;然后,通过挖掘学习者信息,得到专业可信回答者、高信誉回答者和徽章回答者三种潜在回答者;最后,实施了... 针对问答社区中问题不能得到及时、有效解答的现状,文章以Stack Overflow问答社区为例,首先介绍了问答社区数据的采集与预处理情况;然后,通过挖掘学习者信息,得到专业可信回答者、高信誉回答者和徽章回答者三种潜在回答者;最后,实施了三种回答者推荐并对比了推荐性能。实验结果表明,与高信誉回答者推荐和徽章回答者推荐相比,考虑回答质量和专业相关性的专业可信回答者推荐具有更高的准确率和召回率,其推荐性能更优。实施基于数据挖掘的专业可信回答者个性化推荐,能有效缓解问答社区的信息过载问题,有助于建设更高效的网络学习社区环境。 展开更多
关键词 专业可信度 回答者推荐 数据挖掘 STACK Overflow问答社区
在线阅读 下载PDF
Method of integer overflow detection to avoid buffer overflow 被引量:3
15
作者 张实睿 许蕾 徐宝文 《Journal of Southeast University(English Edition)》 EI CAS 2009年第2期219-223,共5页
A simplified integer overflow detection method based on path relaxation is described for avoiding buffer overflow triggered by integer overflow. When the integer overflow refers to the size of the buffer allocated dyn... A simplified integer overflow detection method based on path relaxation is described for avoiding buffer overflow triggered by integer overflow. When the integer overflow refers to the size of the buffer allocated dynamically, this kind of integer overflow is most likely to trigger buffer overflow. Based on this discovery, through lightly static program analysis, the solution traces the key variables referring to the size of a buffer allocated dynamically and it maintains the upper bound and lower bound of these variables. After the constraint information of these traced variables is inserted into the original program, this method tests the program with test cases through path relaxation, which means that it not only reports the errors revealed by the current runtime value of traced variables contained in the test case, but it also examines the errors possibly occurring under the same execution path with all the possible values of the traced variables. The effectiveness of this method is demonstrated in a case study. Compared with the traditional buffer overflow detection methods, this method reduces the burden of detection and improves efficiency. 展开更多
关键词 integer overflow buffer overflow path relaxation
在线阅读 下载PDF
Lossless Mapping from Semi-Structured Data to Structured Data 被引量:2
16
作者 李文武 金远平 童咪娜 《Journal of Southeast University(English Edition)》 EI CAS 2002年第1期46-53,共8页
Most semi-structured data are of certain structure regularity. Having beenstored as structured data in relational database (RDB), they can be effectively managed by databasemanagement system (DBMS). Some semi-structur... Most semi-structured data are of certain structure regularity. Having beenstored as structured data in relational database (RDB), they can be effectively managed by databasemanagement system (DBMS). Some semi-structured data are difficult to transform due to theirirregular structures. We design an efficient algorithm and data structure for ensuring losslesstransformation. We bring forward an approach of schema extraction through data mining, in whichdifferent kinds of elements are transformed respectively and lossless mapping from semi-structureddata to structured data can be achieved. 展开更多
关键词 semi-structured data DTD RDB schema mapping overflow data
在线阅读 下载PDF
基于Stack Overflow的数据库相关主题分析 被引量:3
17
作者 刘蕴涵 沙朝锋 牛军钰 《计算机科学》 CSCD 北大核心 2021年第6期48-56,共9页
数据库管理系统虽是一种较为成熟的软件系统,但开发人员在应用数据库系统进行数据管理以及数据分析时还是会遇到各种问题,因此会在Stack Overflow之类的问答论坛上寻求解决方法。文中获取了Stack Overflow上94473条与数据库相关的问题,... 数据库管理系统虽是一种较为成熟的软件系统,但开发人员在应用数据库系统进行数据管理以及数据分析时还是会遇到各种问题,因此会在Stack Overflow之类的问答论坛上寻求解决方法。文中获取了Stack Overflow上94473条与数据库相关的问题,应用LDA主题模型将这些问题归为25个主题,结果显示开发者的问题可归为"表""SQL""SELECT"等主题。通过研究与数据库相关的不同主题的流行度和困难程度发现,"SQL"主题相关的问题较为流行。除此以外,文中还分别研究了3种不同的数据库,即MySQL,Oracle和MongoDB,分析了与不同数据库系统相关的问题的主题分布。文中的研究成果有助于了解数据库开发者所面临的挑战,从而为数据库系统版本更新、数据库课程教学内容的设置,甚至是数据库领域的研究问题提供参考。 展开更多
关键词 Stack Overflow 数据库 LDA 主题建模
在线阅读 下载PDF
Stack Overflow的缺陷代码特征分析与相似缺陷检测 被引量:2
18
作者 亢振兴 赵逢禹 刘亚 《小型微型计算机系统》 CSCD 北大核心 2021年第3期661-665,共5页
目前在软件代码缺陷审查以及缺陷预测中,研究人员对源代码进行分析研究却忽略了代码的缺陷信息.本文通过对缺陷信息进行分析,发现缺陷信息对于相似缺陷的检测有着重要的参考价值.基于这一思想,本文分析软件缺陷社区Stack Overflow中关... 目前在软件代码缺陷审查以及缺陷预测中,研究人员对源代码进行分析研究却忽略了代码的缺陷信息.本文通过对缺陷信息进行分析,发现缺陷信息对于相似缺陷的检测有着重要的参考价值.基于这一思想,本文分析软件缺陷社区Stack Overflow中关于缺陷代码的信息,提出一种基于缺陷代码特征分析的相似缺陷检测方法.该方法首先对缺陷报告进行LDA主题分析并将缺陷报告分类到不同的主题(类别)中,统计得到高频缺陷类别;其次对于高频缺陷类别的缺陷代码提取特征;最后根据缺陷代码特征构建相似缺陷检测模型.为了验证相似缺陷检测模型的有效性,针对数据操作缺陷数据构建诊断模型并对该模型进行实证,实验结果表明该方法对检测其他代码中相似缺陷有较好的效果. 展开更多
关键词 Stack Overflow LDA 缺陷代码特征 特征相似度 相似缺陷检测
在线阅读 下载PDF
基于CBOW-LDA主题模型的Stack Overflow编程网站热点主题发现研究 被引量:5
19
作者 张景 朱国宾 《计算机科学》 CSCD 北大核心 2018年第4期208-214,共7页
Stack Overflow是一个热门的国外编程问答网站,通过对该网站编程提问帖的问题文本进行文本语义挖掘,能获析用户关注的编程热点。由于研究对象所代表的短文本信息具有高维性及分布不均的特点,易导致主题获取不明晰。文中提出一种基于LDA(... Stack Overflow是一个热门的国外编程问答网站,通过对该网站编程提问帖的问题文本进行文本语义挖掘,能获析用户关注的编程热点。由于研究对象所代表的短文本信息具有高维性及分布不均的特点,易导致主题获取不明晰。文中提出一种基于LDA(Latent Dirichlet Allocation)主题模型的CBOW-LDA建模方法,该方法对目标语料进行相似词聚类后再完成主题建模,能有效降低文本输入维度,使主题分布更明确。采集Stack Overflow网站上2010-2015年的问题帖数据集POST,并对其进行实验,同等主题数下采用文本建模中衡量模型性能的评价指标困惑度(Perplexity)来度量算法在不同数据集容量维度下的性能。结果表明,与现有的基于词频权重的词量化主题建模TFLDA方法相比,CBOW-LDA方法的困惑度更低,在实验语料下的困惑度降低约4.87%,证明了所提算法的性能更好。采用CBOW-LDA方法对Stack Overflow进行热点挖掘,同时使用TF-LDA方法进行对比实验,建立手工标注的标准评测集对两种方法获取的热门主题和热搜词汇进行查全率、查准率及F1值的判定,结果证实CBOW-LDA表现更佳,其热点挖掘效果较好。由实验结果可知,Java为该编程网站提问帖中最热门的主题,而C和Javascript则为该网站用户提问中被提及得最频繁的词汇。 展开更多
关键词 STACK OVERFLOW LDA-CBOW语言模型 主题发现 热门主题 困惑度
在线阅读 下载PDF
基于Word2Vec的编程领域词语拼写错误检测算法 被引量:4
20
作者 刘峻松 唐明靖 +1 位作者 薛岗 杨成荣 《计算机应用与软件》 北大核心 2022年第3期277-284,共8页
Stack Overflow是一个计算机编程领域的问答社区,其中的文本蕴含大量有价值的信息可供挖掘,但由于其本身存在大量的错误词汇,给文本的分析造成影响。对此,提出一种词语自动检测纠错算法,通过词向量的技术以语义相似度为核心,对错误词汇... Stack Overflow是一个计算机编程领域的问答社区,其中的文本蕴含大量有价值的信息可供挖掘,但由于其本身存在大量的错误词汇,给文本的分析造成影响。对此,提出一种词语自动检测纠错算法,通过词向量的技术以语义相似度为核心,对错误词汇进行分析,结合改进的编辑距离算法对文本进行自动检测纠错。实验结果表明,该算法能够对诸如此类专业性较强的领域主题文本进行自动检测纠错,并且能够较好地还原标准文段用词。 展开更多
关键词 词向量 编辑距离 拼写纠错 Word2Vec Stack Overflow
在线阅读 下载PDF
上一页 1 2 6 下一页 到第
使用帮助 返回顶部