期刊文献+
共找到1,461篇文章
< 1 2 74 >
每页显示 20 50 100
TBM big data preprocessing method in machine learning and its application to tunneling
1
作者 Xinyue Zhang Xiaoping Zhang +3 位作者 Quansheng Liu Weiqiang Xie Shaohui Tang Zengmao Wang 《Journal of Rock Mechanics and Geotechnical Engineering》 2025年第8期4762-4783,共22页
The big data generated by tunnel boring machines(TBMs)are widely used to reveal complex rock-machine interactions by machine learning(ML)algorithms.Data preprocessing plays a crucial role in improving ML accuracy.For ... The big data generated by tunnel boring machines(TBMs)are widely used to reveal complex rock-machine interactions by machine learning(ML)algorithms.Data preprocessing plays a crucial role in improving ML accuracy.For this,a TBM big data preprocessing method in ML was proposed in the present study.It emphasized the accurate division of TBM tunneling cycle and the optimization method of feature extraction.Based on the data collected from a TBM water conveyance tunnel in China,its effectiveness was demonstrated by application in predicting TBM performance.Firstly,the Score-Kneedle(S-K)method was proposed to divide a TBM tunneling cycle into five phases.Conducted on 500 TBM tunneling cycles,the S-K method accurately divided all five phases in 458 cycles(accuracy of 91.6%),which is superior to the conventional duration division method(accuracy of 74.2%).Additionally,the S-K method accurately divided the stable phase in 493 cycles(accuracy of 98.6%),which is superior to two state-of-the-art division methods,namely the histogram discriminant method(accuracy of 94.6%)and the cumulative sum change point detection method(accuracy of 92.8%).Secondly,features were extracted from the divided phases.Specifically,TBM tunneling resistances were extracted from the free rotating phase and free advancing phase.The resistances were subtracted from the total forces to represent the true rock-fragmentation forces.The secant slope and the mean value were extracted as features of the increasing phase and stable phase,respectively.Finally,an ML model integrating a deep neural network and genetic algorithm(GA-DNN)was established to learn the preprocessed data.The GA-DNN used 6 secant slope features extracted from the increasing phase to predict the mean field penetration index(FPI)and torque penetration index(TPI)in the stable phase,guiding TBM drivers to make better decisions in advance.The results indicate that the proposed TBM big data preprocessing method can improve prediction accuracy significantly(improving R2s of TPI and FPI on the test dataset from 0.7716 to 0.9178 and from 0.7479 to 0.8842,respectively). 展开更多
关键词 Tunnel boring machine Big data preprocessing Division of tunneling cycle Tunneling resistance Machine learning
在线阅读 下载PDF
Hybrid 1DCNN-Attention with Enhanced Data Preprocessing for Loan Approval Prediction
2
作者 Yaru Liu Huifang Feng 《Journal of Computer and Communications》 2024年第8期224-241,共18页
In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model... In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model with 1DCNN-attention network and the enhanced preprocessing techniques is proposed for loan approval prediction. Our proposed model consists of the enhanced data preprocessing and stacking of multiple hybrid modules. Initially, the enhanced data preprocessing techniques using a combination of methods such as standardization, SMOTE oversampling, feature construction, recursive feature elimination (RFE), information value (IV) and principal component analysis (PCA), which not only eliminates the effects of data jitter and non-equilibrium, but also removes redundant features while improving the representation of features. Subsequently, a hybrid module that combines a 1DCNN with an attention mechanism is proposed to extract local and global spatio-temporal features. Finally, the comprehensive experiments conducted validate that the proposed model surpasses state-of-the-art baseline models across various performance metrics, including accuracy, precision, recall, F1 score, and AUC. Our proposed model helps to automate the loan approval process and provides scientific guidance to financial institutions for loan risk control. 展开更多
关键词 Loan Approval Prediction Deep Learning One-Dimensional Convolutional Neural Network Attention Mechanism data preprocessing
在线阅读 下载PDF
Data preprocessing and preliminary results of the Moon-based Ultraviolet Telescope on the CE-3 lander 被引量:4
3
作者 Wei-Bin Wen Fang Wang +8 位作者 Chun-Lai Li Jing Wang Li Cao Jian-Jun Liu Xu Tan Yuan Xiao Qiang Fu Yan Su Wei Zuo 《Research in Astronomy and Astrophysics》 SCIE CAS CSCD 2014年第12期1674-1681,共8页
The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can... The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can make long-term continuous observations of a series of important celestial objects in the near ultra- violet band (245-340 nm), and perform a sky survey of selected areas, which can- not be completed on Earth. We can find characteristic changes in celestial brightness with time by analyzing image data from the MUVT, and deduce the radiation mech- anism and physical properties of these celestial objects after comparing with a phys- ical model. In order to explain the scientific purposes of MUVT, this article analyzes the preprocessing of MUVT image data and makes a preliminary evaluation of data quality. The results demonstrate that the methods used for data collection and prepro- cessing are effective, and the Level 2A and 2B image data satisfy the requirements of follow-up scientific researches. 展开更多
关键词 Chang'e-3 mission -- the Moon-based Ultraviolet Telescope -- data preprocessing -- near ultraviolet band
在线阅读 下载PDF
Diabetes Type 2: Poincaré Data Preprocessing for Quantum Machine Learning 被引量:1
4
作者 Daniel Sierra-Sosa Juan D.Arcila-Moreno +1 位作者 Begonya Garcia-Zapirain Adel Elmaghraby 《Computers, Materials & Continua》 SCIE EI 2021年第5期1849-1861,共13页
Quantum Machine Learning(QML)techniques have been recently attracting massive interest.However reported applications usually employ synthetic or well-known datasets.One of these techniques based on using a hybrid appr... Quantum Machine Learning(QML)techniques have been recently attracting massive interest.However reported applications usually employ synthetic or well-known datasets.One of these techniques based on using a hybrid approach combining quantum and classic devices is the Variational Quantum Classifier(VQC),which development seems promising.Albeit being largely studied,VQC implementations for“real-world”datasets are still challenging on Noisy Intermediate Scale Quantum devices(NISQ).In this paper we propose a preprocessing pipeline based on Stokes parameters for data mapping.This pipeline enhances the prediction rates when applying VQC techniques,improving the feasibility of solving classification problems using NISQ devices.By including feature selection techniques and geometrical transformations,enhanced quantum state preparation is achieved.Also,a representation based on the Stokes parameters in the PoincaréSphere is possible for visualizing the data.Our results show that by using the proposed techniques we improve the classification score for the incidence of acute comorbid diseases in Type 2 Diabetes Mellitus patients.We used the implemented version of VQC available on IBM’s framework Qiskit,and obtained with two and three qubits an accuracy of 70%and 72%respectively. 展开更多
关键词 Quantum machine learning data preprocessing stokes parameters Poincarésphere
在线阅读 下载PDF
Power Data Preprocessing Method of Mountain Wind Farm Based on POT-DBSCAN 被引量:1
5
作者 Anfeng Zhu Zhao Xiao Qiancheng Zhao 《Energy Engineering》 EI 2021年第3期549-563,共15页
Due to the frequent changes of wind speed and wind direction,the accuracy of wind turbine(WT)power prediction using traditional data preprocessing method is low.This paper proposes a data preprocessing method which co... Due to the frequent changes of wind speed and wind direction,the accuracy of wind turbine(WT)power prediction using traditional data preprocessing method is low.This paper proposes a data preprocessing method which combines POT with DBSCAN(POT-DBSCAN)to improve the prediction efficiency of wind power prediction model.Firstly,according to the data of WT in the normal operation condition,the power prediction model ofWT is established based on the Particle Swarm Optimization(PSO)Arithmetic which is combined with the BP Neural Network(PSO-BP).Secondly,the wind-power data obtained from the supervisory control and data acquisition(SCADA)system is preprocessed by the POT-DBSCAN method.Then,the power prediction of the preprocessed data is carried out by PSO-BP model.Finally,the necessity of preprocessing is verified by the indexes.This case analysis shows that the prediction result of POT-DBSCAN preprocessing is better than that of the Quartile method.Therefore,the accuracy of data and prediction model can be improved by using this method. 展开更多
关键词 Wind turbine SCADA data data preprocessing method power prediction
在线阅读 下载PDF
DATA PREPROCESSING AND RE KERNEL CLUSTERING FOR LETTER
6
作者 Zhu Changming Gao Daqi 《Journal of Electronics(China)》 2014年第6期552-564,共13页
Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing ... Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy. 展开更多
关键词 data preprocessing Kernel clustering Kernel Nearest Neighbor(KNN) Re kernel clustering
在线阅读 下载PDF
D-IMPACT: A Data Preprocessing Algorithm to Improve the Performance of Clustering
7
作者 Vu Anh Tran Osamu Hirose +8 位作者 Thammakorn Saethang Lan Anh T. Nguyen Xuan Tho Dang Tu Kien T. Le Duc Luu Ngo Gavrilov Sergey Mamoru Kubo Yoichi Yamada Kenji Satou 《Journal of Software Engineering and Applications》 2014年第8期639-654,共16页
In this study, we propose a data preprocessing algorithm called D-IMPACT inspired by the IMPACT clustering algorithm. D-IMPACT iteratively moves data points based on attraction and density to detect and remove noise a... In this study, we propose a data preprocessing algorithm called D-IMPACT inspired by the IMPACT clustering algorithm. D-IMPACT iteratively moves data points based on attraction and density to detect and remove noise and outliers, and separate clusters. Our experimental results on two-dimensional datasets and practical datasets show that this algorithm can produce new datasets such that the performance of the clustering algorithm is improved. 展开更多
关键词 ATTRACTION CLUSTERING data preprocessING DENSITY SHRINKING
在线阅读 下载PDF
An improved deep learning model for soybean future price prediction with hybrid data preprocessing strategy
8
作者 Dingya CHEN Hui LIU +1 位作者 Yanfei LI Zhu DUAN 《Frontiers of Agricultural Science and Engineering》 2025年第2期208-230,共23页
The futures trading market is an important part of the financial markets and soybeans are one of the most strategically important crops in the world.How to predict soybean future price is a challenging topic being stu... The futures trading market is an important part of the financial markets and soybeans are one of the most strategically important crops in the world.How to predict soybean future price is a challenging topic being studied by many researchers.This paper proposes a novel hybrid soybean future price prediction model which includes two stages of data preprocessing and deep learning prediction.In the data preprocessing stage,futures price series are decomposed into subsequences using the ICEEMDAN(improved complete ensemble empirical mode decomposition with adaptive noise)method.The Lempel-Ziv complexity determination method was then used to identify and reconstruct high-frequency subsequences.Finally,the high frequency component is decomposed secondarily using variational mode decomposition optimized by beluga whale optimization algorithm.In the deep learning prediction stage,a deep extreme learning machine optimized by the sparrow search algorithm was used to obtain the prediction results of all subseries and reconstructs them to obtain the final soybean future price prediction results.Based on the experimental results of soybean future price markets in China,Italy,and the United States,it was found that the hybrid method proposed provides superior performance in terms of prediction accuracy and robustness. 展开更多
关键词 Deep extreme learning machine hybrid data preprocessing optimization algorithm soybean future price prediction
原文传递
Hybrid Teaching Reform and Practice in Big Data Collection and Preprocessing Courses Based on the Bosi Smart Learning Platform 被引量:1
9
作者 Yang Wang Xuemei Wang Wanyan Wang 《Journal of Contemporary Educational Research》 2025年第2期96-100,共5页
This study examines the Big Data Collection and Preprocessing course at Anhui Institute of Information Engineering,implementing a hybrid teaching reform using the Bosi Smart Learning Platform.The proposed hybrid model... This study examines the Big Data Collection and Preprocessing course at Anhui Institute of Information Engineering,implementing a hybrid teaching reform using the Bosi Smart Learning Platform.The proposed hybrid model follows a“three-stage”and“two-subject”framework,incorporating a structured design for teaching content and assessment methods before,during,and after class.Practical results indicate that this approach significantly enhances teaching effectiveness and improves students’learning autonomy. 展开更多
关键词 Big data Collection and preprocessing Bosi smart learning platform Hybrid teaching Teaching reform
在线阅读 下载PDF
Handling missing data in large-scale TBM datasets:Methods,strategies,and applications
10
作者 Haohan Xiao Ruilang Cao +5 位作者 Zuyu Chen Chengyu Hong Jun Wang Min Yao Litao Fan Teng Luo 《Intelligent Geoengineering》 2025年第3期109-125,共17页
Substantial advancements have been achieved in Tunnel Boring Machine(TBM)technology and monitoring systems,yet the presence of missing data impedes accurate analysis and interpretation of TBM monitoring results.This s... Substantial advancements have been achieved in Tunnel Boring Machine(TBM)technology and monitoring systems,yet the presence of missing data impedes accurate analysis and interpretation of TBM monitoring results.This study aims to investigate the issue of missing data in extensive TBM datasets.Through a comprehensive literature review,we analyze the mechanism of missing TBM data and compare different imputation methods,including statistical analysis and machine learning algorithms.We also examine the impact of various missing patterns and rates on the efficacy of these methods.Finally,we propose a dynamic interpolation strategy tailored for TBM engineering sites.The research results show that K-Nearest Neighbors(KNN)and Random Forest(RF)algorithms can achieve good interpolation results;As the missing rate increases,the interpolation effect of different methods will decrease;The interpolation effect of block missing is poor,followed by mixed missing,and the interpolation effect of sporadic missing is the best.On-site application results validate the proposed interpolation strategy's capability to achieve robust missing value interpolation effects,applicable in ML scenarios such as parameter optimization,attitude warning,and pressure prediction.These findings contribute to enhancing the efficiency of TBM missing data processing,offering more effective support for large-scale TBM monitoring datasets. 展开更多
关键词 Tunnel boring machine(TBM) Missing data imputation Machine learning(ML) Time series interpolation data preprocessing Real-time data stream
在线阅读 下载PDF
Untargeted LC–MS Data Preprocessing in Metabolomics
11
作者 He Tian Bowen Li Guanghou Shui 《Journal of Analysis and Testing》 EI 2017年第3期187-192,共6页
Liquid chromatography–mass spectrometry(LC–MS)has enabled the detection of thousands of metabolite features from a single biological sample that produces large and complex datasets.One of the key issues in LC–MS-ba... Liquid chromatography–mass spectrometry(LC–MS)has enabled the detection of thousands of metabolite features from a single biological sample that produces large and complex datasets.One of the key issues in LC–MS-based metabolomics is comprehensive and accurate analysis of enormous amount of data.Many free data preprocessing tools,such as XCMS,MZmine,MAVEN,and MetaboAnalyst,as well as commercial software,have been developed to facilitate data processing.However,researchers are challenged by the inevitable and unconquerable yields of numerous false-positive peaks,and human errors while manually removing such false peaks.Even with continuous improvements of data processing tools,there can still be many mistakes generated during data preprocessing.In addition,many data preprocessing software exist,and every tool has its own advantages and disadvantages.Thereby,a researcher needs to judge what kind of software or tools to choose that most suit their vendor proprietary formats and goal of downstream analysis.Here,we provided a brief introduction of the general steps of raw MS data processing,and properties of automated data processing tools.Then,characteristics of mainly free data preprocessing software were summarized for researchers’consideration in conducting metabolomics study. 展开更多
关键词 Metabolomics data preprocessing LC-MS Free software/tools
原文传递
基于机器学习的煤系地层TBM掘进巷道围岩强度预测 被引量:3
12
作者 丁自伟 高成登 +6 位作者 景博宇 黄兴 刘滨 胡阳 桑昊旻 徐彬 秦立学 《西安科技大学学报》 北大核心 2025年第1期49-60,共12页
为研究全断面掘进机(TBM)掘进参数与煤系地层岩体力学参数之间的互馈关系,准确、实时预测巷道围岩强度特征,基于TBM掘进过程中的现场监测,通过岩-机互馈关系分析,确定模型的输入特征参数,并建立了对应的数据库;将梯度提升决策树(GBDT)... 为研究全断面掘进机(TBM)掘进参数与煤系地层岩体力学参数之间的互馈关系,准确、实时预测巷道围岩强度特征,基于TBM掘进过程中的现场监测,通过岩-机互馈关系分析,确定模型的输入特征参数,并建立了对应的数据库;将梯度提升决策树(GBDT)、随机森林(RF)、支持向量回归(SVR)3种机器学习算法作为基学习器,线性回归(LR)算法作为元学习器,提出了一种基于Stacking集成算法的预测模型,并对比分析了Stacking集成算法与单一机器学习算法模型的预测性能。结果表明:二值判别与箱线图可有效对原始数据进行预处理;模型的主要输入特征参数为刀盘推力F、刀盘扭矩T、贯入度FPI、刀盘转速RPM、刀盘振动加速度A;Stacking模型在测试集上的拟合优度可达0.976,而均方误差、平均绝对误差、平均绝对百分误差分别仅有0.031,0.148和0.092,与其他3种模型相比,其拟合优度最高,误差指标数值最小,集成模型具有更高的预测精度,能够有效地预测煤矿TBM掘进巷道围岩点荷载强度。研究验证了Stacking模型的准确性,可为煤矿TBM掘进参数控制和巷道支护参数调整提供科学的参考依据。 展开更多
关键词 煤矿全断面掘进机 TBM掘进参数 Stacking集成算法 数据预处理 围岩强度预测
在线阅读 下载PDF
基于Transformer模型的时序数据预测方法综述 被引量:15
13
作者 孟祥福 石皓源 《计算机科学与探索》 北大核心 2025年第1期45-64,共20页
时序数据预测(TSF)是指通过分析历史数据的趋势性、季节性等潜在信息,预测未来时间点或时间段的数值和趋势。时序数据由传感器生成,在金融、医疗、能源、交通、气象等众多领域都发挥着重要作用。随着物联网传感器的发展,海量的时序数据... 时序数据预测(TSF)是指通过分析历史数据的趋势性、季节性等潜在信息,预测未来时间点或时间段的数值和趋势。时序数据由传感器生成,在金融、医疗、能源、交通、气象等众多领域都发挥着重要作用。随着物联网传感器的发展,海量的时序数据难以使用传统的机器学习解决,而Transformer在自然语言处理和计算机视觉等领域的诸多任务表现优秀,学者们利用Transformer模型有效捕获长期依赖关系,使得时序数据预测任务取得了飞速发展。综述了基于Transformer模型的时序数据预测方法,按时间梳理了时序数据预测的发展进程,系统介绍了时序数据预处理过程和方法,介绍了常用的时序预测评价指标和数据集。以算法框架为研究内容系统阐述了基于Transformer的各类模型在TSF任务中的应用方法和工作原理。通过实验对比了各个模型的性能、优点和局限性,并对实验结果展开了分析与讨论。结合Transformer模型在时序数据预测任务中现有工作存在的挑战提出了该方向未来发展趋势。 展开更多
关键词 深度学习 时序数据预测 数据预处理 Transformer模型
在线阅读 下载PDF
基于数据预处理和Bi-LSTM的智能电网预测方法 被引量:3
14
作者 李岩 刘鑫月 +3 位作者 乔俊杰 王毛桃 刘一帆 齐磊杰 《电测与仪表》 北大核心 2025年第6期120-125,共6页
短期预测在智能电网建设中扮演着重要角色,深刻影响电网发输变配用各个环节的智能化改造。短期预测一般基于系统实测数据,而传感器故障,数据传输错误等原因会导致数据质量下降,严重影响短期预测的精确性。为建立数据质量受损情况下的精... 短期预测在智能电网建设中扮演着重要角色,深刻影响电网发输变配用各个环节的智能化改造。短期预测一般基于系统实测数据,而传感器故障,数据传输错误等原因会导致数据质量下降,严重影响短期预测的精确性。为建立数据质量受损情况下的精确短期预测模型,提出了结合数据预处理和双向长短期记忆(bi-directional long short-term memory,Bi-LSTM)的短期预测框架Bi-LSTM-DP(bi-directional long short-term memory data preprocessing)。在Bi-LSTM-DP中,采集的数据首先通过均值填补缺失值,进而基于Savitzky-Golay滤波器对数据降噪,最后采用Bi-LSTM提取时间序列的信息,实现短期预测。为了评估所提方法的性能,文中使用实测的公开数据集分别预测风电发电量和负荷需求,与其他参考方法对比表明了所述方法的有效性和鲁棒性。 展开更多
关键词 短期预测 数据预处理 Bi-LSTM 深度学习 时间序列
在线阅读 下载PDF
基于注意力机制的高光谱图像降维在纸质文物霉斑识别的研究
15
作者 汤斌 贺渝龙 +6 位作者 唐欢 龙邹荣 王建旭 谭博文 覃丹 罗希玲 赵明富 《光谱学与光谱分析》 SCIE EI CAS 北大核心 2025年第1期246-255,共10页
纸质文物作为文物传承的重要工具,用于记录不同时期人类历史及人文风貌,其在保存过程中极易受到霉菌等微生物的侵害。霉菌会加速纤维素的降解,在纸张表面生成霉斑,并且散落的孢子会随空气流动大范围传播,增加其他纸质文物发生霉变的风... 纸质文物作为文物传承的重要工具,用于记录不同时期人类历史及人文风貌,其在保存过程中极易受到霉菌等微生物的侵害。霉菌会加速纤维素的降解,在纸张表面生成霉斑,并且散落的孢子会随空气流动大范围传播,增加其他纸质文物发生霉变的风险。因此,定期对纸质文物进行霉斑检测对了解纸质文物现状和纸质文物修复至关重要。高光谱成像技术是一种非接触性、非破坏性的检测技术,能同时获得空间数据和光谱数据,与计算机技术结合可以实现纸质文物的大批次实时无损检测。针对黑曲霉这一广泛出现的霉菌,提出一种基于注意力机制的高光谱数据降维方法,通过采集其高光谱数据,实现了高光谱冗余数据的自适应预处理。采集了来自重庆中国三峡博物馆提供的20份纸质文物黑曲霉霉斑样本,使用ENVI软件分析得出在413~855 nm波段范围内,黑曲霉霉斑感染区域和健康区域的平均光谱曲线,平均反射率差异明显;在855~1021 nm波段范围内,黑曲霉霉斑感染区域和墨迹区域的平均光谱曲线,平均反射率差异明显。文中将所提出方法与传统主成分分析和独立成分分析预处理方法分别处理原始高光谱数据,并将结果在经典U-Net、SegNet、DeepLabV3+和PSPNet四个语义分割网络上进行了对比。结果表明,该算法预处理的数据在U-Net和SegNet经典网络中有明显优势,相较于主成分分析法和独立成分分析法,霉斑识别精度取得了较大提升达到89.49%和88.46%,验证了本文所提出算法的有效性,为文物保护领域提供有效的支撑和新的思路。 展开更多
关键词 高光谱数据预处理 霉斑识别 纸质文物 注意力机制 图像分割
在线阅读 下载PDF
基于VMD-Itransformer-MOSSA模型的短期风电功率预测方法
16
作者 张伟 高鹭 +1 位作者 秦岭 李伟 《计算机工程与设计》 北大核心 2025年第9期2690-2698,共9页
为解决天气预报存在较小的误差,使风电功率预测产生巨大误差的问题,提出一种结合VMD算法和MOSSA优化的Transformer模型用于短期风力预测。应用变分模态分解处理天气预报风速和实测风速间的误差,将分解结果结合天气预报信息中的其它部分... 为解决天气预报存在较小的误差,使风电功率预测产生巨大误差的问题,提出一种结合VMD算法和MOSSA优化的Transformer模型用于短期风力预测。应用变分模态分解处理天气预报风速和实测风速间的误差,将分解结果结合天气预报信息中的其它部分特征作为改进的Transformer模型输入。通过改进麻雀搜索算法(SSA)优化修正模型的关键参数,提高预测准确性。将预测的风速误差与天气预报风速相加即得到修正后的天气预报风速并计算风功率。仿真结果表明,该模型方法在准确性上优于基准模型,验证了所提出的改进组合模型有效性。 展开更多
关键词 风速修正 变分模态分解 改进的变压器 麻雀搜索算法 短期风电功率 数据预处理 天气预报信息
在线阅读 下载PDF
电驱动系统效率试验数据质量评估方法研究
17
作者 邹喜红 王晓丽 +4 位作者 袁冬梅 周擎 熊锋 周振 王万英 《重庆理工大学学报(自然科学)》 北大核心 2025年第11期30-38,共9页
针对电驱动系统效率试验产生的高维数据,提出了一种利用数据挖掘技术对电驱动系统效率试验数据进行预处理与质量评估的方法。基于此,首先搭建了电驱动效率试验台架,对数据进行采集,分析效率试验数据特征;其次,结合IQR和MAD思想,设计了... 针对电驱动系统效率试验产生的高维数据,提出了一种利用数据挖掘技术对电驱动系统效率试验数据进行预处理与质量评估的方法。基于此,首先搭建了电驱动效率试验台架,对数据进行采集,分析效率试验数据特征;其次,结合IQR和MAD思想,设计了基于概率分布的效率试验数据降噪法;之后,通过构建IPSO-DBSCAN模型和LOF-iForest模型对效率数据进行分簇,并对簇内和簇外的异常值进行检验,实现了异常值数据的识别;最后,构建了多维度的数据质量评估模型。结果表明,该方法实现了对电驱动系统效率试验数据的预处理和多维度的数据质量评估,提高了电驱动系统效率试验数据的准确性和可靠性。 展开更多
关键词 电驱动系统 试验数据 数据预处理 异常值检验 质量评估
在线阅读 下载PDF
基于学生编程能力评估的OJ平台基础数据应用方法研究
18
作者 李环宇 申晓倩 +1 位作者 林晓霞 刘欣颖 《办公自动化》 2025年第7期4-7,共4页
在线评测系统(OJ)是学生学习编程使用的一种在线系统。OJ平台生成的数据是评测学生编程能力评估的基础数据。但平台生成的数据存在量大、无关属性多、缺值多等问题,不适宜直接用以测评学生编程能力。针对这一问题,文章提出解决该问题的... 在线评测系统(OJ)是学生学习编程使用的一种在线系统。OJ平台生成的数据是评测学生编程能力评估的基础数据。但平台生成的数据存在量大、无关属性多、缺值多等问题,不适宜直接用以测评学生编程能力。针对这一问题,文章提出解决该问题的预处理方法,对学生编程练习的成绩单进行处理,采用均值插补方法对“缺考”学生数据进行处理,对重复数据删除;然后将数据规约并归一化,将不同类型的数据进行标准化,去除不必要的属性数据,提高数据的质量和可靠性,为评估学生的编程能力奠定基础。 展开更多
关键词 OJ平台 能力评估 基础数据 预处理
在线阅读 下载PDF
基于近红外光谱的废旧塑料材质识别研究
19
作者 彭斌彬 张潮 +1 位作者 郭亚坤 吴英琦 《激光杂志》 北大核心 2025年第2期210-217,共8页
针对废旧塑料回收处理量大且种类繁多,难以快速无损分类识别的难题,提出基于近红外光谱技术的塑料材质识别方法。使用红外光谱仪采集了聚对苯二甲酸乙二醇酯(PET)、聚乙烯(PE)、尼龙(PA)、聚碳酸酯(PC)、聚丙烯(PP)、聚苯乙烯(PS)、丙烯... 针对废旧塑料回收处理量大且种类繁多,难以快速无损分类识别的难题,提出基于近红外光谱技术的塑料材质识别方法。使用红外光谱仪采集了聚对苯二甲酸乙二醇酯(PET)、聚乙烯(PE)、尼龙(PA)、聚碳酸酯(PC)、聚丙烯(PP)、聚苯乙烯(PS)、丙烯腈-丁二烯-苯乙烯(ABS)、聚甲醛(POM)八种塑料的近红外光谱数据,采用Savitzky-Golay卷积平滑和标准正态变量变换进行数据预处理,分别运用无监督学习的主成分分析与有监督学习的线性判别分析进行光谱数据降维,将光谱数据维度从334维降至10维和7维,最后结合马氏距离判别建立塑料材质识别模型。实验结果表明:结合S-G平滑和SNV的预处理有效提高了识别准确率;对预处理数据的验证集进行降维后,两种降维方法的识别准确率分别达到了95.24%和100%。这两种方法可为多种废旧塑料材质识别研究提供参考。 展开更多
关键词 近红外光谱 塑料材质识别 数据预处理 主成分分析 线性判别分析 马氏距离
原文传递
ChatGPT在风湿科中医电子病历症状信息预处理中的应用 被引量:4
20
作者 喻金龙 张磊 +4 位作者 许宁 法立峰 杨扩 韩真真 郭洪涛 《中华中医药学刊》 北大核心 2025年第3期24-29,I0043,共7页
目的 信息抽取是自然语言处理的重要手段,基于ChatGPT的自然语言处理能力,通过ChatGPT对风湿科中医电子病历进行症状信息抽取。方法 通过基于ChatGPT大模型的小样本学习,实现对风湿科电子病历中主诉及现病史、专科检查和舌脉象数据的信... 目的 信息抽取是自然语言处理的重要手段,基于ChatGPT的自然语言处理能力,通过ChatGPT对风湿科中医电子病历进行症状信息抽取。方法 通过基于ChatGPT大模型的小样本学习,实现对风湿科电子病历中主诉及现病史、专科检查和舌脉象数据的信息抽取与规范,同时选择ChatGLM大模型进行与ChatGPT相同的小样本学习,完成相同的信息抽取任务并比较两个大模型抽取结果的正确率。结果 针对医学命名实体识别问题,小样本学习后的ChatGPT在主诉症状抽取任务上的准确率达到了98.7%,在专科检查抽取任务上的准确率达到了92%,在舌脉象抽取与规范任务上的正确率达到了98%。ChatGLM在同样3个任务上的正确率分别为85.7%、91%和98%,从抽取结果上来看,两个大模型在抽取任务中存在不同的缺陷,但从整体上看,ChatGPT表现更为出色。结论 实现了基于ChatGPT小样本学习的中医症状信息预处理,相比传统模型更便捷高效,且与ChatGLM相比整体上表现更优,这为中医临床文本信息抽取研究提供了新的思路。 展开更多
关键词 ChatGPT 中医电子病历 症状 数据预处理 信息抽取
原文传递
上一页 1 2 74 下一页 到第
使用帮助 返回顶部