期刊文献+
共找到3,836篇文章
< 1 2 192 >
每页显示 20 50 100
A Convolutional Neural Network-Based Deep Support Vector Machine for Parkinson’s Disease Detection with Small-Scale and Imbalanced Datasets
1
作者 Kwok Tai Chui Varsha Arya +2 位作者 Brij B.Gupta Miguel Torres-Ruiz Razaz Waheeb Attar 《Computers, Materials & Continua》 2026年第1期1410-1432,共23页
Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using d... Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested. 展开更多
关键词 Convolutional neural network data generation deep support vector machine feature extraction generative artificial intelligence imbalanced dataset medical diagnosis Parkinson’s disease small-scale dataset
在线阅读 下载PDF
A standardized dataset of CO-TPD spectra on transitionmetal single-crystal surfaces
2
作者 YANG Lin WU Jianghong WANG He 《燃料化学学报(中英文)》 北大核心 2026年第4期180-190,共11页
Temperature-programmed desorption(TPD)is a fundamental technique in surface science and heterogeneous catalysis for characterizing adsorption behavior,and for extracting key parameters such as adsorption energy.Howeve... Temperature-programmed desorption(TPD)is a fundamental technique in surface science and heterogeneous catalysis for characterizing adsorption behavior,and for extracting key parameters such as adsorption energy.However,the majority of existing TPD data is accessible in the form of published images,which lacks structured and quantitative datasets.This constrains its utility for rigorous quantitative analysis and computational modelling.Using carbon monoxide(CO)which is a widely adopted probe molecule,a curated and standardized dataset of CO-TPD is constructed,encompassing 14 transition-metal single-crystal surfaces,including copper(Cu)and ruthenium(Ru).By systematically extracting numerical data points from published spectra and applying normalization,essential spectral features such as peak shape are fully preserved.The dataset also documents relevant experimental parameters,including heating rates,and was developed using a standardized protocol for data collection and quality control.This resource serves as both a reference library to support the deconvolution of TPD spectra from complex catalysts and an experimental benchmark for calibrating parameters in theoretical models.By providing a reliable and accessible data function,this work advances the microscopic understanding and the rational design of catalyst active centers. 展开更多
关键词 CO-TPD standardized dataset transition metal single-crystal surfaces
在线阅读 下载PDF
Layered Feature Engineering for E-Commerce Purchase Prediction:A Hierarchical Evaluation on Taobao User Behavior Datasets
3
作者 Liqiu Suo Lin Xia +1 位作者 Yoona Chung Eunchan Kim 《Computers, Materials & Continua》 2026年第4期1865-1889,共25页
Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three ... Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three layers:Basic,Conversion&Stability(efficiency and volatility across actions),and Advanced Interactions&Activity(crossbehavior synergies and intensity).Using real Taobao(Alibaba’s primary e-commerce platform)logs(57,976 records for 10,203 users;25 November–03 December 2017),we conducted a hierarchical,layer-wise evaluation that holds data splits and hyperparameters fixed while varying only the feature set to quantify each layer’s marginal contribution.Across logistic regression(LR),decision tree,random forest,XGBoost,and CatBoost models with stratified 5-fold cross-validation,the performance improvedmonotonically fromBasic to Conversion&Stability to Advanced features.With LR,F1 increased from 0.613(Basic)to 0.962(Advanced);boosted models achieved high discrimination(0.995 AUC Score)and an F1 score up to 0.983.Calibration and precision–recall analyses indicated strong ranking quality and acknowledged potential dataset and period biases given the short(9-day)window.By making feature contributions measurable and reproducible,the framework complements model-centric advances and offers a transparent blueprint for production-grade behavioralmodeling.The code and processed artifacts are publicly available,and future work will extend the validation to longer,seasonal datasets and hybrid approaches that combine automated feature learning with domain-driven design. 展开更多
关键词 Hierarchical feature engineering purchase prediction user behavior dataset feature importance e-commerce platform TAOBAO
在线阅读 下载PDF
Detection Method for Bolt Loosening of Fan Base through Bayesian Learning with Small Dataset:A Real-World Application
4
作者 Zhongyun Tang Hanyi Xu Haiyang Hu 《Computers, Materials & Continua》 2026年第2期550-578,共29页
With the deep integration of smart manufacturing and IoT technologies,higher demands are placed on the intelligence and real-time performance of industrial equipment fault detection.For industrial fans,base bolt loose... With the deep integration of smart manufacturing and IoT technologies,higher demands are placed on the intelligence and real-time performance of industrial equipment fault detection.For industrial fans,base bolt loosening faults are difficult to identify through conventional spectrum analysis,and the extreme scarcity of fault data leads to limited training datasets,making traditional deep learning methods inaccurate in fault identification and incapable of detecting loosening severity.This paper employs Bayesian Learning by training on a small fault dataset collected from the actual operation of axial-flow fans in a factory to obtain posterior distribution.This method proposes specific data processing approaches and a configuration of Bayesian Convolutional Neural Network(BCNN).It can effectively improve the model’s generalization ability.Experimental results demonstrate high detection accuracy and alignment with real-world applications,offering practical significance and reference value for industrial fan bolt loosening detection under data-limited conditions. 展开更多
关键词 Bolt loosening detection industrial small dataset Bayesian learning INTERPRETABILITY real-world application
在线阅读 下载PDF
Fine-Med-Mental-T&P:a dual-track approach for high-quality instructional datasets of mental disorders in traditional Chinese medicine
5
作者 Yanbai Wei Xiaoshuo Jing Junfeng Yan 《Digital Chinese Medicine》 2026年第1期31-42,共12页
Objective To investigate methods for constructing a high-quality instructional dataset for traditional Chinese medicine(TCM)mental disorders and to validate its efficacy.Methods We proposed the Fine-Med-Mental-T&P... Objective To investigate methods for constructing a high-quality instructional dataset for traditional Chinese medicine(TCM)mental disorders and to validate its efficacy.Methods We proposed the Fine-Med-Mental-T&P methodology for constructing high-quality instruction datasets in TCM mental disorders.This approach integrates theoretical knowledge and practical case studies through a dual-track strategy.(i)Theoretical track:textbooks and guidelines on TCM mental disorders were manually segmented.Initial responses were generated using DeepSeek-V3,followed by refinement by the Qwen3-32B model to align the expression with human preferences.A screening algorithm was then applied to select 16000 high-quality instruction pairs.(ii)Practical track:starting from over 600 real clinical case seeds,diagnostic and therapeutic instruction pairs were generated using DeepSeek-V3 and subsequently screened through manual evaluation,resulting in 4000 high-quality practiceoriented instruction pairs.The integration of both tracks yielded the Med-Mental-Instruct-T&P dataset,comprising a total of 20000 instruction pairs.To validate the dataset’s effectiveness,three experimental evaluations(both manual and automated)were conducted:(i)comparative studies to compare the performance of models fine-tuned on different datasets;(ii)benchmarking to compare against mainstream TCM-specific large language models(LLMs);(iii)data ablation study to investigate the relationship between data volume and model performance.Results Experimental results demonstrate the superior performance of T&P-model finetuned on the Med-Mental-Instruct-T&P dataset.In the comparative study,the T&P-model significantly outperformed the baseline models trained solely on self-generated or purely human-curated baseline data.This superiority was evident in both automated metrics(ROUGEL>0.55)and expert manual evaluations(scoring above 7/10 across accuracy).In benchmark comparisons,the T&P-model also excelled against existing mainstream TCM LLMs(e.g.,HuatuoGPT and ZuoyiGPT).It showed particularly strong capabilities in handling diverse clinical presentations,including challenging disorders such as insomnia and coma,showcasing its robustness and versatility.Data ablation studies showed that T&P-model performance had an overall upward trend with minor fluctuations when training data increased from 10%to 50%;beyond 50%,performance improvement slowed significantly,with metrics plateauing and approaching a saturation point. 展开更多
关键词 Mental disorder Traditional Chinese medicine(TCM) Instruction dataset construction Instruction tuning Large language model
在线阅读 下载PDF
Efficient Dataset Generation for Stacked Meat Products Instance Segmentation in Food Automation
6
作者 Hoang Minh Pham Anh Dong Le +2 位作者 Pablo Malvido-Fresnillo Saigopal Vasudevan JoséL.Martínez Lastra 《IEEE/CAA Journal of Automatica Sinica》 2026年第1期224-226,共3页
Dear Editor,This letter presents techniques to simplify dataset generation for instance segmentation of raw meat products,a critical step toward automating food production lines.Accurate segmentation is essential for ... Dear Editor,This letter presents techniques to simplify dataset generation for instance segmentation of raw meat products,a critical step toward automating food production lines.Accurate segmentation is essential for addressing challenges such as occlusions,indistinct edges,and stacked configurations,which demand large,diverse datasets.To meet these demands,we propose two complementary approaches:a semi-automatic annotation interface using tools like the segment anything model(SAM)and GrabCut and a synthetic data generation pipeline leveraging 3D-scanned models.These methods reduce reliance on real meat,mitigate food waste,and improve scalability.Experimental results demonstrate that incorporating synthetic data enhances segmentation model performance and,when combined with real data,further boosts accuracy,paving the way for more efficient automation in the food industry. 展开更多
关键词 dataset generation segment anything model sam food automation raw meat productsa automating food production linesaccurate instance segmentation stacked meat products semi automatic annotation
在线阅读 下载PDF
BWRadarDataset-1.0:多波段多模态雷达探测感知数据集
7
作者 张转花 靳俊峰 +22 位作者 常沛 何洋洋 汪振亚 侯其立 李玉景 郝慧军 曾怡 夏勇 商国军 许涛 任伟杰 雷鸣 王歆远 寿博 邓丽颖 任乐乐 窦曼莉 杨利红 张琦珺 李伟 牛蕾 林晓斌 张志成 《雷达科学与技术》 北大核心 2026年第1期1-14,共14页
雷达探测感知技术飞速发展浪潮下高质量数据集在算法创新、模型训练与性能验证中发挥着重要作用。当前,深度学习等数据驱动方法已成为提升雷达在检测、跟踪、识别、干扰及合成孔径雷达(SAR)成像等核心任务性能的关键。然而,现有的数据... 雷达探测感知技术飞速发展浪潮下高质量数据集在算法创新、模型训练与性能验证中发挥着重要作用。当前,深度学习等数据驱动方法已成为提升雷达在检测、跟踪、识别、干扰及合成孔径雷达(SAR)成像等核心任务性能的关键。然而,现有的数据集大多基于仿真生成,与真实电磁环境存在差异,泛化能力受限,并且现有的数据集仅针对单一功能,例仅有检测或SAR,缺乏系统性,难以支撑探测感知处理的一体化研究。针对这一空白,本文公开了一套完整的雷达检测跟踪识别一体化数据集。该数据集源于典型的实测场景,涵盖了信号处理、目标跟踪、精细识别、复合干扰以及高分辨率SAR图像的多波段、多模态数据,真实反映复杂环境下雷达信号的传播特性与目标特性。进一步,本文对数据集中的关键特征进行了系统性提取与分析,为不同任务的算法研究与性能评估提供了标准化的特征输入,为研究雷达智能化信号与信息处理提供了坚实的基础。 展开更多
关键词 雷达探测 公开数据集 特征提取 目标检测 目标跟踪 目标识别 有源干扰 SAR图像 特征分析
在线阅读 下载PDF
UD-TN:A comprehensive ultrasound dataset for benign and malignant thyroid nodule classification
8
作者 Jialin Zhu Xuzhou Fu +5 位作者 Zhiqiang Liu Luchen Chang Xuewei Li Jie Gao Ruiguo Yu Xi Wei 《Intelligent Oncology》 2025年第2期176-187,共12页
The automatic classification of thyroid nodules in ultrasound images is a critical research focus in medical imaging.However,publicly available thyroid ultrasound datasets remain scarce.In this study,we developed the ... The automatic classification of thyroid nodules in ultrasound images is a critical research focus in medical imaging.However,publicly available thyroid ultrasound datasets remain scarce.In this study,we developed the Ultrasound Dataset for Thyroid Nodules(UD-TN),a comprehensive dataset containing 10,495 labeled images classified as benign or malignant based on pathology-confirmed results.To establish a benchmark,we proposed the Thyroid Ultrasound Image Neural Network(ThyUNet),a deep learning model designed for accurate nodule classification.By incorporating high-resolution feature enhancement,instance normalization,and dilated convolutions into residual blocks,ThyUNet excels in extracting fine-grained features,particularly for small nodules.Experimental results demonstrate that ThyUNet achieves state-of-the-art performance,with an accuracy of 89.7%,a sensitivity of 0.879,and a specificity of 0.910 on the testing set.These results surpass those of other advanced architectures,highlighting the model’s effectiveness.UD-TN and ThyUNet contribute significantly to advancing intelligent medical diagnostics.Dataset details and access instructions are available at https://github.com/18811755633/Sample-of-UD-TN. 展开更多
关键词 Ultrasound dataset Deep learning Nodule classification Medical imaging dataset
在线阅读 下载PDF
Impact of Dataset Size on Machine Learning Regression Accuracy in Solar Power Prediction 被引量:1
9
作者 S.M.Rezaul Karim Md.Shouquat Hossain +3 位作者 Khadiza Akter Debasish Sarker Md.Moniul Kabir Mamdouh Assad 《Energy Engineering》 2025年第8期3041-3054,共14页
Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence o... Knowing the influence of the size of datasets for regression models can help in improving the accuracy of a solar power forecast and make the most out of renewable energy systems.This research explores the influence of dataset size on the accuracy and reliability of regression models for solar power prediction,contributing to better forecasting methods.The study analyzes data from two solar panels,aSiMicro03036 and aSiTandem72-46,over 7,14,17,21,28,and 38 days,with each dataset comprising five independent and one dependent parameter,and split 80–20 for training and testing.Results indicate that Random Forest consistently outperforms other models,achieving the highest correlation coefficient of 0.9822 and the lowest Mean Absolute Error(MAE)of 2.0544 on the aSiTandem72-46 panel with 21 days of data.For the aSiMicro03036 panel,the best MAE of 4.2978 was reached using the k-Nearest Neighbor(k-NN)algorithm,which was set up as instance-based k-Nearest neighbors(IBk)in Weka after being trained on 17 days of data.Regression performance for most models(excluding IBk)stabilizes at 14 days or more.Compared to the 7-day dataset,increasing to 21 days reduced the MAE by around 20%and improved correlation coefficients by around 2.1%,highlighting the value of moderate dataset expansion.These findings suggest that datasets spanning 17 to 21 days,with 80%used for training,can significantly enhance the predictive accuracy of solar power generation models. 展开更多
关键词 Correlation coefficients dataset size machine learning mean absolute error regression solar power prediction
在线阅读 下载PDF
High-resolution Simulation Dataset of Hourly PM_(2.5)Chemical Composition in China(CAQRA-aerosol)from 2013 to 2020 被引量:1
10
作者 Lei KONG Xiao TANG +14 位作者 Jiang ZHU Zifa WANG Bing LIU Yuanyuan ZHU Lili ZHU Duohong CHEN Ke HU Huangjian WU Qian WU Jin SHEN Yele SUN Zirui LIU Jinyuan XIN Dongsheng JI Mei ZHENG 《Advances in Atmospheric Sciences》 2025年第4期697-712,共16页
Scientific knowledge on the chemical compositions of fine particulate matter(PM_(2.5)) is essential for properly assessing its health and climate effects,and for decisionmakers to develop efficient mitigation strategi... Scientific knowledge on the chemical compositions of fine particulate matter(PM_(2.5)) is essential for properly assessing its health and climate effects,and for decisionmakers to develop efficient mitigation strategies.A high-resolution PM_(2.5) chemical composition dataset(CAQRA-aerosol)is developed in this study,which provides hourly maps of organic carbon,black carbon,ammonium,nitrate,and sulfate in China from 2013 to 2020 with a horizontal resolution of 15 km.This paper describes the method,access,and validation results of this dataset.It shows that CAQRA-aerosol has good consistency with observations and achieves higher or comparable accuracy with previous PM_(2.5) composition datasets.Based on CAQRA-aerosol,spatiotemporal changes of different PM_(2.5) compositions were investigated from a national viewpoint,which emphasizes different changes of nitrate from other compositions.The estimated annual rate of population-weighted concentrations of nitrate is 0.23μg m^(−3)yr^(−1) from 2015 to 2020,compared with−0.19 to−1.1μg m^(−3)yr^(−1) for other compositions.The whole dataset is freely available from the China Air Pollution Data Center(https://doi.org/10.12423/capdb_PKU.2023.DA). 展开更多
关键词 PM_(2.5)composition dataset black carbon organic carbon AMMONIUM NITRATE SULFATE
在线阅读 下载PDF
MEET:A Million-Scale Dataset for Fine-Grained Geospatial Scene Classification With Zoom-Free Remote Sensing Imagery 被引量:1
11
作者 Yansheng Li Yuning Wu +9 位作者 Gong Cheng Chao Tao Bo Dang Yu Wang Jiahao Zhang Chuge Zhang Yiting Liu Xu Tang Jiayi Ma Yongjun Zhang 《IEEE/CAA Journal of Automatica Sinica》 2025年第5期1004-1023,共20页
Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications.However,existing approaches often rely on manually zooming remote sensing images at diff... Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications.However,existing approaches often rely on manually zooming remote sensing images at different scales to create typical scene samples.This approach fails to adequately support the fixed-resolution image interpretation requirements in real-world scenarios.To address this limitation,we introduce the million-scale fine-grained geospatial scene classification dataset(MEET),which contains over 1.03 million zoom-free remote sensing scene samples,manually annotated into 80 fine-grained categories.In MEET,each scene sample follows a scene-in-scene layout,where the central scene serves as the reference,and auxiliary scenes provide crucial spatial context for fine-grained classification.Moreover,to tackle the emerging challenge of scene-in-scene classification,we present the context-aware transformer(CAT),a model specifically designed for this task,which adaptively fuses spatial context to accurately classify the scene samples.CAT adaptively fuses spatial context to accurately classify the scene samples by learning attentional features that capture the relationships between the center and auxiliary scenes.Based on MEET,we establish a comprehensive benchmark for fine-grained geospatial scene classification,evaluating CAT against 11 competitive baselines.The results demonstrate that CAT significantly outperforms these baselines,achieving a 1.88%higher balanced accuracy(BA)with the Swin-Large backbone,and a notable 7.87%improvement with the Swin-Huge backbone.Further experiments validate the effectiveness of each module in CAT and show the practical applicability of CAT in the urban functional zone mapping.The source code and dataset will be publicly available at https://jerrywyn.github.io/project/MEET.html. 展开更多
关键词 Fine-grained geospatial scene classification(FGSC) million-scale dataset remote sensing imagery(RSI) scene-in-scene transformer
在线阅读 下载PDF
A dataset for the structure and electrochemical performance of hard carbon as anodes for sodium-ion batteries
12
作者 HOU Wei-yan YI Zong-lin +7 位作者 JIA Wan-ru YU Hong-tao DAI Li-qin YANG Jun-jie CHEN Jing-peng XIE Li-jing SU Fang-yuan CHEN Cheng-meng 《新型炭材料(中英文)》 北大核心 2025年第5期1193-1200,共8页
This data set collects,compares and contrasts the capacities and structures of a series of hard carbon materials,and then searches for correlations between structure and electrochemical performance.The capacity data o... This data set collects,compares and contrasts the capacities and structures of a series of hard carbon materials,and then searches for correlations between structure and electrochemical performance.The capacity data of the hard carbons were obtained by charge/discharge tests and the materials were characterized by XRD,gas adsorption,true density tests and SAXS.In particular,the fitting of SAXS gave a series of structural parameters which showed good characterization.The related test details are given with the structural data of the hard carbons and the electrochemical performance of the sodium-ion batteries. 展开更多
关键词 Hard carbon Sodium-ion battery SAXS Structural characterization dataset
在线阅读 下载PDF
An Intrusion Detection System Based on HiTar-2024 Dataset Generation from LOG Files for Smart Industrial Internet-of-Things Environment
13
作者 Tarak Dhaouadi Hichem Mrabet +1 位作者 Adeeb Alhomoud Abderrazak Jemai 《Computers, Materials & Continua》 2025年第3期4535-4554,共20页
The increasing adoption of Industrial Internet of Things(IIoT)systems in smart manufacturing is leading to raise cyberattack numbers and pressing the requirement for intrusion detection systems(IDS)to be effective.How... The increasing adoption of Industrial Internet of Things(IIoT)systems in smart manufacturing is leading to raise cyberattack numbers and pressing the requirement for intrusion detection systems(IDS)to be effective.However,existing datasets for IDS training often lack relevance to modern IIoT environments,limiting their applicability for research and development.To address the latter gap,this paper introduces the HiTar-2024 dataset specifically designed for IIoT systems.As a consequence,that can be used by an IDS to detect imminent threats.Likewise,HiTar-2024 was generated using the AREZZO simulator,which replicates realistic smart manufacturing scenarios.The generated dataset includes five distinct classes:Normal,Probing,Remote to Local(R2L),User to Root(U2R),and Denial of Service(DoS).Furthermore,comprehensive experiments with popular Machine Learning(ML)models using various classifiers,including BayesNet,Logistic,IBK,Multiclass,PART,and J48 demonstrate high accuracy,precision,recall,and F1-scores,exceeding 0.99 across all ML metrics.The latter result is reached thanks to the rigorous applied process to achieve this quite good result,including data pre-processing,features extraction,fixing the class imbalance problem,and using a test option for model robustness.This comprehensive approach emphasizes meticulous dataset construction through a complete dataset generation process,a careful labelling algorithm,and a sophisticated evaluation method,providing valuable insights to reinforce IIoT system security.Finally,the HiTar-2024 dataset is compared with other similar datasets in the literature,considering several factors such as data format,feature extraction tools,number of features,attack categories,number of instances,and ML metrics. 展开更多
关键词 Intrusion detection system industrial IoT machine learning security cyber-attacks dataset
在线阅读 下载PDF
A New Dataset for Network Flooding Attacks in SDN-Based IoT Environments
14
作者 Nader Karmous Wadii Jlassi +2 位作者 Mohamed Ould-Elhassen Aoueileyine Imen Filali Ridha Bouallegue 《Computer Modeling in Engineering & Sciences》 2025年第12期4363-4393,共31页
This paper introduces a robust Distributed Denial-of-Service attack detection framework tailored for Software-Defined Networking based Internet of Things environments,built upon a novel,syntheticmulti-vector dataset g... This paper introduces a robust Distributed Denial-of-Service attack detection framework tailored for Software-Defined Networking based Internet of Things environments,built upon a novel,syntheticmulti-vector dataset generated in a Mininet-Ryu testbed using real-time flow-based labeling.The proposed model is based on the XGBoost algorithm,optimized with Principal Component Analysis for dimensionality reduction,utilizing lightweight flowlevel features extracted from Open Flow statistics to classify attacks across critical IoT protocols including TCP,UDP,HTTP,MQTT,and CoAP.The model employs lightweight flow-level features extracted from Open Flow statistics to ensure low computational overhead and fast processing.Performance was rigorously evaluated using key metrics,including Accuracy,Precision,Recall,F1-Score,False Alarm Rate,AUC-ROC,and Detection Time.Experimental results demonstrate the model’s high performance,achieving an accuracy of 98.93%and a low FAR of 0.86%,with a rapid median detection time of 1.02 s.This efficiency validates its superiority in meeting critical Key Performance Indicators,such as Latency and high Throughput,necessary for time-sensitive SDN-IoT systems.Furthermore,the model’s robustness and statistically significant outperformance against baseline models such as Random Forest,k-Nearest Neighbors,and Gradient Boosting Machine,validating through statistical tests using Wilcoxon signed-rank test and confirmed via successful deployment in a real SDN testbed for live traffic detection and mitigation. 展开更多
关键词 CYBERSECURITY SDN IOT ML AI dataset software defined networking FLOODING DDOS attacks THREAT Wilcoxon
在线阅读 下载PDF
DCS-SOCP-SVM:A Novel Integrated Sampling and Classification Algorithm for Imbalanced Datasets
15
作者 Xuewen Mu Bingcong Zhao 《Computers, Materials & Continua》 2025年第5期2143-2159,共17页
When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes... When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes a high-performance classification algorithm specifically designed for imbalanced datasets.The proposed method first uses a biased second-order cone programming support vectormachine(B-SOCP-SVM)to identify the support vectors(SVs)and non-support vectors(NSVs)in the imbalanced data.Then,it applies the synthetic minority over-sampling technique(SV-SMOTE)to oversample the support vectors of the minority class and uses the random under-sampling technique(NSV-RUS)multiple times to undersample the non-support vectors of the majority class.Combining the above-obtained minority class data set withmultiple majority class datasets can obtainmultiple new balanced data sets.Finally,SOCP-SVM is used to classify each data set,and the final result is obtained through the integrated algorithm.Experimental results demonstrate that the proposed method performs excellently on imbalanced datasets. 展开更多
关键词 DCS-SOCP-SVM imbalanced datasets sampling method ensemble method integrated algorithm
在线阅读 下载PDF
A Comprehensive Review of Face Detection Techniques for Occluded Faces:Methods,Datasets,and Open Challenges
16
作者 Thaer Thaher Majdi Mafarja +2 位作者 Muhammed Saffarini Abdul Hakim H.M.Mohamed Ayman A.El-Saleh 《Computer Modeling in Engineering & Sciences》 2025年第6期2615-2673,共59页
Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveill... Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveillance,biometric authentication,and human-computer interaction.This paper provides a comprehensive review of face detection techniques developed to handle occluded faces.Studies are categorized into four main approaches:feature-based,machine learning-based,deep learning-based,and hybrid methods.We analyzed state-of-the-art studies within each category,examining their methodologies,strengths,and limitations based on widely used benchmark datasets,highlighting their adaptability to partial and severe occlusions.The review also identifies key challenges,including dataset diversity,model generalization,and computational efficiency.Our findings reveal that deep learning methods dominate recent studies,benefiting from their ability to extract hierarchical features and handle complex occlusion patterns.More recently,researchers have increasingly explored Transformer-based architectures,such as Vision Transformer(ViT)and Swin Transformer,to further improve detection robustness under challenging occlusion scenarios.In addition,hybrid approaches,which aim to combine traditional andmodern techniques,are emerging as a promising direction for improving robustness.This review provides valuable insights for researchers aiming to develop more robust face detection systems and for practitioners seeking to deploy reliable solutions in real-world,occlusionprone environments.Further improvements and the proposal of broader datasets are required to developmore scalable,robust,and efficient models that can handle complex occlusions in real-world scenarios. 展开更多
关键词 Occluded face detection feature-based deep learning machine learning hybrid approaches datasets
在线阅读 下载PDF
Impact of climate changes on Arizona State precipitation patterns using high-resolution climatic gridded datasets
17
作者 Hayder H.Kareem Shahla Abdulqader Nassrullah 《Journal of Groundwater Science and Engineering》 2025年第1期34-46,共13页
Climate change significantly affects environment,ecosystems,communities,and economies.These impacts often result in quick and gradual changes in water resources,environmental conditions,and weather patterns.A geograph... Climate change significantly affects environment,ecosystems,communities,and economies.These impacts often result in quick and gradual changes in water resources,environmental conditions,and weather patterns.A geographical study was conducted in Arizona State,USA,to examine monthly precipi-tation concentration rates over time.This analysis used a high-resolution 0.50×0.50 grid for monthly precip-itation data from 1961 to 2022,Provided by the Climatic Research Unit.The study aimed to analyze climatic changes affected the first and last five years of each decade,as well as the entire decade,during the specified period.GIS was used to meet the objectives of this study.Arizona experienced 51–568 mm,67–560 mm,63–622 mm,and 52–590 mm of rainfall in the sixth,seventh,eighth,and ninth decades of the second millennium,respectively.Both the first and second five year periods of each decade showed accept-able rainfall amounts despite fluctuations.However,rainfall decreased in the first and second decades of the third millennium.and in the first two years of the third decade.Rainfall amounts dropped to 42–472 mm,55–469 mm,and 74–498 mm,respectively,indicating a downward trend in precipitation.The central part of the state received the highest rainfall,while the eastern and western regions(spanning north to south)had significantly less.Over the decades of the third millennium,the average annual rainfall every five years was relatively low,showing a declining trend due to severe climate changes,generally ranging between 35 mm and 498 mm.The central regions consistently received more rainfall than the eastern and western outskirts.Arizona is currently experiencing a decrease in rainfall due to climate change,a situation that could deterio-rate further.This highlights the need to optimize the use of existing rainfall and explore alternative water sources. 展开更多
关键词 Spatial Analysis Climate Impact Precipitation Rates CRU dataset GIS Arizona State USA
在线阅读 下载PDF
A Comprehensive Brain MRI and Neurodevelopmental Dataset in Children with Tetralogy of Fallot
18
作者 Yang Xu Yaqi Zhang +10 位作者 Meijiao Zhu Pengcheng Xue Siyu Ma Di Yu Liang Hu Yuxi Zhang Wei Peng Jirong Qi Xuyun Wen Ming Yang Xuming Mo 《Congenital Heart Disease》 2025年第5期559-570,共12页
Background:The life-course management of children with tetralogy of Fallot(TOF)has focused on demonstrating brain structural alterations,developmental trajectories,and cognition-related changes that unfold over time.M... Background:The life-course management of children with tetralogy of Fallot(TOF)has focused on demonstrating brain structural alterations,developmental trajectories,and cognition-related changes that unfold over time.Methods:We introduce an magnetic resonance imaging(MRI)dataset comprising TOF children who underwent brain MRI scanning and cross-sectional neurocognitive follow-up.The dataset includes brain three-dimensional T1-weighted imaging(3D-T1WI),three-dimensional T2-weighted imaging(3D-T2WI),and neurodevelopmental evaluations using the Wechsler Preschool and Primary Scale of Intelligence–Fourth Edition(WPPSI-IV).Results:Thirty-one children with TOF(age range:4–33 months;18 males)were recruited and completed corrective surgery at the Children’s Hospital of Nanjing Medical University,Nanjing,China.Aiming to promote the neurodevelopmental outcomes in children with TOF,we have meticulously curated a comprehensive dataset designed to dissect the complex interplay among risk factors,neuroimaging findings,and adverse neurodevelopmental outcomes.Conclusion:This article aims to introduce our open-source dataset on neurodevelopment in children with TOF,which covers the data types,data acquisition and processing methods,the procedure for accessing the data,and related publications. 展开更多
关键词 Tetralogy of Fallot NEURODEVELOPMENT dataset congenital heart disease
暂未订购
A large-scale,high-quality dataset for lithology identification:Construction and applications
19
作者 Jia-Yu Li Ji-Zhou Tang +6 位作者 Xian-Zheng Zhao Bo Fan Wen-Ya Jiang Shun-Yao Song Jian-Bing Li Kai-Da Chen Zheng-Guang Zhao 《Petroleum Science》 2025年第8期3207-3228,共22页
Lithology identification is a critical aspect of geoenergy exploration,including geothermal energy development,gas hydrate extraction,and gas storage.In recent years,artificial intelligence techniques based on drill c... Lithology identification is a critical aspect of geoenergy exploration,including geothermal energy development,gas hydrate extraction,and gas storage.In recent years,artificial intelligence techniques based on drill core images have made significant strides in lithology identification,achieving high accuracy.However,the current demand for advanced lithology identification models remains unmet due to the lack of high-quality drill core image datasets.This study successfully constructs and publicly releases the first open-source Drill Core Image Dataset(DCID),addressing the need for large-scale,high-quality datasets in lithology characterization tasks within geological engineering and establishing a standard dataset for model evaluation.DCID consists of 35 lithology categories and a total of 98,000 high-resolution images(512×512 pixels),making it the most comprehensive drill core image dataset in terms of lithology categories,image quantity,and resolution.This study also provides lithology identification accuracy benchmarks for popular convolutional neural networks(CNNs)such as VGG,ResNet,DenseNet,MobileNet,as well as for the Vision Transformer(ViT)and MLP-Mixer,based on DCID.Additionally,the sensitivity of model performance to various parameters and image resolution is evaluated.In response to real-world challenges,we propose a real-world data augmentation(RWDA)method,leveraging slightly defective images from DCID to enhance model robustness.The study also explores the impact of real-world lighting conditions on the performance of lithology identification models.Finally,we demonstrate how to rapidly evaluate model performance across multiple dimensions using low-resolution datasets,advancing the application and development of new lithology identification models for geoenergy exploration. 展开更多
关键词 Geoenergy exploration Lithology identification Lithology dataset Artificial intelligence Deep learning Drill core
原文传递
Standardizing Healthcare Datasets in China:Challenges and Strategies
20
作者 Zheng-Yong Hu Xiao-Lei Xiu +2 位作者 Jing-Yu Zhang Wan-Fei Hu Si-Zhu Wu 《Chinese Medical Sciences Journal》 2025年第4期253-267,I0001,共16页
Standardized datasets are foundational to healthcare informatization by enhancing data quality and unleashing the value of data elements.Using bibliometrics and content analysis,this study examines China's healthc... Standardized datasets are foundational to healthcare informatization by enhancing data quality and unleashing the value of data elements.Using bibliometrics and content analysis,this study examines China's healthcare dataset standards from 2011 to 2025.It analyzes their evolution across types,applications,institutions,and themes,highlighting key achievements including substantial growth in quantity,optimized typology,expansion into innovative application scenarios such as health decision support,and broadened institutional involvement.The study also identifies critical challenges,including imbalanced development,insufficient quality control,and a lack of essential metadata—such as authoritative data element mappings and privacy annotations—which hampers the delivery of intelligent services.To address these challenges,the study proposes a multi-faceted strategy focused on optimizing the standard system's architecture,enhancing quality and implementation,and advancing both data governance—through authoritative tracing and privacy protection—and intelligent service provision.These strategies aim to promote the application of dataset standards,thereby fostering and securing the development of new productive forces in healthcare. 展开更多
关键词 healthcare dataset standards data standardization data management
在线阅读 下载PDF
上一页 1 2 192 下一页 到第
使用帮助 返回顶部