期刊文献+
共找到4,598篇文章
< 1 2 230 >
每页显示 20 50 100
A Convolutional Neural Network-Based Deep Support Vector Machine for Parkinson’s Disease Detection with Small-Scale and Imbalanced Datasets
1
作者 Kwok Tai Chui Varsha Arya +2 位作者 Brij B.Gupta Miguel Torres-Ruiz Razaz Waheeb Attar 《Computers, Materials & Continua》 2026年第1期1410-1432,共23页
Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using d... Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested. 展开更多
关键词 Convolutional neural network data generation deep support vector machine feature extraction generative artificial intelligence imbalanced dataset medical diagnosis Parkinson’s disease small-scale dataset
在线阅读 下载PDF
Layered Feature Engineering for E-Commerce Purchase Prediction:A Hierarchical Evaluation on Taobao User Behavior Datasets
2
作者 Liqiu Suo Lin Xia +1 位作者 Yoona Chung Eunchan Kim 《Computers, Materials & Continua》 2026年第4期1865-1889,共25页
Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three ... Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three layers:Basic,Conversion&Stability(efficiency and volatility across actions),and Advanced Interactions&Activity(crossbehavior synergies and intensity).Using real Taobao(Alibaba’s primary e-commerce platform)logs(57,976 records for 10,203 users;25 November–03 December 2017),we conducted a hierarchical,layer-wise evaluation that holds data splits and hyperparameters fixed while varying only the feature set to quantify each layer’s marginal contribution.Across logistic regression(LR),decision tree,random forest,XGBoost,and CatBoost models with stratified 5-fold cross-validation,the performance improvedmonotonically fromBasic to Conversion&Stability to Advanced features.With LR,F1 increased from 0.613(Basic)to 0.962(Advanced);boosted models achieved high discrimination(0.995 AUC Score)and an F1 score up to 0.983.Calibration and precision–recall analyses indicated strong ranking quality and acknowledged potential dataset and period biases given the short(9-day)window.By making feature contributions measurable and reproducible,the framework complements model-centric advances and offers a transparent blueprint for production-grade behavioralmodeling.The code and processed artifacts are publicly available,and future work will extend the validation to longer,seasonal datasets and hybrid approaches that combine automated feature learning with domain-driven design. 展开更多
关键词 Hierarchical feature engineering purchase prediction user behavior dataset feature importance e-commerce platform TAOBAO
在线阅读 下载PDF
Fine-Med-Mental-T&P:a dual-track approach for high-quality instructional datasets of mental disorders in traditional Chinese medicine
3
作者 Yanbai Wei Xiaoshuo Jing Junfeng Yan 《Digital Chinese Medicine》 2026年第1期31-42,共12页
Objective To investigate methods for constructing a high-quality instructional dataset for traditional Chinese medicine(TCM)mental disorders and to validate its efficacy.Methods We proposed the Fine-Med-Mental-T&P... Objective To investigate methods for constructing a high-quality instructional dataset for traditional Chinese medicine(TCM)mental disorders and to validate its efficacy.Methods We proposed the Fine-Med-Mental-T&P methodology for constructing high-quality instruction datasets in TCM mental disorders.This approach integrates theoretical knowledge and practical case studies through a dual-track strategy.(i)Theoretical track:textbooks and guidelines on TCM mental disorders were manually segmented.Initial responses were generated using DeepSeek-V3,followed by refinement by the Qwen3-32B model to align the expression with human preferences.A screening algorithm was then applied to select 16000 high-quality instruction pairs.(ii)Practical track:starting from over 600 real clinical case seeds,diagnostic and therapeutic instruction pairs were generated using DeepSeek-V3 and subsequently screened through manual evaluation,resulting in 4000 high-quality practiceoriented instruction pairs.The integration of both tracks yielded the Med-Mental-Instruct-T&P dataset,comprising a total of 20000 instruction pairs.To validate the dataset’s effectiveness,three experimental evaluations(both manual and automated)were conducted:(i)comparative studies to compare the performance of models fine-tuned on different datasets;(ii)benchmarking to compare against mainstream TCM-specific large language models(LLMs);(iii)data ablation study to investigate the relationship between data volume and model performance.Results Experimental results demonstrate the superior performance of T&P-model finetuned on the Med-Mental-Instruct-T&P dataset.In the comparative study,the T&P-model significantly outperformed the baseline models trained solely on self-generated or purely human-curated baseline data.This superiority was evident in both automated metrics(ROUGEL>0.55)and expert manual evaluations(scoring above 7/10 across accuracy).In benchmark comparisons,the T&P-model also excelled against existing mainstream TCM LLMs(e.g.,HuatuoGPT and ZuoyiGPT).It showed particularly strong capabilities in handling diverse clinical presentations,including challenging disorders such as insomnia and coma,showcasing its robustness and versatility.Data ablation studies showed that T&P-model performance had an overall upward trend with minor fluctuations when training data increased from 10%to 50%;beyond 50%,performance improvement slowed significantly,with metrics plateauing and approaching a saturation point. 展开更多
关键词 Mental disorder Traditional Chinese medicine(TCM) Instruction dataset construction Instruction tuning Large language model
在线阅读 下载PDF
基于TDIM的高精度功率SMD结壳热阻测量技术
4
作者 吴玉强 郑花 +3 位作者 马凤丽 侯杰 许为新 郭美洋 《半导体技术》 北大核心 2025年第12期1237-1243,共7页
为解决传统热阻测量中功率表面贴装器件(SMD)散热基板与电学引出端共面导致短路及热电偶法测量误差问题,提出一种基于瞬态双界面法(TDIM)的高精度结壳热阻(R_(θJC))测量技术。通过设计含铜板凸台结构与绝缘定位板的专用夹具,有效避免... 为解决传统热阻测量中功率表面贴装器件(SMD)散热基板与电学引出端共面导致短路及热电偶法测量误差问题,提出一种基于瞬态双界面法(TDIM)的高精度结壳热阻(R_(θJC))测量技术。通过设计含铜板凸台结构与绝缘定位板的专用夹具,有效避免了电气短路;结合TDIM替代热电偶法,消除了热量“芯吸”效应与测温位置误差。以TO-277封装肖特基二极管为实验对象,测得其R_(θJC)为0.302 K/W,与器件手册典型值(0.30 K/W)误差仅0.67%。通过在相同结温(423.15 K)下实施两次热表征,显著抑制了温度依赖性误差。本技术为功率SMD的热管理设计提供了可靠的测量方案,具备工程推广价值。 展开更多
关键词 瞬态双界面法(TDIM) 表面贴装器件(smd) 结壳热阻 专用夹具 热表征 热管理 一维热流路径
原文传递
Standardizing Healthcare Datasets in China:Challenges and Strategies
5
作者 Zheng-Yong Hu Xiao-Lei Xiu +2 位作者 Jing-Yu Zhang Wan-Fei Hu Si-Zhu Wu 《Chinese Medical Sciences Journal》 2025年第4期253-267,I0001,共16页
Standardized datasets are foundational to healthcare informatization by enhancing data quality and unleashing the value of data elements.Using bibliometrics and content analysis,this study examines China's healthc... Standardized datasets are foundational to healthcare informatization by enhancing data quality and unleashing the value of data elements.Using bibliometrics and content analysis,this study examines China's healthcare dataset standards from 2011 to 2025.It analyzes their evolution across types,applications,institutions,and themes,highlighting key achievements including substantial growth in quantity,optimized typology,expansion into innovative application scenarios such as health decision support,and broadened institutional involvement.The study also identifies critical challenges,including imbalanced development,insufficient quality control,and a lack of essential metadata—such as authoritative data element mappings and privacy annotations—which hampers the delivery of intelligent services.To address these challenges,the study proposes a multi-faceted strategy focused on optimizing the standard system's architecture,enhancing quality and implementation,and advancing both data governance—through authoritative tracing and privacy protection—and intelligent service provision.These strategies aim to promote the application of dataset standards,thereby fostering and securing the development of new productive forces in healthcare. 展开更多
关键词 healthcare dataset standards data standardization data management
在线阅读 下载PDF
DCS-SOCP-SVM:A Novel Integrated Sampling and Classification Algorithm for Imbalanced Datasets
6
作者 Xuewen Mu Bingcong Zhao 《Computers, Materials & Continua》 2025年第5期2143-2159,共17页
When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes... When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes a high-performance classification algorithm specifically designed for imbalanced datasets.The proposed method first uses a biased second-order cone programming support vectormachine(B-SOCP-SVM)to identify the support vectors(SVs)and non-support vectors(NSVs)in the imbalanced data.Then,it applies the synthetic minority over-sampling technique(SV-SMOTE)to oversample the support vectors of the minority class and uses the random under-sampling technique(NSV-RUS)multiple times to undersample the non-support vectors of the majority class.Combining the above-obtained minority class data set withmultiple majority class datasets can obtainmultiple new balanced data sets.Finally,SOCP-SVM is used to classify each data set,and the final result is obtained through the integrated algorithm.Experimental results demonstrate that the proposed method performs excellently on imbalanced datasets. 展开更多
关键词 DCS-SOCP-SVM imbalanced datasets sampling method ensemble method integrated algorithm
在线阅读 下载PDF
Development and validation of AI delineation of the thoracic RTOG organs at risk with deep learning on multi-institutional datasets
7
作者 Xianghua Ye Dazhou Guo +32 位作者 Lujun Zhao Congying Xie Dandan Zheng Haihua Yang Xiangzhi Zhu Xin Sun Pingping Dong Huanhuan Li Weiwei Kong Jianzhong Cao Honglei Chen Juntao Ran Kai Ren Hongxin Su Hao Hu Cuimeng Tian Tianlu Wang Qiang Zeng Xiao Hu Ping Peng Junhua Zhang Li Zhang Tingting Zhang Lue Zhou Wenchao Guo Zhanghexuan Ji Puyang Wang Hua Zhang Jiali Liu Le Lu Senxiang Yan Dakai Jin Feng-Ming(Spring)Kong 《Intelligent Oncology》 2025年第1期61-71,共11页
Introduction:Accurate contouring of thoracic organs at risk(OARs)is essential for minimizing complications in radiation treatment.Manual contouring of thoracic OARs is not only time-consuming but also prone to substan... Introduction:Accurate contouring of thoracic organs at risk(OARs)is essential for minimizing complications in radiation treatment.Manual contouring of thoracic OARs is not only time-consuming but also prone to substantial user variation.To enhance the efficiency and consistency,we developed a unified deep learning(DL)OAR contouring model,DeepOAR,that was trained using multiple partially labeled datasets for segmenting a comprehensive set of thoracic OARs following the Radiation Therapy Oncology Group(RTOG)-guided OAR atlas.This DL model supports the segmentation of six required and eight optional OARs guided by the NRG-RTOG 1106 trial,providing precise and reproducible OARs contouring that are ready to be used in radiotherapy practice.Materials and methods:Following the OAR contouring recommendation of the NRG-RTOG 1106 trial,we collected and curated three private datasets and two public datasets,comprising a total of 531 patients with partially annotated thoracic OARs.These partially annotated datasets were utilized to develop DeepOAR,which consisted of a shared encoder and 14 separate decoders,with each decoder dedicated to one specific OAR.For model training,we utilized all patients from the two public datasets and 75%of the patients from the private datasets.We reserved the remaining 25%of the private datasets for independent testing.A multi-user study involving 21 radiation oncologists was conducted on 40 randomly selected patients from the independent testing dataset to evaluate the clinical applicability of DeepOAR.The Dice coefficient score(DSC)and average surface distance(ASD)were computed to evaluate the quantitative delineation performance of the model.Results:DeepOAR outperformed nnUNet(the benchmark medical segmentation model)across all 14 OARs,achieving mean DSC and ASD values of 88.4%and 1.0 mm,respectively,in the independent testing set.Multi-user validation demonstrated that 89.7%of DeepOAR-generated OARs were clinically acceptable or required only minor revisions.A comparison using two randomly selected patients showed that the delineation variability of DeepOAR was significantly smaller than the inter-user variation among radiation oncologists.Human editing of DeepOAR’s predictions could further improve OAR delineation accuracy by an average of 3%increase in DSC and 40%reduction in ASD while significantly reducing the workload of radiation oncologists for contouring 14 thoracic OARs by an average of 77.0%.Conclusion:We developed DeepOAR,a DL-based unified contouring model trained using multiple partially labeled datasets,to delineate a comprehensive set of 14 thoracic OARs following the RTOG-guided OAR atlas.Both qualitative and quantitative results demonstrated the strong clinical applicability of DeepOAR for the OAR delineation process in thoracic cancer radiotherapy workflows,along with improved efficiency,comprehensiveness,and quality. 展开更多
关键词 NRG-RTOG 1106 OAR segmentation Deep learning Partially labeled datasets
暂未订购
A Comprehensive Review of Face Detection Techniques for Occluded Faces:Methods,Datasets,and Open Challenges
8
作者 Thaer Thaher Majdi Mafarja +2 位作者 Muhammed Saffarini Abdul Hakim H.M.Mohamed Ayman A.El-Saleh 《Computer Modeling in Engineering & Sciences》 2025年第6期2615-2673,共59页
Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveill... Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveillance,biometric authentication,and human-computer interaction.This paper provides a comprehensive review of face detection techniques developed to handle occluded faces.Studies are categorized into four main approaches:feature-based,machine learning-based,deep learning-based,and hybrid methods.We analyzed state-of-the-art studies within each category,examining their methodologies,strengths,and limitations based on widely used benchmark datasets,highlighting their adaptability to partial and severe occlusions.The review also identifies key challenges,including dataset diversity,model generalization,and computational efficiency.Our findings reveal that deep learning methods dominate recent studies,benefiting from their ability to extract hierarchical features and handle complex occlusion patterns.More recently,researchers have increasingly explored Transformer-based architectures,such as Vision Transformer(ViT)and Swin Transformer,to further improve detection robustness under challenging occlusion scenarios.In addition,hybrid approaches,which aim to combine traditional andmodern techniques,are emerging as a promising direction for improving robustness.This review provides valuable insights for researchers aiming to develop more robust face detection systems and for practitioners seeking to deploy reliable solutions in real-world,occlusionprone environments.Further improvements and the proposal of broader datasets are required to developmore scalable,robust,and efficient models that can handle complex occlusions in real-world scenarios. 展开更多
关键词 Occluded face detection feature-based deep learning machine learning hybrid approaches datasets
在线阅读 下载PDF
Impact of climate changes on Arizona State precipitation patterns using high-resolution climatic gridded datasets
9
作者 Hayder H.Kareem Shahla Abdulqader Nassrullah 《Journal of Groundwater Science and Engineering》 2025年第1期34-46,共13页
Climate change significantly affects environment,ecosystems,communities,and economies.These impacts often result in quick and gradual changes in water resources,environmental conditions,and weather patterns.A geograph... Climate change significantly affects environment,ecosystems,communities,and economies.These impacts often result in quick and gradual changes in water resources,environmental conditions,and weather patterns.A geographical study was conducted in Arizona State,USA,to examine monthly precipi-tation concentration rates over time.This analysis used a high-resolution 0.50×0.50 grid for monthly precip-itation data from 1961 to 2022,Provided by the Climatic Research Unit.The study aimed to analyze climatic changes affected the first and last five years of each decade,as well as the entire decade,during the specified period.GIS was used to meet the objectives of this study.Arizona experienced 51–568 mm,67–560 mm,63–622 mm,and 52–590 mm of rainfall in the sixth,seventh,eighth,and ninth decades of the second millennium,respectively.Both the first and second five year periods of each decade showed accept-able rainfall amounts despite fluctuations.However,rainfall decreased in the first and second decades of the third millennium.and in the first two years of the third decade.Rainfall amounts dropped to 42–472 mm,55–469 mm,and 74–498 mm,respectively,indicating a downward trend in precipitation.The central part of the state received the highest rainfall,while the eastern and western regions(spanning north to south)had significantly less.Over the decades of the third millennium,the average annual rainfall every five years was relatively low,showing a declining trend due to severe climate changes,generally ranging between 35 mm and 498 mm.The central regions consistently received more rainfall than the eastern and western outskirts.Arizona is currently experiencing a decrease in rainfall due to climate change,a situation that could deterio-rate further.This highlights the need to optimize the use of existing rainfall and explore alternative water sources. 展开更多
关键词 Spatial Analysis Climate Impact Precipitation Rates CRU Dataset GIS Arizona State USA
在线阅读 下载PDF
The Development of Artificial Intelligence:Toward Consistency in the Logical Structures of Datasets,AI Models,Model Building,and Hardware?
10
作者 Li Guo Jinghai Li 《Engineering》 2025年第7期13-17,共5页
The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficu... The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficulty effectively processing and fully representing their spatiotemporal complexity patterns.The article also discusses a potential path of AI development in the engineering domain.Based on the existing understanding of the principles of multilevel com-plexity,this article suggests that consistency among the logical structures of datasets,AI models,model-building software,and hardware will be an important AI development direction and is worthy of careful consideration. 展开更多
关键词 CONSISTENCY datasets model building ai models artificial intelligence ai explore potential directions HARDWARE artificial intelligence
在线阅读 下载PDF
A Comprehensive Review of Face Detection/Recognition Algorithms and Competitive Datasets to Optimize Machine Vision
11
作者 Mahmood Ul Haq Muhammad Athar Javed Sethi +3 位作者 Sadique Ahmad Naveed Ahmad Muhammad Shahid Anwar Alpamis Kutlimuratov 《Computers, Materials & Continua》 2025年第7期1-24,共24页
Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensi... Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research. 展开更多
关键词 Face recognition algorithms face detection techniques face recognition/detection datasets
在线阅读 下载PDF
A critical evaluation of deep-learning based phylogenetic inference programs using simulated datasets
12
作者 Yixiao Zhu Yonglin Li +2 位作者 Chuhao Li Xing-Xing Shen Xiaofan Zhou 《Journal of Genetics and Genomics》 2025年第5期714-717,共4页
Inferring phylogenetic trees from molecular sequences is a cornerstone of evolutionary biology.Many standard phylogenetic methods(such as maximum-likelihood[ML])rely on explicit models of sequence evolution and thus o... Inferring phylogenetic trees from molecular sequences is a cornerstone of evolutionary biology.Many standard phylogenetic methods(such as maximum-likelihood[ML])rely on explicit models of sequence evolution and thus often suffer from model misspecification or inadequacy.The on-rising deep learning(DL)techniques offer a powerful alternative.Deep learning employs multi-layered artificial neural networks to progressively transform input data into more abstract and complex representations.DL methods can autonomously uncover meaningful patterns from data,thereby bypassing potential biases introduced by predefined features(Franklin,2005;Murphy,2012).Recent efforts have aimed to apply deep neural networks(DNNs)to phylogenetics,with a growing number of applications in tree reconstruction(Suvorov et al.,2020;Zou et al.,2020;Nesterenko et al.,2022;Smith and Hahn,2023;Wang et al.,2023),substitution model selection(Abadi et al.,2020;Burgstaller-Muehlbacher et al.,2023),and diversification rate inference(Voznica et al.,2022;Lajaaiti et al.,2023;Lambert et al.,2023).In phylogenetic tree reconstruction,PhyDL(Zou et al.,2020)and Tree_learning(Suvorov et al.,2020)are two notable DNN-based programs designed to infer unrooted quartet trees directly from alignments of four amino acid(AA)and DNA sequences,respectively. 展开更多
关键词 phylogenetic inference explicit models sequence evolution deep learning deep learning dl techniques molecular sequences simulated datasets phylogenetic methods such evolutionary biologymany
原文传递
档案高质量数据集的基本内涵与建设要求
13
作者 加小双 郭若涵 刘力超 《北京档案》 北大核心 2026年第3期5-11,共7页
随着“人工智能+档案”行动的推进,档案数据价值日益凸显。明确档案高质量数据集的基本内涵及其建设要求,已成为档案领域亟须回应的基础性问题。论文认为,档案高质量数据集是以已依法归档的档案数据为对象,经档案数据治理与规范化加工... 随着“人工智能+档案”行动的推进,档案数据价值日益凸显。明确档案高质量数据集的基本内涵及其建设要求,已成为档案领域亟须回应的基础性问题。论文认为,档案高质量数据集是以已依法归档的档案数据为对象,经档案数据治理与规范化加工形成的高可信数据集合形态。它不仅是档案领域深化智能应用的重要基础形态,也是档案工作在人工智能时代激活数据潜在价值、拓展利用边界的关键路径。为此,档案高质量数据集建设应在受控使用、职责协同、数据边界与持续管理等方面形成系统性规范,以实现档案数据可信与智能适配能力的协同提升。 展开更多
关键词 档案高质量数据集 档案数据 高质量数据集 “人工智能+档案”
在线阅读 下载PDF
BHDSI:面向深度学习的遥感建筑高度数据集
14
作者 王浩 马遥 +3 位作者 曹昌昊 宁晓刚 张翰超 张瑞倩 《遥感学报》 北大核心 2026年第2期445-457,共13页
利用光学和SAR遥感影像进行建筑高度估计对于理解城市形态和优化城市存量空间具有重要意义。然而,现有的数据集存在诸多局限:由于样本数量较少,难以满足基于深度学习的遥感信息提取需求,样本所覆盖的区域较为有限,无法提供足够的地理多... 利用光学和SAR遥感影像进行建筑高度估计对于理解城市形态和优化城市存量空间具有重要意义。然而,现有的数据集存在诸多局限:由于样本数量较少,难以满足基于深度学习的遥感信息提取需求,样本所覆盖的区域较为有限,无法提供足够的地理多样性和空间特征代表性,特别是针对中国区域的大规模建筑高度数据集尤为缺乏。此外,数据集的开源性不足,限制了其在更广泛的研究中的应用和验证。为解决这些问题,本文构建了一个面向深度学习的基于Sentinel影像的建筑物高度数据集BHDSI(Building Height Estimation Dataset Based on Sentinel Imagery),该数据集涵盖了中国62个城市的中心城区,共有5606个样本,覆盖了城市,农村等场景,是目前中国区域覆盖面积最大的建筑高度数据集。该数据集包含哨兵一号和哨兵二号的遥感影像以及建筑高度的真实值,样本大小是256×256,相比于64×64大小的数据集,为建筑高度估计研究提供了一个重要的补充选择。相比其他数据集,该数据集具有样本数量大、覆盖范围广、可获取性、建筑高度分布合理等特点,能够更好地满足深度学习网络的训练需求。在此基础上,本文采用相同的深度学习网络对BHDSI数据集及其他类似数据集进行了评估,并对比了多个网络使用BHDSI数据集时在建筑高度回归任务中的表现,深入分析了各网络的优劣。结果表明,与其他数据集相比,BHDSI数据集在建筑高度回归任务中的表现更加优异。进一步分析发现,使用BHDSI数据集时,建筑高度较低的区域其估计精度相对较高。此外,U-Net解码器用于建筑高度估计网络训练能够取得更高的精度。综上,BHDSI数据集为未来建筑高度估计领域的研究提供了重要的支持。 展开更多
关键词 Sentinel图像 建筑物高度 数据集 深度学习 卷积神经网络
原文传递
2012-2023年福建省高校数字图书馆(FULink)联盟文献传递记录数据集
15
作者 戴晓翔 詹庆东 《图书馆杂志》 北大核心 2026年第3期66-71,共6页
通过福建省高校数字图书馆(FULink)文献提供系统,采集2012-2023年FULink联盟文献传递记录,经过对多张表的数据关联、读者隐私数据不可恢复的数据加密等技术手段,最后生成4075485条联盟文献传递记录。该数据集不仅能够作为用户留存的实... 通过福建省高校数字图书馆(FULink)文献提供系统,采集2012-2023年FULink联盟文献传递记录,经过对多张表的数据关联、读者隐私数据不可恢复的数据加密等技术手段,最后生成4075485条联盟文献传递记录。该数据集不仅能够作为用户留存的实证研究对象,而且对其他图书馆联盟也具有重要的参考价值。 展开更多
关键词 文献传递 FULink 高校数字图书馆联盟 数据集
原文传递
基于知识蒸馏的贝叶斯网络参数学习算法
16
作者 郭文强 张琦 +3 位作者 侯勇严 冯宽平 郭志高 刘佳乐 《陕西科技大学学报》 北大核心 2026年第2期187-196,共10页
在小数据集情况下进行贝叶斯网络(Bayesian Networks,BNs)参数学习时,通常依赖以定性参数约束形式的领域知识,然而此类知识的获取方法尚不明确,导致难以充分利用,进而影响模型的参数学习精度.为此,本文探索了一种基于知识蒸馏的知识提... 在小数据集情况下进行贝叶斯网络(Bayesian Networks,BNs)参数学习时,通常依赖以定性参数约束形式的领域知识,然而此类知识的获取方法尚不明确,导致难以充分利用,进而影响模型的参数学习精度.为此,本文探索了一种基于知识蒸馏的知识提取方法,并据此构建了贝叶斯网络参数学习算法(Knowledge Distillation BN,KD-BN).该算法从教师模型的参数空间中提取参数排序关系,将其转化为蒸馏后的知识:分布内约束(IDC)和跨分布约束(CDC),以限定学生模型的参数可行域.进一步地,设计了动态融合策略,融合约束引导生成的候选参数与学生模型的观测数据估计结果,从而获取最终的BN参数.实验结果表明,在小数据集上,KD-BN算法相较于传统的MLE、MAP与QMAP,表现出更高的参数学习精度.该算法已成功应用于真实的轴承故障诊断场景,为在小数据集上进行BN参数学习提供了一种解决方案. 展开更多
关键词 贝叶斯网络 小数据集 知识蒸馏 参数学习
在线阅读 下载PDF
面向人脸视频防伪检测的大规模中文数据测评基准
17
作者 贝毅君 娄恒瑞 +7 位作者 高克威 宋杰 王蕊 金苍宏 雷杰 宋明黎 胡秉德 冯尊磊 《中国图象图形学报》 北大核心 2026年第1期82-98,共17页
目的针对生成式人工智能(artificial intelligence generated content,AIGC)技术生成的高逼真伪造人脸视频对人类视觉感知的欺骗性问题,以及当前人脸防伪检测算法评估体系在中文数据层面有效性和应用性验证方面的空白,旨在构建面向中文... 目的针对生成式人工智能(artificial intelligence generated content,AIGC)技术生成的高逼真伪造人脸视频对人类视觉感知的欺骗性问题,以及当前人脸防伪检测算法评估体系在中文数据层面有效性和应用性验证方面的空白,旨在构建面向中文场景的量化评估基准以推动防伪检测技术迭代发展。方法提出面向大规模中文人脸伪造视频的CHN-DF(Chinese-deepfake)数据集,详细阐述数据采集、伪造样本生成及质量评估的全流程构建方法。通过多维度实验验证数据集复杂性,兼顾跨模态伪造技术覆盖、环境干扰因子完备性等复杂因素,并建立基于深度检测模型的系统性评测基准。结果发布全球首个包含434727样本的中文人脸视频防伪数据集,实验显示该数据集鉴别难度高,在16种包含SOTA(state-of-the-art)与主流防伪模型的测评中视觉与视听结合的准确率分别控制在85%与70%以下。构建的评测基准覆盖了视觉与听觉模态场景,在跨域泛化性测试中显示模型准确率性能波动平均幅度达19.6%,显著揭示现有算法的应用局限性。结论构建的中文防伪评测基准有效填补领域空白,通过系统性实验阐明数据集特性与算法性能的关联机制,提出针对模型鲁棒性增强、跨模态泛化能力提升等关键发展方向,为面向中文场景的量化评估以及人脸视频防伪技术的实际部署提供数据支撑与实践指导。CHN-DF数据集在线发布地址为:https://doi.org/10.57760/sciencedb.j00240.00067和https://github.com/HengruiLou/CHN-DF. 展开更多
关键词 深度伪造 人脸伪造视频 人脸防伪评测基准 中文数据集 多模态
原文传递
Impacts of random negative training datasets on machine learning-based geologic hazard susceptibility assessment
18
作者 Hao Cheng Wei Hong +3 位作者 Zhen-kai Zhang Zeng-lin Hong Zi-yao Wang Yu-xuan Dong 《China Geology》 2025年第4期676-690,共15页
This study investigated the impacts of random negative training datasets(NTDs)on the uncertainty of machine learning models for geologic hazard susceptibility assessment of the Loess Plateau,northern Shaanxi Province,... This study investigated the impacts of random negative training datasets(NTDs)on the uncertainty of machine learning models for geologic hazard susceptibility assessment of the Loess Plateau,northern Shaanxi Province,China.Based on randomly generated 40 NTDs,the study developed models for the geologic hazard susceptibility assessment using the random forest algorithm and evaluated their performances using the area under the receiver operating characteristic curve(AUC).Specifically,the means and standard deviations of the AUC values from all models were then utilized to assess the overall spatial correlation between the conditioning factors and the susceptibility assessment,as well as the uncertainty introduced by the NTDs.A risk and return methodology was thus employed to quantify and mitigate the uncertainty,with log odds ratios used to characterize the susceptibility assessment levels.The risk and return values were calculated based on the standard deviations and means of the log odds ratios of various locations.After the mean log odds ratios were converted into probability values,the final susceptibility map was plotted,which accounts for the uncertainty induced by random NTDs.The results indicate that the AUC values of the models ranged from 0.810 to 0.963,with an average of 0.852 and a standard deviation of 0.035,indicating encouraging prediction effects and certain uncertainty.The risk and return analysis reveals that low-risk and high-return areas suggest lower standard deviations and higher means across multiple model-derived assessments.Overall,this study introduces a new framework for quantifying the uncertainty of multiple training and evaluation models,aimed at improving their robustness and reliability.Additionally,by identifying low-risk and high-return areas,resource allocation for geologic hazard prevention and control can be optimized,thus ensuring that limited resources are directed toward the most effective prevention and control measures. 展开更多
关键词 LANDSLIDES Debris flows Collapses Ground fissures Geologic hazard prevention and control ENGINEERING Geologic hazard susceptibility assessment Negative training dataset Average spatial correlation Random forest algorithm Risk and return analysis Geological survey engineering Loess Plateau area
在线阅读 下载PDF
基于改进WGAN-GP和ConvNext 1D的不平衡轴承故障诊断
19
作者 曹菁菁 肖景昌 +2 位作者 张艳伟 赵强伟 苏越 《机床与液压》 北大核心 2026年第5期203-210,共8页
针对滚动轴承故障诊断数据不平衡问题,提出一种基于改进Wasserstein距离的梯度惩罚生成对抗网络(WGAN-GP)和ConvNext 1D的故障诊断方法。利用小波包分解得到小波包能量域特征,并引入统计特征作为数据输入;利用改进的WGAN-GP进行数据增强... 针对滚动轴承故障诊断数据不平衡问题,提出一种基于改进Wasserstein距离的梯度惩罚生成对抗网络(WGAN-GP)和ConvNext 1D的故障诊断方法。利用小波包分解得到小波包能量域特征,并引入统计特征作为数据输入;利用改进的WGAN-GP进行数据增强,生成平衡数据集;最后,利用递归下采样设计的ConvNext 1D模型进行分类故障诊断,并应用GELU激活函数来提高特征表达能力。针对凯恩斯西储大学(CWRU)和东南大学(SEU)数据集,设置多个不平衡比例实验场景,并与其他5种深度模型(如Transformer、TCN、WDCNN等)进行对比,以准确率ACC、AUC和F_(1)分数作为评价指标。结果表明:在CWRU数据集上,所提方法在3种不平衡数据集下均取得最优性能,最高ACC达0.972,AUC为0.987,F_(1)分数为0.981,优于对比模型;混淆矩阵显示个别故障类别识别率较低,但均超过95%,其余故障类别准确率均为100%;在SEU数据集上,最高ACC为0.943,AUC为0.978,F_(1)分数为0.954,同样优于其他对比方法,但混淆矩阵显示内圈故障识别能力较差,准确率不足85%,其余类别识别准确率较好。生成样本经t-SNE可视化后表明,平衡数据集中不同故障类别聚类更清晰,证明了WGAN-GP生成样本的有效性。相较于其他经典模型,所提方法提高了对少数类样本的识别能力,在处理不平衡故障分类问题上精度更高。该基于改进WGAN-GP和ConvNext 1D的方法能有效缓解轴承故障数据不平衡问题。 展开更多
关键词 滚动轴承 故障诊断 不平衡数据集 深度学习 生成对抗网络
在线阅读 下载PDF
基于改进RT-DETR的有遮挡交通标志检测算法
20
作者 于天河 杨壮壮 +2 位作者 胡金帅 常梦瑶 王文龙 《工程科学学报》 北大核心 2026年第2期393-408,共16页
针对交通标志检测中目标尺寸小、检测精度低等问题,尤其是在远距离拍摄、遮挡严重的情况下,传统检测算法往往难以准确识别交通标志.本文提出了一种基于改进RT-DETR的交通标志检测算法.首先,考虑到当前交通标志被遮挡情况下数据集的匮乏... 针对交通标志检测中目标尺寸小、检测精度低等问题,尤其是在远距离拍摄、遮挡严重的情况下,传统检测算法往往难以准确识别交通标志.本文提出了一种基于改进RT-DETR的交通标志检测算法.首先,考虑到当前交通标志被遮挡情况下数据集的匮乏,自建一个遮挡条件下的交通标志数据集.然后,在反向残差移动块中引入膨胀重参数块,构建了一个轻量级的复合膨胀残差块来替换原始主干提取网络中的BasicBlock,增强了模型的特征提取能力.最后,对RT-DETR模型的损失函数进行了优化,提出了DS-IoU联合损失函数加快收模型敛速度.实验结果表明,改进后的算法在自制数据集上的m AP为94.2%,相比于原始算法增加量为4.7%,在公开数据集TT100K和CCTSDB2021的m AP分别为92.8%和91.7%,相比于原始算法增加量分别为3.1%和2.4%,Params和GFLOPs相比于原始的算法分别降低了26.0%和12.5%.本文提出的改进方法极大地减少了计算量和参数数量,有效提升了遮挡情况下的交通标志的检测精度. 展开更多
关键词 交通标志检测 RT-DETR 遮挡数据集 轻量化 联合损失函数
在线阅读 下载PDF
上一页 1 2 230 下一页 到第
使用帮助 返回顶部