期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
Bilingual phrase induction with local hard negative sampling
1
作者 Hailong Cao Hualin Miao +3 位作者 Weixuan Wang Liangyou Li Wei Peng Tiejun Zhao 《CAAI Transactions on Intelligence Technology》 2025年第1期147-159,共13页
Bilingual lexicon induction focuses on learning word translation pairs,also known as bitexts,from monolingual corpora by establishing a mapping between the source and target embedding spaces.Despite recent advancement... Bilingual lexicon induction focuses on learning word translation pairs,also known as bitexts,from monolingual corpora by establishing a mapping between the source and target embedding spaces.Despite recent advancements,bilingual lexicon induction is limited to inducing bitexts consisting of individual words,lacking the ability to handle semantics-rich phrases.To bridge this gap and support downstream cross-lingual tasks,it is practical to develop a method for bilingual phrase induction that extracts bilingual phrase pairs from monolingual corpora without relying on cross-lingual knowledge.In this paper,the authors propose a novel phrase embedding training method based on the skip-gram structure.Specifically,a local hard negative sampling strategy that utilises negative samples of central tokens in sliding windows to enhance phrase embedding learning is introduced.The proposed method achieves competitive or superior performance compared to baseline approaches,with exceptional results recorded for distant languages.Additionally,we develop a phrase representation learning method that leverages multilingual pre-trained language models.These mPLMs-based representations can be combined with the above-mentioned static phrase embeddings to further improve the accuracy of the bilingual phrase induction task.We manually construct a dataset of bilingual phrase pairs and integrate it with MUSE to facilitate the bilingual phrase induction task. 展开更多
关键词 artificial intelligence local hard negative sampling natural language processing phrase embedding pre-trained language models
在线阅读 下载PDF
PNSS: Unknown Face Presentation Attack Detection with Pseudo Negative Sample Synthesis
2
作者 Hongyang Wang Yichen Shi +2 位作者 Jun Feng Zitong Yu Zhuofu Tao 《Computers, Materials & Continua》 2025年第5期3097-3112,共16页
Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to ... Face Presentation Attack Detection(fPAD)plays a vital role in securing face recognition systems against various presentation attacks.While supervised learning-based methods demonstrate effectiveness,they are prone to overfitting to known attack types and struggle to generalize to novel attack scenarios.Recent studies have explored formulating fPAD as an anomaly detection problem or one-class classification task,enabling the training of generalized models for unknown attack detection.However,conventional anomaly detection approaches encounter difficulties in precisely delineating the boundary between bonafide samples and unknown attacks.To address this challenge,we propose a novel framework focusing on unknown attack detection using exclusively bonafide facial data during training.The core innovation lies in our pseudo-negative sample synthesis(PNSS)strategy,which facilitates learning of compact decision boundaries between bonafide faces and potential attack variations.Specifically,PNSS generates synthetic negative samples within low-likelihood regions of the bonafide feature space to represent diverse unknown attack patterns.To overcome the inherent imbalance between positive and synthetic negative samples during iterative training,we implement a dual-loss mechanism combining focal loss for classification optimization with pairwise confusion loss as a regularizer.This architecture effectively mitigates model bias towards bonafide samples while maintaining discriminative power.Comprehensive evaluations across three benchmark datasets validate the framework’s superior performance.Notably,our PNSS achieves 8%–18% average classification error rate(ACER)reduction compared with state-of-the-art one-class fPAD methods in cross-dataset evaluations on Idiap Replay-Attack and MSU-MFSD datasets. 展开更多
关键词 Face presentation attack detection pseudo negative sample anomaly detection one-class classification
在线阅读 下载PDF
A physics-informed machine learning solution for landslide susceptibility mapping based on three-dimensional slope stability evaluation
3
作者 WANG Yun-hao WANG Lu-qi +4 位作者 ZHANG Wen-gang LIU Song-lin SUN Wei-xin HONG Li ZHU Zheng-wei 《Journal of Central South University》 CSCD 2024年第11期3838-3853,共16页
Landslide susceptibility mapping is a crucial tool for disaster prevention and management.The performance of conventional data-driven model is greatly influenced by the quality of the samples data.The random selection... Landslide susceptibility mapping is a crucial tool for disaster prevention and management.The performance of conventional data-driven model is greatly influenced by the quality of the samples data.The random selection of negative samples results in the lack of interpretability throughout the assessment process.To address this limitation and construct a high-quality negative samples database,this study introduces a physics-informed machine learning approach,combining the random forest model with Scoops 3D,to optimize the negative samples selection strategy and assess the landslide susceptibility of the study area.The Scoops 3D is employed to determine the factor of safety value leveraging Bishop’s simplified method.Instead of conventional random selection,negative samples are extracted from the areas with a high factor of safety value.Subsequently,the results of conventional random forest model and physics-informed data-driven model are analyzed and discussed,focusing on model performance and prediction uncertainty.In comparison to conventional methods,the physics-informed model,set with a safety area threshold of 3,demonstrates a noteworthy improvement in the mean AUC value by 36.7%,coupled with a reduced prediction uncertainty.It is evident that the determination of the safety area threshold exerts an impact on both prediction uncertainty and model performance. 展开更多
关键词 machine learning physics-informed model negative samples selection INTERPRETABILITY landslide susceptibility mapping
在线阅读 下载PDF
False Negative Sample Detection for Graph Contrastive Learning 被引量:2
4
作者 Binbin Zhang Li Wang 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第2期529-542,共14页
Recently,self-supervised learning has shown great potential in Graph Neural Networks (GNNs) through contrastive learning,which aims to learn discriminative features for each node without label information. The key to ... Recently,self-supervised learning has shown great potential in Graph Neural Networks (GNNs) through contrastive learning,which aims to learn discriminative features for each node without label information. The key to graph contrastive learning is data augmentation. The anchor node regards its augmented samples as positive samples,and the rest of the samples are regarded as negative samples,some of which may be positive samples. We call these mislabeled samples as “false negative” samples,which will seriously affect the final learning effect. Since such semantically similar samples are ubiquitous in the graph,the problem of false negative samples is very significant. To address this issue,the paper proposes a novel model,False negative sample Detection for Graph Contrastive Learning (FD4GCL),which uses attribute and structure-aware to detect false negative samples. Experimental results on seven datasets show that FD4GCL outperforms the state-of-the-art baselines and even exceeds several supervised methods. 展开更多
关键词 graph representation learning contrastive learning false negative sample detection
原文传递
Two-Stage Negative Adaptive Cluster Sampling
5
作者 R.V.Latpate J.K.Kshirsagar 《Communications in Mathematics and Statistics》 SCIE 2020年第1期1-21,共21页
If the population is rare and clustered,then simple random sampling gives a poor estimate of the population total.For such type of populations,adaptive cluster sampling is useful.But it loses control on the final samp... If the population is rare and clustered,then simple random sampling gives a poor estimate of the population total.For such type of populations,adaptive cluster sampling is useful.But it loses control on the final sample size.Hence,the cost of sampling increases substantially.To overcome this problem,the surveyors often use auxiliary information which is easy to obtain and inexpensive.An attempt is made through the auxiliary information to control the final sample size.In this article,we have proposed two-stage negative adaptive cluster sampling design.It is a new design,which is a combination of two-stage sampling and negative adaptive cluster sampling designs.In this design,we consider an auxiliary variablewhich is highly negatively correlatedwith the variable of interest and auxiliary information is completely known.In the first stage of this design,an initial random sample is drawn by using the auxiliary information.Further,using Thompson’s(JAmStat Assoc 85:1050-1059,1990)adaptive procedure networks in the population are discovered.These networks serve as the primary-stage units(PSUs).In the second stage,random samples of unequal sizes are drawn from the PSUs to get the secondary-stage units(SSUs).The values of the auxiliary variable and the variable of interest are recorded for these SSUs.Regression estimator is proposed to estimate the population total of the variable of interest.A new estimator,Composite Horwitz-Thompson(CHT)-type estimator,is also proposed.It is based on only the information on the variable of interest.Variances of the above two estimators along with their unbiased estimators are derived.Using this proposed methodology,sample survey was conducted at Western Ghat of Maharashtra,India.The comparison of the performance of these estimators and methodology is presented and compared with other existing methods.The cost-benefit analysis is given. 展开更多
关键词 Adaptive cluster sampling Two-stage cluster sampling negative adaptive cluster sampling Two-stage NACS Regression estimator
原文传递
Open-Set Face Verification Algorithm Using Competitive Negative Samples
6
作者 YANG Qiong DING Xiao-qing 《Frontiers of Electrical and Electronic Engineering in China》 CSCD 2006年第1期20-25,共6页
A novel face verification algorithm using competitive negative samples is proposed.In the algorithm,the tested face matches not only with the claimed client face but also with competitive negative samples,and all the ... A novel face verification algorithm using competitive negative samples is proposed.In the algorithm,the tested face matches not only with the claimed client face but also with competitive negative samples,and all the matching scores are combined to make a final decision.Based on the algorithm,three schemes,including closestnegative-sample scheme,all-negative-sample scheme,and closest-few-negative-sample scheme,are designed.They are tested and compared with the traditional similaritybased verification approach on several databases with different features and classifiers.Experiments demonstrate that the three schemes reduce the verification error rate by 25.15%,30.24%,and 30.97%,on average,respectively. 展开更多
关键词 image recognition competitive negative samples open-set face verification
原文传递
Multiplex Networks and Pan-Cancer Multiomics-Based Driver Gene Identification Using Graph Neural Networks
7
作者 Xingyi Li Junming Li +3 位作者 Jun Hao Xingyu Liao Min Li Xuequn Shang 《Big Data Mining and Analytics》 CSCD 2024年第4期1262-1272,共11页
Identifying cancer driver genes has paramount significance in elucidating the intricate mechanisms underlying cancer development,progression,and therapeutic interventions.Abundant omics data and interactome networks p... Identifying cancer driver genes has paramount significance in elucidating the intricate mechanisms underlying cancer development,progression,and therapeutic interventions.Abundant omics data and interactome networks provided by numerous extensive databases enable the application of graph deep learning techniques that incorporate network structures into the deep learning framework.However,most existing models primarily focus on individual network,inevitably neglecting the incompleteness and noise of interactions.Moreover,samples with imbalanced classes in driver gene identification hamper the performance of models.To address this,we propose a novel deep learning framework MMGN,which integrates multiplex networks and pan-cancer multiomics data using graph neural networks combined with negative sample inference to discover cancer driver genes,which not only enhances gene feature learning based on the mutual information and the consensus regularizer,but also achieves balanced class of positive and negative samples for model training.The reliability of MMGN has been verified by the Area Under the Receiver Operating Characteristic curves(AUROC)and the Area Under the Precision-Recall Curves(AUPRC).We believe MMGN has the potential to provide new prospects in precision oncology and may find broader applications in predicting biomarkers for other intricate diseases. 展开更多
关键词 cancer driver gene multiplex networks pan-cancer multiomics data graph neural networks negative sample inference
原文传递
A physics-informed data-driven model for landslide susceptibility assessment in the Three Gorges Reservoir area 被引量:10
8
作者 Songlin Liu Luqi Wang +4 位作者 Wengang Zhang Weixin Sun Jie Fu Ting Xiao Zhenwei Dai 《Geoscience Frontiers》 SCIE CAS CSCD 2023年第5期1-16,共16页
Landslide susceptibility mapping is a crucial tool for analyzing geohazards in a region.Recent publications have popularized data-driven models,particularly machine learning-based methods,owing to their strong capabil... Landslide susceptibility mapping is a crucial tool for analyzing geohazards in a region.Recent publications have popularized data-driven models,particularly machine learning-based methods,owing to their strong capability in dealing with complex nonlinear problems.However,a significant proportion of these models have neglected qualitative aspects during analysis,resulting in a lack of interpretability throughout the process and causing inaccuracies in the negative sample extraction.In this study,Scoops 3D was employed as a physics-informed tool to qualitatively assess slope stability in the study area(the Hubei Province section of the Three Gorges Reservoir Area).The non-landslide samples were extracted based on the calculated factor of safety(FS).Subsequently,the random forest algorithm was employed for data-driven landslide susceptibility analysis,with the area under the receiver operating characteristic curve(AUC)serving as the model evaluation index.Compared to the benchmark model(i.e.,the standard method of utilizing the pure random forest algorithm),the proposed method’s AUC value improved by 20.1%,validating the effectiveness of the dual-driven method(physics-informed data-driven). 展开更多
关键词 Machine Learning Physics-informed negative sample extraction INTERPRETABILITY Dual-driven
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部