期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
A Top-down Method of Extraction Entity Relationship Triples and Obtaining Annotated Data
1
作者 Zhiqiang Hu Zheng Ma +6 位作者 Jun Shi Zhipeng Li Xun Shao Yangzhao Yang Yong Liao Zhenyuan Gao Jie Zhang 《Journal of Quantum Computing》 2022年第1期13-22,共10页
The extraction of entity relationship triples is very important to build a knowledge graph(KG),meanwhile,various entity relationship extraction algorithms are mostly based on data-driven,especially for the current pop... The extraction of entity relationship triples is very important to build a knowledge graph(KG),meanwhile,various entity relationship extraction algorithms are mostly based on data-driven,especially for the current popular deep learning algorithms.Therefore,obtaining a large number of accurate triples is the key to build a good KG as well as train a good entity relationship extraction algorithm.Because of business requirements,this KG’s application field is determined and the experts’opinions also must be satisfied.Considering these factors we adopt the top-down method which refers to determining the data schema firstly,then filling the specific data according to the schema.The design of data schema is the top-level design of KG,and determining the data schema according to the characteristics of KG is equivalent to determining the scope of data’s collection and the mode of data’s organization.This method is generally suitable for the construction of domain KG.This article proposes a fast and efficient method to extract the topdown type KG’s triples in social media with the help of structured data in the information box on the right side of the related encyclopedia webpage.At the same time,based on the obtained triples,a data labeling method is proposed to obtain sufficiently high-quality training data,using in various Natural Language Processing(NLP)information extraction algorithms’training. 展开更多
关键词 Entity relationship triples knowledge graph TOP-DOWN social media data labeling
在线阅读 下载PDF
Efficient XML Query and Update Processing Using A Novel Prime-Based Middle Fraction Labeling Scheme 被引量:2
2
作者 Zunyue Qin Yong Tang +3 位作者 Feiyi Tang Jing Xiao Changqin Huang Hongzhi Xu 《China Communications》 SCIE CSCD 2017年第3期145-157,共13页
XML data can be represented by a tree or graph and the query processing for XML data requires the structural information among nodes. Designing an efficient labeling scheme for the nodes of Order-Sensitive XML trees i... XML data can be represented by a tree or graph and the query processing for XML data requires the structural information among nodes. Designing an efficient labeling scheme for the nodes of Order-Sensitive XML trees is one of the important methods to obtain the excellent management of XML data. Previous labeling schemes such as region and prefix often sacrifice updating performance and suffer increasing labeling space when inserting new nodes. To overcome these limitations, in this paper we propose a new labeling idea of separating structure from order. According to the proposed idea, a novel Prime-based Middle Fraction Labeling Scheme(PMFLS) is designed accordingly, in which a series of algorithms are proposed to obtain the structural relationships among nodes and to support updates. PMFLS combines the advantages of both prefix and region schemes in which the structural information and sequential information are separately expressed. PMFLS also supports Order-Sensitive updates without relabeling or recalculation, and its labeling space is stable. Experiments and analysis on several benchmarks are conducted and the results show that PMFLS is efficient in handling updates and also significantly improves the performance of the query processing with good scalability. 展开更多
关键词 XML data structure information order information information separation PMFLS labeling scheme
在线阅读 下载PDF
Photogrammetry engaged automated image labeling approach
3
作者 Jonathan Boyack Jongseong Brad Choi 《Visual Informatics》 2025年第2期76-86,共11页
Deep learning models require many instances of training data to be able to accurately detect the desired object.However,the labeling of images is currently conducted manually due to the inclusion of irrelevant scenes ... Deep learning models require many instances of training data to be able to accurately detect the desired object.However,the labeling of images is currently conducted manually due to the inclusion of irrelevant scenes in the original images,especially for the data collected in a dynamic environment such as from drone imagery.In this work,we developed an automated extraction of training data set using photogrammetry.This approach works with continuous and arbitrary collection of visual data,such as video,encompassing a stationary object.A dense point cloud was first generated to estimate the geometric relationship between individual images using a structure-from-motion(SfM)technique,followed by user-designated region-of-interests,ROIs,that are automatically extracted from the original images.An orthophoto mosaic of the façade plane of the building shown in the point cloud was created to ease the user’s selection of an intended labeling region of the object,which is a one-time process.We verified this method by using the ROIs extracted from a previously obtained dataset to train and test a convolutional neural network which is modeled to detect damage locations.The method put forward in this work allows a relatively small amount of labeling to generate a large amount of training data.We successfully demonstrate the capabilities of the technique with the dataset previously collected by a drone from an abandoned building in which many of the glass windows have been damaged. 展开更多
关键词 PHOTOGRAMMETRY Deep learning Computer vision STRUCTURE-FROM-MOTION ORTHOPHOTO ROI data labeling Visual inspection
原文传递
Pressure swing adsorption process modeling using physics-informed machine learning with transfer learning and labeled data
4
作者 Zhiqiang Wu Yunquan Chen +4 位作者 Bingjian Zhang Jingzheng Ren Qinglin Chen Huan Wang Chang He 《Green Chemical Engineering》 2025年第2期233-248,共16页
Pressure swing adsorption(PSA)modeling remains a challenging task since it exhibits strong dynamic and cyclic behavior.This study presents a systematic physics-informed machine learning method that integrates transfer... Pressure swing adsorption(PSA)modeling remains a challenging task since it exhibits strong dynamic and cyclic behavior.This study presents a systematic physics-informed machine learning method that integrates transfer learning and labeled data to construct a spatiotemporal model of the PSA process.To approximate the latent solutions of partial differential equations(PDEs)in the specific steps of pressurization,adsorption,heavy reflux,counter-current depressurization,and light reflux,the system's network representation is decomposed into five lightweight sub-networks.On this basis,we propose a parameter-based transfer learning(TL)combined with domain decomposition to address the long-term integration of periodic PDEs and expedite the network training process.Moreover,to tackle challenges related to sharp adsorption fronts,our method allows for the inclusion of a specified amount of labeled data at the boundaries and/or within the system in the loss function.The results show that the proposed method closely matches the outcomes achieved through the conventional numerical method,effectively simulating all steps and cyclic behavior within the PSA processes. 展开更多
关键词 Physics-informed machine learning Pressure swing adsorption Transfer learning Labeled data Partial differential equations
原文传递
Classification framework and semantic labeling for Big Earth Data
5
作者 Juanle Wang Kun Bu +4 位作者 Dongmei Yan Jingyue Wang Bowen Duan Min Zhang Guojin He 《Big Earth Data》 EI CSCD 2023年第3期886-903,共18页
Big Earth Data refers to the multidimensional integration and association of scientific data,including geography,resources,environment,ecology,and biology.An effective data classification system and label management s... Big Earth Data refers to the multidimensional integration and association of scientific data,including geography,resources,environment,ecology,and biology.An effective data classification system and label management strategy are important foundations for long-term management of data resources.The objective of this study was to construct a classification system and realize multidimensional semantic data label management for the Big Earth Data Science Engineering Program(CASEarth).This study constructed two sets of classification and coding systems that realize classification by mapping each other;namely,the geosphere-level and Sustainable Development Goals(SDGs)indicator classifications.This technique was based on natural language processing technology and solved problems with subject-word segmentation,weight calculation,and dynamic matching.A prototype system for classification and label management was constructed based on existing CASEarth datasets of more than 1,100.Furthermore,we expect our study to provide the methodology and technical support for useroriented classification and label management services for Big Earth Data. 展开更多
关键词 Big Earth data CASEarth scientific engineering data classification data labeling data management
原文传递
Geostatistical semi-supervised learning for spatial prediction
6
作者 Francky Fouedjio Hassan Talebi 《Artificial Intelligence in Geosciences》 2022年第1期162-178,共17页
Geoscientists are increasingly tasked with spatially predicting a target variable in the presence of auxiliary information using supervised machine learning algorithms.Typically,the target variable is observed at a fe... Geoscientists are increasingly tasked with spatially predicting a target variable in the presence of auxiliary information using supervised machine learning algorithms.Typically,the target variable is observed at a few sampling locations due to the relatively time-consuming and costly process of obtaining measurements.In contrast,auxiliary variables are often exhaustively observed within the region under study through the increasing development of remote sensing platforms and sensor networks.Supervised machine learning methods do not fully leverage this large amount of auxiliary spatial data.Indeed,in these methods,the training dataset includes only labeled data locations(where both target and auxiliary variables were measured).At the same time,unlabeled data locations(where auxiliary variables were measured but not the target variable)are not considered during the model training phase.Consequently,only a limited amount of auxiliary spatial data is utilized during the model training stage.As an alternative to supervised learning,semi-supervised learning,which learns from labeled as well as unlabeled data,can be used to address this problem.However,conventional semi-supervised learning techniques do not account for the specificities of spatial data.This paper introduces a spatial semi-supervised learning framework where geostatistics and machine learning are combined to harness a large amount of unlabeled spatial data in combination with typically a smaller set of labeled spatial data.The main idea consists of leveraging the target variable’s spatial autocorrelation to generate pseudo labels at unlabeled data points that are geographically close to labeled data points.This is achieved through geostatistical conditional simulation,where an ensemble of pseudo labels is generated to account for the uncertainty in the pseudo labeling process.The observed labels are augmented by this ensemble of pseudo labels to create an ensemble of pseudo training datasets.A supervised machine learning model is then trained on each pseudo training dataset,followed by an aggregation of trained models.The proposed geostatistical semi-supervised learning method is applied to synthetic and real-world spatial datasets.Its predictive performance is compared with some classical supervised and semi-supervised machine learning methods.It appears that it can effectively leverage a large amount of unlabeled spatial data to improve the target variable’s spatial prediction. 展开更多
关键词 Labeled spatial data Unlabeled spatial data Spatial autocorrelation Pseudo labeling Spatial prediction
在线阅读 下载PDF
WSDSum: Unsupervised Extractive Summarization Based on Word Weight Fusion and Document Dynamic Comparison
7
作者 Yukun Cao Yuanmin Liu +2 位作者 Ming Chen Jingjing Li Tianhao Wang 《国际计算机前沿大会会议论文集》 2024年第3期108-122,共15页
Unsupervised extractive summarization aims to pinpoint representative sentences from raw text without relying on labeled summary data,capturing the overall content.Numerous prevalent research methods predominantly pri... Unsupervised extractive summarization aims to pinpoint representative sentences from raw text without relying on labeled summary data,capturing the overall content.Numerous prevalent research methods predominantly prioritize the significance of sentences within a document,potentially overlooking the importance of varying keywords within a sentence.Moreover,many methods confine the summarization to information present only in the current document,potentially omitting crucial details essential for comprehensive document understanding.To tackle these challenges,this paper introduces WSDSum,an algorithm rooted in word weight fusion and dynamic document comparison.This algorithm employs two distinctword weight assessmentmethods to gauge the significance of words in a sentence and subsequently combines their assessment outcomes to more effectively evaluate word importance within a sentence.Furthermore,this paper suggests a dynamic document comparison approach to enhance the diversity of the generated summaries by creating positive examples from intra-document sentences and contrasting them with inter-document sentence counterexamples.This is achieved by leveraging a cosine annealing strategy to facilitate dynamic temperature comparisons with other documents.Experimental evaluations on three public datasets indicate that WSDSum outperforms traditional methods. 展开更多
关键词 Unsupervised extractive summarization Labeled summary data Cosine annealing strategy Dynamic temperature comparisons Summarization diversity
原文传递
A Threshold-Control Generative Adversarial Network Method for Intelligent Fault Diagnosis 被引量:4
8
作者 Xinyu Li Sican Cao +1 位作者 Liang Gao Long Wen 《Complex System Modeling and Simulation》 2021年第1期55-64,共10页
Fault diagnosis plays the increasingly vital role to guarantee the machine reliability in the industrial enterprise.Among all the solutions,deep learning(DL)methods have achieved more popularity for their feature extr... Fault diagnosis plays the increasingly vital role to guarantee the machine reliability in the industrial enterprise.Among all the solutions,deep learning(DL)methods have achieved more popularity for their feature extraction ability from the raw historical data.However,the performance of DL relies on the huge amount of labeled data,as it is costly to obtain in the real world as the labeling process for data is usually tagged by hand.To obtain the good performance with limited labeled data,this research proposes a threshold-control generative adversarial network(TCGAN)method.Firstly,the 1D vibration signals are processed to be converted into 2D images,which are used as the input of TCGAN.Secondly,TCGAN would generate pseudo data which have the similar distribution with the limited labeled data.With pseudo data generation,the training dataset can be enlarged and the increase on the labeled data could further promote the performance of TCGAN on fault diagnosis.Thirdly,to mitigate the instability of the generated data,a threshold-control is presented to adjust the relationship between discriminator and generator dynamically and automatically.The proposed TCGAN is validated on the datasets from Case Western Reserve University and Self-Priming Centrifugal Pump.The prediction accuracies with limited labeled data have reached to 99.96%and 99.898%,which are even better than other methods tested under the whole labeled datasets. 展开更多
关键词 generative adversarial network limited labeled data DISCRIMINATOR fault diagnosis
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部