By modeling the spatiotemporal data of the power grid, it is possible to better understand its operational status, identify potential issues and risks, and take timely measures to adjust and optimize the system. Compa...By modeling the spatiotemporal data of the power grid, it is possible to better understand its operational status, identify potential issues and risks, and take timely measures to adjust and optimize the system. Compared to the bus-branch model, the node-breaker model provides higher granularity in describing grid components and can dynamically reflect changes in equipment status, thus improving the efficiency of grid dispatching and operation. This paper proposes a spatiotemporal data modeling method based on a graph database. It elaborates on constructing graph nodes, graph ontology models, and graph entity models from grid dispatch data, describing the construction of the spatiotemporal node-breaker graph model and the transformation to the bus-branch model. Subsequently, by integrating spatiotemporal data attributes into the pre-built static grid graph model, a spatiotemporal evolving graph of the power grid is constructed. Furthermore, the concept of the “Power Grid One Graph” and its requirements in modern power systems are elucidated. Leveraging the constructed spatiotemporal node-breaker graph model and graph computing technology, the paper explores the feasibility of grid situational awareness. Finally, typical applications in an operational provincial grid are showcased, and potential scenarios of the proposed spatiotemporal graph model are discussed.展开更多
With the development of the Semantic Web,the number of ontologies grows exponentially and the semantic relationships between ontologies become more and more complex,understanding the true semantics of specific terms o...With the development of the Semantic Web,the number of ontologies grows exponentially and the semantic relationships between ontologies become more and more complex,understanding the true semantics of specific terms or concepts in an ontology is crucial for the matching task.At present,the main challenges facing ontology matching tasks based on representation learning methods are how to improve the embedding quality of ontology knowledge and how to integrate multiple features of ontology efficiently.Therefore,we propose an Ontology Matching Method Based on the Gated Graph Attention Model(OM-GGAT).Firstly,the semantic knowledge related to concepts in the ontology is encoded into vectors using the OWL2Vec^(*)method,and the relevant path information from the root node to the concept is embedded to understand better the true meaning of the concept itself and the relationship between concepts.Secondly,the ontology is transformed into the corresponding graph structure according to the semantic relation.Then,when extracting the features of the ontology graph nodes,different attention weights are assigned to each adjacent node of the central concept with the help of the attention mechanism idea.Finally,gated networks are designed to further fuse semantic and structural embedding representations efficiently.To verify the effectiveness of the proposed method,comparative experiments on matching tasks were carried out on public datasets.The results show that the OM-GGAT model can effectively improve the efficiency of ontology matching.展开更多
To increase the efficiency and reliability of the thermodynamics analysis of the hydraulic system, the method based on pseudo-bond graph is introduced. According to the working mechanism of hydraulic components, they ...To increase the efficiency and reliability of the thermodynamics analysis of the hydraulic system, the method based on pseudo-bond graph is introduced. According to the working mechanism of hydraulic components, they can be separated into two categories: capacitive components and resistive components. Then, the thermal-hydraulic pseudo-bond graphs of capacitive C element and resistance R element were developed, based on the conservation of mass and energy. Subsequently, the connection rule for the pseudo-bond graph elements and the method to construct the complete thermal-hydraulic system model were proposed. On the basis of heat transfer analysis of a typical hydraulic circuit containing a piston pump, the lumped parameter mathematical model of the system was given. The good agreement between the simulation results and experimental data demonstrates the validity of the modeling method.展开更多
The telecommunications industry is becoming increasingly aware of potential subscriber churn as a result of the growing popularity of smartphones in the mobile Internet era,the quick development of telecommunications ...The telecommunications industry is becoming increasingly aware of potential subscriber churn as a result of the growing popularity of smartphones in the mobile Internet era,the quick development of telecommunications services,the implementation of the number portability policy,and the intensifying competition among operators.At the same time,users'consumption preferences and choices are evolving.Excellent churn prediction models must be created in order to accurately predict the churn tendency,since keeping existing customers is far less expensive than acquiring new ones.But conventional or learning-based algorithms can only go so far into a single subscriber's data;they cannot take into consideration changes in a subscriber's subscription and ignore the coupling and correlation between various features.Additionally,the current churn prediction models have a high computational burden,a fuzzy weight distribution,and significant resource economic costs.The prediction algorithms involving network models currently in use primarily take into account the private information shared between users with text and pictures,ignoring the reference value supplied by other users with the same package.This work suggests a user churn prediction model based on Graph Attention Convolutional Neural Network(GAT-CNN)to address the aforementioned issues.The main contributions of this paper are as follows:Firstly,we present a three-tiered hierarchical cloud-edge cooperative framework that increases the volume of user feature input by means of two aggregations at the device,edge,and cloud layers.Second,we extend the use of users'own data by introducing self-attention and graph convolution models to track the relative changes of both users and packages simultaneously.Lastly,we build an integrated offline-online system for churn prediction based on the strengths of the two models,and we experimentally validate the efficacy of cloudside collaborative training and inference.In summary,the churn prediction model based on Graph Attention Convolutional Neural Network presented in this paper can effectively address the drawbacks of conventional algorithms and offer telecom operators crucial decision support in developing subscriber retention strategies and cutting operational expenses.展开更多
Recommendation Information Systems(RIS)are pivotal in helping users in swiftly locating desired content from the vast amount of information available on the Internet.Graph Convolution Network(GCN)algorithms have been ...Recommendation Information Systems(RIS)are pivotal in helping users in swiftly locating desired content from the vast amount of information available on the Internet.Graph Convolution Network(GCN)algorithms have been employed to implement the RIS efficiently.However,the GCN algorithm faces limitations in terms of performance enhancement owing to the due to the embedding value-vanishing problem that occurs during the learning process.To address this issue,we propose a Weighted Forwarding method using the GCN(WF-GCN)algorithm.The proposed method involves multiplying the embedding results with different weights for each hop layer during graph learning.By applying the WF-GCN algorithm,which adjusts weights for each hop layer before forwarding to the next,nodes with many neighbors achieve higher embedding values.This approach facilitates the learning of more hop layers within the GCN framework.The efficacy of the WF-GCN was demonstrated through its application to various datasets.In the MovieLens dataset,the implementation of WF-GCN in LightGCN resulted in significant performance improvements,with recall and NDCG increasing by up to+163.64%and+132.04%,respectively.Similarly,in the Last.FM dataset,LightGCN using WF-GCN enhanced with WF-GCN showed substantial improvements,with the recall and NDCG metrics rising by up to+174.40%and+169.95%,respectively.Furthermore,the application of WF-GCN to Self-supervised Graph Learning(SGL)and Simple Graph Contrastive Learning(SimGCL)also demonstrated notable enhancements in both recall and NDCG across these datasets.展开更多
The global clustering of inventive talent shapes innovation capacity and drives economic growth.For China,this process is especially crucial in sustaining its development momentum.This paper draws on data from the EPO...The global clustering of inventive talent shapes innovation capacity and drives economic growth.For China,this process is especially crucial in sustaining its development momentum.This paper draws on data from the EPO Worldwide Patent Statistical Database(PATSTAT)to extract global inventive talent mobility information and analyzes the spatial structural evolution of the global inventive talent flow network.The study finds that this network is undergoing a multi-polar transformation,characterized by the rising importance of a few central countries-such as the United States,Germany,and China-and the increasing marginalization of many peripheral countries.In response to this typical phenomenon,the paper constructs an endogenous migration model and conducts empirical testing using the Temporal Exponential Random Graph Model(TERGM).The results reveal several endogenous mechanisms driving global inventive talent flows,including reciprocity,path dependence,convergence effects,transitivity,and cyclic structures,all of which contribute to the network’s multi-polar trend.In addition,differences in regional industrial structures significantly influence talent mobility choices and are a decisive factor in the formation of poles within the multi-polar landscape.Based on these findings,it is suggested that efforts be made to foster two-way channels for talent exchange between China and other global innovation hubs,in order to enhance international collaboration and knowledge flow.We should aim to reduce the migration costs and institutional barriers faced by R&D personnel,thereby encouraging greater mobility of high-skilled talent.Furthermore,the government is advised to strategically leverage regional strengths in high-tech industries as a lever to capture competitive advantages in emerging technologies and products,ultimately strengthening the country’s position in the global innovation landscape.展开更多
Software defect prediction plays a critical role in software development and quality assurance processes. Effective defect prediction enables testers to accurately prioritize testing efforts and enhance defect detecti...Software defect prediction plays a critical role in software development and quality assurance processes. Effective defect prediction enables testers to accurately prioritize testing efforts and enhance defect detection efficiency. Additionally, this technology provides developers with a means to quickly identify errors, thereby improving software robustness and overall quality. However, current research in software defect prediction often faces challenges, such as relying on a single data source or failing to adequately account for the characteristics of multiple coexisting data sources. This approach may overlook the differences and potential value of various data sources, affecting the accuracy and generalization performance of prediction results. To address this issue, this study proposes a multivariate heterogeneous hybrid deep learning algorithm for defect prediction (DP-MHHDL). Initially, Abstract Syntax Tree (AST), Code Dependency Network (CDN), and code static quality metrics are extracted from source code files and used as inputs to ensure data diversity. Subsequently, for the three types of heterogeneous data, the study employs a graph convolutional network optimization model based on adjacency and spatial topologies, a Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) hybrid neural network model, and a TabNet model to extract data features. These features are then concatenated and processed through a fully connected neural network for defect prediction. Finally, the proposed framework is evaluated using ten promise defect repository projects, and performance is assessed with three metrics: F1, Area under the curve (AUC), and Matthews correlation coefficient (MCC). The experimental results demonstrate that the proposed algorithm outperforms existing methods, offering a novel solution for software defect prediction.展开更多
As one of the main characteristics of atmospheric pollutants,PM_(2.5) severely affects human health and has received widespread attention in recent years.How to predict the variations of PM_(2.5) concentrations with h...As one of the main characteristics of atmospheric pollutants,PM_(2.5) severely affects human health and has received widespread attention in recent years.How to predict the variations of PM_(2.5) concentrations with high accuracy is an important topic.The PM_(2.5) monitoring stations in Xinjiang Uygur Autonomous Region,China,are unevenly distributed,which makes it challenging to conduct comprehensive analyses and predictions.Therefore,this study primarily addresses the limitations mentioned above and the poor generalization ability of PM_(2.5) concentration prediction models across different monitoring stations.We chose the northern slope of the Tianshan Mountains as the study area and took the January−December in 2019 as the research period.On the basis of data from 21 PM_(2.5) monitoring stations as well as meteorological data(temperature,instantaneous wind speed,and pressure),we developed an improved model,namely GCN−TCN−AR(where GCN is the graph convolution network,TCN is the temporal convolutional network,and AR is the autoregression),for predicting PM_(2.5) concentrations on the northern slope of the Tianshan Mountains.The GCN−TCN−AR model is composed of an improved GCN model,a TCN model,and an AR model.The results revealed that the R2 values predicted by the GCN−TCN−AR model at the four monitoring stations(Urumqi,Wujiaqu,Shihezi,and Changji)were 0.93,0.91,0.93,and 0.92,respectively,and the RMSE(root mean square error)values were 6.85,7.52,7.01,and 7.28μg/m^(3),respectively.The performance of the GCN−TCN−AR model was also compared with the currently neural network models,including the GCN−TCN,GCN,TCN,Support Vector Regression(SVR),and AR.The GCN−TCN−AR outperformed the other current neural network models,with high prediction accuracy and good stability,making it especially suitable for the predictions of PM_(2.5)concentrations.This study revealed the significant spatiotemporal variations of PM_(2.5)concentrations.First,the PM_(2.5) concentrations exhibited clear seasonal fluctuations,with higher levels typically observed in winter and differences presented between months.Second,the spatial distribution analysis revealed that cities such as Urumqi and Wujiaqu have high PM_(2.5) concentrations,with a noticeable geographical clustering of pollutions.Understanding the variations in PM_(2.5) concentrations is highly important for the sustainable development of ecological environment in arid areas.展开更多
Building façades can feature different patterns depending on the architectural style,function-ality,and size of the buildings;therefore,reconstructing these façades can be complicated.In particular,when sema...Building façades can feature different patterns depending on the architectural style,function-ality,and size of the buildings;therefore,reconstructing these façades can be complicated.In particular,when semantic façades are reconstructed from point cloud data,uneven point density and noise make it difficult to accurately determine the façade structure.When inves-tigating façade layouts,Gestalt principles can be applied to cluster visually similar floors and façade elements,allowing for a more intuitive interpretation of façade structures.We propose a novel model for describing façade structures,namely the layout graph model,which involves a compound graph with two structure levels.In the proposed model,similar façade elements such as windows are first grouped into clusters.A down-layout graph is then formed using this cluster as a node and by combining intra-and inter-cluster spacings as the edges.Second,a top-layout graph is formed by clustering similar floors.By extracting relevant parameters from this model,we transform semantic façade reconstruction to an optimization strategy using simulated annealing coupled with Gibbs sampling.Multiple façade point cloud data with different features were selected from three datasets to verify the effectiveness of this method.The experimental results show that the proposed method achieves an average accuracy of 86.35%.Owing to its flexibility,the proposed layout graph model can deal with different types of façades and qualities of point cloud data,enabling a more robust and accurate reconstruc-tion of façade models.展开更多
Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of kno...Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of knowledge bottleneck,i.e.,it is hard to acquire abundant disambiguation knowledge,especially in Chinese.To solve this problem,this paper proposes a graph-based Chinese WSD method with multi-knowledge integration.Particularly,a graph model combining various Chinese and English knowledge resources by word sense mapping is designed.Firstly,the content words in a Chinese ambiguous sentence are extracted and mapped to English words with BabelNet.Then,English word similarity is computed based on English word embeddings and knowledge base.Chinese word similarity is evaluated with Chinese word embedding and HowNet,respectively.The weights of the three kinds of word similarity are optimized with simulated annealing algorithm so as to obtain their overall similarities,which are utilized to construct a disambiguation graph.The graph scoring algorithm evaluates the importance of each word sense node and judge the right senses of the ambiguous words.Extensive experimental results on SemEval dataset show that our proposed WSD method significantly outperforms the baselines.展开更多
Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in r...Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in real systems are based on graph models,which are characterized by their simplicity and stability.Thus,this paper proposes an improved extractive text summarization algorithm based on both topic and graph models.The methodology of this work consists of two stages.First,the well-known TextRank algorithm is analyzed and its shortcomings are investigated.Then,an improved method is proposed with a new computational model of sentence weights.The experimental results were carried out on standard DUC2004 and DUC2006 datasets and compared to four text summarization methods.Finally,through experiments on the DUC2004 and DUC2006 datasets,our proposed improved graph model algorithm TG-SMR(Topic Graph-Summarizer)is compared to other text summarization systems.The experimental results prove that the proposed TG-SMR algorithm achieves higher ROUGE scores.It is foreseen that the TG-SMR algorithm will open a new horizon that concerns the performance of ROUGE evaluation indicators.展开更多
With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this pap...With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.展开更多
Network modeling is an important approach in many fields in analyzing complex systems. Recently new series of methods have emerged, by using Kronecker product and similar tools to model real systems. One of such appro...Network modeling is an important approach in many fields in analyzing complex systems. Recently new series of methods have emerged, by using Kronecker product and similar tools to model real systems. One of such approaches is the multiplicative attribute graph(MAG) model, which generates networks based on category attributes of nodes. In this paper we try to extend this model into a continuous one, give an overview of its properties, and discuss some special cases related to real-world networks, as well as the influence of attribute distribution and affinity function respectively.展开更多
To construct a high efficient text clustering algorithm the multilevel graph model and the refinement algorithm used in the uncoarsening phase is discussed. The model is applied to text clustering. The performance of ...To construct a high efficient text clustering algorithm the multilevel graph model and the refinement algorithm used in the uncoarsening phase is discussed. The model is applied to text clustering. The performance of clustering algorithm has to be improved with the refinement algorithm application. The experiment result demonstrated that the multilevel graph text clustering algorithm is available. Key words text clustering - multilevel coarsen graph model - refinement algorithm - high-dimensional clustering CLC number TP301 Foundation item: Supported by the National Natural Science Foundation of China (60173051)Biography: CHEN Jian-bin(1970-), male, Associate professor, Ph. D., research direction: data mining.展开更多
Due to limitations in geometric representation and semantic description, the current pedestrian route analysis models are inadequate. To express the geometry of geographic entities in a micro-spatial environment accur...Due to limitations in geometric representation and semantic description, the current pedestrian route analysis models are inadequate. To express the geometry of geographic entities in a micro-spatial environment accurately, the concept of a grid is presented, and grid-based methods for modeling geospatial objects are described. The semantic constitution of a building environment and the methods for modeling rooms, corridors, and staircases with grid objects are described. Based on the topology relationship between grid objects, a grid-based graph for a building environment is presented, and the corresponding route algorithm for pedestrians is proposed. The main advantages of the graph model proposed in this paper are as follows: 1) consideration of both semantic and geometric information, 2) consideration of the need for accurate geometric representation of the micro-spatial environment and the efficiency of pedestrian route analysis, 3) applicability of the graph model to route analysis in both static and dynamic environments, and 4) ability of the multi-hierarchical route analysis to integrate the multiple levels of pedestrian decision characteristics, from the high to the low, to determine the optimal path.展开更多
With the wider growth of web-based documents,the necessity of automatic document clustering and text summarization is increased.Here,document summarization that is extracting the essential task with appropriate inform...With the wider growth of web-based documents,the necessity of automatic document clustering and text summarization is increased.Here,document summarization that is extracting the essential task with appropriate information,removal of unnecessary data and providing the data in a cohesive and coherent manner is determined to be a most confronting task.In this research,a novel intelligent model for document clustering is designed with graph model and Fuzzy based association rule generation(gFAR).Initially,the graph model is used to map the relationship among the data(multi-source)followed by the establishment of document clustering with the generation of association rule using the fuzzy concept.This method shows benefit in redundancy elimination by mapping the relevant document using graph model and reduces the time consumption and improves the accuracy using the association rule generation with fuzzy.This framework is provided in an interpretable way for document clustering.It iteratively reduces the error rate during relationship mapping among the data(clusters)with the assistance of weighted document content.Also,this model represents the significance of data features with class discrimination.It is also helpful in measuring the significance of the features during the data clustering process.The simulation is done with MATLAB 2016b environment and evaluated with the empirical standards like Relative Risk Patterns(RRP),ROUGE score,and Discrimination Information Measure(DMI)respectively.Here,DailyMail and DUC 2004 dataset is used to extract the empirical results.The proposed gFAR model gives better trade-off while compared with various prevailing approaches.展开更多
Markov model is usually selected as the base model of user action in the intrusion detection system (IDS). However, the performance of the IDS depends on the status space of Markov model and it will degrade as the spa...Markov model is usually selected as the base model of user action in the intrusion detection system (IDS). However, the performance of the IDS depends on the status space of Markov model and it will degrade as the space dimension grows. Here, Markov Graph Model (MGM) is proposed to handle this issue. Specification of the model is described, and several methods for probability computation with MGM are also presented. Based on MGM, algorithms for building user model and predicting user action are presented. And the performance of these algorithms such as computing complexity, prediction accuracy, and storage requirement of MGM are analyzed.展开更多
Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word a...Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.展开更多
Based on the HS 4-digit code trade data in UNCOMTRADE from 1995 to 2020, this paper analyzes the characteristics of the evolution of the global PG trade network using the complex network approach and analyzes the chan...Based on the HS 4-digit code trade data in UNCOMTRADE from 1995 to 2020, this paper analyzes the characteristics of the evolution of the global PG trade network using the complex network approach and analyzes the changes in its resilience at the overall and country levels, respectively. The results illustrated that:(1) The scale of the global PG trade network tends to expand, and the connection is gradually tightened, experiencing a change from a “supply-oriented” to a “supply-and-demand” pattern, in which the U.S., Russia, Qatar, and Australia have gradually replaced Canada, Japan, and Russia to become the core trade status, while OPEC countries such as Qatar, Algeria, and Kuwait mainly rely on PG exports to occupy the core of the global supply, and the trade status of other countries has been dynamically alternating and evolving.(2) The resilience of the global PG trade network is lower than that of the random network and decreases non-linearly with more disrupted countries. Moreover, the impact of the U.S. is more significant than the rest of countries. Simulations using the exponential random graph model(ERGM) model revealed that national GDP, institutional quality, common border and RTA network are the determinants of PG trade network formation, and the positive impact of the four factors not only varies significantly across regions and stages, but also increases with national network status.展开更多
Transmission line(TL)Parameter Identification(PI)method plays an essential role in the transmission system.The existing PI methods usually have two limitations:(1)These methods only model for single TL,and can not con...Transmission line(TL)Parameter Identification(PI)method plays an essential role in the transmission system.The existing PI methods usually have two limitations:(1)These methods only model for single TL,and can not consider the topology connection of multiple branches for simultaneous identification.(2)Transient bad data is ignored by methods,and the random selection of terminal section data may cause the distortion of PI and have serious consequences.Therefore,a multi-task PI model considering multiple TLs’spatial constraints and massive electrical section data is proposed in this paper.The Graph Attention Network module is used to draw a single TL into a node and calculate its influence coefficient in the transmission network.Multi-Task strategy of Hard Parameter Sharing is used to identify the conductance ofmultiple branches simultaneously.Experiments show that themethod has good accuracy and robustness.Due to the consideration of spatial constraints,the method can also obtain more accurate conductance values under different training and testing conditions.展开更多
基金supported by the Project of China Southern Power Grid Digital Grid Research Institute Co.,Ltd.(210002KK52222026)。
文摘By modeling the spatiotemporal data of the power grid, it is possible to better understand its operational status, identify potential issues and risks, and take timely measures to adjust and optimize the system. Compared to the bus-branch model, the node-breaker model provides higher granularity in describing grid components and can dynamically reflect changes in equipment status, thus improving the efficiency of grid dispatching and operation. This paper proposes a spatiotemporal data modeling method based on a graph database. It elaborates on constructing graph nodes, graph ontology models, and graph entity models from grid dispatch data, describing the construction of the spatiotemporal node-breaker graph model and the transformation to the bus-branch model. Subsequently, by integrating spatiotemporal data attributes into the pre-built static grid graph model, a spatiotemporal evolving graph of the power grid is constructed. Furthermore, the concept of the “Power Grid One Graph” and its requirements in modern power systems are elucidated. Leveraging the constructed spatiotemporal node-breaker graph model and graph computing technology, the paper explores the feasibility of grid situational awareness. Finally, typical applications in an operational provincial grid are showcased, and potential scenarios of the proposed spatiotemporal graph model are discussed.
基金supported by the National Natural Science Foundation of China(grant numbers 62267005 and 42365008)the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing.
文摘With the development of the Semantic Web,the number of ontologies grows exponentially and the semantic relationships between ontologies become more and more complex,understanding the true semantics of specific terms or concepts in an ontology is crucial for the matching task.At present,the main challenges facing ontology matching tasks based on representation learning methods are how to improve the embedding quality of ontology knowledge and how to integrate multiple features of ontology efficiently.Therefore,we propose an Ontology Matching Method Based on the Gated Graph Attention Model(OM-GGAT).Firstly,the semantic knowledge related to concepts in the ontology is encoded into vectors using the OWL2Vec^(*)method,and the relevant path information from the root node to the concept is embedded to understand better the true meaning of the concept itself and the relationship between concepts.Secondly,the ontology is transformed into the corresponding graph structure according to the semantic relation.Then,when extracting the features of the ontology graph nodes,different attention weights are assigned to each adjacent node of the central concept with the help of the attention mechanism idea.Finally,gated networks are designed to further fuse semantic and structural embedding representations efficiently.To verify the effectiveness of the proposed method,comparative experiments on matching tasks were carried out on public datasets.The results show that the OM-GGAT model can effectively improve the efficiency of ontology matching.
基金Project(51175518)supported by the National Natural Science Foundation of China
文摘To increase the efficiency and reliability of the thermodynamics analysis of the hydraulic system, the method based on pseudo-bond graph is introduced. According to the working mechanism of hydraulic components, they can be separated into two categories: capacitive components and resistive components. Then, the thermal-hydraulic pseudo-bond graphs of capacitive C element and resistance R element were developed, based on the conservation of mass and energy. Subsequently, the connection rule for the pseudo-bond graph elements and the method to construct the complete thermal-hydraulic system model were proposed. On the basis of heat transfer analysis of a typical hydraulic circuit containing a piston pump, the lumped parameter mathematical model of the system was given. The good agreement between the simulation results and experimental data demonstrates the validity of the modeling method.
基金supported by National Key R&D Program of China(No.2022YFB3104500)Natural Science Foundation of Jiangsu Province(No.BK20222013)Scientific Research Foundation of Nanjing Institute of Technology(No.3534113223036)。
文摘The telecommunications industry is becoming increasingly aware of potential subscriber churn as a result of the growing popularity of smartphones in the mobile Internet era,the quick development of telecommunications services,the implementation of the number portability policy,and the intensifying competition among operators.At the same time,users'consumption preferences and choices are evolving.Excellent churn prediction models must be created in order to accurately predict the churn tendency,since keeping existing customers is far less expensive than acquiring new ones.But conventional or learning-based algorithms can only go so far into a single subscriber's data;they cannot take into consideration changes in a subscriber's subscription and ignore the coupling and correlation between various features.Additionally,the current churn prediction models have a high computational burden,a fuzzy weight distribution,and significant resource economic costs.The prediction algorithms involving network models currently in use primarily take into account the private information shared between users with text and pictures,ignoring the reference value supplied by other users with the same package.This work suggests a user churn prediction model based on Graph Attention Convolutional Neural Network(GAT-CNN)to address the aforementioned issues.The main contributions of this paper are as follows:Firstly,we present a three-tiered hierarchical cloud-edge cooperative framework that increases the volume of user feature input by means of two aggregations at the device,edge,and cloud layers.Second,we extend the use of users'own data by introducing self-attention and graph convolution models to track the relative changes of both users and packages simultaneously.Lastly,we build an integrated offline-online system for churn prediction based on the strengths of the two models,and we experimentally validate the efficacy of cloudside collaborative training and inference.In summary,the churn prediction model based on Graph Attention Convolutional Neural Network presented in this paper can effectively address the drawbacks of conventional algorithms and offer telecom operators crucial decision support in developing subscriber retention strategies and cutting operational expenses.
基金This work was supported by the Kyonggi University Research Grant 2022.
文摘Recommendation Information Systems(RIS)are pivotal in helping users in swiftly locating desired content from the vast amount of information available on the Internet.Graph Convolution Network(GCN)algorithms have been employed to implement the RIS efficiently.However,the GCN algorithm faces limitations in terms of performance enhancement owing to the due to the embedding value-vanishing problem that occurs during the learning process.To address this issue,we propose a Weighted Forwarding method using the GCN(WF-GCN)algorithm.The proposed method involves multiplying the embedding results with different weights for each hop layer during graph learning.By applying the WF-GCN algorithm,which adjusts weights for each hop layer before forwarding to the next,nodes with many neighbors achieve higher embedding values.This approach facilitates the learning of more hop layers within the GCN framework.The efficacy of the WF-GCN was demonstrated through its application to various datasets.In the MovieLens dataset,the implementation of WF-GCN in LightGCN resulted in significant performance improvements,with recall and NDCG increasing by up to+163.64%and+132.04%,respectively.Similarly,in the Last.FM dataset,LightGCN using WF-GCN enhanced with WF-GCN showed substantial improvements,with the recall and NDCG metrics rising by up to+174.40%and+169.95%,respectively.Furthermore,the application of WF-GCN to Self-supervised Graph Learning(SGL)and Simple Graph Contrastive Learning(SimGCL)also demonstrated notable enhancements in both recall and NDCG across these datasets.
基金supported by the Major Project of the National Social Science Fund of China,titled“Design Path Selection for the Mechanism of New and Old Growth Driver Conversion”(Grant No.18ZDA077)by the Joint Special Major Research Project of the Yangtze River Delta Economics and Social Development Research Center at Nanjing University and the Collaborative Innovation Center for China Economy(CICCE),titled“Practicing Innovation in China’s Development Economics for the Yangtze River Delta:From Industrial Clusters to Technological Clusters”(Grant No.CYD2022006).
文摘The global clustering of inventive talent shapes innovation capacity and drives economic growth.For China,this process is especially crucial in sustaining its development momentum.This paper draws on data from the EPO Worldwide Patent Statistical Database(PATSTAT)to extract global inventive talent mobility information and analyzes the spatial structural evolution of the global inventive talent flow network.The study finds that this network is undergoing a multi-polar transformation,characterized by the rising importance of a few central countries-such as the United States,Germany,and China-and the increasing marginalization of many peripheral countries.In response to this typical phenomenon,the paper constructs an endogenous migration model and conducts empirical testing using the Temporal Exponential Random Graph Model(TERGM).The results reveal several endogenous mechanisms driving global inventive talent flows,including reciprocity,path dependence,convergence effects,transitivity,and cyclic structures,all of which contribute to the network’s multi-polar trend.In addition,differences in regional industrial structures significantly influence talent mobility choices and are a decisive factor in the formation of poles within the multi-polar landscape.Based on these findings,it is suggested that efforts be made to foster two-way channels for talent exchange between China and other global innovation hubs,in order to enhance international collaboration and knowledge flow.We should aim to reduce the migration costs and institutional barriers faced by R&D personnel,thereby encouraging greater mobility of high-skilled talent.Furthermore,the government is advised to strategically leverage regional strengths in high-tech industries as a lever to capture competitive advantages in emerging technologies and products,ultimately strengthening the country’s position in the global innovation landscape.
文摘Software defect prediction plays a critical role in software development and quality assurance processes. Effective defect prediction enables testers to accurately prioritize testing efforts and enhance defect detection efficiency. Additionally, this technology provides developers with a means to quickly identify errors, thereby improving software robustness and overall quality. However, current research in software defect prediction often faces challenges, such as relying on a single data source or failing to adequately account for the characteristics of multiple coexisting data sources. This approach may overlook the differences and potential value of various data sources, affecting the accuracy and generalization performance of prediction results. To address this issue, this study proposes a multivariate heterogeneous hybrid deep learning algorithm for defect prediction (DP-MHHDL). Initially, Abstract Syntax Tree (AST), Code Dependency Network (CDN), and code static quality metrics are extracted from source code files and used as inputs to ensure data diversity. Subsequently, for the three types of heterogeneous data, the study employs a graph convolutional network optimization model based on adjacency and spatial topologies, a Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) hybrid neural network model, and a TabNet model to extract data features. These features are then concatenated and processed through a fully connected neural network for defect prediction. Finally, the proposed framework is evaluated using ten promise defect repository projects, and performance is assessed with three metrics: F1, Area under the curve (AUC), and Matthews correlation coefficient (MCC). The experimental results demonstrate that the proposed algorithm outperforms existing methods, offering a novel solution for software defect prediction.
基金supported by the Program of Support Xinjiang by Technology(2024E02028,B2-2024-0359)Xinjiang Tianchi Talent Program of 2024,the Foundation of Chinese Academy of Sciences(B2-2023-0239)the Youth Foundation of Shandong Natural Science(ZR2023QD070).
文摘As one of the main characteristics of atmospheric pollutants,PM_(2.5) severely affects human health and has received widespread attention in recent years.How to predict the variations of PM_(2.5) concentrations with high accuracy is an important topic.The PM_(2.5) monitoring stations in Xinjiang Uygur Autonomous Region,China,are unevenly distributed,which makes it challenging to conduct comprehensive analyses and predictions.Therefore,this study primarily addresses the limitations mentioned above and the poor generalization ability of PM_(2.5) concentration prediction models across different monitoring stations.We chose the northern slope of the Tianshan Mountains as the study area and took the January−December in 2019 as the research period.On the basis of data from 21 PM_(2.5) monitoring stations as well as meteorological data(temperature,instantaneous wind speed,and pressure),we developed an improved model,namely GCN−TCN−AR(where GCN is the graph convolution network,TCN is the temporal convolutional network,and AR is the autoregression),for predicting PM_(2.5) concentrations on the northern slope of the Tianshan Mountains.The GCN−TCN−AR model is composed of an improved GCN model,a TCN model,and an AR model.The results revealed that the R2 values predicted by the GCN−TCN−AR model at the four monitoring stations(Urumqi,Wujiaqu,Shihezi,and Changji)were 0.93,0.91,0.93,and 0.92,respectively,and the RMSE(root mean square error)values were 6.85,7.52,7.01,and 7.28μg/m^(3),respectively.The performance of the GCN−TCN−AR model was also compared with the currently neural network models,including the GCN−TCN,GCN,TCN,Support Vector Regression(SVR),and AR.The GCN−TCN−AR outperformed the other current neural network models,with high prediction accuracy and good stability,making it especially suitable for the predictions of PM_(2.5)concentrations.This study revealed the significant spatiotemporal variations of PM_(2.5)concentrations.First,the PM_(2.5) concentrations exhibited clear seasonal fluctuations,with higher levels typically observed in winter and differences presented between months.Second,the spatial distribution analysis revealed that cities such as Urumqi and Wujiaqu have high PM_(2.5) concentrations,with a noticeable geographical clustering of pollutions.Understanding the variations in PM_(2.5) concentrations is highly important for the sustainable development of ecological environment in arid areas.
基金This work is supported by the National Natural Science Foundation of China[grant number 41771484].
文摘Building façades can feature different patterns depending on the architectural style,function-ality,and size of the buildings;therefore,reconstructing these façades can be complicated.In particular,when semantic façades are reconstructed from point cloud data,uneven point density and noise make it difficult to accurately determine the façade structure.When inves-tigating façade layouts,Gestalt principles can be applied to cluster visually similar floors and façade elements,allowing for a more intuitive interpretation of façade structures.We propose a novel model for describing façade structures,namely the layout graph model,which involves a compound graph with two structure levels.In the proposed model,similar façade elements such as windows are first grouped into clusters.A down-layout graph is then formed using this cluster as a node and by combining intra-and inter-cluster spacings as the edges.Second,a top-layout graph is formed by clustering similar floors.By extracting relevant parameters from this model,we transform semantic façade reconstruction to an optimization strategy using simulated annealing coupled with Gibbs sampling.Multiple façade point cloud data with different features were selected from three datasets to verify the effectiveness of this method.The experimental results show that the proposed method achieves an average accuracy of 86.35%.Owing to its flexibility,the proposed layout graph model can deal with different types of façades and qualities of point cloud data,enabling a more robust and accurate reconstruc-tion of façade models.
基金The research work is supported by National Key R&D Program of China under Grant No.2018YFC0831704National Nature Science Foundation of China under Grant No.61502259+1 种基金Natural Science Foundation of Shandong Province under Grant No.ZR2017MF056Taishan Scholar Program of Shandong Province in China(Directed by Prof.Yinglong Wang).
文摘Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of knowledge bottleneck,i.e.,it is hard to acquire abundant disambiguation knowledge,especially in Chinese.To solve this problem,this paper proposes a graph-based Chinese WSD method with multi-knowledge integration.Particularly,a graph model combining various Chinese and English knowledge resources by word sense mapping is designed.Firstly,the content words in a Chinese ambiguous sentence are extracted and mapped to English words with BabelNet.Then,English word similarity is computed based on English word embeddings and knowledge base.Chinese word similarity is evaluated with Chinese word embedding and HowNet,respectively.The weights of the three kinds of word similarity are optimized with simulated annealing algorithm so as to obtain their overall similarities,which are utilized to construct a disambiguation graph.The graph scoring algorithm evaluates the importance of each word sense node and judge the right senses of the ambiguous words.Extensive experimental results on SemEval dataset show that our proposed WSD method significantly outperforms the baselines.
文摘Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in real systems are based on graph models,which are characterized by their simplicity and stability.Thus,this paper proposes an improved extractive text summarization algorithm based on both topic and graph models.The methodology of this work consists of two stages.First,the well-known TextRank algorithm is analyzed and its shortcomings are investigated.Then,an improved method is proposed with a new computational model of sentence weights.The experimental results were carried out on standard DUC2004 and DUC2006 datasets and compared to four text summarization methods.Finally,through experiments on the DUC2004 and DUC2006 datasets,our proposed improved graph model algorithm TG-SMR(Topic Graph-Summarizer)is compared to other text summarization systems.The experimental results prove that the proposed TG-SMR algorithm achieves higher ROUGE scores.It is foreseen that the TG-SMR algorithm will open a new horizon that concerns the performance of ROUGE evaluation indicators.
基金supported in part by the Fundamental Research Funds for the Central Universities under Grant No.2013RC0114111 Project of China under Grant No.B08004
文摘With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.
基金the National Natural Science Foundation of China(No.61379074)the Zhejiang Provincial Natural Science Foundation of China(No.LZ12F02003)
文摘Network modeling is an important approach in many fields in analyzing complex systems. Recently new series of methods have emerged, by using Kronecker product and similar tools to model real systems. One of such approaches is the multiplicative attribute graph(MAG) model, which generates networks based on category attributes of nodes. In this paper we try to extend this model into a continuous one, give an overview of its properties, and discuss some special cases related to real-world networks, as well as the influence of attribute distribution and affinity function respectively.
文摘To construct a high efficient text clustering algorithm the multilevel graph model and the refinement algorithm used in the uncoarsening phase is discussed. The model is applied to text clustering. The performance of clustering algorithm has to be improved with the refinement algorithm application. The experiment result demonstrated that the multilevel graph text clustering algorithm is available. Key words text clustering - multilevel coarsen graph model - refinement algorithm - high-dimensional clustering CLC number TP301 Foundation item: Supported by the National Natural Science Foundation of China (60173051)Biography: CHEN Jian-bin(1970-), male, Associate professor, Ph. D., research direction: data mining.
基金supported by National Natural Science Foundation of China(Nos.41571387,41201375 and 41501440)Tianjin Research Program of Application Foundation and Advanced Technology(No.14JCQNJC07900)+1 种基金Tianjin Science and Technology Planning Project(Nos.15ZCZDSF00390 and 14TXGCCX00015)Opening Fund of Tianjin Engineering Research Center of Geospatial Information Technology"Modeling and analysis of path graph in 3D indoor spatial environment"
文摘Due to limitations in geometric representation and semantic description, the current pedestrian route analysis models are inadequate. To express the geometry of geographic entities in a micro-spatial environment accurately, the concept of a grid is presented, and grid-based methods for modeling geospatial objects are described. The semantic constitution of a building environment and the methods for modeling rooms, corridors, and staircases with grid objects are described. Based on the topology relationship between grid objects, a grid-based graph for a building environment is presented, and the corresponding route algorithm for pedestrians is proposed. The main advantages of the graph model proposed in this paper are as follows: 1) consideration of both semantic and geometric information, 2) consideration of the need for accurate geometric representation of the micro-spatial environment and the efficiency of pedestrian route analysis, 3) applicability of the graph model to route analysis in both static and dynamic environments, and 4) ability of the multi-hierarchical route analysis to integrate the multiple levels of pedestrian decision characteristics, from the high to the low, to determine the optimal path.
文摘With the wider growth of web-based documents,the necessity of automatic document clustering and text summarization is increased.Here,document summarization that is extracting the essential task with appropriate information,removal of unnecessary data and providing the data in a cohesive and coherent manner is determined to be a most confronting task.In this research,a novel intelligent model for document clustering is designed with graph model and Fuzzy based association rule generation(gFAR).Initially,the graph model is used to map the relationship among the data(multi-source)followed by the establishment of document clustering with the generation of association rule using the fuzzy concept.This method shows benefit in redundancy elimination by mapping the relevant document using graph model and reduces the time consumption and improves the accuracy using the association rule generation with fuzzy.This framework is provided in an interpretable way for document clustering.It iteratively reduces the error rate during relationship mapping among the data(clusters)with the assistance of weighted document content.Also,this model represents the significance of data features with class discrimination.It is also helpful in measuring the significance of the features during the data clustering process.The simulation is done with MATLAB 2016b environment and evaluated with the empirical standards like Relative Risk Patterns(RRP),ROUGE score,and Discrimination Information Measure(DMI)respectively.Here,DailyMail and DUC 2004 dataset is used to extract the empirical results.The proposed gFAR model gives better trade-off while compared with various prevailing approaches.
文摘Markov model is usually selected as the base model of user action in the intrusion detection system (IDS). However, the performance of the IDS depends on the status space of Markov model and it will degrade as the space dimension grows. Here, Markov Graph Model (MGM) is proposed to handle this issue. Specification of the model is described, and several methods for probability computation with MGM are also presented. Based on MGM, algorithms for building user model and predicting user action are presented. And the performance of these algorithms such as computing complexity, prediction accuracy, and storage requirement of MGM are analyzed.
基金supported by the National Natural Science Foundation of China(No.61303082) the Research Fund for the Doctoral Program of Higher Education of China(No.20120121120046)
文摘Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.
基金funded by the National Natural Science Foundation of China Projects (Grant number 71703128)Anhui Provincial Higher Education Research Key Project (grant number: 2024AH052139)。
文摘Based on the HS 4-digit code trade data in UNCOMTRADE from 1995 to 2020, this paper analyzes the characteristics of the evolution of the global PG trade network using the complex network approach and analyzes the changes in its resilience at the overall and country levels, respectively. The results illustrated that:(1) The scale of the global PG trade network tends to expand, and the connection is gradually tightened, experiencing a change from a “supply-oriented” to a “supply-and-demand” pattern, in which the U.S., Russia, Qatar, and Australia have gradually replaced Canada, Japan, and Russia to become the core trade status, while OPEC countries such as Qatar, Algeria, and Kuwait mainly rely on PG exports to occupy the core of the global supply, and the trade status of other countries has been dynamically alternating and evolving.(2) The resilience of the global PG trade network is lower than that of the random network and decreases non-linearly with more disrupted countries. Moreover, the impact of the U.S. is more significant than the rest of countries. Simulations using the exponential random graph model(ERGM) model revealed that national GDP, institutional quality, common border and RTA network are the determinants of PG trade network formation, and the positive impact of the four factors not only varies significantly across regions and stages, but also increases with national network status.
基金supported by the National Natural Science Foundation of PR China(42075130)the Postgraduate Research and Innovation Project of Jiangsu Province(1534052101133).
文摘Transmission line(TL)Parameter Identification(PI)method plays an essential role in the transmission system.The existing PI methods usually have two limitations:(1)These methods only model for single TL,and can not consider the topology connection of multiple branches for simultaneous identification.(2)Transient bad data is ignored by methods,and the random selection of terminal section data may cause the distortion of PI and have serious consequences.Therefore,a multi-task PI model considering multiple TLs’spatial constraints and massive electrical section data is proposed in this paper.The Graph Attention Network module is used to draw a single TL into a node and calculate its influence coefficient in the transmission network.Multi-Task strategy of Hard Parameter Sharing is used to identify the conductance ofmultiple branches simultaneously.Experiments show that themethod has good accuracy and robustness.Due to the consideration of spatial constraints,the method can also obtain more accurate conductance values under different training and testing conditions.