Digital watermarking must balance imperceptibility,robustness,complexity,and security.To address the challenge of computational efficiency in trellis-based informed embedding,we propose a modified watermarking framewo...Digital watermarking must balance imperceptibility,robustness,complexity,and security.To address the challenge of computational efficiency in trellis-based informed embedding,we propose a modified watermarking framework that integrates fuzzy c-means(FCM)clustering into the generation off block codewords for labeling trellis arcs.The system incorporates a parallel trellis structure,controllable embedding parameters,and a novel informed embedding algorithm with reduced complexity.Two types of embedding schemes—memoryless and memory-based—are designed to flexibly trade-off between imperceptibility and robustness.Experimental results demonstrate that the proposed method outperforms existing approaches in bit error rate(BER)and computational complexity under various attacks,including additive noise,filtering,JPEG compression,cropping,and rotation.The integration of FCM enhances robustness by increasing the codeword distance,while preserving perceptual quality.Overall,the proposed framework is suitable for real-time and secure watermarking applications.展开更多
Generally speaking, being an efficient information hiding scheme, what we want to achieve is high embedding capacity of the cover image and high visual quality of the stego image, high visual quality is also called em...Generally speaking, being an efficient information hiding scheme, what we want to achieve is high embedding capacity of the cover image and high visual quality of the stego image, high visual quality is also called embedding efficiency. This paper mainly studies on the information hiding technology based on gray-scale digital images and especially considers the improvement of embedding capacity and embedding efficiency. For the purpose of that, two algorithms for information hiding were proposed, one is called high capacity of information hiding algorithm (HCIH for short), which achieves high embedding rate, and the other is called high quality of information hiding algorithm (HQIH for short), which realizes high embedding efficiency. The simulation experiments show that our proposed algorithms achieve better performance.展开更多
Electronic medical record (EMR) containing rich biomedical information has a great potential in disease diagnosis and biomedical research. However, the EMR information is usually in the form of unstructured text, whic...Electronic medical record (EMR) containing rich biomedical information has a great potential in disease diagnosis and biomedical research. However, the EMR information is usually in the form of unstructured text, which increases the use cost and hinders its applications. In this work, an effective named entity recognition (NER) method is presented for information extraction on Chinese EMR, which is achieved by word embedding bootstrapped deep active learning to promote the acquisition of medical information from Chinese EMR and to release its value. In this work, deep active learning of bi-directional long short-term memory followed by conditional random field (Bi-LSTM+CRF) is used to capture the characteristics of different information from labeled corpus, and the word embedding models of contiguous bag of words and skip-gram are combined in the above model to respectively capture the text feature of Chinese EMR from unlabeled corpus. To evaluate the performance of above method, the tasks of NER on Chinese EMR with “medical history” content were used. Experimental results show that the word embedding bootstrapped deep active learning method using unlabeled medical corpus can achieve a better performance compared with other models.展开更多
For networking of big data applications,an essential issue is how to represent networks in vector space for further mining and analysis tasks,e.g.,node classification,clustering,link prediction,and visualization.Most ...For networking of big data applications,an essential issue is how to represent networks in vector space for further mining and analysis tasks,e.g.,node classification,clustering,link prediction,and visualization.Most existing studies on this subject mainly concentrate on monoplex networks considering a single type of relation among nodes.However,numerous real-world networks are naturally composed of multiple layers with different relation types;such a network is called a multiplex network.The majority of existing multiplex network embedding methods either overlook node attributes,resort to node labels for training,or underutilize underlying information shared across multiple layers.In this paper,we propose Multiplex Network Infomax(MNI),an unsupervised embedding framework to represent information of multiple layers into a unified embedding space.To be more specific,we aim to maximize the mutual information between the unified embedding and node embeddings of each layer.On the basis of this framework,we present an unsupervised network embedding method for attributed multiplex networks.Experimental results show that our method achieves competitive performance on not only node-related tasks,such as node classification,clustering,and similarity search,but also a typical edge-related task,i.e.,link prediction,at times even outperforming relevant supervised methods,despite that MNI is fully unsupervised.展开更多
In view of the lack of research on the information model of tufting carpet machine in China,an information modeling method based on Object Linking and Embedding for Process Control Unified Architecture(OPC UA)framewor...In view of the lack of research on the information model of tufting carpet machine in China,an information modeling method based on Object Linking and Embedding for Process Control Unified Architecture(OPC UA)framework was proposed to solve the problem of“information island”caused by the differentiated data interface between heterogeneous equipment and system in tufting carpet machine workshop.This paper established an information model of tufting carpet machine based on analyzing the system architecture,workshop equipment composition and information flow of the workshop,combined with the OPC UA information modeling specification.Subsequently,the OPC UA protocol is used to instantiate and map the information model,and the OPC UA server is developed.Finally,the practicability of tufting carpet machine information model under the OPC UA framework and the feasibility of realizing the information interconnection of heterogeneous devices in the tufting carpet machine digital workshop are verified.On this basis,the cloud and remote access to the underlying device data are realized.The application of this information model and information integration scheme in actual production explores and practices the application of OPC UA technology in the digital workshop of tufting carpet machine.展开更多
A heterogeneous information network,which is composed of various types of nodes and edges,has a complex structure and rich information content,and is widely used in social networks,academic networks,e-commerce,and oth...A heterogeneous information network,which is composed of various types of nodes and edges,has a complex structure and rich information content,and is widely used in social networks,academic networks,e-commerce,and other fields.Link prediction,as a key task to reveal the unobserved relationships in the network,is of great significance in heterogeneous information networks.This paper reviews the application of presentation-based learning methods in link prediction of heterogeneous information networks.This paper introduces the basic concepts of heterogeneous information networks,and the theoretical basis of representation learning,and discusses the specific application of the deep learning model in node embedding learning and link prediction in detail.The effectiveness and superiority of these methods on multiple real data sets are demonstrated by experimental verification.展开更多
At present,the process of digital image information fusion has the problems of low data cleaning unaccuracy and more repeated data omission,resulting in the unideal information fusion.In this regard,a visualized multi...At present,the process of digital image information fusion has the problems of low data cleaning unaccuracy and more repeated data omission,resulting in the unideal information fusion.In this regard,a visualized multicomponent information fusion method for big data based on radar map is proposed in this paper.The data model of perceptual digital image is constructed by using the linear regression analysis method.The ID tag of the collected image data as Transactin Identification(TID)is compared.If the TID of two data is the same,the repeated data detection is carried out.After the test,the data set is processed many times in accordance with the method process to improve the precision of data cleaning and reduce the omission.Based on the radar images,hierarchical visualization of processed multi-level information fusion is realized.The experiments show that the method can clean the redundant data accurately and achieve the efficient fusion of multi-level information of big data in the digital image.展开更多
An information hiding algorithm is proposed, which hides information by embedding secret data into the palette of bitmap resources of portable executable (PE) files. This algorithm has higher security than some trad...An information hiding algorithm is proposed, which hides information by embedding secret data into the palette of bitmap resources of portable executable (PE) files. This algorithm has higher security than some traditional ones because of integrating secret data and bitmap resources together. Through analyzing the principle of bitmap resources parsing in an operating system and the layer of resource data in PE files, a safe and useful solution is presented to solve two problems that bitmap resources are incorrectly analyzed and other resources data are confused in the process of data embedding. The feasibility and effectiveness of the proposed algorithm are confirmed through computer experiments.展开更多
This paper proposes a new technique that is used to embed depth maps into corresponding 2-dimensional (2D) images. Since a 2D image and its depth map are integrated into one type of image format, they can be treated...This paper proposes a new technique that is used to embed depth maps into corresponding 2-dimensional (2D) images. Since a 2D image and its depth map are integrated into one type of image format, they can be treated as if they were one 2D image. Thereby, it can reduce the amount of data in 3D images by half and simplify the processes for sending them through networks because the synchronization between images for the left and right eyes becomes unnecessary. We embed depth maps in the quantized discrete cosine transform (DCT) data of 2D images. The key to this technique is whether the depth maps could be embedded into 2D images without perceivably deteriorating their quality. We try to reduce their deterioration by compressing the depth map data by using the differences from the next pixel to the left. We assume that there is only one non-zero pixel at most on one horizontal line in the DCT block because the depth map values change abruptly. We conduct an experiment to evaluate the quality of the 2D images embedded with depth maps and find that satisfactory quality could be achieved.展开更多
Spectrum access approach and power allocation scheme are important techniques in cognitive radio(CR) system,which not only affect communication performance of CR user(secondary user,SU) but also play decisive role for...Spectrum access approach and power allocation scheme are important techniques in cognitive radio(CR) system,which not only affect communication performance of CR user(secondary user,SU) but also play decisive role for protection of primary user(PU).In this study,we propose a power allocation scheme for SU based on the status sensing of PU in a single-input single-output(SISO) CR network.Instead of the conventional binary primary transmit power strategy,namely the sensed PU has only present or absent status,we consider a more practical scenario when PU transmits with multiple levels of power and quantized side information known by SU in advance as a primary quantized codebook.The secondary power allocation scheme to maximize the average throughput under the rate loss constraint(RLC) of PU is parameterized by the sensing results for PU,the primary quantized codebook and the channel state information(CSI) of SU.Furthermore,Differential Evolution(DE) algorithm is used to solve this non-convex power allocation problem.Simulation results show the performance and effectiveness of our proposed scheme under more practical communication conditions.展开更多
Real-world complex networks are inherently heterogeneous;they have different types of nodes,attributes,and relationships.In recent years,various methods have been proposed to automatically learn how to encode the stru...Real-world complex networks are inherently heterogeneous;they have different types of nodes,attributes,and relationships.In recent years,various methods have been proposed to automatically learn how to encode the structural and semantic information contained in heterogeneous information networks(HINs)into low-dimensional embeddings;this task is called heterogeneous network embedding(HNE).Efficient HNE techniques can benefit various HIN-based machine learning tasks such as node classification,recommender systems,and information retrieval.Here,we provide a comprehensive survey of key advancements in the area of HNE.First,we define an encoder-decoder-based HNE model taxonomy.Then,we systematically overview,compare,and summarize various state-of-the-art HNE models and analyze the advantages and disadvantages of various model categories to identify more potentially competitive HNE frameworks.We also summarize the application fields,benchmark datasets,open source tools,andperformance evaluation in theHNEarea.Finally,wediscuss open issues and suggest promising future directions.We anticipate that this survey will provide deep insights into research in the field of HNE.展开更多
This paper presented an approach to hide secret speech information in code excited linear prediction (CELP)-based speech coding scheme by adopting the analysis-by-synthesis (ABS)-based algorithm of speech information ...This paper presented an approach to hide secret speech information in code excited linear prediction (CELP)-based speech coding scheme by adopting the analysis-by-synthesis (ABS)-based algorithm of speech information hiding and extracting for the purpose of secure speech communication. The secret speech is coded in 2.4 Kb/s mixed excitation linear prediction (MELP), which is embedded in CELP type public speech. The ABS algorithm adopts speech synthesizer in speech coder. Speech embedding and coding are synchronous, i.e. a fusion of speech information data of public and secret. The experiment of embedding 2.4 Kb/s MELP secret speech in G.728 scheme coded public speech transmitted via public switched telephone network (PSTN) shows that the proposed approach satisfies the requirements of information hiding, meets the secure communication speech quality constraints, and achieves high hiding capacity of average 3.2 Kb/s with an excellent speech quality and complicating speakers’ recognition.展开更多
Aspect-based sentiment analysis aims to detect and classify the sentiment polarities as negative,positive,or neutral while associating them with their identified aspects from the corresponding context.In this regard,p...Aspect-based sentiment analysis aims to detect and classify the sentiment polarities as negative,positive,or neutral while associating them with their identified aspects from the corresponding context.In this regard,prior methodologies widely utilize either word embedding or tree-based rep-resentations.Meanwhile,the separate use of those deep features such as word embedding and tree-based dependencies has become a significant cause of information loss.Generally,word embedding preserves the syntactic and semantic relations between a couple of terms lying in a sentence.Besides,the tree-based structure conserves the grammatical and logical dependencies of context.In addition,the sentence-oriented word position describes a critical factor that influences the contextual information of a targeted sentence.Therefore,knowledge of the position-oriented information of words in a sentence has been considered significant.In this study,we propose to use word embedding,tree-based representation,and contextual position information in combination to evaluate whether their combination will improve the result’s effectiveness or not.In the meantime,their joint utilization enhances the accurate identification and extraction of targeted aspect terms,which also influences their classification process.In this research paper,we propose a method named Attention Based Multi-Channel Convolutional Neural Net-work(Att-MC-CNN)that jointly utilizes these three deep features such as word embedding with tree-based structure and contextual position informa-tion.These three parameters deliver to Multi-Channel Convolutional Neural Network(MC-CNN)that identifies and extracts the potential terms and classifies their polarities.In addition,these terms have been further filtered with the attention mechanism,which determines the most significant words.The empirical analysis proves the proposed approach’s effectiveness compared to existing techniques when evaluated on standard datasets.The experimental results represent our approach outperforms in the F1 measure with an overall achievement of 94%in identifying aspects and 92%in the task of sentiment classification.展开更多
With the deployment of modern infrastructure for public transportation, several studies have analyzed movement patterns of people using smart card data and have characterized different areas. In this paper, we propose...With the deployment of modern infrastructure for public transportation, several studies have analyzed movement patterns of people using smart card data and have characterized different areas. In this paper, we propose the “movement purpose hypothesis” that each movement occurs from two causes: where the person is and what the person wants to do at a given moment. We formulate this hypothesis to a synthesis model in which two network graphs generate a movement network graph. Then we develop two novel-embedding models to assess the hypothesis, and demonstrate that the models obtain a vector representation of a geospatial area using movement patterns of people from large-scale smart card data. We conducted an experiment using smart card data for a large network of railroads in the Kansai region of Japan. We obtained a vector representation of each railroad station and each purpose using the developed embedding models. Results show that network embedding methods are suitable for a large-scale movement of data, and the developed models perform better than existing embedding methods in the task of multi-label classification for train stations on the purpose of use data set. Our proposed models can contribute to the prediction of people flows by discovering underlying representations of geospatial areas from mobility data.展开更多
Due to the heterogeneity of nodes and edges,heterogeneous network embedding is a very challenging task to embed highly coupled networks into a set of low-dimensional vectors.Existing models either only learn embedding...Due to the heterogeneity of nodes and edges,heterogeneous network embedding is a very challenging task to embed highly coupled networks into a set of low-dimensional vectors.Existing models either only learn embedding vectors for nodes or only for edges.These two methods of embedding learning are rarely performed in the same model,and they both overlook the internal correlation between nodes and edges.To solve these problems,a node and edge joint embedding model is proposed for Heterogeneous Information Networks(HINs),called NEJE.The NEJE model can better capture the latent structural and semantic information from an HIN through two joint learning strategies:type-level joint learning and element-level joint learning.Firstly,node-type-aware structure learning and edge-type-aware semantic learning are sequentially performed on the original network and its line graph to get the initial embedding of nodes and the embedding of edges.Then,to optimize performance,type-level joint learning is performed through the alternating training of node embedding on the original network and edge embedding on the line graph.Finally,a new homogeneous network is constructed from the original heterogeneous network,and the graph attention model is further used on the new network to perform element-level joint learning.Experiments on three tasks and five public datasets show that our NEJE model performance improves by about 2.83%over other models,and even improves by 6.42%on average for the node clustering task on Digital Bibliography&Library Project(DBLP)dataset.展开更多
基金funded by the National Science and Technology Council,Taiwan,under grant number NSTC 114-2221-E-167-005-MY3,and NSTC 113-2221-E-167-006-.
文摘Digital watermarking must balance imperceptibility,robustness,complexity,and security.To address the challenge of computational efficiency in trellis-based informed embedding,we propose a modified watermarking framework that integrates fuzzy c-means(FCM)clustering into the generation off block codewords for labeling trellis arcs.The system incorporates a parallel trellis structure,controllable embedding parameters,and a novel informed embedding algorithm with reduced complexity.Two types of embedding schemes—memoryless and memory-based—are designed to flexibly trade-off between imperceptibility and robustness.Experimental results demonstrate that the proposed method outperforms existing approaches in bit error rate(BER)and computational complexity under various attacks,including additive noise,filtering,JPEG compression,cropping,and rotation.The integration of FCM enhances robustness by increasing the codeword distance,while preserving perceptual quality.Overall,the proposed framework is suitable for real-time and secure watermarking applications.
文摘Generally speaking, being an efficient information hiding scheme, what we want to achieve is high embedding capacity of the cover image and high visual quality of the stego image, high visual quality is also called embedding efficiency. This paper mainly studies on the information hiding technology based on gray-scale digital images and especially considers the improvement of embedding capacity and embedding efficiency. For the purpose of that, two algorithms for information hiding were proposed, one is called high capacity of information hiding algorithm (HCIH for short), which achieves high embedding rate, and the other is called high quality of information hiding algorithm (HQIH for short), which realizes high embedding efficiency. The simulation experiments show that our proposed algorithms achieve better performance.
基金the Artificial Intelligence Innovation and Development Project of Shanghai Municipal Commission of Economy and Information (No. 2019-RGZN-01081)。
文摘Electronic medical record (EMR) containing rich biomedical information has a great potential in disease diagnosis and biomedical research. However, the EMR information is usually in the form of unstructured text, which increases the use cost and hinders its applications. In this work, an effective named entity recognition (NER) method is presented for information extraction on Chinese EMR, which is achieved by word embedding bootstrapped deep active learning to promote the acquisition of medical information from Chinese EMR and to release its value. In this work, deep active learning of bi-directional long short-term memory followed by conditional random field (Bi-LSTM+CRF) is used to capture the characteristics of different information from labeled corpus, and the word embedding models of contiguous bag of words and skip-gram are combined in the above model to respectively capture the text feature of Chinese EMR from unlabeled corpus. To evaluate the performance of above method, the tasks of NER on Chinese EMR with “medical history” content were used. Experimental results show that the word embedding bootstrapped deep active learning method using unlabeled medical corpus can achieve a better performance compared with other models.
基金This work was supported by the National Natural Science Foundation of China(NSFC)under Grant U19B2004in part by National Key R&D Program of China under Grant 2022YFB2901202+1 种基金in part by the Open Funding Projects of the State Key Laboratory of Communication Content Cognition(No.20K05 and No.A02107)in part by the Special Fund for Science and Technology of Guangdong Province under Grant 2019SDR002.
文摘For networking of big data applications,an essential issue is how to represent networks in vector space for further mining and analysis tasks,e.g.,node classification,clustering,link prediction,and visualization.Most existing studies on this subject mainly concentrate on monoplex networks considering a single type of relation among nodes.However,numerous real-world networks are naturally composed of multiple layers with different relation types;such a network is called a multiplex network.The majority of existing multiplex network embedding methods either overlook node attributes,resort to node labels for training,or underutilize underlying information shared across multiple layers.In this paper,we propose Multiplex Network Infomax(MNI),an unsupervised embedding framework to represent information of multiple layers into a unified embedding space.To be more specific,we aim to maximize the mutual information between the unified embedding and node embeddings of each layer.On the basis of this framework,we present an unsupervised network embedding method for attributed multiplex networks.Experimental results show that our method achieves competitive performance on not only node-related tasks,such as node classification,clustering,and similarity search,but also a typical edge-related task,i.e.,link prediction,at times even outperforming relevant supervised methods,despite that MNI is fully unsupervised.
文摘In view of the lack of research on the information model of tufting carpet machine in China,an information modeling method based on Object Linking and Embedding for Process Control Unified Architecture(OPC UA)framework was proposed to solve the problem of“information island”caused by the differentiated data interface between heterogeneous equipment and system in tufting carpet machine workshop.This paper established an information model of tufting carpet machine based on analyzing the system architecture,workshop equipment composition and information flow of the workshop,combined with the OPC UA information modeling specification.Subsequently,the OPC UA protocol is used to instantiate and map the information model,and the OPC UA server is developed.Finally,the practicability of tufting carpet machine information model under the OPC UA framework and the feasibility of realizing the information interconnection of heterogeneous devices in the tufting carpet machine digital workshop are verified.On this basis,the cloud and remote access to the underlying device data are realized.The application of this information model and information integration scheme in actual production explores and practices the application of OPC UA technology in the digital workshop of tufting carpet machine.
基金Science and Technology Research Project of Jiangxi Provincial Department of Education(Project No.GJJ211348,GJJ211347 and GJJ2201056)。
文摘A heterogeneous information network,which is composed of various types of nodes and edges,has a complex structure and rich information content,and is widely used in social networks,academic networks,e-commerce,and other fields.Link prediction,as a key task to reveal the unobserved relationships in the network,is of great significance in heterogeneous information networks.This paper reviews the application of presentation-based learning methods in link prediction of heterogeneous information networks.This paper introduces the basic concepts of heterogeneous information networks,and the theoretical basis of representation learning,and discusses the specific application of the deep learning model in node embedding learning and link prediction in detail.The effectiveness and superiority of these methods on multiple real data sets are demonstrated by experimental verification.
基金2018 National Grade Innovation and Entrepreneurship Training Program for College Students,China(No.201811562005)Research Project of Gansu University,China(No.2016A-105)Innovation and Entrepreneurship Education Project of Gansu Province in 2019,China(No.2019024)。
文摘At present,the process of digital image information fusion has the problems of low data cleaning unaccuracy and more repeated data omission,resulting in the unideal information fusion.In this regard,a visualized multicomponent information fusion method for big data based on radar map is proposed in this paper.The data model of perceptual digital image is constructed by using the linear regression analysis method.The ID tag of the collected image data as Transactin Identification(TID)is compared.If the TID of two data is the same,the repeated data detection is carried out.After the test,the data set is processed many times in accordance with the method process to improve the precision of data cleaning and reduce the omission.Based on the radar images,hierarchical visualization of processed multi-level information fusion is realized.The experiments show that the method can clean the redundant data accurately and achieve the efficient fusion of multi-level information of big data in the digital image.
基金supported by the Applied Basic Research Programs of Sichuan Province under Grant No. 2010JY0001the Fundamental Research Funds for the Central Universities under Grant No. ZYGX2010J068
文摘An information hiding algorithm is proposed, which hides information by embedding secret data into the palette of bitmap resources of portable executable (PE) files. This algorithm has higher security than some traditional ones because of integrating secret data and bitmap resources together. Through analyzing the principle of bitmap resources parsing in an operating system and the layer of resource data in PE files, a safe and useful solution is presented to solve two problems that bitmap resources are incorrectly analyzed and other resources data are confused in the process of data embedding. The feasibility and effectiveness of the proposed algorithm are confirmed through computer experiments.
文摘This paper proposes a new technique that is used to embed depth maps into corresponding 2-dimensional (2D) images. Since a 2D image and its depth map are integrated into one type of image format, they can be treated as if they were one 2D image. Thereby, it can reduce the amount of data in 3D images by half and simplify the processes for sending them through networks because the synchronization between images for the left and right eyes becomes unnecessary. We embed depth maps in the quantized discrete cosine transform (DCT) data of 2D images. The key to this technique is whether the depth maps could be embedded into 2D images without perceivably deteriorating their quality. We try to reduce their deterioration by compressing the depth map data by using the differences from the next pixel to the left. We assume that there is only one non-zero pixel at most on one horizontal line in the DCT block because the depth map values change abruptly. We conduct an experiment to evaluate the quality of the 2D images embedded with depth maps and find that satisfactory quality could be achieved.
基金supported by the National Natural Science Foundation of China(Grant No.61571209)
文摘Spectrum access approach and power allocation scheme are important techniques in cognitive radio(CR) system,which not only affect communication performance of CR user(secondary user,SU) but also play decisive role for protection of primary user(PU).In this study,we propose a power allocation scheme for SU based on the status sensing of PU in a single-input single-output(SISO) CR network.Instead of the conventional binary primary transmit power strategy,namely the sensed PU has only present or absent status,we consider a more practical scenario when PU transmits with multiple levels of power and quantized side information known by SU in advance as a primary quantized codebook.The secondary power allocation scheme to maximize the average throughput under the rate loss constraint(RLC) of PU is parameterized by the sensing results for PU,the primary quantized codebook and the channel state information(CSI) of SU.Furthermore,Differential Evolution(DE) algorithm is used to solve this non-convex power allocation problem.Simulation results show the performance and effectiveness of our proposed scheme under more practical communication conditions.
基金supported by the National Key Research and Development Plan of China(2017YFB0503700,2016YFB0501801)the National Natural Science Foundation of China(61170026,62173157)+1 种基金the Thirteen Five-Year Research Planning Project of National Language Committee(No.YB135-149)the Fundamental Research Funds for the Central Universities(Nos.CCNU20QN022,CCNU20QN021,CCNU20ZT012).
文摘Real-world complex networks are inherently heterogeneous;they have different types of nodes,attributes,and relationships.In recent years,various methods have been proposed to automatically learn how to encode the structural and semantic information contained in heterogeneous information networks(HINs)into low-dimensional embeddings;this task is called heterogeneous network embedding(HNE).Efficient HNE techniques can benefit various HIN-based machine learning tasks such as node classification,recommender systems,and information retrieval.Here,we provide a comprehensive survey of key advancements in the area of HNE.First,we define an encoder-decoder-based HNE model taxonomy.Then,we systematically overview,compare,and summarize various state-of-the-art HNE models and analyze the advantages and disadvantages of various model categories to identify more potentially competitive HNE frameworks.We also summarize the application fields,benchmark datasets,open source tools,andperformance evaluation in theHNEarea.Finally,wediscuss open issues and suggest promising future directions.We anticipate that this survey will provide deep insights into research in the field of HNE.
文摘This paper presented an approach to hide secret speech information in code excited linear prediction (CELP)-based speech coding scheme by adopting the analysis-by-synthesis (ABS)-based algorithm of speech information hiding and extracting for the purpose of secure speech communication. The secret speech is coded in 2.4 Kb/s mixed excitation linear prediction (MELP), which is embedded in CELP type public speech. The ABS algorithm adopts speech synthesizer in speech coder. Speech embedding and coding are synchronous, i.e. a fusion of speech information data of public and secret. The experiment of embedding 2.4 Kb/s MELP secret speech in G.728 scheme coded public speech transmitted via public switched telephone network (PSTN) shows that the proposed approach satisfies the requirements of information hiding, meets the secure communication speech quality constraints, and achieves high hiding capacity of average 3.2 Kb/s with an excellent speech quality and complicating speakers’ recognition.
基金supported by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia[Grant No.3418].
文摘Aspect-based sentiment analysis aims to detect and classify the sentiment polarities as negative,positive,or neutral while associating them with their identified aspects from the corresponding context.In this regard,prior methodologies widely utilize either word embedding or tree-based rep-resentations.Meanwhile,the separate use of those deep features such as word embedding and tree-based dependencies has become a significant cause of information loss.Generally,word embedding preserves the syntactic and semantic relations between a couple of terms lying in a sentence.Besides,the tree-based structure conserves the grammatical and logical dependencies of context.In addition,the sentence-oriented word position describes a critical factor that influences the contextual information of a targeted sentence.Therefore,knowledge of the position-oriented information of words in a sentence has been considered significant.In this study,we propose to use word embedding,tree-based representation,and contextual position information in combination to evaluate whether their combination will improve the result’s effectiveness or not.In the meantime,their joint utilization enhances the accurate identification and extraction of targeted aspect terms,which also influences their classification process.In this research paper,we propose a method named Attention Based Multi-Channel Convolutional Neural Net-work(Att-MC-CNN)that jointly utilizes these three deep features such as word embedding with tree-based structure and contextual position informa-tion.These three parameters deliver to Multi-Channel Convolutional Neural Network(MC-CNN)that identifies and extracts the potential terms and classifies their polarities.In addition,these terms have been further filtered with the attention mechanism,which determines the most significant words.The empirical analysis proves the proposed approach’s effectiveness compared to existing techniques when evaluated on standard datasets.The experimental results represent our approach outperforms in the F1 measure with an overall achievement of 94%in identifying aspects and 92%in the task of sentiment classification.
文摘With the deployment of modern infrastructure for public transportation, several studies have analyzed movement patterns of people using smart card data and have characterized different areas. In this paper, we propose the “movement purpose hypothesis” that each movement occurs from two causes: where the person is and what the person wants to do at a given moment. We formulate this hypothesis to a synthesis model in which two network graphs generate a movement network graph. Then we develop two novel-embedding models to assess the hypothesis, and demonstrate that the models obtain a vector representation of a geospatial area using movement patterns of people from large-scale smart card data. We conducted an experiment using smart card data for a large network of railroads in the Kansai region of Japan. We obtained a vector representation of each railroad station and each purpose using the developed embedding models. Results show that network embedding methods are suitable for a large-scale movement of data, and the developed models perform better than existing embedding methods in the task of multi-label classification for train stations on the purpose of use data set. Our proposed models can contribute to the prediction of people flows by discovering underlying representations of geospatial areas from mobility data.
基金supported by the National Natural Science Foundation of China(No.62103143)the Hunan Province Key Research and Development Program(No.2022WK2006)+2 种基金the Special Project for the Construction of Innovative Provinces in Hunan(Nos.2020TP2018 and 2019GK4030)the Young Backbone Teacher of Hunan Province(No.2022101)the Scientific Research Fund of Hunan Provincial Education Department(No.22B0471).
文摘Due to the heterogeneity of nodes and edges,heterogeneous network embedding is a very challenging task to embed highly coupled networks into a set of low-dimensional vectors.Existing models either only learn embedding vectors for nodes or only for edges.These two methods of embedding learning are rarely performed in the same model,and they both overlook the internal correlation between nodes and edges.To solve these problems,a node and edge joint embedding model is proposed for Heterogeneous Information Networks(HINs),called NEJE.The NEJE model can better capture the latent structural and semantic information from an HIN through two joint learning strategies:type-level joint learning and element-level joint learning.Firstly,node-type-aware structure learning and edge-type-aware semantic learning are sequentially performed on the original network and its line graph to get the initial embedding of nodes and the embedding of edges.Then,to optimize performance,type-level joint learning is performed through the alternating training of node embedding on the original network and edge embedding on the line graph.Finally,a new homogeneous network is constructed from the original heterogeneous network,and the graph attention model is further used on the new network to perform element-level joint learning.Experiments on three tasks and five public datasets show that our NEJE model performance improves by about 2.83%over other models,and even improves by 6.42%on average for the node clustering task on Digital Bibliography&Library Project(DBLP)dataset.