Reliable traffic flow prediction is crucial for mitigating urban congestion.This paper proposes Attentionbased spatiotemporal Interactive Dynamic Graph Convolutional Network(AIDGCN),a novel architecture integrating In...Reliable traffic flow prediction is crucial for mitigating urban congestion.This paper proposes Attentionbased spatiotemporal Interactive Dynamic Graph Convolutional Network(AIDGCN),a novel architecture integrating Interactive Dynamic Graph Convolution Network(IDGCN)with Temporal Multi-Head Trend-Aware Attention.Its core innovation lies in IDGCN,which uniquely splits sequences into symmetric intervals for interactive feature sharing via dynamic graphs,and a novel attention mechanism incorporating convolutional operations to capture essential local traffic trends—addressing a critical gap in standard attention for continuous data.For 15-and 60-min forecasting on METR-LA,AIDGCN achieves MAEs of 0.75%and 0.39%,and RMSEs of 1.32%and 0.14%,respectively.In the 60-min long-term forecasting of the PEMS-BAY dataset,the AIDGCN out-performs the MRA-BGCN method by 6.28%,4.93%,and 7.17%in terms of MAE,RMSE,and MAPE,respectively.Experimental results demonstrate the superiority of our pro-posed model over state-of-the-art methods.展开更多
The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relation...Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relationships between hand joints and improve the accuracy of 3D hand pose regression.However,GCNs cannot effectively describe the relationships between non-adjacent hand joints.Recently,hypergraph convolutional networks(HGCNs)have received much attention as they can describe multi-dimensional relationships between nodes through hyperedges;therefore,this paper proposes a framework for 3D hand pose estimation based on HGCN,which can better extract correlated relationships between adjacent and non-adjacent hand joints.To overcome the shortcomings of predefined hypergraph structures,a kind of dynamic hypergraph convolutional network is proposed,in which hyperedges are constructed dynamically based on hand joint feature similarity.To better explore the local semantic relationships between nodes,a kind of semantic dynamic hypergraph convolution is proposed.The proposed method is evaluated on publicly available benchmark datasets.Qualitative and quantitative experimental results both show that the proposed HGCN and improved methods for 3D hand pose estimation are better than GCN,and achieve state-of-the-art performance compared with existing methods.展开更多
Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To...Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To address the difficulties in simultaneously capturing local and global dynamic spatiotemporal correlations in traffic flow,as well as the high time complexity of existing models,a multi-head flow attention-based local-global dynamic hypergraph convolution(MFA-LGDHC)pre-diction model is proposed.which consists of multi-head flow attention(MHFA)mechanism,graph convolution network(GCN),and local-global dynamic hypergraph convolution(LGHC).MHFA is utilized to extract the time dependency of traffic flow and reduce the time complexity of the model.GCN is employed to catch the spatial dependency of traffic flow.LGHC utilizes down-sampling con-volution and isometric convolution to capture the local and global spatial dependencies of traffic flow.And dynamic hypergraph convolution is used to model the dynamic higher-order relationships of the traffic road network.Experimental results indicate that the MFA-LGDHC model outperforms current popular baseline models and exhibits good prediction performance.展开更多
BACKGROUND The accurate classification of focal liver lesions(FLLs)is essential to properly guide treatment options and predict prognosis.Dynamic contrast-enhanced computed tomography(DCE-CT)is still the cornerstone i...BACKGROUND The accurate classification of focal liver lesions(FLLs)is essential to properly guide treatment options and predict prognosis.Dynamic contrast-enhanced computed tomography(DCE-CT)is still the cornerstone in the exact classification of FLLs due to its noninvasive nature,high scanning speed,and high-density resolution.Since their recent development,convolutional neural network-based deep learning techniques has been recognized to have high potential for image recognition tasks.AIM To develop and evaluate an automated multiphase convolutional dense network(MP-CDN)to classify FLLs on multiphase CT.METHODS A total of 517 FLLs scanned on a 320-detector CT scanner using a four-phase DCECT imaging protocol(including precontrast phase,arterial phase,portal venous phase,and delayed phase)from 2012 to 2017 were retrospectively enrolled.FLLs were classified into four categories:Category A,hepatocellular carcinoma(HCC);category B,liver metastases;category C,benign non-inflammatory FLLs including hemangiomas,focal nodular hyperplasias and adenomas;and category D,hepatic abscesses.Each category was split into a training set and test set in an approximate 8:2 ratio.An MP-CDN classifier with a sequential input of the fourphase CT images was developed to automatically classify FLLs.The classification performance of the model was evaluated on the test set;the accuracy and specificity were calculated from the confusion matrix,and the area under the receiver operating characteristic curve(AUC)was calculated from the SoftMax probability outputted from the last layer of the MP-CDN.RESULTS A total of 410 FLLs were used for training and 107 FLLs were used for testing.The mean classification accuracy of the test set was 81.3%(87/107).The accuracy/specificity of distinguishing each category from the others were 0.916/0.964,0.925/0.905,0.860/0.918,and 0.925/0.963 for HCC,metastases,benign non-inflammatory FLLs,and abscesses on the test set,respectively.The AUC(95%confidence interval)for differentiating each category from the others was 0.92(0.837-0.992),0.99(0.967-1.00),0.88(0.795-0.955)and 0.96(0.914-0.996)for HCC,metastases,benign non-inflammatory FLLs,and abscesses on the test set,respectively.CONCLUSION MP-CDN accurately classified FLLs detected on four-phase CT as HCC,metastases,benign non-inflammatory FLLs and hepatic abscesses and may assist radiologists in identifying the different types of FLLs.展开更多
Pneumonia is a serious disease that can be fatal,particularly among children and the elderly.The accuracy of pneumonia diagnosis can be improved by combining artificial-intelligence technology with X-ray imaging.This ...Pneumonia is a serious disease that can be fatal,particularly among children and the elderly.The accuracy of pneumonia diagnosis can be improved by combining artificial-intelligence technology with X-ray imaging.This study proposes X-ODFCANet,which addresses the issues of low accuracy and excessive parameters in existing deep-learningbased pneumonia-classification methods.This network incorporates a feature coordination attention module and an omni-dimensional dynamic convolution(ODConv)module,leveraging the residual module for feature extraction from X-ray images.The feature coordination attention module utilizes two one-dimensional feature encoding processes to aggregate feature information from different spatial directions.Additionally,the ODConv module extracts and fuses feature information in four dimensions:the spatial dimension of the convolution kernel,input and output channel quantities,and convolution kernel quantity.The experimental results demonstrate that the proposed method can effectively improve the accuracy of pneumonia classification,which is 3.77%higher than that of ResNet18.The model parameters are 4.45M,which was reduced by approximately 2.5 times.The code is available at https://github.com/limuni/X ODFCA NET.展开更多
Hyperspectral image(HSI)classification is crucial for numerous remote sensing applications.Traditional deep learning methods may miss pixel relationships and context,leading to inefficiencies.This paper introduces the...Hyperspectral image(HSI)classification is crucial for numerous remote sensing applications.Traditional deep learning methods may miss pixel relationships and context,leading to inefficiencies.This paper introduces the spectral band graph convolutional and attention-enhanced CNN joint network(SGCCN),a novel approach that harnesses the power of spectral band graph convolutions for capturing long-range relationships,utilizes local perception of attention-enhanced multi-level convolutions for local spatial feature and employs a dynamic attention mechanism to enhance feature extraction.The SGCCN integrates spectral and spatial features through a self-attention fusion network,significantly improving classification accuracy and efficiency.The proposed method outperforms existing techniques,demonstrating its effectiveness in handling the challenges associated with HSI data.展开更多
Traditional Computational Fluid Dynamics(CFD)simulations are computationally expensive when applied to complex fluid–structure interaction problems and often struggle to capture the essential flow features governing ...Traditional Computational Fluid Dynamics(CFD)simulations are computationally expensive when applied to complex fluid–structure interaction problems and often struggle to capture the essential flow features governing vortex-induced vibrations(VIV)of floating structures.To overcome these limitations,this study develops a hybrid framework that integrates high-fidelity CFD modeling with deep learning techniques to enhance the accuracy and efficiency of VIV response prediction.First,an unstructured finite-volume fluid–structure coupling model is established to generate high-resolution flow field data and extract multi-component time-series feature tensors.These tensors serve as inputs to a Squeeze-and-Excitation Convolutional Neural Network(SE-CNN),which models the nonlinear coupling between flow disturbances and structural responses.The SE-CNN architecture incorporates an attention-based weighting mechanism through an embedded Squeeze-and-Excitation module,dynamically optimizing channel feature importance and improving sensitivity to critical flow characteristics.During training,multidimensional inputs,including pressure,velocity gradient,and displacement sequences,are used to capture the full complexity of fluid–structure interactions.Results demonstrate that the proposed method achieves a maximum amplitude prediction error of only 2.9%and a main frequency deviation below 0.03 Hz,outperforming conventional CNN models by reducing amplitude prediction error from 3.2%to 1.9%.The approach is validated using a representative semi-submersible platform,confirming its robustness across varying damping conditions and flow velocities.展开更多
Impact dynamics of flexible solids is important in engineering practice. Obtaining its dynamic response is a challenging task and usually achieved by numerical methods. The objectives of the study are twofold. Firstly...Impact dynamics of flexible solids is important in engineering practice. Obtaining its dynamic response is a challenging task and usually achieved by numerical methods. The objectives of the study are twofold. Firstly, the discrete singular convolution (DSC) is used for the first time to analyze the impact dynamics. Secondly, the efficiency of various numerical methods for dynamic analysis is explored via an example of a flexible rod hit by a rigid ball. Three numerical methods, including the conventional finite element (FE) method, the DSC algorithm, and the spectral finite element (SFE) method, and one proposed modeling strategy, the improved spectral finite element (ISFE) method, are involved. Numerical results are compared with the known analytical solutions to show their efficiency. It is demonstrated that the proposed ISFE modeling strategy with a proper length of con- ventional FE yields the most accurate contact stress among the four investigated models. It is also found that the DSC algorithm is an alternative method for collision problems.展开更多
When evaluating the seismic safety and reliability of complex engineering structures,it is a critical problem to reasonably consider the randomness and multi-dimensional nature of ground motions.To this end,a proposed...When evaluating the seismic safety and reliability of complex engineering structures,it is a critical problem to reasonably consider the randomness and multi-dimensional nature of ground motions.To this end,a proposed modeling strategy of multi-dimensional stochastic earthquakes is addressed in this study.This improved seismic model has several merits that enable it to better provide seismic analyses of structures.Specifically,at first,the ground motion model is compatible with the design response spectrum.Secondly,the evolutionary power spectrum involved in the model and the design response spectrum are constructed accordingly with sufficient consideration of the correlation between different seismic components.Thirdly,the random function-based dimension-reduction representation is applied,by which seismic modeling is established,with three elementary random variables.Numerical simulations of multi-dimensional stochastic ground motions in a specific design scenario indicate the effectiveness of the proposed modeling strategy.Moreover,the multi-dimensional seismic response and the global reliability of a high-rise frame-core tube structure is discussed in detail to further illustrate the engineering applicability of the proposed method.The analytical investigations demonstrate that the suggested stochastic model of multi-dimensional ground motion is available for accurate seismic response analysis and dynamic reliability assessment of complex engineering structures for performance-based seismic resistance design.展开更多
Integrated water and fertilizer management is important for promoting sustainable development of facility agriculture,and biochar plays an important role in guaranteeing food production,as well as alleviating water sh...Integrated water and fertilizer management is important for promoting sustainable development of facility agriculture,and biochar plays an important role in guaranteeing food production,as well as alleviating water shortages and the overuse of fertilizers.The field experiment had twelve treatments and a control(CK)trial including two irrigation amounts(I1,100%ETm;I2,60%ETm;where ETm is the maximum evapotranspiration),two nitrogen applications(N1,360 kg ha^(−1);N2,120 kg ha^(−1))and three biochar application levels(B1,60 t ha^(−1);B_(2),30 t ha^(−1)and B3,0 t ha^(−1)).A multi-objective synergistic irrigation-nitrogen-biochar application system for improving tomato yield,quality,water and nitrogen use efficiency,and greenhouse emissions was developed by integrating the techniques of experimentation and optimization.First,a coupled irrigation-nitrogen-biochar plot experiment was arranged.Then,tomato yield and fruit quality parameters were determined experimentally to establish the response relationships between irrigation-nitrogen-biochar dosage and yield,comprehensive quality of tomatoes(TCQ),irrigation water use efficiency(IWUE),partial factor productivity of nitrogen(PFPN),and net greenhouse gas emissions(NGE).Finally,a multi-objective dynamic optimization regulation model of irrigation-nitrogen-biochar resource allocation at different growth stages of tomato was constructed which was solved by the fuzzy programming method.The results showed that the application of irrigation and nitrogen to biochar promoted increase in yield,IWUE and PFPN,while it had an inhibitory effect on NGE.In addition,the optimal allocation amounts of water and fertilizer were different under different scenarios.The yield of the S1 scenario increased by 8.31%compared to the B_(1)I_(1)N_(2) treatment;TCQ of the S2 scenario increased by 5.14%compared to the B_(2)I_(2)N_(1) treatment;IWUE of the S3 scenario increased by 10.01%compared to the B1I2N2 treatment;PFPN of the S4 scenario increased by 9.35%compared to the B_(1)I_(1)N_(2) treatment;and NGE of the S5 scenario decreased by 11.23%compared to the B_(2)I1N1 treatment.The optimization model showed that the coordination of multiple objectives considering yield,TCQ,IWUE,PFPN,and NGE increased on average from 4.44 to 69.02%compared to each treatment when the irrigation-nitrogen-biochar dosage was 205.18 mm,186 kg ha^(−1)and 43.31 t ha^(−1),respectively.This study provides a guiding basis for the sustainable management of water and fertilizer in greenhouse tomato production under drip irrigation fertilization conditions.展开更多
Venanico-Filho et al. developed an elegant matrix formulation for dynamic analysis by frequency domain (FD), but the convergence, causality and extended period need further refining. In the present paper, it was arg...Venanico-Filho et al. developed an elegant matrix formulation for dynamic analysis by frequency domain (FD), but the convergence, causality and extended period need further refining. In the present paper, it was argued that: (1) under reasonable assumptions (approximating the frequency response function by the discrete Fourier transform of the discretized unitary impulse response function), the matrix formulation by FD is equivalent to a circular convolution; (2) to avoid the wraparound interference, the excitation vector and impulse response must be padded with enough zeros; (3) provided that the zero padding requirement satisfied, the convergence and accuracy of direct time domain analysis, which is equivalent to that by FD, are guaranteed by the numerical integration scheme; (4) the imaginary part of the computational response approaching zero is due to the continuity of the impulse response functions.展开更多
In order to effectively detect the privacy that may be leaked through social networks and avoid unnecessary harm to users,this paper takes microblog as the research object to study the detection of privacy disclosure ...In order to effectively detect the privacy that may be leaked through social networks and avoid unnecessary harm to users,this paper takes microblog as the research object to study the detection of privacy disclosure in social networks.First,we perform fast privacy leak detection on the currently published text based on the fastText model.In the case that the text to be published contains certain private information,we fully consider the aggregation effect of the private information leaked by different channels,and establish a convolution neural network model based on multi-dimensional features(MF-CNN)to detect privacy disclosure comprehensively and accurately.The experimental results show that the proposed method has a higher accuracy of privacy disclosure detection and can meet the real-time requirements of detection.展开更多
Latent information is difficult to get from the text in speech synthesis.Studies show that features from speech can get more information to help text encoding.In the field of speech encoding,a lot of work has been con...Latent information is difficult to get from the text in speech synthesis.Studies show that features from speech can get more information to help text encoding.In the field of speech encoding,a lot of work has been conducted on two aspects.The first aspect is to encode speech frame by frame.The second aspect is to encode the whole speech to a vector.But the scale in these aspects is fixed.So,encoding speech with an adjustable scale for more latent information is worthy of investigation.But current alignment approaches only support frame-by-frame encoding and speech-to-vector encoding.It remains a challenge to propose a new alignment approach to support adjustable scale speech encoding.This paper presents the dynamic speech encoder with a new alignment approach in conjunction with frame-by-frame encoding and speech-to-vector encoding.The speech feature fromourmodel achieves three functions.First,the speech feature can reconstruct the origin speech while the length of the speech feature is equal to the text length.Second,our model can get text embedding fromspeech,and the encoded speech feature is similar to the text embedding result.Finally,it can transfer the style of synthesis speech and make it more similar to the given reference speech.展开更多
Low dynamic range(LDR)images captured by consumer cameras have a limited luminance range.As the conventional method for generating high dynamic range(HDR)images involves merging multiple-exposure LDR images of the sam...Low dynamic range(LDR)images captured by consumer cameras have a limited luminance range.As the conventional method for generating high dynamic range(HDR)images involves merging multiple-exposure LDR images of the same scene(assuming a stationary scene),we introduce a learning-based model for single-image HDR reconstruction.An input LDR image is sequentially segmented into the local region maps based on the cumulative histogram of the input brightness distribution.Using the local region maps,SParam-Net estimates the parameters of an inverse tone mapping function to generate a pseudo-HDR image.We process the segmented region maps as the input sequences on long short-term memory.Finally,a fast super-resolution convolutional neural network is used for HDR image reconstruction.The proposed method was trained and tested on datasets including HDR-Real,LDR-HDR-pair,and HDR-Eye.The experimental results revealed that HDR images can be generated more reliably than using contemporary end-to-end approaches.展开更多
Deep neural network-based relational extraction research has made significant progress in recent years,andit provides data support for many natural language processing downstream tasks such as building knowledgegraph,...Deep neural network-based relational extraction research has made significant progress in recent years,andit provides data support for many natural language processing downstream tasks such as building knowledgegraph,sentiment analysis and question-answering systems.However,previous studies ignored much unusedstructural information in sentences that could enhance the performance of the relation extraction task.Moreover,most existing dependency-based models utilize self-attention to distinguish the importance of context,whichhardly deals withmultiple-structure information.To efficiently leverage multiple structure information,this paperproposes a dynamic structure attention mechanism model based on textual structure information,which deeplyintegrates word embedding,named entity recognition labels,part of speech,dependency tree and dependency typeinto a graph convolutional network.Specifically,our model extracts text features of different structures from theinput sentence.Textual Structure information Graph Convolutional Networks employs the dynamic structureattention mechanism to learn multi-structure attention,effectively distinguishing important contextual features invarious structural information.In addition,multi-structure weights are carefully designed as amergingmechanismin the different structure attention to dynamically adjust the final attention.This paper combines these featuresand trains a graph convolutional network for relation extraction.We experiment on supervised relation extractiondatasets including SemEval 2010 Task 8,TACRED,TACREV,and Re-TACED,the result significantly outperformsthe previous.展开更多
In this work,a three dimensional(3D)convolutional neural network(CNN)model based on image slices of various normal and pathological vocal folds is proposed for accurate and efficient prediction of glottal flows.The 3D...In this work,a three dimensional(3D)convolutional neural network(CNN)model based on image slices of various normal and pathological vocal folds is proposed for accurate and efficient prediction of glottal flows.The 3D CNN model is composed of the feature extraction block and regression block.The feature extraction block is capable of learning low dimensional features from the high dimensional image data of the glottal shape,and the regression block is employed to flatten the output from the feature extraction block and obtain the desired glottal flow data.The input image data is the condensed set of 2D image slices captured in the axial plane of the 3D vocal folds,where these glottal shapes are synthesized based on the equations of normal vibration modes.The output flow data is the corresponding flow rate,averaged glottal pressure and nodal pressure distributions over the glottal surface.The 3D CNN model is built to establish the mapping between the input image data and output flow data.The ground-truth flow variables of each glottal shape in the training and test datasets are obtained by a high-fidelity sharp-interface immersed-boundary solver.The proposed model is trained to predict the concerned flow variables for glottal shapes in the test set.The present 3D CNN model is more efficient than traditional Computational Fluid Dynamics(CFD)models while the accuracy can still be retained,and more powerful than previous data-driven prediction models because more details of the glottal flow can be provided.The prediction performance of the trained 3D CNN model in accuracy and efficiency indicates that this model could be promising for future clinical applications.展开更多
Much attention has been given to the Internet of Things (IoT) by citizens, industries, governments, and universities for applications like smart buildings, environmental monitoring, health care and so on. With IoT, ne...Much attention has been given to the Internet of Things (IoT) by citizens, industries, governments, and universities for applications like smart buildings, environmental monitoring, health care and so on. With IoT, networkconnectivity is facilitated between smart devices from anyplace and anytime.IoT-based health monitoring systems are gaining popularity and acceptance forcontinuous monitoring and detect health abnormalities from the data collected.Electrocardiographic (ECG) signals are widely used for heart diseases detection.A novel method has been proposed in this work for ECG monitoring using IoTtechniques. In this work, a two-stage approach is employed. In the first stage, arouting protocol based on Dynamic Source Routing (DSR) and Routing byEnergy and Link quality (REL) for IoT healthcare platform is proposed for effi-cient data collection, and in the second stage, classification of ECG for Arrhythmia. Furthermore, this work has evaluated Support Vector Machine (SVM),Artificial Neural Network (ANN), and Convolution Neural Networks (CNNs)-based approach for ECG signals classification. Deep-ECG will use a deep CNNto extract critical features and then compare through evaluation of simple and fastdistance functions in order to obtain an efficient classification of heart abnormalities. For the identification of abnormal data, this work has proposed techniquesfor the classification of ECG data, which has been obtained from mobile watchusers. For experimental verification of the proposed methods, the Beth Israel Hospital (MIT/BIH) Arrhythmia and Massachusetts Institute of Technology (MIT)Database was used for evaluation. Results confirm the presented method’s superior performance with regards to the accuracy of classification. The CNN achievedan accuracy of 91.92% and has a higher accuracy of 4.98% for the SVM and2.68% for the ANN.展开更多
Classic Graph Convolutional Networks (GCNs) often learn node representation holistically, which ignores the distinct impacts from different neighbors when aggregating their features to update a node’s representation....Classic Graph Convolutional Networks (GCNs) often learn node representation holistically, which ignores the distinct impacts from different neighbors when aggregating their features to update a node’s representation. Disentangled GCNs have been proposed to divide each node’s representation into several feature units. However, current disentangling methods do not try to figure out how many inherent factors the model should assign to help extract the best representation of each node. This paper then proposes D^(2)-GCN to provide dynamic disentanglement in GCNs and present the most appropriate factorization of each node’s mixed features. The convergence of the proposed method is proved both theoretically and experimentally. Experiments on real-world datasets show that D^(2)-GCN outperforms the baseline models concerning node classification results in both single- and multi-label tasks.展开更多
With the rapid advancement of virtual reality,dynamic gesture recognition technology has become an indispensable and critical technique for users to achieve human–computer interaction in virtual environments.The reco...With the rapid advancement of virtual reality,dynamic gesture recognition technology has become an indispensable and critical technique for users to achieve human–computer interaction in virtual environments.The recognition of dynamic gestures is a challenging task due to the high degree of freedom and the influence of individual differences and the change of gesture space.To solve the problem of low recognition accuracy of existing networks,an improved dynamic gesture recognition algorithm based on ResNeXt architecture is proposed.The algorithm employs three-dimensional convolution techniques to effectively capture the spatiotemporal features intrinsic to dynamic gestures.Additionally,to enhance the model’s focus and improve its accuracy in identifying dynamic gestures,a lightweight convolutional attention mechanism is introduced.This mechanism not only augments the model’s precision but also facilitates faster convergence during the training phase.In order to further optimize the performance of the model,a deep attention submodule is added to the convolutional attention mechanism module to strengthen the network’s capability in temporal feature extraction.Empirical evaluations on EgoGesture and NvGesture datasets show that the accuracy of the proposed model in dynamic gesture recognition reaches 95.03%and 86.21%,respectively.When operating in RGB mode,the accuracy reached 93.49%and 80.22%,respectively.These results underscore the effectiveness of the proposed algorithm in recognizing dynamic gestures with high accuracy,showcasing its potential for applications in advanced human–computer interaction systems.展开更多
文摘Reliable traffic flow prediction is crucial for mitigating urban congestion.This paper proposes Attentionbased spatiotemporal Interactive Dynamic Graph Convolutional Network(AIDGCN),a novel architecture integrating Interactive Dynamic Graph Convolution Network(IDGCN)with Temporal Multi-Head Trend-Aware Attention.Its core innovation lies in IDGCN,which uniquely splits sequences into symmetric intervals for interactive feature sharing via dynamic graphs,and a novel attention mechanism incorporating convolutional operations to capture essential local traffic trends—addressing a critical gap in standard attention for continuous data.For 15-and 60-min forecasting on METR-LA,AIDGCN achieves MAEs of 0.75%and 0.39%,and RMSEs of 1.32%and 0.14%,respectively.In the 60-min long-term forecasting of the PEMS-BAY dataset,the AIDGCN out-performs the MRA-BGCN method by 6.28%,4.93%,and 7.17%in terms of MAE,RMSE,and MAPE,respectively.Experimental results demonstrate the superiority of our pro-posed model over state-of-the-art methods.
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
基金the National Key Research and Development Program of China(No.2021ZD0111902)the National Natural Science Foundation of China(Nos.62172022 and U21B2038)。
文摘Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relationships between hand joints and improve the accuracy of 3D hand pose regression.However,GCNs cannot effectively describe the relationships between non-adjacent hand joints.Recently,hypergraph convolutional networks(HGCNs)have received much attention as they can describe multi-dimensional relationships between nodes through hyperedges;therefore,this paper proposes a framework for 3D hand pose estimation based on HGCN,which can better extract correlated relationships between adjacent and non-adjacent hand joints.To overcome the shortcomings of predefined hypergraph structures,a kind of dynamic hypergraph convolutional network is proposed,in which hyperedges are constructed dynamically based on hand joint feature similarity.To better explore the local semantic relationships between nodes,a kind of semantic dynamic hypergraph convolution is proposed.The proposed method is evaluated on publicly available benchmark datasets.Qualitative and quantitative experimental results both show that the proposed HGCN and improved methods for 3D hand pose estimation are better than GCN,and achieve state-of-the-art performance compared with existing methods.
基金Supported by the Key R&D Program of Gansu Province(No.23YFGA0063)the Key Talent Project of Gansu Province(No.2024RCXM57,2024RCXM22)the Major Science and Technology Special Program of Gansu Province(No.25ZYJA037).
文摘Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To address the difficulties in simultaneously capturing local and global dynamic spatiotemporal correlations in traffic flow,as well as the high time complexity of existing models,a multi-head flow attention-based local-global dynamic hypergraph convolution(MFA-LGDHC)pre-diction model is proposed.which consists of multi-head flow attention(MHFA)mechanism,graph convolution network(GCN),and local-global dynamic hypergraph convolution(LGHC).MHFA is utilized to extract the time dependency of traffic flow and reduce the time complexity of the model.GCN is employed to catch the spatial dependency of traffic flow.LGHC utilizes down-sampling con-volution and isometric convolution to capture the local and global spatial dependencies of traffic flow.And dynamic hypergraph convolution is used to model the dynamic higher-order relationships of the traffic road network.Experimental results indicate that the MFA-LGDHC model outperforms current popular baseline models and exhibits good prediction performance.
基金Supported by National Natural Science Foundation of China,No.91959118Science and Technology Program of Guangzhou,China,No.201704020016+1 种基金SKY Radiology Department International Medical Research Foundation of China,No.Z-2014-07-1912-15Clinical Research Foundation of the 3rd Affiliated Hospital of Sun Yat-Sen University,No.YHJH201901.
文摘BACKGROUND The accurate classification of focal liver lesions(FLLs)is essential to properly guide treatment options and predict prognosis.Dynamic contrast-enhanced computed tomography(DCE-CT)is still the cornerstone in the exact classification of FLLs due to its noninvasive nature,high scanning speed,and high-density resolution.Since their recent development,convolutional neural network-based deep learning techniques has been recognized to have high potential for image recognition tasks.AIM To develop and evaluate an automated multiphase convolutional dense network(MP-CDN)to classify FLLs on multiphase CT.METHODS A total of 517 FLLs scanned on a 320-detector CT scanner using a four-phase DCECT imaging protocol(including precontrast phase,arterial phase,portal venous phase,and delayed phase)from 2012 to 2017 were retrospectively enrolled.FLLs were classified into four categories:Category A,hepatocellular carcinoma(HCC);category B,liver metastases;category C,benign non-inflammatory FLLs including hemangiomas,focal nodular hyperplasias and adenomas;and category D,hepatic abscesses.Each category was split into a training set and test set in an approximate 8:2 ratio.An MP-CDN classifier with a sequential input of the fourphase CT images was developed to automatically classify FLLs.The classification performance of the model was evaluated on the test set;the accuracy and specificity were calculated from the confusion matrix,and the area under the receiver operating characteristic curve(AUC)was calculated from the SoftMax probability outputted from the last layer of the MP-CDN.RESULTS A total of 410 FLLs were used for training and 107 FLLs were used for testing.The mean classification accuracy of the test set was 81.3%(87/107).The accuracy/specificity of distinguishing each category from the others were 0.916/0.964,0.925/0.905,0.860/0.918,and 0.925/0.963 for HCC,metastases,benign non-inflammatory FLLs,and abscesses on the test set,respectively.The AUC(95%confidence interval)for differentiating each category from the others was 0.92(0.837-0.992),0.99(0.967-1.00),0.88(0.795-0.955)and 0.96(0.914-0.996)for HCC,metastases,benign non-inflammatory FLLs,and abscesses on the test set,respectively.CONCLUSION MP-CDN accurately classified FLLs detected on four-phase CT as HCC,metastases,benign non-inflammatory FLLs and hepatic abscesses and may assist radiologists in identifying the different types of FLLs.
基金supported in part by the Key Research and Development Program of Shaanxi Province of China,No.2024GX-YBXM-149in part by the National Natural Science Foundation of China,No.62071381.
文摘Pneumonia is a serious disease that can be fatal,particularly among children and the elderly.The accuracy of pneumonia diagnosis can be improved by combining artificial-intelligence technology with X-ray imaging.This study proposes X-ODFCANet,which addresses the issues of low accuracy and excessive parameters in existing deep-learningbased pneumonia-classification methods.This network incorporates a feature coordination attention module and an omni-dimensional dynamic convolution(ODConv)module,leveraging the residual module for feature extraction from X-ray images.The feature coordination attention module utilizes two one-dimensional feature encoding processes to aggregate feature information from different spatial directions.Additionally,the ODConv module extracts and fuses feature information in four dimensions:the spatial dimension of the convolution kernel,input and output channel quantities,and convolution kernel quantity.The experimental results demonstrate that the proposed method can effectively improve the accuracy of pneumonia classification,which is 3.77%higher than that of ResNet18.The model parameters are 4.45M,which was reduced by approximately 2.5 times.The code is available at https://github.com/limuni/X ODFCA NET.
基金supported in part by the National Natural Science Foundations of China(No.61801214)the Postgraduate Research Practice Innovation Program of NUAA(No.xcxjh20231504)。
文摘Hyperspectral image(HSI)classification is crucial for numerous remote sensing applications.Traditional deep learning methods may miss pixel relationships and context,leading to inefficiencies.This paper introduces the spectral band graph convolutional and attention-enhanced CNN joint network(SGCCN),a novel approach that harnesses the power of spectral band graph convolutions for capturing long-range relationships,utilizes local perception of attention-enhanced multi-level convolutions for local spatial feature and employs a dynamic attention mechanism to enhance feature extraction.The SGCCN integrates spectral and spatial features through a self-attention fusion network,significantly improving classification accuracy and efficiency.The proposed method outperforms existing techniques,demonstrating its effectiveness in handling the challenges associated with HSI data.
基金sponsored by the National Natural Science Foundation of China(Grant No.52301320)the Natural Science Founds of Fujian Province(No.2023J01790).
文摘Traditional Computational Fluid Dynamics(CFD)simulations are computationally expensive when applied to complex fluid–structure interaction problems and often struggle to capture the essential flow features governing vortex-induced vibrations(VIV)of floating structures.To overcome these limitations,this study develops a hybrid framework that integrates high-fidelity CFD modeling with deep learning techniques to enhance the accuracy and efficiency of VIV response prediction.First,an unstructured finite-volume fluid–structure coupling model is established to generate high-resolution flow field data and extract multi-component time-series feature tensors.These tensors serve as inputs to a Squeeze-and-Excitation Convolutional Neural Network(SE-CNN),which models the nonlinear coupling between flow disturbances and structural responses.The SE-CNN architecture incorporates an attention-based weighting mechanism through an embedded Squeeze-and-Excitation module,dynamically optimizing channel feature importance and improving sensitivity to critical flow characteristics.During training,multidimensional inputs,including pressure,velocity gradient,and displacement sequences,are used to capture the full complexity of fluid–structure interactions.Results demonstrate that the proposed method achieves a maximum amplitude prediction error of only 2.9%and a main frequency deviation below 0.03 Hz,outperforming conventional CNN models by reducing amplitude prediction error from 3.2%to 1.9%.The approach is validated using a representative semi-submersible platform,confirming its robustness across varying damping conditions and flow velocities.
基金Supported by the National Natural Science Foundation of China(50830201)the Priority Academic Program Development of Jiangsu Higher Education Institutions~~
文摘Impact dynamics of flexible solids is important in engineering practice. Obtaining its dynamic response is a challenging task and usually achieved by numerical methods. The objectives of the study are twofold. Firstly, the discrete singular convolution (DSC) is used for the first time to analyze the impact dynamics. Secondly, the efficiency of various numerical methods for dynamic analysis is explored via an example of a flexible rod hit by a rigid ball. Three numerical methods, including the conventional finite element (FE) method, the DSC algorithm, and the spectral finite element (SFE) method, and one proposed modeling strategy, the improved spectral finite element (ISFE) method, are involved. Numerical results are compared with the known analytical solutions to show their efficiency. It is demonstrated that the proposed ISFE modeling strategy with a proper length of con- ventional FE yields the most accurate contact stress among the four investigated models. It is also found that the DSC algorithm is an alternative method for collision problems.
基金National Natural Science Foundation of China under Grant Nos.51978543,52108444,and 51778343Plan of Outstanding Young and Middle-aged Scientific and Technological Innovation Team in the Universities of Hubei Province with Project No.T2020010Natural Science Foundation of Hebei Province under Grant No.E2021512001。
文摘When evaluating the seismic safety and reliability of complex engineering structures,it is a critical problem to reasonably consider the randomness and multi-dimensional nature of ground motions.To this end,a proposed modeling strategy of multi-dimensional stochastic earthquakes is addressed in this study.This improved seismic model has several merits that enable it to better provide seismic analyses of structures.Specifically,at first,the ground motion model is compatible with the design response spectrum.Secondly,the evolutionary power spectrum involved in the model and the design response spectrum are constructed accordingly with sufficient consideration of the correlation between different seismic components.Thirdly,the random function-based dimension-reduction representation is applied,by which seismic modeling is established,with three elementary random variables.Numerical simulations of multi-dimensional stochastic ground motions in a specific design scenario indicate the effectiveness of the proposed modeling strategy.Moreover,the multi-dimensional seismic response and the global reliability of a high-rise frame-core tube structure is discussed in detail to further illustrate the engineering applicability of the proposed method.The analytical investigations demonstrate that the suggested stochastic model of multi-dimensional ground motion is available for accurate seismic response analysis and dynamic reliability assessment of complex engineering structures for performance-based seismic resistance design.
基金supported by the National Natural Science Foundation of China(52222902 and 52079029)。
文摘Integrated water and fertilizer management is important for promoting sustainable development of facility agriculture,and biochar plays an important role in guaranteeing food production,as well as alleviating water shortages and the overuse of fertilizers.The field experiment had twelve treatments and a control(CK)trial including two irrigation amounts(I1,100%ETm;I2,60%ETm;where ETm is the maximum evapotranspiration),two nitrogen applications(N1,360 kg ha^(−1);N2,120 kg ha^(−1))and three biochar application levels(B1,60 t ha^(−1);B_(2),30 t ha^(−1)and B3,0 t ha^(−1)).A multi-objective synergistic irrigation-nitrogen-biochar application system for improving tomato yield,quality,water and nitrogen use efficiency,and greenhouse emissions was developed by integrating the techniques of experimentation and optimization.First,a coupled irrigation-nitrogen-biochar plot experiment was arranged.Then,tomato yield and fruit quality parameters were determined experimentally to establish the response relationships between irrigation-nitrogen-biochar dosage and yield,comprehensive quality of tomatoes(TCQ),irrigation water use efficiency(IWUE),partial factor productivity of nitrogen(PFPN),and net greenhouse gas emissions(NGE).Finally,a multi-objective dynamic optimization regulation model of irrigation-nitrogen-biochar resource allocation at different growth stages of tomato was constructed which was solved by the fuzzy programming method.The results showed that the application of irrigation and nitrogen to biochar promoted increase in yield,IWUE and PFPN,while it had an inhibitory effect on NGE.In addition,the optimal allocation amounts of water and fertilizer were different under different scenarios.The yield of the S1 scenario increased by 8.31%compared to the B_(1)I_(1)N_(2) treatment;TCQ of the S2 scenario increased by 5.14%compared to the B_(2)I_(2)N_(1) treatment;IWUE of the S3 scenario increased by 10.01%compared to the B1I2N2 treatment;PFPN of the S4 scenario increased by 9.35%compared to the B_(1)I_(1)N_(2) treatment;and NGE of the S5 scenario decreased by 11.23%compared to the B_(2)I1N1 treatment.The optimization model showed that the coordination of multiple objectives considering yield,TCQ,IWUE,PFPN,and NGE increased on average from 4.44 to 69.02%compared to each treatment when the irrigation-nitrogen-biochar dosage was 205.18 mm,186 kg ha^(−1)and 43.31 t ha^(−1),respectively.This study provides a guiding basis for the sustainable management of water and fertilizer in greenhouse tomato production under drip irrigation fertilization conditions.
文摘Venanico-Filho et al. developed an elegant matrix formulation for dynamic analysis by frequency domain (FD), but the convergence, causality and extended period need further refining. In the present paper, it was argued that: (1) under reasonable assumptions (approximating the frequency response function by the discrete Fourier transform of the discretized unitary impulse response function), the matrix formulation by FD is equivalent to a circular convolution; (2) to avoid the wraparound interference, the excitation vector and impulse response must be padded with enough zeros; (3) provided that the zero padding requirement satisfied, the convergence and accuracy of direct time domain analysis, which is equivalent to that by FD, are guaranteed by the numerical integration scheme; (4) the imaginary part of the computational response approaching zero is due to the continuity of the impulse response functions.
基金This work was supported by the National Natural Science Foundation of China(No.61672101)the Beijing Key Laboratory of Internet Culture and Digital Dissemination Research(ICDDXN004)Key Lab of Information Network Security,Ministry of Public Security,China(No.C18601).
文摘In order to effectively detect the privacy that may be leaked through social networks and avoid unnecessary harm to users,this paper takes microblog as the research object to study the detection of privacy disclosure in social networks.First,we perform fast privacy leak detection on the currently published text based on the fastText model.In the case that the text to be published contains certain private information,we fully consider the aggregation effect of the private information leaked by different channels,and establish a convolution neural network model based on multi-dimensional features(MF-CNN)to detect privacy disclosure comprehensively and accurately.The experimental results show that the proposed method has a higher accuracy of privacy disclosure detection and can meet the real-time requirements of detection.
基金supported by National Key R&D Program of China (2020AAA0107901).
文摘Latent information is difficult to get from the text in speech synthesis.Studies show that features from speech can get more information to help text encoding.In the field of speech encoding,a lot of work has been conducted on two aspects.The first aspect is to encode speech frame by frame.The second aspect is to encode the whole speech to a vector.But the scale in these aspects is fixed.So,encoding speech with an adjustable scale for more latent information is worthy of investigation.But current alignment approaches only support frame-by-frame encoding and speech-to-vector encoding.It remains a challenge to propose a new alignment approach to support adjustable scale speech encoding.This paper presents the dynamic speech encoder with a new alignment approach in conjunction with frame-by-frame encoding and speech-to-vector encoding.The speech feature fromourmodel achieves three functions.First,the speech feature can reconstruct the origin speech while the length of the speech feature is equal to the text length.Second,our model can get text embedding fromspeech,and the encoded speech feature is similar to the text embedding result.Finally,it can transfer the style of synthesis speech and make it more similar to the given reference speech.
基金This study was supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(NRF-2018R1D1A1B07049932).
文摘Low dynamic range(LDR)images captured by consumer cameras have a limited luminance range.As the conventional method for generating high dynamic range(HDR)images involves merging multiple-exposure LDR images of the same scene(assuming a stationary scene),we introduce a learning-based model for single-image HDR reconstruction.An input LDR image is sequentially segmented into the local region maps based on the cumulative histogram of the input brightness distribution.Using the local region maps,SParam-Net estimates the parameters of an inverse tone mapping function to generate a pseudo-HDR image.We process the segmented region maps as the input sequences on long short-term memory.Finally,a fast super-resolution convolutional neural network is used for HDR image reconstruction.The proposed method was trained and tested on datasets including HDR-Real,LDR-HDR-pair,and HDR-Eye.The experimental results revealed that HDR images can be generated more reliably than using contemporary end-to-end approaches.
文摘Deep neural network-based relational extraction research has made significant progress in recent years,andit provides data support for many natural language processing downstream tasks such as building knowledgegraph,sentiment analysis and question-answering systems.However,previous studies ignored much unusedstructural information in sentences that could enhance the performance of the relation extraction task.Moreover,most existing dependency-based models utilize self-attention to distinguish the importance of context,whichhardly deals withmultiple-structure information.To efficiently leverage multiple structure information,this paperproposes a dynamic structure attention mechanism model based on textual structure information,which deeplyintegrates word embedding,named entity recognition labels,part of speech,dependency tree and dependency typeinto a graph convolutional network.Specifically,our model extracts text features of different structures from theinput sentence.Textual Structure information Graph Convolutional Networks employs the dynamic structureattention mechanism to learn multi-structure attention,effectively distinguishing important contextual features invarious structural information.In addition,multi-structure weights are carefully designed as amergingmechanismin the different structure attention to dynamically adjust the final attention.This paper combines these featuresand trains a graph convolutional network for relation extraction.We experiment on supervised relation extractiondatasets including SemEval 2010 Task 8,TACRED,TACREV,and Re-TACED,the result significantly outperformsthe previous.
基金supported by the Open Project of Key Laboratory of Computational Aerodynamics,AVIC Aerodynamics Research Institute(Grant No.YL2022XFX0409).
文摘In this work,a three dimensional(3D)convolutional neural network(CNN)model based on image slices of various normal and pathological vocal folds is proposed for accurate and efficient prediction of glottal flows.The 3D CNN model is composed of the feature extraction block and regression block.The feature extraction block is capable of learning low dimensional features from the high dimensional image data of the glottal shape,and the regression block is employed to flatten the output from the feature extraction block and obtain the desired glottal flow data.The input image data is the condensed set of 2D image slices captured in the axial plane of the 3D vocal folds,where these glottal shapes are synthesized based on the equations of normal vibration modes.The output flow data is the corresponding flow rate,averaged glottal pressure and nodal pressure distributions over the glottal surface.The 3D CNN model is built to establish the mapping between the input image data and output flow data.The ground-truth flow variables of each glottal shape in the training and test datasets are obtained by a high-fidelity sharp-interface immersed-boundary solver.The proposed model is trained to predict the concerned flow variables for glottal shapes in the test set.The present 3D CNN model is more efficient than traditional Computational Fluid Dynamics(CFD)models while the accuracy can still be retained,and more powerful than previous data-driven prediction models because more details of the glottal flow can be provided.The prediction performance of the trained 3D CNN model in accuracy and efficiency indicates that this model could be promising for future clinical applications.
文摘Much attention has been given to the Internet of Things (IoT) by citizens, industries, governments, and universities for applications like smart buildings, environmental monitoring, health care and so on. With IoT, networkconnectivity is facilitated between smart devices from anyplace and anytime.IoT-based health monitoring systems are gaining popularity and acceptance forcontinuous monitoring and detect health abnormalities from the data collected.Electrocardiographic (ECG) signals are widely used for heart diseases detection.A novel method has been proposed in this work for ECG monitoring using IoTtechniques. In this work, a two-stage approach is employed. In the first stage, arouting protocol based on Dynamic Source Routing (DSR) and Routing byEnergy and Link quality (REL) for IoT healthcare platform is proposed for effi-cient data collection, and in the second stage, classification of ECG for Arrhythmia. Furthermore, this work has evaluated Support Vector Machine (SVM),Artificial Neural Network (ANN), and Convolution Neural Networks (CNNs)-based approach for ECG signals classification. Deep-ECG will use a deep CNNto extract critical features and then compare through evaluation of simple and fastdistance functions in order to obtain an efficient classification of heart abnormalities. For the identification of abnormal data, this work has proposed techniquesfor the classification of ECG data, which has been obtained from mobile watchusers. For experimental verification of the proposed methods, the Beth Israel Hospital (MIT/BIH) Arrhythmia and Massachusetts Institute of Technology (MIT)Database was used for evaluation. Results confirm the presented method’s superior performance with regards to the accuracy of classification. The CNN achievedan accuracy of 91.92% and has a higher accuracy of 4.98% for the SVM and2.68% for the ANN.
基金supported by the National Natural Science Foundation of China(Grant Nos.62141214 and 62272171).
文摘Classic Graph Convolutional Networks (GCNs) often learn node representation holistically, which ignores the distinct impacts from different neighbors when aggregating their features to update a node’s representation. Disentangled GCNs have been proposed to divide each node’s representation into several feature units. However, current disentangling methods do not try to figure out how many inherent factors the model should assign to help extract the best representation of each node. This paper then proposes D^(2)-GCN to provide dynamic disentanglement in GCNs and present the most appropriate factorization of each node’s mixed features. The convergence of the proposed method is proved both theoretically and experimentally. Experiments on real-world datasets show that D^(2)-GCN outperforms the baseline models concerning node classification results in both single- and multi-label tasks.
文摘With the rapid advancement of virtual reality,dynamic gesture recognition technology has become an indispensable and critical technique for users to achieve human–computer interaction in virtual environments.The recognition of dynamic gestures is a challenging task due to the high degree of freedom and the influence of individual differences and the change of gesture space.To solve the problem of low recognition accuracy of existing networks,an improved dynamic gesture recognition algorithm based on ResNeXt architecture is proposed.The algorithm employs three-dimensional convolution techniques to effectively capture the spatiotemporal features intrinsic to dynamic gestures.Additionally,to enhance the model’s focus and improve its accuracy in identifying dynamic gestures,a lightweight convolutional attention mechanism is introduced.This mechanism not only augments the model’s precision but also facilitates faster convergence during the training phase.In order to further optimize the performance of the model,a deep attention submodule is added to the convolutional attention mechanism module to strengthen the network’s capability in temporal feature extraction.Empirical evaluations on EgoGesture and NvGesture datasets show that the accuracy of the proposed model in dynamic gesture recognition reaches 95.03%and 86.21%,respectively.When operating in RGB mode,the accuracy reached 93.49%and 80.22%,respectively.These results underscore the effectiveness of the proposed algorithm in recognizing dynamic gestures with high accuracy,showcasing its potential for applications in advanced human–computer interaction systems.