Reliable traffic flow prediction is crucial for mitigating urban congestion.This paper proposes Attentionbased spatiotemporal Interactive Dynamic Graph Convolutional Network(AIDGCN),a novel architecture integrating In...Reliable traffic flow prediction is crucial for mitigating urban congestion.This paper proposes Attentionbased spatiotemporal Interactive Dynamic Graph Convolutional Network(AIDGCN),a novel architecture integrating Interactive Dynamic Graph Convolution Network(IDGCN)with Temporal Multi-Head Trend-Aware Attention.Its core innovation lies in IDGCN,which uniquely splits sequences into symmetric intervals for interactive feature sharing via dynamic graphs,and a novel attention mechanism incorporating convolutional operations to capture essential local traffic trends—addressing a critical gap in standard attention for continuous data.For 15-and 60-min forecasting on METR-LA,AIDGCN achieves MAEs of 0.75%and 0.39%,and RMSEs of 1.32%and 0.14%,respectively.In the 60-min long-term forecasting of the PEMS-BAY dataset,the AIDGCN out-performs the MRA-BGCN method by 6.28%,4.93%,and 7.17%in terms of MAE,RMSE,and MAPE,respectively.Experimental results demonstrate the superiority of our pro-posed model over state-of-the-art methods.展开更多
With the rapid development of the Artificial Intelligence of Things(AIoT),convolutional neural networks(CNNs)have demonstrated potential and remarkable performance in AIoT applications due to their excellent performan...With the rapid development of the Artificial Intelligence of Things(AIoT),convolutional neural networks(CNNs)have demonstrated potential and remarkable performance in AIoT applications due to their excellent performance in various inference tasks.However,the users have concerns about privacy leakage for the use of AI and the performance and efficiency of computing on resource-constrained IoT edge devices.Therefore,this paper proposes an efficient privacy-preserving CNN framework(i.e.,EPPA)based on the Fully Homomorphic Encryption(FHE)scheme for AIoT application scenarios.In the plaintext domain,we verify schemes with different activation structures to determine the actual activation functions applicable to the corresponding ciphertext domain.Within the encryption domain,we integrate batch normalization(BN)into the convolutional layers to simplify the computation process.For nonlinear activation functions,we use composite polynomials for approximate calculation.Regarding the noise accumulation caused by homomorphic multiplication operations,we realize the refreshment of ciphertext noise through minimal“decryption-encryption”interactions,instead of adopting bootstrapping operations.Additionally,in practical implementation,we convert three-dimensional convolution into two-dimensional convolution to reduce the amount of computation in the encryption domain.Finally,we conduct extensive experiments on four IoT datasets,different CNN architectures,and two platforms with different resource configurations to evaluate the performance of EPPA in detail.展开更多
With the increasing complexity of industrial automation,planetary gearboxes play a vital role in largescale equipment transmission systems,directly impacting operational efficiency and safety.Traditional maintenance s...With the increasing complexity of industrial automation,planetary gearboxes play a vital role in largescale equipment transmission systems,directly impacting operational efficiency and safety.Traditional maintenance strategies often struggle to accurately predict the degradation process of equipment,leading to excessive maintenance costs or potential failure risks.However,existing prediction methods based on statistical models are difficult to adapt to nonlinear degradation processes.To address these challenges,this study proposes a novel condition-based maintenance framework for planetary gearboxes.A comprehensive full-lifecycle degradation experiment was conducted to collect raw vibration signals,which were then processed using a temporal convolutional network autoencoder with multi-scale perception capability to extract deep temporal degradation features,enabling the collaborative extraction of longperiod meshing frequencies and short-term impact features from the vibration signals.Kernel principal component analysis was employed to fuse and normalize these features,enhancing the characterization of degradation progression.A nonlinear Wiener process was used to model the degradation trajectory,with a threshold decay function introduced to dynamically adjust maintenance strategies,and model parameters optimized through maximum likelihood estimation.Meanwhile,the maintenance strategy was optimized to minimize costs per unit time,determining the optimal maintenance timing and preventive maintenance threshold.The comprehensive indicator of degradation trends extracted by this method reaches 0.756,which is 41.2%higher than that of traditional time-domain features;the dynamic threshold strategy reduces the maintenance cost per unit time to 55.56,which is 8.9%better than that of the static threshold optimization.Experimental results demonstrate significant reductions in maintenance costs while enhancing system reliability and safety.This study realizes the organic integration of deep learning and reliability theory in the maintenance of planetary gearboxes,provides an interpretable solution for the predictive maintenance of complex mechanical systems,and promotes the development of condition-based maintenance strategies for planetary gearboxes.展开更多
Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using d...Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested.展开更多
Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained promine...Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.展开更多
In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accurac...In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.展开更多
Severe ground-level ozone(O_(3))pollution over major Chinese cities has become one of the most challenging problems,which have deleterious effects on human health and the sustainability of society.This study explored ...Severe ground-level ozone(O_(3))pollution over major Chinese cities has become one of the most challenging problems,which have deleterious effects on human health and the sustainability of society.This study explored the spatiotemporal distribution characteristics of ground-level O_(3) and its precursors based on conventional pollutant and meteorological monitoring data in Zhejiang Province from 2016 to 2021.Then,a high-performance convolutional neural network(CNN)model was established by expanding the moment and the concentration variations to general factors.Finally,the response mechanism of O_(3) to the variation with crucial influencing factors is explored by controlling variables and interpolating target variables.The results indicated that the annual average MDA8-90th concentrations in Zhejiang Province are higher in the northern and lower in the southern.When the wind direction(WD)ranges from east to southwest and the wind speed(WS)ranges between 2 and 3 m/sec,higher O_(3) concentration prone to occur.At different temperatures(T),the O_(3) concentration showed a trend of first increasing and subsequently decreasing with increasing NO_(2) concentration,peaks at the NO_(2) concentration around 0.02mg/m^(3).The sensitivity of NO_(2) to O_(3) formation is not easily affected by temperature,barometric pressure and dew point temperature.Additionally,there is a minimum IRNO_(2) at each temperature when the NO_(2) concentration is 0.03 mg/m^(3),and this minimum IRNO_(2) decreases with increasing temperature.The study explores the response mechanism of O_(3) with the change of driving variables,which can provide a scientific foundation and methodological support for the targeted management of O_(3) pollution.展开更多
The isolated fracture-vug systems controlled by small-scale strike-slip faults within ultra-deep carbonate rocks of the Tarim Basin exhibit significant exploration potential.The study employs a novel training set inco...The isolated fracture-vug systems controlled by small-scale strike-slip faults within ultra-deep carbonate rocks of the Tarim Basin exhibit significant exploration potential.The study employs a novel training set incorporating innovative fault labels to train a U-Net-structured CNN model,enabling effective identification of small-scale strike-slip faults through seismic data interpretation.Based on the CNN faults,we analyze the distribution patterns of small-scale strike-slip faults.The small-scale strike-slip faults can be categorized into NNW-trending and NE-trending groups with strike lengths ranging 200–5000 m.The development intensity of small-scale strike-slip faults in the Lower Yingshan Member notably exceeds that in the Upper Member.The Lower and Upper Yingshan members are two distinct mechanical layers with contrasting brittleness characteristics,separated by a low-brittleness layer.The superior brittleness of the Lower Yingshan Member enhances the development intensity of small-scale strike-slip faults compared to the upper member,while the low-brittleness layer exerts restrictive effects on vertical fault propagation.Fracture-vug systems formed by interactions of two or more small-scale strike-slip faults demonstrate larger sizes than those controlled by individual faults.All fracture-vug system sizes show positive correlations with the vertical extents of associated small-scale strike-slip faults,particularly intersection and approaching fracture-vug systems exhibit accelerated size increases proportional to the vertical extents.展开更多
Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global featu...Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.展开更多
With the emphasis on user privacy and communication security, encrypted traffic has increased dramatically, which brings great challenges to traffic classification. The classification method of encrypted traffic based...With the emphasis on user privacy and communication security, encrypted traffic has increased dramatically, which brings great challenges to traffic classification. The classification method of encrypted traffic based on GNN can deal with encrypted traffic well. However, existing GNN-based approaches ignore the relationship between client or server packets. In this paper, we design a network traffic topology based on GCN, called Flow Mapping Graph (FMG). FMG establishes sequential edges between vertexes by the arrival order of packets and establishes jump-order edges between vertexes by connecting packets in different bursts with the same direction. It not only reflects the time characteristics of the packet but also strengthens the relationship between the client or server packets. According to FMG, a Traffic Mapping Classification model (TMC-GCN) is designed, which can automatically capture and learn the characteristics and structure information of the top vertex in FMG. The TMC-GCN model is used to classify the encrypted traffic. The encryption stream classification problem is transformed into a graph classification problem, which can effectively deal with data from different data sources and application scenarios. By comparing the performance of TMC-GCN with other classical models in four public datasets, including CICIOT2023, ISCXVPN2016, CICAAGM2017, and GraphDapp, the effectiveness of the FMG algorithm is verified. The experimental results show that the accuracy rate of the TMC-GCN model is 96.13%, the recall rate is 95.04%, and the F1 rate is 94.54%.展开更多
The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been pr...Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been proposed.However,unlike DNNs,shallow convolutional neural networks often outperform deeper models in mitigating overfitting,particularly with small datasets.Still,many of these methods rely on a single feature for recognition,resulting in an insufficient ability to extract highly effective features.To address this limitation,in this paper,an Improved Dual-stream Shallow Convolutional Neural Network based on an Extreme Gradient Boosting Algorithm(IDSSCNN-XgBoost)is introduced for ME Recognition.The proposed method utilizes a dual-stream architecture where motion vectors(temporal features)are extracted using Optical Flow TV-L1 and amplify subtle changes(spatial features)via EulerianVideoMagnification(EVM).These features are processed by IDSSCNN,with an attention mechanism applied to refine the extracted effective features.The outputs are then fused,concatenated,and classified using the XgBoost algorithm.This comprehensive approach significantly improves recognition accuracy by leveraging the strengths of both temporal and spatial information,supported by the robust classification power of XgBoost.The proposed method is evaluated on three publicly available ME databases named Chinese Academy of Sciences Micro-expression Database(CASMEII),Spontaneous Micro-Expression Database(SMICHS),and Spontaneous Actions and Micro-Movements(SAMM).Experimental results indicate that the proposed model can achieve outstanding results compared to recent models.The accuracy results are 79.01%,69.22%,and 68.99%on CASMEII,SMIC-HS,and SAMM,and the F1-score are 75.47%,68.91%,and 63.84%,respectively.The proposed method has the advantage of operational efficiency and less computational time.展开更多
Human disturbance activities is one of the main reasons for inducing geohazards.Ecological impact assessment metrics of roads are inconsistent criteria and multiple.From the perspective of visual observation,the envir...Human disturbance activities is one of the main reasons for inducing geohazards.Ecological impact assessment metrics of roads are inconsistent criteria and multiple.From the perspective of visual observation,the environment damage can be shown through detecting the uncovered area of vegetation in the images along road.To realize this,an end-to-end environment damage detection model based on convolutional neural network is proposed.A 50-layer residual network is used to extract feature map.The initial parameters are optimized by transfer learning.An example is shown by this method.The dataset including cliff and landslide damage are collected by us along road in Shennongjia national forest park.Results show 0.4703 average precision(AP)rating for cliff damage and 0.4809 average precision(AP)rating for landslide damage.Compared with YOLOv3,our model shows a better accuracy in cliff and landslide detection although a certain amount of speed is sacrificed.展开更多
Landslide susceptibility mapping(LSM)plays a crucial role in assessing geological risks.The current LSM techniques face a significant challenge in achieving accurate results due to uncertainties associated with region...Landslide susceptibility mapping(LSM)plays a crucial role in assessing geological risks.The current LSM techniques face a significant challenge in achieving accurate results due to uncertainties associated with regional-scale geotechnical parameters.To explore rainfall-induced LSM,this study proposes a hybrid model that combines the physically-based probabilistic model(PPM)with convolutional neural network(CNN).The PPM is capable of effectively capturing the spatial distribution of landslides by incorporating the probability of failure(POF)considering the slope stability mechanism under rainfall conditions.This significantly characterizes the variation of POF caused by parameter uncertainties.CNN was used as a binary classifier to capture the spatial and channel correlation between landslide conditioning factors and the probability of landslide occurrence.OpenCV image enhancement technique was utilized to extract non-landslide points based on the POF of landslides.The proposed model comprehensively considers physical mechanics when selecting non-landslide samples,effectively filtering out samples that do not adhere to physical principles and reduce the risk of overfitting.The results indicate that the proposed PPM-CNN hybrid model presents a higher prediction accuracy,with an area under the curve(AUC)value of 0.85 based on the landslide case of the Niangniangba area of Gansu Province,China compared with the individual CNN model(AUC=0.61)and the PPM(AUC=0.74).This model can also consider the statistical correlation and non-normal probability distributions of model parameters.These results offer practical guidance for future research on rainfall-induced LSM at the regional scale.展开更多
An evolution inequality of Sobolev type involving a nonlinear convolution term is considered.By using the nonlinear capacity method and the contradiction argument,the non-existence of the nontrivial local weak solutio...An evolution inequality of Sobolev type involving a nonlinear convolution term is considered.By using the nonlinear capacity method and the contradiction argument,the non-existence of the nontrivial local weak solution is proved.展开更多
The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relation...Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relationships between hand joints and improve the accuracy of 3D hand pose regression.However,GCNs cannot effectively describe the relationships between non-adjacent hand joints.Recently,hypergraph convolutional networks(HGCNs)have received much attention as they can describe multi-dimensional relationships between nodes through hyperedges;therefore,this paper proposes a framework for 3D hand pose estimation based on HGCN,which can better extract correlated relationships between adjacent and non-adjacent hand joints.To overcome the shortcomings of predefined hypergraph structures,a kind of dynamic hypergraph convolutional network is proposed,in which hyperedges are constructed dynamically based on hand joint feature similarity.To better explore the local semantic relationships between nodes,a kind of semantic dynamic hypergraph convolution is proposed.The proposed method is evaluated on publicly available benchmark datasets.Qualitative and quantitative experimental results both show that the proposed HGCN and improved methods for 3D hand pose estimation are better than GCN,and achieve state-of-the-art performance compared with existing methods.展开更多
Aspect-oriented sentiment analysis is a meticulous sentiment analysis task that aims to analyse the sentiment polarity of specific aspects. Most of the current research builds graph convolutional networks based on dep...Aspect-oriented sentiment analysis is a meticulous sentiment analysis task that aims to analyse the sentiment polarity of specific aspects. Most of the current research builds graph convolutional networks based on dependent syntactic trees, which improves the classification performance of the models to some extent. However, the technical limitations of dependent syntactic trees can introduce considerable noise into the model. Meanwhile, it is difficult for a single graph convolutional network to aggregate both semantic and syntactic structural information of nodes, which affects the final sentence classification. To cope with the above problems, this paper proposes a bi-channel graph convolutional network model. The model introduces a phrase structure tree and transforms it into a hierarchical phrase matrix. The adjacency matrix of the dependent syntactic tree and the hierarchical phrase matrix are combined as the initial matrix of the graph convolutional network to enhance the syntactic information. The semantic information feature representations of the sentences are obtained by the graph convolutional network with a multi-head attention mechanism and fused to achieve complementary learning of dual-channel features. Experimental results show that the model performs well and improves the accuracy of sentiment classification on three public benchmark datasets, namely Rest14, Lap14 and Twitter.展开更多
During its growth stage,the plant is exposed to various diseases.Detection and early detection of crop diseases is amajor challenge in the horticulture industry.Crop infections can harmtotal crop yield and reduce farm...During its growth stage,the plant is exposed to various diseases.Detection and early detection of crop diseases is amajor challenge in the horticulture industry.Crop infections can harmtotal crop yield and reduce farmers’income if not identified early.Today’s approved method involves a professional plant pathologist to diagnose the disease by visual inspection of the afflicted plant leaves.This is an excellent use case for Community Assessment and Treatment Services(CATS)due to the lengthy manual disease diagnosis process and the accuracy of identification is directly proportional to the skills of pathologists.An alternative to conventional Machine Learning(ML)methods,which require manual identification of parameters for exact results,is to develop a prototype that can be classified without pre-processing.To automatically diagnose tomato leaf disease,this research proposes a hybrid model using the Convolutional Auto-Encoders(CAE)network and the CNN-based deep learning architecture of DenseNet.To date,none of the modern systems described in this paper have a combined model based on DenseNet,CAE,and ConvolutionalNeuralNetwork(CNN)todiagnose the ailments of tomato leaves automatically.Themodelswere trained on a dataset obtained from the Plant Village repository.The dataset consisted of 9920 tomato leaves,and the model-tomodel accuracy ratio was 98.35%.Unlike other approaches discussed in this paper,this hybrid strategy requires fewer training components.Therefore,the training time to classify plant diseases with the trained algorithm,as well as the training time to automatically detect the ailments of tomato leaves,is significantly reduced.展开更多
Aiming at the problem of insufficient feature extraction in single scale neural network model and the problem that convolutional neural network cannot process sequential tasks in the classification of EEG signals in d...Aiming at the problem of insufficient feature extraction in single scale neural network model and the problem that convolutional neural network cannot process sequential tasks in the classification of EEG signals in depression,a hybrid model(BFTCNet)of dualbranch convolutional neural network(Bi_CNN)and temporal convolutional network(TCN)based on feature recalibration(FR)was proposed to classify EEG signals of depressed patients and healthy controls.Firstly,Bi_CNN module was used to extract the mixed EEG features between different frequency bands and different channels.Secondly,FR module was used to enhance the features extracted by Bi_CNN.Finally,TCN with dilated causal convolution was used for the sequence learning to capture the temporal dependency between features.In this study,128 EEG channels of resting-state(closed-eye)EEG data from the public dataset MODMA were used as experimental data,including 29 healthy controls and 24 depression patients.The performance of the model was evaluated by the 10-fold cross validation method.The proposed BFTCNet achieves a classification accuracy of 95.98%,F1 score value of 95.47%,sensitivity and specificity of 94.21%and 97.50%,respectively.Compared with the single-scale network model EEGNet-8,2,the classification accuracy and F1 value are improved by 1.5%and 1.48%,respectively.Meanwhile,the ablation experiment proved that each sub-module had its contribution to the improvement of the model’s classification ability.展开更多
文摘Reliable traffic flow prediction is crucial for mitigating urban congestion.This paper proposes Attentionbased spatiotemporal Interactive Dynamic Graph Convolutional Network(AIDGCN),a novel architecture integrating Interactive Dynamic Graph Convolution Network(IDGCN)with Temporal Multi-Head Trend-Aware Attention.Its core innovation lies in IDGCN,which uniquely splits sequences into symmetric intervals for interactive feature sharing via dynamic graphs,and a novel attention mechanism incorporating convolutional operations to capture essential local traffic trends—addressing a critical gap in standard attention for continuous data.For 15-and 60-min forecasting on METR-LA,AIDGCN achieves MAEs of 0.75%and 0.39%,and RMSEs of 1.32%and 0.14%,respectively.In the 60-min long-term forecasting of the PEMS-BAY dataset,the AIDGCN out-performs the MRA-BGCN method by 6.28%,4.93%,and 7.17%in terms of MAE,RMSE,and MAPE,respectively.Experimental results demonstrate the superiority of our pro-posed model over state-of-the-art methods.
基金supported by the Natural Science Foundation of China No.62362008the Major Scientific and Technological Special Project of Guizhou Province([2024]014).
文摘With the rapid development of the Artificial Intelligence of Things(AIoT),convolutional neural networks(CNNs)have demonstrated potential and remarkable performance in AIoT applications due to their excellent performance in various inference tasks.However,the users have concerns about privacy leakage for the use of AI and the performance and efficiency of computing on resource-constrained IoT edge devices.Therefore,this paper proposes an efficient privacy-preserving CNN framework(i.e.,EPPA)based on the Fully Homomorphic Encryption(FHE)scheme for AIoT application scenarios.In the plaintext domain,we verify schemes with different activation structures to determine the actual activation functions applicable to the corresponding ciphertext domain.Within the encryption domain,we integrate batch normalization(BN)into the convolutional layers to simplify the computation process.For nonlinear activation functions,we use composite polynomials for approximate calculation.Regarding the noise accumulation caused by homomorphic multiplication operations,we realize the refreshment of ciphertext noise through minimal“decryption-encryption”interactions,instead of adopting bootstrapping operations.Additionally,in practical implementation,we convert three-dimensional convolution into two-dimensional convolution to reduce the amount of computation in the encryption domain.Finally,we conduct extensive experiments on four IoT datasets,different CNN architectures,and two platforms with different resource configurations to evaluate the performance of EPPA in detail.
基金funded by scientific research projects under Grant JY2024B011.
文摘With the increasing complexity of industrial automation,planetary gearboxes play a vital role in largescale equipment transmission systems,directly impacting operational efficiency and safety.Traditional maintenance strategies often struggle to accurately predict the degradation process of equipment,leading to excessive maintenance costs or potential failure risks.However,existing prediction methods based on statistical models are difficult to adapt to nonlinear degradation processes.To address these challenges,this study proposes a novel condition-based maintenance framework for planetary gearboxes.A comprehensive full-lifecycle degradation experiment was conducted to collect raw vibration signals,which were then processed using a temporal convolutional network autoencoder with multi-scale perception capability to extract deep temporal degradation features,enabling the collaborative extraction of longperiod meshing frequencies and short-term impact features from the vibration signals.Kernel principal component analysis was employed to fuse and normalize these features,enhancing the characterization of degradation progression.A nonlinear Wiener process was used to model the degradation trajectory,with a threshold decay function introduced to dynamically adjust maintenance strategies,and model parameters optimized through maximum likelihood estimation.Meanwhile,the maintenance strategy was optimized to minimize costs per unit time,determining the optimal maintenance timing and preventive maintenance threshold.The comprehensive indicator of degradation trends extracted by this method reaches 0.756,which is 41.2%higher than that of traditional time-domain features;the dynamic threshold strategy reduces the maintenance cost per unit time to 55.56,which is 8.9%better than that of the static threshold optimization.Experimental results demonstrate significant reductions in maintenance costs while enhancing system reliability and safety.This study realizes the organic integration of deep learning and reliability theory in the maintenance of planetary gearboxes,provides an interpretable solution for the predictive maintenance of complex mechanical systems,and promotes the development of condition-based maintenance strategies for planetary gearboxes.
基金The work described in this paper was fully supported by a grant from Hong Kong Metropolitan University(RIF/2021/05).
文摘Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested.
基金supported by the National Natural Science Foundation of China(No.52277055).
文摘Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.
基金supported by the National Natural Science Foundation of China(62272049,62236006,62172045)the Key Projects of Beijing Union University(ZKZD202301).
文摘In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.
基金supported by the National Key Research and Development Program of China (Nos.2022YFC3702000 and 2022YFC3703500)the Key R&D Project of Zhejiang Province (No.2022C03146).
文摘Severe ground-level ozone(O_(3))pollution over major Chinese cities has become one of the most challenging problems,which have deleterious effects on human health and the sustainability of society.This study explored the spatiotemporal distribution characteristics of ground-level O_(3) and its precursors based on conventional pollutant and meteorological monitoring data in Zhejiang Province from 2016 to 2021.Then,a high-performance convolutional neural network(CNN)model was established by expanding the moment and the concentration variations to general factors.Finally,the response mechanism of O_(3) to the variation with crucial influencing factors is explored by controlling variables and interpolating target variables.The results indicated that the annual average MDA8-90th concentrations in Zhejiang Province are higher in the northern and lower in the southern.When the wind direction(WD)ranges from east to southwest and the wind speed(WS)ranges between 2 and 3 m/sec,higher O_(3) concentration prone to occur.At different temperatures(T),the O_(3) concentration showed a trend of first increasing and subsequently decreasing with increasing NO_(2) concentration,peaks at the NO_(2) concentration around 0.02mg/m^(3).The sensitivity of NO_(2) to O_(3) formation is not easily affected by temperature,barometric pressure and dew point temperature.Additionally,there is a minimum IRNO_(2) at each temperature when the NO_(2) concentration is 0.03 mg/m^(3),and this minimum IRNO_(2) decreases with increasing temperature.The study explores the response mechanism of O_(3) with the change of driving variables,which can provide a scientific foundation and methodological support for the targeted management of O_(3) pollution.
基金supported by the National Natural Science Foundation of China(No.U21B2062).
文摘The isolated fracture-vug systems controlled by small-scale strike-slip faults within ultra-deep carbonate rocks of the Tarim Basin exhibit significant exploration potential.The study employs a novel training set incorporating innovative fault labels to train a U-Net-structured CNN model,enabling effective identification of small-scale strike-slip faults through seismic data interpretation.Based on the CNN faults,we analyze the distribution patterns of small-scale strike-slip faults.The small-scale strike-slip faults can be categorized into NNW-trending and NE-trending groups with strike lengths ranging 200–5000 m.The development intensity of small-scale strike-slip faults in the Lower Yingshan Member notably exceeds that in the Upper Member.The Lower and Upper Yingshan members are two distinct mechanical layers with contrasting brittleness characteristics,separated by a low-brittleness layer.The superior brittleness of the Lower Yingshan Member enhances the development intensity of small-scale strike-slip faults compared to the upper member,while the low-brittleness layer exerts restrictive effects on vertical fault propagation.Fracture-vug systems formed by interactions of two or more small-scale strike-slip faults demonstrate larger sizes than those controlled by individual faults.All fracture-vug system sizes show positive correlations with the vertical extents of associated small-scale strike-slip faults,particularly intersection and approaching fracture-vug systems exhibit accelerated size increases proportional to the vertical extents.
基金National Key Research and Development Program of China,Grant/Award Number:2018YFE0206900China Postdoctoral Science Foundation,Grant/Award Number:2023M731204+2 种基金The Open Project of Key Laboratory for Quality Evaluation of Ultrasound Surgical Equipment of National Medical Products Administration,Grant/Award Number:SMDTKL-2023-1-01The Hubei Province Key Research and Development Project,Grant/Award Number:2023BCB007CAAI-Huawei MindSpore Open Fund。
文摘Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.
基金supported by the National Key Research and Development Program of China No.2023YFA1009500.
文摘With the emphasis on user privacy and communication security, encrypted traffic has increased dramatically, which brings great challenges to traffic classification. The classification method of encrypted traffic based on GNN can deal with encrypted traffic well. However, existing GNN-based approaches ignore the relationship between client or server packets. In this paper, we design a network traffic topology based on GCN, called Flow Mapping Graph (FMG). FMG establishes sequential edges between vertexes by the arrival order of packets and establishes jump-order edges between vertexes by connecting packets in different bursts with the same direction. It not only reflects the time characteristics of the packet but also strengthens the relationship between the client or server packets. According to FMG, a Traffic Mapping Classification model (TMC-GCN) is designed, which can automatically capture and learn the characteristics and structure information of the top vertex in FMG. The TMC-GCN model is used to classify the encrypted traffic. The encryption stream classification problem is transformed into a graph classification problem, which can effectively deal with data from different data sources and application scenarios. By comparing the performance of TMC-GCN with other classical models in four public datasets, including CICIOT2023, ISCXVPN2016, CICAAGM2017, and GraphDapp, the effectiveness of the FMG algorithm is verified. The experimental results show that the accuracy rate of the TMC-GCN model is 96.13%, the recall rate is 95.04%, and the F1 rate is 94.54%.
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
基金supported by the Key Research and Development Program of Jiangsu Province under Grant BE2022059-3,CTBC Bank through the Industry-Academia Cooperation Project,as well as by the Ministry of Science and Technology of Taiwan through Grants MOST-108-2218-E-002-055,MOST-109-2223-E-009-002-MY3,MOST-109-2218-E-009-025,and MOST431109-2218-E-002-015.
文摘Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been proposed.However,unlike DNNs,shallow convolutional neural networks often outperform deeper models in mitigating overfitting,particularly with small datasets.Still,many of these methods rely on a single feature for recognition,resulting in an insufficient ability to extract highly effective features.To address this limitation,in this paper,an Improved Dual-stream Shallow Convolutional Neural Network based on an Extreme Gradient Boosting Algorithm(IDSSCNN-XgBoost)is introduced for ME Recognition.The proposed method utilizes a dual-stream architecture where motion vectors(temporal features)are extracted using Optical Flow TV-L1 and amplify subtle changes(spatial features)via EulerianVideoMagnification(EVM).These features are processed by IDSSCNN,with an attention mechanism applied to refine the extracted effective features.The outputs are then fused,concatenated,and classified using the XgBoost algorithm.This comprehensive approach significantly improves recognition accuracy by leveraging the strengths of both temporal and spatial information,supported by the robust classification power of XgBoost.The proposed method is evaluated on three publicly available ME databases named Chinese Academy of Sciences Micro-expression Database(CASMEII),Spontaneous Micro-Expression Database(SMICHS),and Spontaneous Actions and Micro-Movements(SAMM).Experimental results indicate that the proposed model can achieve outstanding results compared to recent models.The accuracy results are 79.01%,69.22%,and 68.99%on CASMEII,SMIC-HS,and SAMM,and the F1-score are 75.47%,68.91%,and 63.84%,respectively.The proposed method has the advantage of operational efficiency and less computational time.
文摘Human disturbance activities is one of the main reasons for inducing geohazards.Ecological impact assessment metrics of roads are inconsistent criteria and multiple.From the perspective of visual observation,the environment damage can be shown through detecting the uncovered area of vegetation in the images along road.To realize this,an end-to-end environment damage detection model based on convolutional neural network is proposed.A 50-layer residual network is used to extract feature map.The initial parameters are optimized by transfer learning.An example is shown by this method.The dataset including cliff and landslide damage are collected by us along road in Shennongjia national forest park.Results show 0.4703 average precision(AP)rating for cliff damage and 0.4809 average precision(AP)rating for landslide damage.Compared with YOLOv3,our model shows a better accuracy in cliff and landslide detection although a certain amount of speed is sacrificed.
基金funding support from the National Natural Science Foundation of China(Grant Nos.U22A20594,52079045)Hong-Zhi Cui acknowledges the financial support of the China Scholarship Council(Grant No.CSC:202206710014)for his research at Universitat Politecnica de Catalunya,Barcelona.
文摘Landslide susceptibility mapping(LSM)plays a crucial role in assessing geological risks.The current LSM techniques face a significant challenge in achieving accurate results due to uncertainties associated with regional-scale geotechnical parameters.To explore rainfall-induced LSM,this study proposes a hybrid model that combines the physically-based probabilistic model(PPM)with convolutional neural network(CNN).The PPM is capable of effectively capturing the spatial distribution of landslides by incorporating the probability of failure(POF)considering the slope stability mechanism under rainfall conditions.This significantly characterizes the variation of POF caused by parameter uncertainties.CNN was used as a binary classifier to capture the spatial and channel correlation between landslide conditioning factors and the probability of landslide occurrence.OpenCV image enhancement technique was utilized to extract non-landslide points based on the POF of landslides.The proposed model comprehensively considers physical mechanics when selecting non-landslide samples,effectively filtering out samples that do not adhere to physical principles and reduce the risk of overfitting.The results indicate that the proposed PPM-CNN hybrid model presents a higher prediction accuracy,with an area under the curve(AUC)value of 0.85 based on the landslide case of the Niangniangba area of Gansu Province,China compared with the individual CNN model(AUC=0.61)and the PPM(AUC=0.74).This model can also consider the statistical correlation and non-normal probability distributions of model parameters.These results offer practical guidance for future research on rainfall-induced LSM at the regional scale.
基金Supported by Scientific Research Fund of Hunan Provincial Education Departmen(t23A0361)。
文摘An evolution inequality of Sobolev type involving a nonlinear convolution term is considered.By using the nonlinear capacity method and the contradiction argument,the non-existence of the nontrivial local weak solution is proved.
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.
基金the National Key Research and Development Program of China(No.2021ZD0111902)the National Natural Science Foundation of China(Nos.62172022 and U21B2038)。
文摘Due to self-occlusion and high degree of freedom,estimating 3D hand pose from a single RGB image is a great challenging problem.Graph convolutional networks(GCNs)use graphs to describe the physical connection relationships between hand joints and improve the accuracy of 3D hand pose regression.However,GCNs cannot effectively describe the relationships between non-adjacent hand joints.Recently,hypergraph convolutional networks(HGCNs)have received much attention as they can describe multi-dimensional relationships between nodes through hyperedges;therefore,this paper proposes a framework for 3D hand pose estimation based on HGCN,which can better extract correlated relationships between adjacent and non-adjacent hand joints.To overcome the shortcomings of predefined hypergraph structures,a kind of dynamic hypergraph convolutional network is proposed,in which hyperedges are constructed dynamically based on hand joint feature similarity.To better explore the local semantic relationships between nodes,a kind of semantic dynamic hypergraph convolution is proposed.The proposed method is evaluated on publicly available benchmark datasets.Qualitative and quantitative experimental results both show that the proposed HGCN and improved methods for 3D hand pose estimation are better than GCN,and achieve state-of-the-art performance compared with existing methods.
文摘Aspect-oriented sentiment analysis is a meticulous sentiment analysis task that aims to analyse the sentiment polarity of specific aspects. Most of the current research builds graph convolutional networks based on dependent syntactic trees, which improves the classification performance of the models to some extent. However, the technical limitations of dependent syntactic trees can introduce considerable noise into the model. Meanwhile, it is difficult for a single graph convolutional network to aggregate both semantic and syntactic structural information of nodes, which affects the final sentence classification. To cope with the above problems, this paper proposes a bi-channel graph convolutional network model. The model introduces a phrase structure tree and transforms it into a hierarchical phrase matrix. The adjacency matrix of the dependent syntactic tree and the hierarchical phrase matrix are combined as the initial matrix of the graph convolutional network to enhance the syntactic information. The semantic information feature representations of the sentences are obtained by the graph convolutional network with a multi-head attention mechanism and fused to achieve complementary learning of dual-channel features. Experimental results show that the model performs well and improves the accuracy of sentiment classification on three public benchmark datasets, namely Rest14, Lap14 and Twitter.
基金funded by UKRI EPSRC Grant EP/W020408/1 Project SPRITE+2:The Security,Privacy,Identity,and Trust Engagement Network plus(phase 2)for this studyfunded by PhD project RS718 on Explainable AI through the UKRI EPSRC Grant-funded Doctoral Training Centre at Swansea University.
文摘During its growth stage,the plant is exposed to various diseases.Detection and early detection of crop diseases is amajor challenge in the horticulture industry.Crop infections can harmtotal crop yield and reduce farmers’income if not identified early.Today’s approved method involves a professional plant pathologist to diagnose the disease by visual inspection of the afflicted plant leaves.This is an excellent use case for Community Assessment and Treatment Services(CATS)due to the lengthy manual disease diagnosis process and the accuracy of identification is directly proportional to the skills of pathologists.An alternative to conventional Machine Learning(ML)methods,which require manual identification of parameters for exact results,is to develop a prototype that can be classified without pre-processing.To automatically diagnose tomato leaf disease,this research proposes a hybrid model using the Convolutional Auto-Encoders(CAE)network and the CNN-based deep learning architecture of DenseNet.To date,none of the modern systems described in this paper have a combined model based on DenseNet,CAE,and ConvolutionalNeuralNetwork(CNN)todiagnose the ailments of tomato leaves automatically.Themodelswere trained on a dataset obtained from the Plant Village repository.The dataset consisted of 9920 tomato leaves,and the model-tomodel accuracy ratio was 98.35%.Unlike other approaches discussed in this paper,this hybrid strategy requires fewer training components.Therefore,the training time to classify plant diseases with the trained algorithm,as well as the training time to automatically detect the ailments of tomato leaves,is significantly reduced.
基金supported by Natural Science Foundation of Gansu Province(No.21JR11RA062)University Innovation Fund of Gansu Province(No.2022A-047).
文摘Aiming at the problem of insufficient feature extraction in single scale neural network model and the problem that convolutional neural network cannot process sequential tasks in the classification of EEG signals in depression,a hybrid model(BFTCNet)of dualbranch convolutional neural network(Bi_CNN)and temporal convolutional network(TCN)based on feature recalibration(FR)was proposed to classify EEG signals of depressed patients and healthy controls.Firstly,Bi_CNN module was used to extract the mixed EEG features between different frequency bands and different channels.Secondly,FR module was used to enhance the features extracted by Bi_CNN.Finally,TCN with dilated causal convolution was used for the sequence learning to capture the temporal dependency between features.In this study,128 EEG channels of resting-state(closed-eye)EEG data from the public dataset MODMA were used as experimental data,including 29 healthy controls and 24 depression patients.The performance of the model was evaluated by the 10-fold cross validation method.The proposed BFTCNet achieves a classification accuracy of 95.98%,F1 score value of 95.47%,sensitivity and specificity of 94.21%and 97.50%,respectively.Compared with the single-scale network model EEGNet-8,2,the classification accuracy and F1 value are improved by 1.5%and 1.48%,respectively.Meanwhile,the ablation experiment proved that each sub-module had its contribution to the improvement of the model’s classification ability.