Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global featu...Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.展开更多
Currently,most trains are equipped with dedicated cameras for capturing pantograph videos.Pantographs are core to the high-speed-railway pantograph-catenary system,and their failure directly affects the normal operati...Currently,most trains are equipped with dedicated cameras for capturing pantograph videos.Pantographs are core to the high-speed-railway pantograph-catenary system,and their failure directly affects the normal operation of high-speed trains.However,given the complex and variable real-world operational conditions of high-speed railways,there is no real-time and robust pantograph fault-detection method capable of handling large volumes of surveillance video.Hence,it is of paramount importance to maintain real-time monitoring and analysis of pantographs.Our study presents a real-time intelligent detection technology for identifying faults in high-speed railway pantographs,utilizing a fusion of self-attention and convolution features.We delved into lightweight multi-scale feature-extraction and fault-detection models based on deep learning to detect pantograph anomalies.Compared with traditional methods,this approach achieves high recall and accuracy in pantograph recognition,accurately pinpointing issues like discharge sparks,pantograph horns,and carbon pantograph-slide malfunctions.After experimentation and validation with actual surveillance videos of electric multiple-unit train,our algorithmic model demonstrates real-time,high-accuracy performance even under complex operational conditions.展开更多
Accurate traffic flow prediction has a profound impact on modern traffic management. Traffic flow has complex spatial-temporal correlations and periodicity, which poses difficulties for precise prediction. To address ...Accurate traffic flow prediction has a profound impact on modern traffic management. Traffic flow has complex spatial-temporal correlations and periodicity, which poses difficulties for precise prediction. To address this problem, a Multi-head Self-attention and Spatial-Temporal Graph Convolutional Network (MSSTGCN) for multiscale traffic flow prediction is proposed. Firstly, to capture the hidden traffic periodicity of traffic flow, traffic flow is divided into three kinds of periods, including hourly, daily, and weekly data. Secondly, a graph attention residual layer is constructed to learn the global spatial features across regions. Local spatial-temporal dependence is captured by using a T-GCN module. Thirdly, a transformer layer is introduced to learn the long-term dependence in time. A position embedding mechanism is introduced to label position information for all traffic sequences. Thus, this multi-head self-attention mechanism can recognize the sequence order and allocate weights for different time nodes. Experimental results on four real-world datasets show that the MSSTGCN performs better than the baseline methods and can be successfully adapted to traffic prediction tasks.展开更多
Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained promine...Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.展开更多
With the rapid development of the Artificial Intelligence of Things(AIoT),convolutional neural networks(CNNs)have demonstrated potential and remarkable performance in AIoT applications due to their excellent performan...With the rapid development of the Artificial Intelligence of Things(AIoT),convolutional neural networks(CNNs)have demonstrated potential and remarkable performance in AIoT applications due to their excellent performance in various inference tasks.However,the users have concerns about privacy leakage for the use of AI and the performance and efficiency of computing on resource-constrained IoT edge devices.Therefore,this paper proposes an efficient privacy-preserving CNN framework(i.e.,EPPA)based on the Fully Homomorphic Encryption(FHE)scheme for AIoT application scenarios.In the plaintext domain,we verify schemes with different activation structures to determine the actual activation functions applicable to the corresponding ciphertext domain.Within the encryption domain,we integrate batch normalization(BN)into the convolutional layers to simplify the computation process.For nonlinear activation functions,we use composite polynomials for approximate calculation.Regarding the noise accumulation caused by homomorphic multiplication operations,we realize the refreshment of ciphertext noise through minimal“decryption-encryption”interactions,instead of adopting bootstrapping operations.Additionally,in practical implementation,we convert three-dimensional convolution into two-dimensional convolution to reduce the amount of computation in the encryption domain.Finally,we conduct extensive experiments on four IoT datasets,different CNN architectures,and two platforms with different resource configurations to evaluate the performance of EPPA in detail.展开更多
With the increasing complexity of industrial automation,planetary gearboxes play a vital role in largescale equipment transmission systems,directly impacting operational efficiency and safety.Traditional maintenance s...With the increasing complexity of industrial automation,planetary gearboxes play a vital role in largescale equipment transmission systems,directly impacting operational efficiency and safety.Traditional maintenance strategies often struggle to accurately predict the degradation process of equipment,leading to excessive maintenance costs or potential failure risks.However,existing prediction methods based on statistical models are difficult to adapt to nonlinear degradation processes.To address these challenges,this study proposes a novel condition-based maintenance framework for planetary gearboxes.A comprehensive full-lifecycle degradation experiment was conducted to collect raw vibration signals,which were then processed using a temporal convolutional network autoencoder with multi-scale perception capability to extract deep temporal degradation features,enabling the collaborative extraction of longperiod meshing frequencies and short-term impact features from the vibration signals.Kernel principal component analysis was employed to fuse and normalize these features,enhancing the characterization of degradation progression.A nonlinear Wiener process was used to model the degradation trajectory,with a threshold decay function introduced to dynamically adjust maintenance strategies,and model parameters optimized through maximum likelihood estimation.Meanwhile,the maintenance strategy was optimized to minimize costs per unit time,determining the optimal maintenance timing and preventive maintenance threshold.The comprehensive indicator of degradation trends extracted by this method reaches 0.756,which is 41.2%higher than that of traditional time-domain features;the dynamic threshold strategy reduces the maintenance cost per unit time to 55.56,which is 8.9%better than that of the static threshold optimization.Experimental results demonstrate significant reductions in maintenance costs while enhancing system reliability and safety.This study realizes the organic integration of deep learning and reliability theory in the maintenance of planetary gearboxes,provides an interpretable solution for the predictive maintenance of complex mechanical systems,and promotes the development of condition-based maintenance strategies for planetary gearboxes.展开更多
Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using d...Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested.展开更多
Lightweight convolutional neural networks(CNNs)have simple structures but struggle to comprehensively and accurately extract important semantic information from images.While attention mechanisms can enhance CNNs by le...Lightweight convolutional neural networks(CNNs)have simple structures but struggle to comprehensively and accurately extract important semantic information from images.While attention mechanisms can enhance CNNs by learning distinctive representations,most existing spatial and hybrid attention methods focus on local regions with extensive parameters,making them unsuitable for lightweight CNNs.In this paper,we propose a self-attention mechanism tailored for lightweight networks,namely the brief self-attention module(BSAM).BSAM consists of the brief spatial attention(BSA)and advanced channel attention blocks.Unlike conventional self-attention methods with many parameters,our BSA block improves the performance of lightweight networks by effectively learning global semantic representations.Moreover,BSAM can be seamlessly integrated into lightweight CNNs for end-to-end training,maintaining the network’s lightweight and mobile characteristics.We validate the effectiveness of the proposed method on image classification tasks using the Food-101,Caltech-256,and Mini-ImageNet datasets.展开更多
Industrial Internet of Things(IIoT)is a pervasive network of interlinked smart devices that provide a variety of intelligent computing services in industrial environments.Several IIoT nodes operate confidential data(s...Industrial Internet of Things(IIoT)is a pervasive network of interlinked smart devices that provide a variety of intelligent computing services in industrial environments.Several IIoT nodes operate confidential data(such as medical,transportation,military,etc.)which are reachable targets for hostile intruders due to their openness and varied structure.Intrusion Detection Systems(IDS)based on Machine Learning(ML)and Deep Learning(DL)techniques have got significant attention.However,existing ML and DL-based IDS still face a number of obstacles that must be overcome.For instance,the existing DL approaches necessitate a substantial quantity of data for effective performance,which is not feasible to run on low-power and low-memory devices.Imbalanced and fewer data potentially lead to low performance on existing IDS.This paper proposes a self-attention convolutional neural network(SACNN)architecture for the detection of malicious activity in IIoT networks and an appropriate feature extraction method to extract the most significant features.The proposed architecture has a self-attention layer to calculate the input attention and convolutional neural network(CNN)layers to process the assigned attention features for prediction.The performance evaluation of the proposed SACNN architecture has been done with the Edge-IIoTset and X-IIoTID datasets.These datasets encompassed the behaviours of contemporary IIoT communication protocols,the operations of state-of-the-art devices,various attack types,and diverse attack scenarios.展开更多
Traditional based deep learning intrusion detection methods face problems such as insufficient cloud storage,data privacy leaks,high com-munication costs,unsatisfactory detection rates,and false positive rate.To addre...Traditional based deep learning intrusion detection methods face problems such as insufficient cloud storage,data privacy leaks,high com-munication costs,unsatisfactory detection rates,and false positive rate.To address existing issues in intrusion detection,this paper presents a novel approach called CS-FL,which combines Federated Learning and a Self-Attention Fusion Convolutional Neural Network.Federated Learning is a new distributed computing model that enables individual training of client data without uploading local data to a central server.at the same time,local training results are uploaded and integrated across all participating clients to produce a global model.The sharing model reduces communication costs,protects data privacy,and solves problems such as insufficient cloud storage and“data islands”for each client.In the proposed method,a hybrid model is formed by integrating the self-Attention and similar parts of the Convolutional Neural Network in the local data processing.This approach not only enhances the performance of the hybrid model but also reduces computational overhead compared to pure hybrid neural networks.Results from experiments on the NSL-KDD dataset show that the proposed method outperforms other intrusion detection techniques,resulting in a significant improvement in performance.This demonstrates the effectiveness of the proposed approach in improving intrusion detection accuracy.展开更多
Located in northern China,the Hetao Plain is an important agro-economic zone and population centre.The deterioration of local groundwater quality has had a serious impact on human health and economic development.Nowad...Located in northern China,the Hetao Plain is an important agro-economic zone and population centre.The deterioration of local groundwater quality has had a serious impact on human health and economic development.Nowadays,the groundwater vulnerability assessment(GVA)has become an essential task to identify the current status and development trend of groundwater quality.In this study,the Convolutional Neural Network(CNN)and Long Short-Term Memory(LSTM)models are integrated to realize the spatio-temporal prediction of regional groundwater vulnerability by introducing the Self-attention mechanism.The study firstly builds the CNN-LSTM modelwith self-attention(SA)mechanism and evaluates the prediction accuracy of the model for groundwater vulnerability compared to other common machine learning models such as Support Vector Machine(SVM),Random Forest(RF),and Extreme Gradient Boosting(XGBoost).The results indicate that the CNNLSTM model outperforms thesemodels,demonstrating its significance in groundwater vulnerability assessment.It can be posited that the predictions indicate an increased risk of groundwater vulnerability in the study area over the coming years.This increase can be attributed to the synergistic impact of global climate anomalies and intensified local human activities.Moreover,the overall groundwater vulnerability risk in the entire region has increased,evident fromboth the notably high value and standard deviation.This suggests that the spatial variability of groundwater vulnerability in the area is expected to expand in the future due to the sustained progression of climate change and human activities.The model can be optimized for diverse applications across regional environmental assessment,pollution prediction,and risk statistics.This study holds particular significance for ecological protection and groundwater resource management.展开更多
The Ultra-Wideband(UWB)Location-Based Service is receiving more and more attention due to its high ranging accuracy and good time resolution.However,the None-Line-of-Sight(NLOS)propagation may reduce the ranging accur...The Ultra-Wideband(UWB)Location-Based Service is receiving more and more attention due to its high ranging accuracy and good time resolution.However,the None-Line-of-Sight(NLOS)propagation may reduce the ranging accuracy for UWB localization system in indoor environment.So it is important to identify LOS and NLOS propagations before taking proper measures to improve the UWB localization accuracy.In this paper,a deep learning-based UWB NLOS/LOS classification algorithm called FCN-Attention is proposed.The proposed FCN-Attention algorithm utilizes a Fully Convolution Network(FCN)for improving feature extraction ability and a self-attention mechanism for enhancing feature description from the data to improve the classification accuracy.The proposed algorithm is evaluated using an open-source dataset,a local collected dataset and a mixed dataset created from these two datasets.The experiment result shows that the proposed FCN-Attention algorithm achieves classification accuracy of 88.24%on the open-source dataset,100%on the local collected dataset and 92.01%on the mixed dataset,which is better than the results from other evaluated NLOS/LOS classification algorithms in most scenarios in this paper.展开更多
In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accurac...In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.展开更多
Severe ground-level ozone(O_(3))pollution over major Chinese cities has become one of the most challenging problems,which have deleterious effects on human health and the sustainability of society.This study explored ...Severe ground-level ozone(O_(3))pollution over major Chinese cities has become one of the most challenging problems,which have deleterious effects on human health and the sustainability of society.This study explored the spatiotemporal distribution characteristics of ground-level O_(3) and its precursors based on conventional pollutant and meteorological monitoring data in Zhejiang Province from 2016 to 2021.Then,a high-performance convolutional neural network(CNN)model was established by expanding the moment and the concentration variations to general factors.Finally,the response mechanism of O_(3) to the variation with crucial influencing factors is explored by controlling variables and interpolating target variables.The results indicated that the annual average MDA8-90th concentrations in Zhejiang Province are higher in the northern and lower in the southern.When the wind direction(WD)ranges from east to southwest and the wind speed(WS)ranges between 2 and 3 m/sec,higher O_(3) concentration prone to occur.At different temperatures(T),the O_(3) concentration showed a trend of first increasing and subsequently decreasing with increasing NO_(2) concentration,peaks at the NO_(2) concentration around 0.02mg/m^(3).The sensitivity of NO_(2) to O_(3) formation is not easily affected by temperature,barometric pressure and dew point temperature.Additionally,there is a minimum IRNO_(2) at each temperature when the NO_(2) concentration is 0.03 mg/m^(3),and this minimum IRNO_(2) decreases with increasing temperature.The study explores the response mechanism of O_(3) with the change of driving variables,which can provide a scientific foundation and methodological support for the targeted management of O_(3) pollution.展开更多
The isolated fracture-vug systems controlled by small-scale strike-slip faults within ultra-deep carbonate rocks of the Tarim Basin exhibit significant exploration potential.The study employs a novel training set inco...The isolated fracture-vug systems controlled by small-scale strike-slip faults within ultra-deep carbonate rocks of the Tarim Basin exhibit significant exploration potential.The study employs a novel training set incorporating innovative fault labels to train a U-Net-structured CNN model,enabling effective identification of small-scale strike-slip faults through seismic data interpretation.Based on the CNN faults,we analyze the distribution patterns of small-scale strike-slip faults.The small-scale strike-slip faults can be categorized into NNW-trending and NE-trending groups with strike lengths ranging 200–5000 m.The development intensity of small-scale strike-slip faults in the Lower Yingshan Member notably exceeds that in the Upper Member.The Lower and Upper Yingshan members are two distinct mechanical layers with contrasting brittleness characteristics,separated by a low-brittleness layer.The superior brittleness of the Lower Yingshan Member enhances the development intensity of small-scale strike-slip faults compared to the upper member,while the low-brittleness layer exerts restrictive effects on vertical fault propagation.Fracture-vug systems formed by interactions of two or more small-scale strike-slip faults demonstrate larger sizes than those controlled by individual faults.All fracture-vug system sizes show positive correlations with the vertical extents of associated small-scale strike-slip faults,particularly intersection and approaching fracture-vug systems exhibit accelerated size increases proportional to the vertical extents.展开更多
With the emphasis on user privacy and communication security, encrypted traffic has increased dramatically, which brings great challenges to traffic classification. The classification method of encrypted traffic based...With the emphasis on user privacy and communication security, encrypted traffic has increased dramatically, which brings great challenges to traffic classification. The classification method of encrypted traffic based on GNN can deal with encrypted traffic well. However, existing GNN-based approaches ignore the relationship between client or server packets. In this paper, we design a network traffic topology based on GCN, called Flow Mapping Graph (FMG). FMG establishes sequential edges between vertexes by the arrival order of packets and establishes jump-order edges between vertexes by connecting packets in different bursts with the same direction. It not only reflects the time characteristics of the packet but also strengthens the relationship between the client or server packets. According to FMG, a Traffic Mapping Classification model (TMC-GCN) is designed, which can automatically capture and learn the characteristics and structure information of the top vertex in FMG. The TMC-GCN model is used to classify the encrypted traffic. The encryption stream classification problem is transformed into a graph classification problem, which can effectively deal with data from different data sources and application scenarios. By comparing the performance of TMC-GCN with other classical models in four public datasets, including CICIOT2023, ISCXVPN2016, CICAAGM2017, and GraphDapp, the effectiveness of the FMG algorithm is verified. The experimental results show that the accuracy rate of the TMC-GCN model is 96.13%, the recall rate is 95.04%, and the F1 rate is 94.54%.展开更多
The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been pr...Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been proposed.However,unlike DNNs,shallow convolutional neural networks often outperform deeper models in mitigating overfitting,particularly with small datasets.Still,many of these methods rely on a single feature for recognition,resulting in an insufficient ability to extract highly effective features.To address this limitation,in this paper,an Improved Dual-stream Shallow Convolutional Neural Network based on an Extreme Gradient Boosting Algorithm(IDSSCNN-XgBoost)is introduced for ME Recognition.The proposed method utilizes a dual-stream architecture where motion vectors(temporal features)are extracted using Optical Flow TV-L1 and amplify subtle changes(spatial features)via EulerianVideoMagnification(EVM).These features are processed by IDSSCNN,with an attention mechanism applied to refine the extracted effective features.The outputs are then fused,concatenated,and classified using the XgBoost algorithm.This comprehensive approach significantly improves recognition accuracy by leveraging the strengths of both temporal and spatial information,supported by the robust classification power of XgBoost.The proposed method is evaluated on three publicly available ME databases named Chinese Academy of Sciences Micro-expression Database(CASMEII),Spontaneous Micro-Expression Database(SMICHS),and Spontaneous Actions and Micro-Movements(SAMM).Experimental results indicate that the proposed model can achieve outstanding results compared to recent models.The accuracy results are 79.01%,69.22%,and 68.99%on CASMEII,SMIC-HS,and SAMM,and the F1-score are 75.47%,68.91%,and 63.84%,respectively.The proposed method has the advantage of operational efficiency and less computational time.展开更多
For image compression sensing reconstruction,most algorithms use the method of reconstructing image blocks one by one and stacking many convolutional layers,which usually have defects of obvious block effects,high com...For image compression sensing reconstruction,most algorithms use the method of reconstructing image blocks one by one and stacking many convolutional layers,which usually have defects of obvious block effects,high computational complexity,and long reconstruction time.An image compressed sensing reconstruction network based on self-attention mechanism(SAMNet)was proposed.For the compressed sampling,self-attention convolution was designed,which was conducive to capturing richer features,so that the compressed sensing measurement value retained more image structure information.For the reconstruction,a self-attention mechanism was introduced in the convolutional neural network.A reconstruction network including residual blocks,bottleneck transformer(BoTNet),and dense blocks was proposed,which strengthened the transfer of image features and reduced the amount of parameters dramatically.Under the Set5 dataset,when the measurement rates are 0.01,0.04,0.10,and 0.25,the average peak signal-to-noise ratio(PSNR)of SAMNet is improved by 1.27,1.23,0.50,and 0.15 dB,respectively,compared to the CSNet+.The running time of reconstructing a 256×256 image is reduced by 0.1473,0.1789,0.2310,and 0.2524 s compared to ReconNet.Experimental results showed that SAMNet improved the quality of reconstructed images and reduced the reconstruction time.展开更多
Human disturbance activities is one of the main reasons for inducing geohazards.Ecological impact assessment metrics of roads are inconsistent criteria and multiple.From the perspective of visual observation,the envir...Human disturbance activities is one of the main reasons for inducing geohazards.Ecological impact assessment metrics of roads are inconsistent criteria and multiple.From the perspective of visual observation,the environment damage can be shown through detecting the uncovered area of vegetation in the images along road.To realize this,an end-to-end environment damage detection model based on convolutional neural network is proposed.A 50-layer residual network is used to extract feature map.The initial parameters are optimized by transfer learning.An example is shown by this method.The dataset including cliff and landslide damage are collected by us along road in Shennongjia national forest park.Results show 0.4703 average precision(AP)rating for cliff damage and 0.4809 average precision(AP)rating for landslide damage.Compared with YOLOv3,our model shows a better accuracy in cliff and landslide detection although a certain amount of speed is sacrificed.展开更多
基金National Key Research and Development Program of China,Grant/Award Number:2018YFE0206900China Postdoctoral Science Foundation,Grant/Award Number:2023M731204+2 种基金The Open Project of Key Laboratory for Quality Evaluation of Ultrasound Surgical Equipment of National Medical Products Administration,Grant/Award Number:SMDTKL-2023-1-01The Hubei Province Key Research and Development Project,Grant/Award Number:2023BCB007CAAI-Huawei MindSpore Open Fund。
文摘Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.
基金supported by the National Key R&D Program of China(No.2022YFB4301102).
文摘Currently,most trains are equipped with dedicated cameras for capturing pantograph videos.Pantographs are core to the high-speed-railway pantograph-catenary system,and their failure directly affects the normal operation of high-speed trains.However,given the complex and variable real-world operational conditions of high-speed railways,there is no real-time and robust pantograph fault-detection method capable of handling large volumes of surveillance video.Hence,it is of paramount importance to maintain real-time monitoring and analysis of pantographs.Our study presents a real-time intelligent detection technology for identifying faults in high-speed railway pantographs,utilizing a fusion of self-attention and convolution features.We delved into lightweight multi-scale feature-extraction and fault-detection models based on deep learning to detect pantograph anomalies.Compared with traditional methods,this approach achieves high recall and accuracy in pantograph recognition,accurately pinpointing issues like discharge sparks,pantograph horns,and carbon pantograph-slide malfunctions.After experimentation and validation with actual surveillance videos of electric multiple-unit train,our algorithmic model demonstrates real-time,high-accuracy performance even under complex operational conditions.
基金supported by the National Natural Science Foundation of China(Grant Nos.62472149,62376089,62202147)Hubei Provincial Science and Technology Plan Project(2023BCB04100).
文摘Accurate traffic flow prediction has a profound impact on modern traffic management. Traffic flow has complex spatial-temporal correlations and periodicity, which poses difficulties for precise prediction. To address this problem, a Multi-head Self-attention and Spatial-Temporal Graph Convolutional Network (MSSTGCN) for multiscale traffic flow prediction is proposed. Firstly, to capture the hidden traffic periodicity of traffic flow, traffic flow is divided into three kinds of periods, including hourly, daily, and weekly data. Secondly, a graph attention residual layer is constructed to learn the global spatial features across regions. Local spatial-temporal dependence is captured by using a T-GCN module. Thirdly, a transformer layer is introduced to learn the long-term dependence in time. A position embedding mechanism is introduced to label position information for all traffic sequences. Thus, this multi-head self-attention mechanism can recognize the sequence order and allocate weights for different time nodes. Experimental results on four real-world datasets show that the MSSTGCN performs better than the baseline methods and can be successfully adapted to traffic prediction tasks.
基金supported by the National Natural Science Foundation of China(No.52277055).
文摘Traditional data-driven fault diagnosis methods depend on expert experience to manually extract effective fault features of signals,which has certain limitations.Conversely,deep learning techniques have gained prominence as a central focus of research in the field of fault diagnosis by strong fault feature extraction ability and end-to-end fault diagnosis efficiency.Recently,utilizing the respective advantages of convolution neural network(CNN)and Transformer in local and global feature extraction,research on cooperating the two have demonstrated promise in the field of fault diagnosis.However,the cross-channel convolution mechanism in CNN and the self-attention calculations in Transformer contribute to excessive complexity in the cooperative model.This complexity results in high computational costs and limited industrial applicability.To tackle the above challenges,this paper proposes a lightweight CNN-Transformer named as SEFormer for rotating machinery fault diagnosis.First,a separable multiscale depthwise convolution block is designed to extract and integrate multiscale feature information from different channel dimensions of vibration signals.Then,an efficient self-attention block is developed to capture critical fine-grained features of the signal from a global perspective.Finally,experimental results on the planetary gearbox dataset and themotor roller bearing dataset prove that the proposed framework can balance the advantages of robustness,generalization and lightweight compared to recent state-of-the-art fault diagnosis models based on CNN and Transformer.This study presents a feasible strategy for developing a lightweight rotating machinery fault diagnosis framework aimed at economical deployment.
基金supported by the Natural Science Foundation of China No.62362008the Major Scientific and Technological Special Project of Guizhou Province([2024]014).
文摘With the rapid development of the Artificial Intelligence of Things(AIoT),convolutional neural networks(CNNs)have demonstrated potential and remarkable performance in AIoT applications due to their excellent performance in various inference tasks.However,the users have concerns about privacy leakage for the use of AI and the performance and efficiency of computing on resource-constrained IoT edge devices.Therefore,this paper proposes an efficient privacy-preserving CNN framework(i.e.,EPPA)based on the Fully Homomorphic Encryption(FHE)scheme for AIoT application scenarios.In the plaintext domain,we verify schemes with different activation structures to determine the actual activation functions applicable to the corresponding ciphertext domain.Within the encryption domain,we integrate batch normalization(BN)into the convolutional layers to simplify the computation process.For nonlinear activation functions,we use composite polynomials for approximate calculation.Regarding the noise accumulation caused by homomorphic multiplication operations,we realize the refreshment of ciphertext noise through minimal“decryption-encryption”interactions,instead of adopting bootstrapping operations.Additionally,in practical implementation,we convert three-dimensional convolution into two-dimensional convolution to reduce the amount of computation in the encryption domain.Finally,we conduct extensive experiments on four IoT datasets,different CNN architectures,and two platforms with different resource configurations to evaluate the performance of EPPA in detail.
基金funded by scientific research projects under Grant JY2024B011.
文摘With the increasing complexity of industrial automation,planetary gearboxes play a vital role in largescale equipment transmission systems,directly impacting operational efficiency and safety.Traditional maintenance strategies often struggle to accurately predict the degradation process of equipment,leading to excessive maintenance costs or potential failure risks.However,existing prediction methods based on statistical models are difficult to adapt to nonlinear degradation processes.To address these challenges,this study proposes a novel condition-based maintenance framework for planetary gearboxes.A comprehensive full-lifecycle degradation experiment was conducted to collect raw vibration signals,which were then processed using a temporal convolutional network autoencoder with multi-scale perception capability to extract deep temporal degradation features,enabling the collaborative extraction of longperiod meshing frequencies and short-term impact features from the vibration signals.Kernel principal component analysis was employed to fuse and normalize these features,enhancing the characterization of degradation progression.A nonlinear Wiener process was used to model the degradation trajectory,with a threshold decay function introduced to dynamically adjust maintenance strategies,and model parameters optimized through maximum likelihood estimation.Meanwhile,the maintenance strategy was optimized to minimize costs per unit time,determining the optimal maintenance timing and preventive maintenance threshold.The comprehensive indicator of degradation trends extracted by this method reaches 0.756,which is 41.2%higher than that of traditional time-domain features;the dynamic threshold strategy reduces the maintenance cost per unit time to 55.56,which is 8.9%better than that of the static threshold optimization.Experimental results demonstrate significant reductions in maintenance costs while enhancing system reliability and safety.This study realizes the organic integration of deep learning and reliability theory in the maintenance of planetary gearboxes,provides an interpretable solution for the predictive maintenance of complex mechanical systems,and promotes the development of condition-based maintenance strategies for planetary gearboxes.
基金The work described in this paper was fully supported by a grant from Hong Kong Metropolitan University(RIF/2021/05).
文摘Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested.
文摘Lightweight convolutional neural networks(CNNs)have simple structures but struggle to comprehensively and accurately extract important semantic information from images.While attention mechanisms can enhance CNNs by learning distinctive representations,most existing spatial and hybrid attention methods focus on local regions with extensive parameters,making them unsuitable for lightweight CNNs.In this paper,we propose a self-attention mechanism tailored for lightweight networks,namely the brief self-attention module(BSAM).BSAM consists of the brief spatial attention(BSA)and advanced channel attention blocks.Unlike conventional self-attention methods with many parameters,our BSA block improves the performance of lightweight networks by effectively learning global semantic representations.Moreover,BSAM can be seamlessly integrated into lightweight CNNs for end-to-end training,maintaining the network’s lightweight and mobile characteristics.We validate the effectiveness of the proposed method on image classification tasks using the Food-101,Caltech-256,and Mini-ImageNet datasets.
基金Deputy for Research and Innovation-Ministry of Education,Kingdom of Saudi Arabia,Grant/Award Number:NU/IFC/02/SERC/-/31Institutional Funding Committee at Najran University,Kingdom of Saudi Arabia。
文摘Industrial Internet of Things(IIoT)is a pervasive network of interlinked smart devices that provide a variety of intelligent computing services in industrial environments.Several IIoT nodes operate confidential data(such as medical,transportation,military,etc.)which are reachable targets for hostile intruders due to their openness and varied structure.Intrusion Detection Systems(IDS)based on Machine Learning(ML)and Deep Learning(DL)techniques have got significant attention.However,existing ML and DL-based IDS still face a number of obstacles that must be overcome.For instance,the existing DL approaches necessitate a substantial quantity of data for effective performance,which is not feasible to run on low-power and low-memory devices.Imbalanced and fewer data potentially lead to low performance on existing IDS.This paper proposes a self-attention convolutional neural network(SACNN)architecture for the detection of malicious activity in IIoT networks and an appropriate feature extraction method to extract the most significant features.The proposed architecture has a self-attention layer to calculate the input attention and convolutional neural network(CNN)layers to process the assigned attention features for prediction.The performance evaluation of the proposed SACNN architecture has been done with the Edge-IIoTset and X-IIoTID datasets.These datasets encompassed the behaviours of contemporary IIoT communication protocols,the operations of state-of-the-art devices,various attack types,and diverse attack scenarios.
基金sponsored by the National Natural Science Foundation of China under Grants 62271264,61972207,and 42175194the Project through the Priority Academic Program Development(PAPD)of Jiangsu Higher Education Institution.
文摘Traditional based deep learning intrusion detection methods face problems such as insufficient cloud storage,data privacy leaks,high com-munication costs,unsatisfactory detection rates,and false positive rate.To address existing issues in intrusion detection,this paper presents a novel approach called CS-FL,which combines Federated Learning and a Self-Attention Fusion Convolutional Neural Network.Federated Learning is a new distributed computing model that enables individual training of client data without uploading local data to a central server.at the same time,local training results are uploaded and integrated across all participating clients to produce a global model.The sharing model reduces communication costs,protects data privacy,and solves problems such as insufficient cloud storage and“data islands”for each client.In the proposed method,a hybrid model is formed by integrating the self-Attention and similar parts of the Convolutional Neural Network in the local data processing.This approach not only enhances the performance of the hybrid model but also reduces computational overhead compared to pure hybrid neural networks.Results from experiments on the NSL-KDD dataset show that the proposed method outperforms other intrusion detection techniques,resulting in a significant improvement in performance.This demonstrates the effectiveness of the proposed approach in improving intrusion detection accuracy.
基金supported by the National Key Research and Development Program of China(No.2021YFA0715900).
文摘Located in northern China,the Hetao Plain is an important agro-economic zone and population centre.The deterioration of local groundwater quality has had a serious impact on human health and economic development.Nowadays,the groundwater vulnerability assessment(GVA)has become an essential task to identify the current status and development trend of groundwater quality.In this study,the Convolutional Neural Network(CNN)and Long Short-Term Memory(LSTM)models are integrated to realize the spatio-temporal prediction of regional groundwater vulnerability by introducing the Self-attention mechanism.The study firstly builds the CNN-LSTM modelwith self-attention(SA)mechanism and evaluates the prediction accuracy of the model for groundwater vulnerability compared to other common machine learning models such as Support Vector Machine(SVM),Random Forest(RF),and Extreme Gradient Boosting(XGBoost).The results indicate that the CNNLSTM model outperforms thesemodels,demonstrating its significance in groundwater vulnerability assessment.It can be posited that the predictions indicate an increased risk of groundwater vulnerability in the study area over the coming years.This increase can be attributed to the synergistic impact of global climate anomalies and intensified local human activities.Moreover,the overall groundwater vulnerability risk in the entire region has increased,evident fromboth the notably high value and standard deviation.This suggests that the spatial variability of groundwater vulnerability in the area is expected to expand in the future due to the sustained progression of climate change and human activities.The model can be optimized for diverse applications across regional environmental assessment,pollution prediction,and risk statistics.This study holds particular significance for ecological protection and groundwater resource management.
基金supported by the National Key Research and Development Program of China[grant No.2016YF B0502200]the Postdoctoral Research Foundation of China[grant No.2020M682480]the Fundamental Research Funds for the Central Universities[grant No.2042021kf0009]。
文摘The Ultra-Wideband(UWB)Location-Based Service is receiving more and more attention due to its high ranging accuracy and good time resolution.However,the None-Line-of-Sight(NLOS)propagation may reduce the ranging accuracy for UWB localization system in indoor environment.So it is important to identify LOS and NLOS propagations before taking proper measures to improve the UWB localization accuracy.In this paper,a deep learning-based UWB NLOS/LOS classification algorithm called FCN-Attention is proposed.The proposed FCN-Attention algorithm utilizes a Fully Convolution Network(FCN)for improving feature extraction ability and a self-attention mechanism for enhancing feature description from the data to improve the classification accuracy.The proposed algorithm is evaluated using an open-source dataset,a local collected dataset and a mixed dataset created from these two datasets.The experiment result shows that the proposed FCN-Attention algorithm achieves classification accuracy of 88.24%on the open-source dataset,100%on the local collected dataset and 92.01%on the mixed dataset,which is better than the results from other evaluated NLOS/LOS classification algorithms in most scenarios in this paper.
基金supported by the National Natural Science Foundation of China(62272049,62236006,62172045)the Key Projects of Beijing Union University(ZKZD202301).
文摘In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.
基金supported by the National Key Research and Development Program of China (Nos.2022YFC3702000 and 2022YFC3703500)the Key R&D Project of Zhejiang Province (No.2022C03146).
文摘Severe ground-level ozone(O_(3))pollution over major Chinese cities has become one of the most challenging problems,which have deleterious effects on human health and the sustainability of society.This study explored the spatiotemporal distribution characteristics of ground-level O_(3) and its precursors based on conventional pollutant and meteorological monitoring data in Zhejiang Province from 2016 to 2021.Then,a high-performance convolutional neural network(CNN)model was established by expanding the moment and the concentration variations to general factors.Finally,the response mechanism of O_(3) to the variation with crucial influencing factors is explored by controlling variables and interpolating target variables.The results indicated that the annual average MDA8-90th concentrations in Zhejiang Province are higher in the northern and lower in the southern.When the wind direction(WD)ranges from east to southwest and the wind speed(WS)ranges between 2 and 3 m/sec,higher O_(3) concentration prone to occur.At different temperatures(T),the O_(3) concentration showed a trend of first increasing and subsequently decreasing with increasing NO_(2) concentration,peaks at the NO_(2) concentration around 0.02mg/m^(3).The sensitivity of NO_(2) to O_(3) formation is not easily affected by temperature,barometric pressure and dew point temperature.Additionally,there is a minimum IRNO_(2) at each temperature when the NO_(2) concentration is 0.03 mg/m^(3),and this minimum IRNO_(2) decreases with increasing temperature.The study explores the response mechanism of O_(3) with the change of driving variables,which can provide a scientific foundation and methodological support for the targeted management of O_(3) pollution.
基金supported by the National Natural Science Foundation of China(No.U21B2062).
文摘The isolated fracture-vug systems controlled by small-scale strike-slip faults within ultra-deep carbonate rocks of the Tarim Basin exhibit significant exploration potential.The study employs a novel training set incorporating innovative fault labels to train a U-Net-structured CNN model,enabling effective identification of small-scale strike-slip faults through seismic data interpretation.Based on the CNN faults,we analyze the distribution patterns of small-scale strike-slip faults.The small-scale strike-slip faults can be categorized into NNW-trending and NE-trending groups with strike lengths ranging 200–5000 m.The development intensity of small-scale strike-slip faults in the Lower Yingshan Member notably exceeds that in the Upper Member.The Lower and Upper Yingshan members are two distinct mechanical layers with contrasting brittleness characteristics,separated by a low-brittleness layer.The superior brittleness of the Lower Yingshan Member enhances the development intensity of small-scale strike-slip faults compared to the upper member,while the low-brittleness layer exerts restrictive effects on vertical fault propagation.Fracture-vug systems formed by interactions of two or more small-scale strike-slip faults demonstrate larger sizes than those controlled by individual faults.All fracture-vug system sizes show positive correlations with the vertical extents of associated small-scale strike-slip faults,particularly intersection and approaching fracture-vug systems exhibit accelerated size increases proportional to the vertical extents.
基金supported by the National Key Research and Development Program of China No.2023YFA1009500.
文摘With the emphasis on user privacy and communication security, encrypted traffic has increased dramatically, which brings great challenges to traffic classification. The classification method of encrypted traffic based on GNN can deal with encrypted traffic well. However, existing GNN-based approaches ignore the relationship between client or server packets. In this paper, we design a network traffic topology based on GCN, called Flow Mapping Graph (FMG). FMG establishes sequential edges between vertexes by the arrival order of packets and establishes jump-order edges between vertexes by connecting packets in different bursts with the same direction. It not only reflects the time characteristics of the packet but also strengthens the relationship between the client or server packets. According to FMG, a Traffic Mapping Classification model (TMC-GCN) is designed, which can automatically capture and learn the characteristics and structure information of the top vertex in FMG. The TMC-GCN model is used to classify the encrypted traffic. The encryption stream classification problem is transformed into a graph classification problem, which can effectively deal with data from different data sources and application scenarios. By comparing the performance of TMC-GCN with other classical models in four public datasets, including CICIOT2023, ISCXVPN2016, CICAAGM2017, and GraphDapp, the effectiveness of the FMG algorithm is verified. The experimental results show that the accuracy rate of the TMC-GCN model is 96.13%, the recall rate is 95.04%, and the F1 rate is 94.54%.
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
基金supported by the Key Research and Development Program of Jiangsu Province under Grant BE2022059-3,CTBC Bank through the Industry-Academia Cooperation Project,as well as by the Ministry of Science and Technology of Taiwan through Grants MOST-108-2218-E-002-055,MOST-109-2223-E-009-002-MY3,MOST-109-2218-E-009-025,and MOST431109-2218-E-002-015.
文摘Micro-expressions(ME)recognition is a complex task that requires advanced techniques to extract informative features fromfacial expressions.Numerous deep neural networks(DNNs)with convolutional structures have been proposed.However,unlike DNNs,shallow convolutional neural networks often outperform deeper models in mitigating overfitting,particularly with small datasets.Still,many of these methods rely on a single feature for recognition,resulting in an insufficient ability to extract highly effective features.To address this limitation,in this paper,an Improved Dual-stream Shallow Convolutional Neural Network based on an Extreme Gradient Boosting Algorithm(IDSSCNN-XgBoost)is introduced for ME Recognition.The proposed method utilizes a dual-stream architecture where motion vectors(temporal features)are extracted using Optical Flow TV-L1 and amplify subtle changes(spatial features)via EulerianVideoMagnification(EVM).These features are processed by IDSSCNN,with an attention mechanism applied to refine the extracted effective features.The outputs are then fused,concatenated,and classified using the XgBoost algorithm.This comprehensive approach significantly improves recognition accuracy by leveraging the strengths of both temporal and spatial information,supported by the robust classification power of XgBoost.The proposed method is evaluated on three publicly available ME databases named Chinese Academy of Sciences Micro-expression Database(CASMEII),Spontaneous Micro-Expression Database(SMICHS),and Spontaneous Actions and Micro-Movements(SAMM).Experimental results indicate that the proposed model can achieve outstanding results compared to recent models.The accuracy results are 79.01%,69.22%,and 68.99%on CASMEII,SMIC-HS,and SAMM,and the F1-score are 75.47%,68.91%,and 63.84%,respectively.The proposed method has the advantage of operational efficiency and less computational time.
基金supported by National Natural Science Foundation of China(Nos.61261016,61661025)Science and Technology Plan of Gansu Province(No.20JR10RA273).
文摘For image compression sensing reconstruction,most algorithms use the method of reconstructing image blocks one by one and stacking many convolutional layers,which usually have defects of obvious block effects,high computational complexity,and long reconstruction time.An image compressed sensing reconstruction network based on self-attention mechanism(SAMNet)was proposed.For the compressed sampling,self-attention convolution was designed,which was conducive to capturing richer features,so that the compressed sensing measurement value retained more image structure information.For the reconstruction,a self-attention mechanism was introduced in the convolutional neural network.A reconstruction network including residual blocks,bottleneck transformer(BoTNet),and dense blocks was proposed,which strengthened the transfer of image features and reduced the amount of parameters dramatically.Under the Set5 dataset,when the measurement rates are 0.01,0.04,0.10,and 0.25,the average peak signal-to-noise ratio(PSNR)of SAMNet is improved by 1.27,1.23,0.50,and 0.15 dB,respectively,compared to the CSNet+.The running time of reconstructing a 256×256 image is reduced by 0.1473,0.1789,0.2310,and 0.2524 s compared to ReconNet.Experimental results showed that SAMNet improved the quality of reconstructed images and reduced the reconstruction time.
文摘Human disturbance activities is one of the main reasons for inducing geohazards.Ecological impact assessment metrics of roads are inconsistent criteria and multiple.From the perspective of visual observation,the environment damage can be shown through detecting the uncovered area of vegetation in the images along road.To realize this,an end-to-end environment damage detection model based on convolutional neural network is proposed.A 50-layer residual network is used to extract feature map.The initial parameters are optimized by transfer learning.An example is shown by this method.The dataset including cliff and landslide damage are collected by us along road in Shennongjia national forest park.Results show 0.4703 average precision(AP)rating for cliff damage and 0.4809 average precision(AP)rating for landslide damage.Compared with YOLOv3,our model shows a better accuracy in cliff and landslide detection although a certain amount of speed is sacrificed.