The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the a...Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the ability to simulate geometric transformations.Therefore,a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation.Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture.To overcome this shortcoming,the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation.The proposed method can easily be trained by end-to-end using standard backpropagation algorithms.Finally,the proposed method is tested on the ISPRS dataset.The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset.展开更多
In the textile industry,the presence of defects on the surface of fabric is an essential factor in determining fabric quality.Therefore,identifying fabric defects forms a crucial part of the fabric production process....In the textile industry,the presence of defects on the surface of fabric is an essential factor in determining fabric quality.Therefore,identifying fabric defects forms a crucial part of the fabric production process.Traditional fabric defect detection algorithms can only detect specific materials and specific fabric defect types;in addition,their detection efficiency is low,and their detection results are relatively poor.Deep learning-based methods have many advantages in the field of fabric defect detection,however,such methods are less effective in identifying multiscale fabric defects and defects with complex shapes.Therefore,we propose an effective algorithm,namely multilayer feature extraction combined with deformable convolution(MFDC),for fabric defect detection.In MFDC,multi-layer feature extraction is used to fuse the underlying location features with high-level classification features through a horizontally connected top-down architecture to improve the detection of multi-scale fabric defects.On this basis,a deformable convolution is added to solve the problem of the algorithm’s weak detection ability of irregularly shaped fabric defects.In this approach,Roi Align and Cascade-RCNN are integrated to enhance the adaptability of the algorithm in materials with complex patterned backgrounds.The experimental results show that the MFDC algorithm can achieve good detection results for both multi-scale fabric defects and defects with complex shapes,at the expense of a small increase in detection time.展开更多
Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to...Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to perform multi-perspective learning of temporal signals and Electrocardiogram images, nor can they fully extract the latent information within the data, falling short of the accuracy required by clinicians. Therefore, this paper proposes an innovative hybrid multimodal spatiotemporal neural network to address these challenges. The model employs a multimodal data augmentation framework integrating visual and signal-based features to enhance the classification performance of rare arrhythmias in imbalanced datasets. Additionally, the spatiotemporal fusion module incorporates a spatiotemporal graph convolutional network to jointly model temporal and spatial features, uncovering complex dependencies within the Electrocardiogram data and improving the model’s ability to represent complex patterns. In experiments conducted on the MIT-BIH arrhythmia dataset, the model achieved 99.95% accuracy, 99.80% recall, and a 99.78% F1 score. The model was further validated for generalization using the clinical INCART arrhythmia dataset, and the results demonstrated its effectiveness in terms of both generalization and robustness.展开更多
Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To...Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To address the difficulties in simultaneously capturing local and global dynamic spatiotemporal correlations in traffic flow,as well as the high time complexity of existing models,a multi-head flow attention-based local-global dynamic hypergraph convolution(MFA-LGDHC)pre-diction model is proposed.which consists of multi-head flow attention(MHFA)mechanism,graph convolution network(GCN),and local-global dynamic hypergraph convolution(LGHC).MHFA is utilized to extract the time dependency of traffic flow and reduce the time complexity of the model.GCN is employed to catch the spatial dependency of traffic flow.LGHC utilizes down-sampling con-volution and isometric convolution to capture the local and global spatial dependencies of traffic flow.And dynamic hypergraph convolution is used to model the dynamic higher-order relationships of the traffic road network.Experimental results indicate that the MFA-LGDHC model outperforms current popular baseline models and exhibits good prediction performance.展开更多
This paper presents CW-HRNet,a high-resolution,lightweight crack segmentation network designed to address challenges in complex scenes with slender,deformable,and blurred crack structures.The model incorporates two ke...This paper presents CW-HRNet,a high-resolution,lightweight crack segmentation network designed to address challenges in complex scenes with slender,deformable,and blurred crack structures.The model incorporates two key modules:Constrained Deformable Convolution(CDC),which stabilizes geometric alignment by applying a tanh limiter and learnable scaling factor to the predicted offsets,and the Wavelet Frequency Enhancement Module(WFEM),which decomposes features using Haar wavelets to preserve low-frequency structures while enhancing high-frequency boundaries and textures.Evaluations on the CrackSeg9k benchmark demonstrate CW-HRNet’s superior performance,achieving 82.39%mIoU with only 7.49M parameters and 10.34 GFLOPs,outperforming HrSegNet-B48 by 1.83% in segmentation accuracy with minimal complexity overhead.The model also shows strong cross-dataset generalization,achieving 60.01%mIoU and 66.22%F1 on Asphalt3k without fine-tuning.These results highlight CW-HRNet’s favorable accuracyefficiency trade-off for real-world crack segmentation tasks.展开更多
In this paper,an improved spatio-temporal alignment measurement method is presented to address the inertial matching measurement of hull deformation under the coexistence of time delay and large misalignment angle.Lar...In this paper,an improved spatio-temporal alignment measurement method is presented to address the inertial matching measurement of hull deformation under the coexistence of time delay and large misalignment angle.Large misalignment angle and time delay often occur simultaneously and bring great challenges to the accurate measurement of hull deformation in space and time.The proposed method utilizes coarse alignment with large misalignment angle and time delay estimation of inertial measurement unit modeling to establish a brand-new spatiotemporal aligned hull deformation measurement model.In addition,two-step loop control is designed to ensure the accurate description of dynamic deformation angle and static deformation angle by the time-space alignment method of hull deformation.The experiments illustrate that the proposed method can effectively measure the hull deformation angle when time delay and large misalignment angle coexist.展开更多
Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propos...Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.展开更多
Deformable medical image registration plays a vital role in medical image applications,such as placing different temporal images at the same time point or different modality images into the same coordinate system.Vari...Deformable medical image registration plays a vital role in medical image applications,such as placing different temporal images at the same time point or different modality images into the same coordinate system.Various strategies have been developed to satisfy the increasing needs of deformable medical image registration.One popular registration method is estimating the displacement field by computing the optical flow between two images.The motion field(flow field)is computed based on either gray-value or handcrafted descriptors such as the scale-invariant feature transform(SIFT).These methods assume that illumination is constant between images.However,medical images may not always satisfy this assumption.In this study,we propose a metric learning-based motion estimation method called Siamese Flow for deformable medical image registration.We train metric learners using a Siamese network,which produces an image patch descriptor that guarantees a smaller feature distance in two similar anatomical structures and a larger feature distance in two dissimilar anatomical structures.In the proposed registration framework,the flow field is computed based on such features and is close to the real deformation field due to the excellent feature representation ability of the Siamese network.Experimental results demonstrate that the proposed method outperforms the Demons,SIFT Flow,Elastix,and VoxelMorph networks regarding registration accuracy and robustness,particularly with large deformations.展开更多
Health monitoring of electro-mechanical actuator(EMA)is critical to ensure the security of airplanes.It is difficult or even impossible to collect enough labeled failure or degradation data from actual EMA.The autoenc...Health monitoring of electro-mechanical actuator(EMA)is critical to ensure the security of airplanes.It is difficult or even impossible to collect enough labeled failure or degradation data from actual EMA.The autoencoder based on reconstruction loss is a popular model that can carry out anomaly detection with only consideration of normal training data,while it fails to capture spatio-temporal information from multivariate time series signals of multiple monitoring sensors.To mine the spatio-temporal information from multivariate time series signals,this paper proposes an attention graph stacked autoencoder for EMA anomaly detection.Firstly,attention graph con-volution is introduced into autoencoder to convolve temporal information from neighbor features to current features based on different weight attentions.Secondly,stacked autoencoder is applied to mine spatial information from those new aggregated temporal features.Finally,based on the bench-mark reconstruction loss of normal training data,different health thresholds calculated by several statistic indicators can carry out anomaly detection for new testing data.In comparison with tra-ditional stacked autoencoder,the proposed model could obtain higher fault detection rate and lower false alarm rate in EMA anomaly detection experiment.展开更多
The intensive application of deep learning in medical image processing has facilitated the advancement of automatic retinal vessel segmentation research.To overcome the limitation that traditional U-shaped vessel segm...The intensive application of deep learning in medical image processing has facilitated the advancement of automatic retinal vessel segmentation research.To overcome the limitation that traditional U-shaped vessel segmentation networks fail to extract features in fundus image sufficiently,we propose a novel network(DSeU-net)based on deformable convolution and squeeze excitation residual module.The deformable convolution is utilized to dynamically adjust the receptive field for the feature extraction of retinal vessel.And the squeeze excitation residual module is used to scale the weights of the low-level features so that the network learns the complex relationships of the different feature layers efficiently.We validate the DSeU-net on three public retinal vessel segmentation datasets including DRIVE,CHASEDB1,and STARE,and the experimental results demonstrate the satisfactory segmentation performance of the network.展开更多
Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust a...Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust and accurate correspondences,we propose DSD-MatchingNet for local feature matching in this study.First,we develop a deformable feature extraction module to obtain multilevel feature maps,which harvest contextual information from dynamic receptive fields.The dynamic receptive fields provided by the deformable convolution network ensure that our method obtains dense and robust correspondence.Second,we utilize sparse-to-dense matching with symmetry of correspondence to implement accurate pixel-level matching,which enables our method to produce more accurate correspondences.Result Experiments show that our proposed DSD-MatchingNet achieves a better performance on the image matching benchmark,as well as on the visual localization benchmark.Specifically,our method achieved 91.3%mean matching accuracy on the HPatches dataset and 99.3%visual localization recalls on the Aachen Day-Night dataset.展开更多
With the advent of deep learning,various deep neural network architectures have been proposed to capture the complex spatio-temporal dependencies in traffic data.This paper introduces a novel Deep Bi-directional Adapt...With the advent of deep learning,various deep neural network architectures have been proposed to capture the complex spatio-temporal dependencies in traffic data.This paper introduces a novel Deep Bi-directional Adaptive Gating Graph Convolutional Network(DBAG-GCN)model for spatio-temporal traffic forecasting.The proposed model leverages the power of graph convolutional networks to capture the spatial dependencies in the road network topology and incorporates bi-directional gating mechanisms to control the information flow adaptively.Furthermore,we introduce a multi-scale temporal convolution module to capture multi-scale temporal dynamics and a contextual attention mechanism to integrate external factors such as weather conditions and event information.Extensive experiments on real-world traffic datasets demonstrate the superior performance of DBAG-GCN compared to state-of-the-art baselines,achieving significant improvements in prediction accuracy and computational efficiency.The DBAG-GCN model provides a powerful and flexible framework for spatio-temporal traffic forecasting,paving the way for intelligent transportation management and urban planning.展开更多
Two actual rocks drilled from a typical ultra-deep hydrocarbon reservoir in the Tarim Basin are selected to conduct in-situ stress-loading micro-focus CT scanning experiments.The gray images of rock microstructure at ...Two actual rocks drilled from a typical ultra-deep hydrocarbon reservoir in the Tarim Basin are selected to conduct in-situ stress-loading micro-focus CT scanning experiments.The gray images of rock microstructure at different stress loading stages are obtained.The U-Net fully convolutional neural network is utilized to achieve fine semantic segmentation of rock skeleton,pore space,and microfractures based on CT slice images of deep rocks.The three-dimensional digital rock models of deformed multiscale fractured-porous media at different stress loading stages are thereafter reconstructed,and the equivalent fracture-pore network models are finally extracted to explore the underlying mechanisms of gas-water two-phase flow at the pore-scale.Results indicate that,in the process of insitu stress loading,both the deep rocks have experienced three stages:linear elastic deformation,nonlinear plastic deformation,and shear failure.The micro-mechanical behavior greatly affects the dynamic deformation of rock microstructure and gas-water two-phase flow.In the linear elastic deformation stage,with the increase in in-situ stress,both the deep rocks are gradually compacted,leading to decreases in average pore radius,pore throat ratio,tortuosity,and water-phase relative permeability,while the coordination number nearly remains unchanged.In the plastic deformation stage,the synergistic influence of rock compaction and existence of micro-fractures typically exert a great effect on pore-throat topological properties and gas-water relative permeability.In the shear failure stage,due to the generation and propagation of micro-fractures inside the deep rock,the topological connectivity becomes better,fluid flow paths increase,and flow conductivity is promoted,thus leading to sharp increases in average pore radius and coordination number,rapid decreases in pore throat ratio and tortuosity,as well as remarkable improvement in relative permeability of gas phase and waterphase.展开更多
为解决由于无人机视角下毛竹林的形状和纹理复杂,现有方法在分割精度和鲁棒性方面表现不佳的问题,提出了一种应用跨领域适应和偏移量引导的毛竹林分割网络——BFSNet。以百山祖国家公园为试验区,利用无人机拍摄周边毛竹林图像构建数据...为解决由于无人机视角下毛竹林的形状和纹理复杂,现有方法在分割精度和鲁棒性方面表现不佳的问题,提出了一种应用跨领域适应和偏移量引导的毛竹林分割网络——BFSNet。以百山祖国家公园为试验区,利用无人机拍摄周边毛竹林图像构建数据集。为增强模型的特征提取能力,提出跨领域适应模块以有效利用源模型的强特征提取能力,并结合自主学习提取适用于毛竹林分割任务的特征,利用两者的优势进行互补。为提高模型对于不同形状毛竹林的识别和定位能力,结合可变形卷积的偏移量引导模块,引入可学习的偏移量参数,以适应不同形状的毛竹林目标。将BFSNet在DeepGlobe Land Cover Classification Challenge和自制数据集上进行模型训练和测试,并与多种主流图像分割方法进行对比。结果表明:BFSNet在交并比、Dice系数、精确率和召回率4项指标上均取得了最优的性能表现,分别获得了76.04%和71.93%的交并比。与多种主流的图像分割模型相比,BFSNet在毛竹林的分割效果方面表现最为出色,对毛竹林形状的精确建模能力能够有效地应对不同形态的毛竹林。展开更多
Extracting implicit anomaly information through deformation monitoring data mining is highly significant to determining dam safety status.As an intelligent singular value diagnostic method for concrete dam deformation...Extracting implicit anomaly information through deformation monitoring data mining is highly significant to determining dam safety status.As an intelligent singular value diagnostic method for concrete dam deformation monitoring, shallow neural network models result in local optima and overfitting, and require manual feature extraction.To obtain an intelligent singular value diagnosis model that can be used for dam safety monitoring, a convolutional neural network (CNN) model that has advantages of deep learning (DL), such as automatic feature extraction, good model fitting, and strong generalizability, was trained in this study.An engineering example shows that the predicted result of the intelligent singular value diagnostic method based on CNN is highly compatible with the confusion matrix, with a precision of 92.41%, receiver operating characteristic (ROC) coordinates of (0.03, 0.97), an area-under-curve (AUC) value of 0.99, and an F1-score of 0.91.Moreover, the performance of the CNN model is better than those of models based on decision tree (DT) and k-nearest neighbor (KNN) methods.Therefore, the intelligent singular value diagnostic method based on CNN is simple to operate, highly intelligent, and highly reliable, and it has a high potential for application in engineering.展开更多
Existing learning-based super-resolution (SR) reconstruction algorithms are mainly designed for single image, which ignore the spatio-temporal relationship between video frames. Aiming at applying the advantages of ...Existing learning-based super-resolution (SR) reconstruction algorithms are mainly designed for single image, which ignore the spatio-temporal relationship between video frames. Aiming at applying the advantages of learning-based algorithms to video SR field, a novel video SR reconstruction algorithm based on deep convolutional neural network (CNN) and spatio-temporal similarity (STCNN-SR) was proposed in this paper. It is a deep learning method for video SR reconstruction, which considers not onlv the mapping relationship among associated low-resolution (LR) and high-resolution (HR) image blocks, but also the spatio-temporal non-local complementary and redundant information between adjacent low-resolution video frames. The reconstruction speed can be improved obviously with the pre-trained end-to-end reconstructed coefficients. Moreover, the performance of video SR will be further improved by the optimization process with spatio-temporal similarity. Experimental results demonstrated that the proposed algorithm achieves a competitive SR quality on both subjective and objective evaluations, when compared to other state-of-the-art algorithms.展开更多
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
基金National Key Research and Development Program of China(No.2017YFC0405806)。
文摘Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the ability to simulate geometric transformations.Therefore,a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation.Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture.To overcome this shortcoming,the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation.The proposed method can easily be trained by end-to-end using standard backpropagation algorithms.Finally,the proposed method is tested on the ISPRS dataset.The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset.
基金supported in part by the National Science Foundation of China under Grant 62001236in part by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant 20KJA520003.
文摘In the textile industry,the presence of defects on the surface of fabric is an essential factor in determining fabric quality.Therefore,identifying fabric defects forms a crucial part of the fabric production process.Traditional fabric defect detection algorithms can only detect specific materials and specific fabric defect types;in addition,their detection efficiency is low,and their detection results are relatively poor.Deep learning-based methods have many advantages in the field of fabric defect detection,however,such methods are less effective in identifying multiscale fabric defects and defects with complex shapes.Therefore,we propose an effective algorithm,namely multilayer feature extraction combined with deformable convolution(MFDC),for fabric defect detection.In MFDC,multi-layer feature extraction is used to fuse the underlying location features with high-level classification features through a horizontally connected top-down architecture to improve the detection of multi-scale fabric defects.On this basis,a deformable convolution is added to solve the problem of the algorithm’s weak detection ability of irregularly shaped fabric defects.In this approach,Roi Align and Cascade-RCNN are integrated to enhance the adaptability of the algorithm in materials with complex patterned backgrounds.The experimental results show that the MFDC algorithm can achieve good detection results for both multi-scale fabric defects and defects with complex shapes,at the expense of a small increase in detection time.
基金supported by The Henan Province Science and Technology Research Project(242102211046)the Key Scientific Research Project of Higher Education Institutions in Henan Province(25A520039)+1 种基金theNatural Science Foundation project of Zhongyuan Institute of Technology(K2025YB011)the Zhongyuan University of Technology Graduate Education and Teaching Reform Research Project(JG202424).
文摘Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to perform multi-perspective learning of temporal signals and Electrocardiogram images, nor can they fully extract the latent information within the data, falling short of the accuracy required by clinicians. Therefore, this paper proposes an innovative hybrid multimodal spatiotemporal neural network to address these challenges. The model employs a multimodal data augmentation framework integrating visual and signal-based features to enhance the classification performance of rare arrhythmias in imbalanced datasets. Additionally, the spatiotemporal fusion module incorporates a spatiotemporal graph convolutional network to jointly model temporal and spatial features, uncovering complex dependencies within the Electrocardiogram data and improving the model’s ability to represent complex patterns. In experiments conducted on the MIT-BIH arrhythmia dataset, the model achieved 99.95% accuracy, 99.80% recall, and a 99.78% F1 score. The model was further validated for generalization using the clinical INCART arrhythmia dataset, and the results demonstrated its effectiveness in terms of both generalization and robustness.
基金Supported by the Key R&D Program of Gansu Province(No.23YFGA0063)the Key Talent Project of Gansu Province(No.2024RCXM57,2024RCXM22)the Major Science and Technology Special Program of Gansu Province(No.25ZYJA037).
文摘Traffic flow prediction is a crucial element of intelligent transportation systems.However,accu-rate traffic flow prediction is quite challenging because of its highly nonlinear,complex,and dynam-ic characteristics.To address the difficulties in simultaneously capturing local and global dynamic spatiotemporal correlations in traffic flow,as well as the high time complexity of existing models,a multi-head flow attention-based local-global dynamic hypergraph convolution(MFA-LGDHC)pre-diction model is proposed.which consists of multi-head flow attention(MHFA)mechanism,graph convolution network(GCN),and local-global dynamic hypergraph convolution(LGHC).MHFA is utilized to extract the time dependency of traffic flow and reduce the time complexity of the model.GCN is employed to catch the spatial dependency of traffic flow.LGHC utilizes down-sampling con-volution and isometric convolution to capture the local and global spatial dependencies of traffic flow.And dynamic hypergraph convolution is used to model the dynamic higher-order relationships of the traffic road network.Experimental results indicate that the MFA-LGDHC model outperforms current popular baseline models and exhibits good prediction performance.
文摘This paper presents CW-HRNet,a high-resolution,lightweight crack segmentation network designed to address challenges in complex scenes with slender,deformable,and blurred crack structures.The model incorporates two key modules:Constrained Deformable Convolution(CDC),which stabilizes geometric alignment by applying a tanh limiter and learnable scaling factor to the predicted offsets,and the Wavelet Frequency Enhancement Module(WFEM),which decomposes features using Haar wavelets to preserve low-frequency structures while enhancing high-frequency boundaries and textures.Evaluations on the CrackSeg9k benchmark demonstrate CW-HRNet’s superior performance,achieving 82.39%mIoU with only 7.49M parameters and 10.34 GFLOPs,outperforming HrSegNet-B48 by 1.83% in segmentation accuracy with minimal complexity overhead.The model also shows strong cross-dataset generalization,achieving 60.01%mIoU and 66.22%F1 on Asphalt3k without fine-tuning.These results highlight CW-HRNet’s favorable accuracyefficiency trade-off for real-world crack segmentation tasks.
基金supported by Beijing Insititute of Technology Research Fund Program for Young Scholars(2020X04104)。
文摘In this paper,an improved spatio-temporal alignment measurement method is presented to address the inertial matching measurement of hull deformation under the coexistence of time delay and large misalignment angle.Large misalignment angle and time delay often occur simultaneously and bring great challenges to the accurate measurement of hull deformation in space and time.The proposed method utilizes coarse alignment with large misalignment angle and time delay estimation of inertial measurement unit modeling to establish a brand-new spatiotemporal aligned hull deformation measurement model.In addition,two-step loop control is designed to ensure the accurate description of dynamic deformation angle and static deformation angle by the time-space alignment method of hull deformation.The experiments illustrate that the proposed method can effectively measure the hull deformation angle when time delay and large misalignment angle coexist.
基金supported by the National Natural Science Foundation of China[grant number 41671452].
文摘Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.
基金This study was supported in part by the Sichuan Science and Technology Program(2019YFH0085,2019ZDZX0005,2019YFG0196)in part by the Foundation of Chengdu University of Information Technology(No.KYTZ202008).
文摘Deformable medical image registration plays a vital role in medical image applications,such as placing different temporal images at the same time point or different modality images into the same coordinate system.Various strategies have been developed to satisfy the increasing needs of deformable medical image registration.One popular registration method is estimating the displacement field by computing the optical flow between two images.The motion field(flow field)is computed based on either gray-value or handcrafted descriptors such as the scale-invariant feature transform(SIFT).These methods assume that illumination is constant between images.However,medical images may not always satisfy this assumption.In this study,we propose a metric learning-based motion estimation method called Siamese Flow for deformable medical image registration.We train metric learners using a Siamese network,which produces an image patch descriptor that guarantees a smaller feature distance in two similar anatomical structures and a larger feature distance in two dissimilar anatomical structures.In the proposed registration framework,the flow field is computed based on such features and is close to the real deformation field due to the excellent feature representation ability of the Siamese network.Experimental results demonstrate that the proposed method outperforms the Demons,SIFT Flow,Elastix,and VoxelMorph networks regarding registration accuracy and robustness,particularly with large deformations.
基金supported by the National Natural Science Foundation of China (No.52075349)the National Natural Science Foundation of China (No.62303335)+1 种基金the Postdoctoral Researcher Program of China (No.GZC20231779)the Natural Science Foundation of Sichuan Province (No.2022NSFSC1942).
文摘Health monitoring of electro-mechanical actuator(EMA)is critical to ensure the security of airplanes.It is difficult or even impossible to collect enough labeled failure or degradation data from actual EMA.The autoencoder based on reconstruction loss is a popular model that can carry out anomaly detection with only consideration of normal training data,while it fails to capture spatio-temporal information from multivariate time series signals of multiple monitoring sensors.To mine the spatio-temporal information from multivariate time series signals,this paper proposes an attention graph stacked autoencoder for EMA anomaly detection.Firstly,attention graph con-volution is introduced into autoencoder to convolve temporal information from neighbor features to current features based on different weight attentions.Secondly,stacked autoencoder is applied to mine spatial information from those new aggregated temporal features.Finally,based on the bench-mark reconstruction loss of normal training data,different health thresholds calculated by several statistic indicators can carry out anomaly detection for new testing data.In comparison with tra-ditional stacked autoencoder,the proposed model could obtain higher fault detection rate and lower false alarm rate in EMA anomaly detection experiment.
基金Beijing Natural Science Foundation(No.IS23112)Beijing Institute of Technology Research Fund Program for Young Scholars(No.6120220236)。
文摘The intensive application of deep learning in medical image processing has facilitated the advancement of automatic retinal vessel segmentation research.To overcome the limitation that traditional U-shaped vessel segmentation networks fail to extract features in fundus image sufficiently,we propose a novel network(DSeU-net)based on deformable convolution and squeeze excitation residual module.The deformable convolution is utilized to dynamically adjust the receptive field for the feature extraction of retinal vessel.And the squeeze excitation residual module is used to scale the weights of the low-level features so that the network learns the complex relationships of the different feature layers efficiently.We validate the DSeU-net on three public retinal vessel segmentation datasets including DRIVE,CHASEDB1,and STARE,and the experimental results demonstrate the satisfactory segmentation performance of the network.
基金Supported by the National Natural Science Foundation of China under Grants 61872241,62077037 and 62272298in part by Shanghai Municipal Science and Technology Major Project under Grant 2021SHZDZX0102。
文摘Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust and accurate correspondences,we propose DSD-MatchingNet for local feature matching in this study.First,we develop a deformable feature extraction module to obtain multilevel feature maps,which harvest contextual information from dynamic receptive fields.The dynamic receptive fields provided by the deformable convolution network ensure that our method obtains dense and robust correspondence.Second,we utilize sparse-to-dense matching with symmetry of correspondence to implement accurate pixel-level matching,which enables our method to produce more accurate correspondences.Result Experiments show that our proposed DSD-MatchingNet achieves a better performance on the image matching benchmark,as well as on the visual localization benchmark.Specifically,our method achieved 91.3%mean matching accuracy on the HPatches dataset and 99.3%visual localization recalls on the Aachen Day-Night dataset.
基金supported by the National Natural Science Foundation of China(Nos.62202247 and 62306073)the National Key Research and Development Program of China(No.2022ZD0115303).
文摘With the advent of deep learning,various deep neural network architectures have been proposed to capture the complex spatio-temporal dependencies in traffic data.This paper introduces a novel Deep Bi-directional Adaptive Gating Graph Convolutional Network(DBAG-GCN)model for spatio-temporal traffic forecasting.The proposed model leverages the power of graph convolutional networks to capture the spatial dependencies in the road network topology and incorporates bi-directional gating mechanisms to control the information flow adaptively.Furthermore,we introduce a multi-scale temporal convolution module to capture multi-scale temporal dynamics and a contextual attention mechanism to integrate external factors such as weather conditions and event information.Extensive experiments on real-world traffic datasets demonstrate the superior performance of DBAG-GCN compared to state-of-the-art baselines,achieving significant improvements in prediction accuracy and computational efficiency.The DBAG-GCN model provides a powerful and flexible framework for spatio-temporal traffic forecasting,paving the way for intelligent transportation management and urban planning.
基金supported by the National Natural Science Foundation of China(No.52174043)the Beijing Natural Science Foundation(No.3242019)+1 种基金the CNPC Innovation Foundation(No.2022DQ02-0208)the State Key Laboratory of Deep Oil and Gas(No.SKLD0G2024-KFZD-06).
文摘Two actual rocks drilled from a typical ultra-deep hydrocarbon reservoir in the Tarim Basin are selected to conduct in-situ stress-loading micro-focus CT scanning experiments.The gray images of rock microstructure at different stress loading stages are obtained.The U-Net fully convolutional neural network is utilized to achieve fine semantic segmentation of rock skeleton,pore space,and microfractures based on CT slice images of deep rocks.The three-dimensional digital rock models of deformed multiscale fractured-porous media at different stress loading stages are thereafter reconstructed,and the equivalent fracture-pore network models are finally extracted to explore the underlying mechanisms of gas-water two-phase flow at the pore-scale.Results indicate that,in the process of insitu stress loading,both the deep rocks have experienced three stages:linear elastic deformation,nonlinear plastic deformation,and shear failure.The micro-mechanical behavior greatly affects the dynamic deformation of rock microstructure and gas-water two-phase flow.In the linear elastic deformation stage,with the increase in in-situ stress,both the deep rocks are gradually compacted,leading to decreases in average pore radius,pore throat ratio,tortuosity,and water-phase relative permeability,while the coordination number nearly remains unchanged.In the plastic deformation stage,the synergistic influence of rock compaction and existence of micro-fractures typically exert a great effect on pore-throat topological properties and gas-water relative permeability.In the shear failure stage,due to the generation and propagation of micro-fractures inside the deep rock,the topological connectivity becomes better,fluid flow paths increase,and flow conductivity is promoted,thus leading to sharp increases in average pore radius and coordination number,rapid decreases in pore throat ratio and tortuosity,as well as remarkable improvement in relative permeability of gas phase and waterphase.
文摘为解决由于无人机视角下毛竹林的形状和纹理复杂,现有方法在分割精度和鲁棒性方面表现不佳的问题,提出了一种应用跨领域适应和偏移量引导的毛竹林分割网络——BFSNet。以百山祖国家公园为试验区,利用无人机拍摄周边毛竹林图像构建数据集。为增强模型的特征提取能力,提出跨领域适应模块以有效利用源模型的强特征提取能力,并结合自主学习提取适用于毛竹林分割任务的特征,利用两者的优势进行互补。为提高模型对于不同形状毛竹林的识别和定位能力,结合可变形卷积的偏移量引导模块,引入可学习的偏移量参数,以适应不同形状的毛竹林目标。将BFSNet在DeepGlobe Land Cover Classification Challenge和自制数据集上进行模型训练和测试,并与多种主流图像分割方法进行对比。结果表明:BFSNet在交并比、Dice系数、精确率和召回率4项指标上均取得了最优的性能表现,分别获得了76.04%和71.93%的交并比。与多种主流的图像分割模型相比,BFSNet在毛竹林的分割效果方面表现最为出色,对毛竹林形状的精确建模能力能够有效地应对不同形态的毛竹林。
基金supported by the National Natural Science Foundation of China(Grant No.51579207)the Open Foundation of State Key Laboratory Base of Eco-Hydraulic Engineering in Arid Area(Grant No.2016ZZKT-8)the Key Projects of Natural Science Basic Research Program of Shaanxi Province(Grant No.2018JZ5010)
文摘Extracting implicit anomaly information through deformation monitoring data mining is highly significant to determining dam safety status.As an intelligent singular value diagnostic method for concrete dam deformation monitoring, shallow neural network models result in local optima and overfitting, and require manual feature extraction.To obtain an intelligent singular value diagnosis model that can be used for dam safety monitoring, a convolutional neural network (CNN) model that has advantages of deep learning (DL), such as automatic feature extraction, good model fitting, and strong generalizability, was trained in this study.An engineering example shows that the predicted result of the intelligent singular value diagnostic method based on CNN is highly compatible with the confusion matrix, with a precision of 92.41%, receiver operating characteristic (ROC) coordinates of (0.03, 0.97), an area-under-curve (AUC) value of 0.99, and an F1-score of 0.91.Moreover, the performance of the CNN model is better than those of models based on decision tree (DT) and k-nearest neighbor (KNN) methods.Therefore, the intelligent singular value diagnostic method based on CNN is simple to operate, highly intelligent, and highly reliable, and it has a high potential for application in engineering.
基金supported by the National Natural Science Foundation of China (61320106006, 61532006, 61502042)
文摘Existing learning-based super-resolution (SR) reconstruction algorithms are mainly designed for single image, which ignore the spatio-temporal relationship between video frames. Aiming at applying the advantages of learning-based algorithms to video SR field, a novel video SR reconstruction algorithm based on deep convolutional neural network (CNN) and spatio-temporal similarity (STCNN-SR) was proposed in this paper. It is a deep learning method for video SR reconstruction, which considers not onlv the mapping relationship among associated low-resolution (LR) and high-resolution (HR) image blocks, but also the spatio-temporal non-local complementary and redundant information between adjacent low-resolution video frames. The reconstruction speed can be improved obviously with the pre-trained end-to-end reconstructed coefficients. Moreover, the performance of video SR will be further improved by the optimization process with spatio-temporal similarity. Experimental results demonstrated that the proposed algorithm achieves a competitive SR quality on both subjective and objective evaluations, when compared to other state-of-the-art algorithms.