Objective N6-methyladenosine(m6A),the most prevalent epigenetic modification in eukaryotic RNA,plays a pivotal role in regulating cellular differentiation and developmental processes,with its dysregulation implicated ...Objective N6-methyladenosine(m6A),the most prevalent epigenetic modification in eukaryotic RNA,plays a pivotal role in regulating cellular differentiation and developmental processes,with its dysregulation implicated in diverse pathological conditions.Accurate prediction of m6A sites is critical for elucidating their regulatory mechanisms and informing drug development.However,traditional experimental methods are time-consuming and costly.Although various computational approaches have been proposed,challenges remain in feature learning,predictive accuracy,and generalization.Here,we present m6A-PSRA,a dual-branch residual-network-based predictor that fully exploits RNA sequence information to enhance prediction performance and model generalization.Methods m6A-PSRA adopts a parallel dual-branch network architecture to comprehensively extract RNA sequence features via two independent pathways.The first branch applies one-hot encoding to transform the RNA sequence into a numerical matrix while strictly preserving positional information and sequence continuity.This ensures that the biological context conveyed by nucleotide order is retained.A bidirectional long short-term memory network(BiLSTM)then processes the encoded matrix,capturing both forward and backward dependencies between bases to resolve contextual correlations.The second branch employs a k-mer tokenization strategy(k=3),decomposing the sequence into overlapping 3-mer subsequences to capture local sequence patterns.A pre-trained Doc2vec model maps these subsequences into fixeddimensional vectors,reducing feature dimensionality while extracting latent global semantic information via context learning.Both branches integrate residual networks(ResNet)and a self-attention mechanism:ResNet mitigates vanishing gradients through skip connections,preserving feature integrity,while self-attention adaptively assigns weights to focus on sequence regions most relevant to methylation prediction.This synergy enhances both feature learning and generalization capability.Results Across 11 tissues from humans,mice,and rats,m6A-PSRA consistently outperformed existing methods in accuracy(ACC)and area under the curve(AUC),achieving>90%ACC and>95%AUC in every tissue tested,indicating strong cross-species and cross-tissue adaptability.Validation on independent datasets—including three human cell lines(MOLM1,HEK293,A549)and a long-sequence dataset(m6A_IND,1001 nt)—confirmed stable performance across varied biological contexts and sequence lengths.Ablation studies demonstrated that the dual-branch architecture,residual network,and self-attention mechanism each contribute critically to performance,with their combination reducing interference between pathways.Motif analysis revealed an enrichment of m6A sites in guanine(G)and cytosine(C),consistent with known regulatory patterns,supporting the model’s biological plausibility.Conclusion m6A-PSRA effectively captures RNA sequence features,achieving high prediction accuracy and robust generalization across tissues and species,providing an efficient computational tool for m6A methylation site prediction.展开更多
Bioluminescent tomography(BLT)is a noninvasive imaging technology that uses optical methods to study physiological and pathological processes at the cellular and molecular levels.It is a powerful tool for early diagno...Bioluminescent tomography(BLT)is a noninvasive imaging technology that uses optical methods to study physiological and pathological processes at the cellular and molecular levels.It is a powerful tool for early diagnosis and treatment of tumors,as well as drug development.However,the simplified optical transmission models and the ill-posed inverse reconstruction limit its wide applications.The development of deep learning has provided new potential for extending the applications of optical BLT.Researchers have introduced various methods such as neural networks and self-attention mechanisms to improve reconstruction accuracy.Despite these efforts,weak energy points around the reconstructed light source center still impact the accuracy of restoration.In this study,we propose a dual-branch network based on a combination of attention mechanism and fully connected layers(FC-AM)to reduce centroid error and improve reconstruction performance.The network architecture consists of a fully connected(FC)subnetwork and an attention mechanism-based dual-branch(AMDB)subnetwork.The FC subnetwork is used to process input data.AMDB subnetwork is used for deep feature extraction,and captures feature information from different perspectives in parallel.Each branch of the AMDB subnetwork is composed of four AM subnets,which extract features through multilayer linear transformations and attention mechanisms.The outputs of the AMDB are combined through feature fusion to produce the final result.Numerical simulations and experimental results demonstrate that the FC-AM network significantly improves BLT reconstruction performance compared to existing methods(KNN_LC and AMLC networks),offering enhanced stability and accuracy.展开更多
Image inpainting refers to synthesizing missing content in an image based on known information to restore occluded or damaged regions,which is a typical manifestation of this trend.With the increasing complexity of im...Image inpainting refers to synthesizing missing content in an image based on known information to restore occluded or damaged regions,which is a typical manifestation of this trend.With the increasing complexity of image in tasks and the growth of data scale,existing deep learning methods still have some limitations.For example,they lack the ability to capture long-range dependencies and their performance in handling multi-scale image structures is suboptimal.To solve this problem,the paper proposes an image inpainting method based on the parallel dual-branch learnable Transformer network.The encoder of the proposed model generator consists of a dual-branch parallel structure with stacked CNN blocks and Transformer blocks,aiming to extract global and local feature information from images.Furthermore,a dual-branch fusion module is adopted to combine the features obtained from both branches.Additionally,a gated full-scale skip connection module is proposed to further enhance the coherence of the inpainting results and alleviate information loss.Finally,experimental results from the three public datasets demonstrate the superior performance of the proposed method.展开更多
Brain tumor classification is crucial for personalized treatment planning.Although deep learning-based Artificial Intelligence(AI)models can automatically analyze tumor images,fine details of small tumor regions may b...Brain tumor classification is crucial for personalized treatment planning.Although deep learning-based Artificial Intelligence(AI)models can automatically analyze tumor images,fine details of small tumor regions may be overlooked during global feature extraction.Therefore,we propose a brain tumor Magnetic Resonance Imaging(MRI)classification model based on a global-local parallel dual-branch structure.The global branch employs ResNet50 with a Multi-Head Self-Attention(MHSA)to capture global contextual information from whole brain images,while the local branch utilizes VGG16 to extract fine-grained features from segmented brain tumor regions.The features from both branches are processed through designed attention-enhanced feature fusion module to filter and integrate important features.Additionally,to address sample imbalance in the dataset,we introduce a category attention block to improve the recognition of minority classes.Experimental results indicate that our method achieved a classification accuracy of 98.04%and a micro-average Area Under the Curve(AUC)of 0.989 in the classification of three types of brain tumors,surpassing several existing pre-trained Convolutional Neural Network(CNN)models.Additionally,feature interpretability analysis validated the effectiveness of the proposed model.This suggests that the method holds significant potential for brain tumor image classification.展开更多
Gaze estimation,a crucial non-verbal communication cue,has achieved remarkable progress through convolutional neural networks.However,accurate gaze prediction in uncon-strained environments,particularly in extreme hea...Gaze estimation,a crucial non-verbal communication cue,has achieved remarkable progress through convolutional neural networks.However,accurate gaze prediction in uncon-strained environments,particularly in extreme head poses,partial occlusions,and abnormal lighting,remains challenging.Existing models often struggle to effectively focus on discriminative ocular features,leading to suboptimal performance.To address these limitations,this paper proposes dual-branch gaze estimation with Gaussian mixture distribution heatmaps and dynamic adaptive loss function(DMGDL),a novel dual-branch gaze estimation algorithm.By introducing Gaussian mixture distribution heatmaps centered on pupil positions as spatial attention guides,the model is enabled to prioritize ocular regions.Additionally,a dual-branch network architecture is designed to separately extract features for yaw and pitch angles,enhancing flexibility and mitigating cross-angle interference.A dynamic adaptive loss function is further formulated to address discontinuities in angle estimation,improving robustness and convergence stability.Experimental evaluations on three benchmark datasets demonstrate that DMGDL outperforms state-of-the-art methods,achiev-ing a mean angular error of 3.98°on the Max-Planck institute for informatics face gaze(MPI-IFaceGaze)dataset,10.21°on the physically unconstrained gaze estimation in the wild(Gaze360)dataset and 6.14°on the real-time eye gaze estimation in natural environments(RT-Gene)dataset,exhibiting superior generalization and robustness.展开更多
A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decod...A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet.展开更多
By introducing the dimensional splitting(DS)method into the multiscale interpolating element-free Galerkin(VMIEFG)method,a dimension-splitting multiscale interpolating element-free Galerkin(DS-VMIEFG)method is propose...By introducing the dimensional splitting(DS)method into the multiscale interpolating element-free Galerkin(VMIEFG)method,a dimension-splitting multiscale interpolating element-free Galerkin(DS-VMIEFG)method is proposed for three-dimensional(3D)singular perturbed convection-diffusion(SPCD)problems.In the DSVMIEFG method,the 3D problem is decomposed into a series of 2D problems by the DS method,and the discrete equations on the 2D splitting surface are obtained by the VMIEFG method.The improved interpolation-type moving least squares(IIMLS)method is used to construct shape functions in the weak form and to combine 2D discrete equations into a global system of discrete equations for the three-dimensional SPCD problems.The solved numerical example verifies the effectiveness of the method in this paper for the 3D SPCD problems.The numerical solution will gradually converge to the analytical solution with the increase in the number of nodes.For extremely small singular diffusion coefficients,the numerical solution will avoid numerical oscillation and has high computational stability.展开更多
Sputum smear tests are critical for the diagnosis of respiratory diseases. Automatic segmentation of bacteria from spu-tum smear images is important for improving diagnostic efficiency. However, this remains a challen...Sputum smear tests are critical for the diagnosis of respiratory diseases. Automatic segmentation of bacteria from spu-tum smear images is important for improving diagnostic efficiency. However, this remains a challenging task owing to the high interclass similarity among different categories of bacteria and the low contrast of the bacterial edges. To explore more levels of global pattern features to promote the distinguishing ability of bacterial categories and main-tain sufficient local fine-grained features to ensure accurate localization of ambiguous bacteria simultaneously, we propose a novel dual-branch deformable cross-attention fusion network (DB-DCAFN) for accurate bacterial segmen-tation. Specifically, we first designed a dual-branch encoder consisting of multiple convolution and transformer blocks in parallel to simultaneously extract multilevel local and global features. We then designed a sparse and deformable cross-attention module to capture the semantic dependencies between local and global features, which can bridge the semantic gap and fuse features effectively. Furthermore, we designed a feature assignment fusion module to enhance meaningful features using an adaptive feature weighting strategy to obtain more accurate segmentation. We conducted extensive experiments to evaluate the effectiveness of DB-DCAFN on a clinical dataset comprising three bacterial categories: Acinetobacter baumannii, Klebsiella pneumoniae, and Pseudomonas aeruginosa. The experi-mental results demonstrate that the proposed DB-DCAFN outperforms other state-of-the-art methods and is effective at segmenting bacteria from sputum smear images.展开更多
Aiming at the problem of insufficient feature extraction in single scale neural network model and the problem that convolutional neural network cannot process sequential tasks in the classification of EEG signals in d...Aiming at the problem of insufficient feature extraction in single scale neural network model and the problem that convolutional neural network cannot process sequential tasks in the classification of EEG signals in depression,a hybrid model(BFTCNet)of dualbranch convolutional neural network(Bi_CNN)and temporal convolutional network(TCN)based on feature recalibration(FR)was proposed to classify EEG signals of depressed patients and healthy controls.Firstly,Bi_CNN module was used to extract the mixed EEG features between different frequency bands and different channels.Secondly,FR module was used to enhance the features extracted by Bi_CNN.Finally,TCN with dilated causal convolution was used for the sequence learning to capture the temporal dependency between features.In this study,128 EEG channels of resting-state(closed-eye)EEG data from the public dataset MODMA were used as experimental data,including 29 healthy controls and 24 depression patients.The performance of the model was evaluated by the 10-fold cross validation method.The proposed BFTCNet achieves a classification accuracy of 95.98%,F1 score value of 95.47%,sensitivity and specificity of 94.21%and 97.50%,respectively.Compared with the single-scale network model EEGNet-8,2,the classification accuracy and F1 value are improved by 1.5%and 1.48%,respectively.Meanwhile,the ablation experiment proved that each sub-module had its contribution to the improvement of the model’s classification ability.展开更多
Extracting useful details from images is essential for the Internet of Things project.However,in real life,various external environments,such as badweather conditions,will cause the occlusion of key target information...Extracting useful details from images is essential for the Internet of Things project.However,in real life,various external environments,such as badweather conditions,will cause the occlusion of key target information and image distortion,resulting in difficulties and obstacles to the extraction of key information,affecting the judgment of the real situation in the process of the Internet of Things,and causing system decision-making errors and accidents.In this paper,we mainly solve the problem of rain on the image occlusion,remove the rain grain in the image,and get a clear image without rain.Therefore,the single image deraining algorithm is studied,and a dual-branch network structure based on the attention module and convolutional neural network(CNN)module is proposed to accomplish the task of rain removal.In order to complete the rain removal of a single image with high quality,we apply the spatial attention module,channel attention module and CNN module to the network structure,and build the network using the coder-decoder structure.In the experiment,with the structural similarity(SSIM)and the peak signal-to-noise ratio(PSNR)as evaluation indexes,the training and testing results on the rain removal dataset show that the proposed structure has a good effect on the single image deraining task.展开更多
In nature, the most possible reason that helix is chosen as the basic structure of life molecule is based on its simplest chiral three-dimensional structure. The process of the conversion from chemical molecules to do...In nature, the most possible reason that helix is chosen as the basic structure of life molecule is based on its simplest chiral three-dimensional structure. The process of the conversion from chemical molecules to double helical molecules is completed by the topology effect which belongs to the simplest way to form helix, and no external power is needed;moreover, the energy of double helix has fixed drive direction [1]. The dual-branch loop helix (II)—the transition state of double helix has many uses, for example, it can be turned to double helix, and it may be broken into two fragments of a and b which can construct more complicated structures. So the dual- branch loop helix (II) can provide special "building block" of assembling biomembrane and other life molecules.展开更多
Monocular 6D pose estimation is a functional task in the field of com-puter vision and robotics.In recent years,2D-3D correspondence-based methods have achieved improved performance in multiview and depth data-based s...Monocular 6D pose estimation is a functional task in the field of com-puter vision and robotics.In recent years,2D-3D correspondence-based methods have achieved improved performance in multiview and depth data-based scenes.However,for monocular 6D pose estimation,these methods are affected by the prediction results of the 2D-3D correspondences and the robustness of the per-spective-n-point(PnP)algorithm.There is still a difference in the distance from the expected estimation effect.To obtain a more effective feature representation result,edge enhancement is proposed to increase the shape information of the object by analyzing the influence of inaccurate 2D-3D matching on 6D pose regression and comparing the effectiveness of the intermediate representation.Furthermore,although the transformation matrix is composed of rotation and translation matrices from 3D model points to 2D pixel points,the two variables are essentially different and the same network cannot be used for both variables in the regression process.Therefore,to improve the effectiveness of the PnP algo-rithm,this paper designs a dual-branch PnP network to predict rotation and trans-lation information.Finally,the proposed method is verified on the public LM,LM-O and YCB-Video datasets.The ADD(S)values of the proposed method are 94.2 and 62.84 on the LM and LM-O datasets,respectively.The AUC of ADD(-S)value on YCB-Video is 81.1.These experimental results show that the performance of the proposed method is superior to that of similar methods.展开更多
Manhole cover defect recognition is of significant practical importance as it can accurately identify damaged or missing covers, enabling timely replacement and maintenance. Traditional manhole cover detection techniq...Manhole cover defect recognition is of significant practical importance as it can accurately identify damaged or missing covers, enabling timely replacement and maintenance. Traditional manhole cover detection techniques primarily focus on detecting the presence of covers rather than classifying the types of defects. However, manhole cover defects exhibit small inter-class feature differences and large intra-class feature variations, which makes their recognition challenging. To improve the classification of manhole cover defect types, we propose a Progressive Dual-Branch Feature Fusion Network (PDBFFN). The baseline backbone network adopts a multi-stage hierarchical architecture design using Res-Net50 as the visual feature extractor, from which both local and global information is obtained. Additionally, a Feature Enhancement Module (FEM) and a Fusion Module (FM) are introduced to enhance the network’s ability to learn critical features. Experimental results demonstrate that our model achieves a classification accuracy of 82.6% on a manhole cover defect dataset, outperforming several state-of-the-art fine-grained image classification models.展开更多
To enhance the accuracy of deep learning methods based on reconstruction discrepancy in satellite anomaly detection tasks,this study proposes a dual-branch reconstruction model(DBRM)and designs a comprehensive satelli...To enhance the accuracy of deep learning methods based on reconstruction discrepancy in satellite anomaly detection tasks,this study proposes a dual-branch reconstruction model(DBRM)and designs a comprehensive satellite anomaly detection framework around this model.Firstly,we introduce the temporal-channel mixer(TC-Mixer)module,which mainly comprises a self-attention layer for capturing long-range temporal dependencies in telemetry data,and two types of feed-forward networks(FFN)for extract-ing complex patterns in the temporal and channel dimension of telemetry data.This design endows the TC-Mixer module with robust capabilities for extracting complicated dependencies in telemetry data.Secondly,with the TC-Mixer module as the main component,we designed the DBRM.This model utilizes a shared latent representation layer,allowing the regeneration branch and forecasting branch of the DBRM to share most of the feature extraction network architecture.This approach significantly en-hances the model’s regression accuracy while reducing computational complexity.Thirdly,using the DBRM as the core network model,we devised a comprehensive satellite anomaly detection framework.This includes an anomaly criterion that considers the reconstruction discrepancy of both the regeneration and forecasting branches,the peak-over-threshold(POT)method for anomaly thresholding,and the MIC-based feature engineering method,etc.Finally,we conducted comparative experiments with several SOTA anomaly detection algorithms on two public and one private satellite anomaly detection datasets.The experimental results validate the effectiveness and superiority of our proposed method.展开更多
Tactile sensing provides robots the ability of object recognition,fine operation,natural interaction,etc.However,in the actual scenario,robotic tactile recognition of similar objects still faces difficulties such as l...Tactile sensing provides robots the ability of object recognition,fine operation,natural interaction,etc.However,in the actual scenario,robotic tactile recognition of similar objects still faces difficulties such as low efficiency and accuracy,resulting from a lack of high-performance sensors and intelligent recognition algorithms.In this paper,a flexible sensor combining a pyramidal microstructure with a gradient conformal ionic gel coating was demonstrated,exhibiting excellent signal-to-noise ratio(48 dB),low detection limit(1 Pa),high sensitivity(92.96 kPa^(-1)),fast response time(55 ms),and outstanding stability over 15,000 compression-release cycles.Furthermore,a Pressure-Slip Dual-Branch Convolutional Neural Network(PSNet)architecture was proposed to separately extract hardness and texture features and perform feature fusion.In tactile experiments on different kinds of leaves,a recognition rate of 97.16%was achieved,and surpassed that of human hands recognition(72.5%).These researches showed the great potential in a broad application in bionic robots,intelligent prostheses,and precise human–computer interaction.展开更多
基金supported by grants from The National Natural Science Foundation of China(12361104)Yunnan Fundamental Research Projects(202301AT070016,202401AT070036)+2 种基金the Youth Talent Program of Xingdian Talent Support Plan(XDYC-QNRC-2022-0514)the Yunnan Province International Joint Laboratory for Intelligent Integration and Application of Ethnic Multilingualism(202403AP140014)the Open Research Fund of Yunnan Key Laboratory of Statistical Modeling and Data Analysis,Yunnan University(SMDAYB2023004)。
文摘Objective N6-methyladenosine(m6A),the most prevalent epigenetic modification in eukaryotic RNA,plays a pivotal role in regulating cellular differentiation and developmental processes,with its dysregulation implicated in diverse pathological conditions.Accurate prediction of m6A sites is critical for elucidating their regulatory mechanisms and informing drug development.However,traditional experimental methods are time-consuming and costly.Although various computational approaches have been proposed,challenges remain in feature learning,predictive accuracy,and generalization.Here,we present m6A-PSRA,a dual-branch residual-network-based predictor that fully exploits RNA sequence information to enhance prediction performance and model generalization.Methods m6A-PSRA adopts a parallel dual-branch network architecture to comprehensively extract RNA sequence features via two independent pathways.The first branch applies one-hot encoding to transform the RNA sequence into a numerical matrix while strictly preserving positional information and sequence continuity.This ensures that the biological context conveyed by nucleotide order is retained.A bidirectional long short-term memory network(BiLSTM)then processes the encoded matrix,capturing both forward and backward dependencies between bases to resolve contextual correlations.The second branch employs a k-mer tokenization strategy(k=3),decomposing the sequence into overlapping 3-mer subsequences to capture local sequence patterns.A pre-trained Doc2vec model maps these subsequences into fixeddimensional vectors,reducing feature dimensionality while extracting latent global semantic information via context learning.Both branches integrate residual networks(ResNet)and a self-attention mechanism:ResNet mitigates vanishing gradients through skip connections,preserving feature integrity,while self-attention adaptively assigns weights to focus on sequence regions most relevant to methylation prediction.This synergy enhances both feature learning and generalization capability.Results Across 11 tissues from humans,mice,and rats,m6A-PSRA consistently outperformed existing methods in accuracy(ACC)and area under the curve(AUC),achieving>90%ACC and>95%AUC in every tissue tested,indicating strong cross-species and cross-tissue adaptability.Validation on independent datasets—including three human cell lines(MOLM1,HEK293,A549)and a long-sequence dataset(m6A_IND,1001 nt)—confirmed stable performance across varied biological contexts and sequence lengths.Ablation studies demonstrated that the dual-branch architecture,residual network,and self-attention mechanism each contribute critically to performance,with their combination reducing interference between pathways.Motif analysis revealed an enrichment of m6A sites in guanine(G)and cytosine(C),consistent with known regulatory patterns,supporting the model’s biological plausibility.Conclusion m6A-PSRA effectively captures RNA sequence features,achieving high prediction accuracy and robust generalization across tissues and species,providing an efficient computational tool for m6A methylation site prediction.
基金supported by the National Natural Science Foundation of China(62101439)the Key Research and Development Program of Shaanxi(2023-YBSF-289).
文摘Bioluminescent tomography(BLT)is a noninvasive imaging technology that uses optical methods to study physiological and pathological processes at the cellular and molecular levels.It is a powerful tool for early diagnosis and treatment of tumors,as well as drug development.However,the simplified optical transmission models and the ill-posed inverse reconstruction limit its wide applications.The development of deep learning has provided new potential for extending the applications of optical BLT.Researchers have introduced various methods such as neural networks and self-attention mechanisms to improve reconstruction accuracy.Despite these efforts,weak energy points around the reconstructed light source center still impact the accuracy of restoration.In this study,we propose a dual-branch network based on a combination of attention mechanism and fully connected layers(FC-AM)to reduce centroid error and improve reconstruction performance.The network architecture consists of a fully connected(FC)subnetwork and an attention mechanism-based dual-branch(AMDB)subnetwork.The FC subnetwork is used to process input data.AMDB subnetwork is used for deep feature extraction,and captures feature information from different perspectives in parallel.Each branch of the AMDB subnetwork is composed of four AM subnets,which extract features through multilayer linear transformations and attention mechanisms.The outputs of the AMDB are combined through feature fusion to produce the final result.Numerical simulations and experimental results demonstrate that the FC-AM network significantly improves BLT reconstruction performance compared to existing methods(KNN_LC and AMLC networks),offering enhanced stability and accuracy.
基金supported by Scientific Research Fund of Hunan Provincial Natural Science Foundation under Grant 20231J60257Hunan Provincial Engineering Research Center for Intelligent Rehabilitation Robotics and Assistive Equipment under Grant 2025SH501Inha University and Design of a Conflict Detection and Validation Tool under Grant HX2024123.
文摘Image inpainting refers to synthesizing missing content in an image based on known information to restore occluded or damaged regions,which is a typical manifestation of this trend.With the increasing complexity of image in tasks and the growth of data scale,existing deep learning methods still have some limitations.For example,they lack the ability to capture long-range dependencies and their performance in handling multi-scale image structures is suboptimal.To solve this problem,the paper proposes an image inpainting method based on the parallel dual-branch learnable Transformer network.The encoder of the proposed model generator consists of a dual-branch parallel structure with stacked CNN blocks and Transformer blocks,aiming to extract global and local feature information from images.Furthermore,a dual-branch fusion module is adopted to combine the features obtained from both branches.Additionally,a gated full-scale skip connection module is proposed to further enhance the coherence of the inpainting results and alleviate information loss.Finally,experimental results from the three public datasets demonstrate the superior performance of the proposed method.
文摘Brain tumor classification is crucial for personalized treatment planning.Although deep learning-based Artificial Intelligence(AI)models can automatically analyze tumor images,fine details of small tumor regions may be overlooked during global feature extraction.Therefore,we propose a brain tumor Magnetic Resonance Imaging(MRI)classification model based on a global-local parallel dual-branch structure.The global branch employs ResNet50 with a Multi-Head Self-Attention(MHSA)to capture global contextual information from whole brain images,while the local branch utilizes VGG16 to extract fine-grained features from segmented brain tumor regions.The features from both branches are processed through designed attention-enhanced feature fusion module to filter and integrate important features.Additionally,to address sample imbalance in the dataset,we introduce a category attention block to improve the recognition of minority classes.Experimental results indicate that our method achieved a classification accuracy of 98.04%and a micro-average Area Under the Curve(AUC)of 0.989 in the classification of three types of brain tumors,surpassing several existing pre-trained Convolutional Neural Network(CNN)models.Additionally,feature interpretability analysis validated the effectiveness of the proposed model.This suggests that the method holds significant potential for brain tumor image classification.
基金supported by the Key Project of the NationalLanguage Commission(No.ZDI145-110)the AcademicResearch Projects of Beijing Union University(No.ZK20202514)+1 种基金the Key Laboratory Project(No.YYZN-2024-6)the Project for the Construction and Support of High-Level Innovative Teams in Beijing Municipal Institutions(No.BPHR20220121).
文摘Gaze estimation,a crucial non-verbal communication cue,has achieved remarkable progress through convolutional neural networks.However,accurate gaze prediction in uncon-strained environments,particularly in extreme head poses,partial occlusions,and abnormal lighting,remains challenging.Existing models often struggle to effectively focus on discriminative ocular features,leading to suboptimal performance.To address these limitations,this paper proposes dual-branch gaze estimation with Gaussian mixture distribution heatmaps and dynamic adaptive loss function(DMGDL),a novel dual-branch gaze estimation algorithm.By introducing Gaussian mixture distribution heatmaps centered on pupil positions as spatial attention guides,the model is enabled to prioritize ocular regions.Additionally,a dual-branch network architecture is designed to separately extract features for yaw and pitch angles,enhancing flexibility and mitigating cross-angle interference.A dynamic adaptive loss function is further formulated to address discontinuities in angle estimation,improving robustness and convergence stability.Experimental evaluations on three benchmark datasets demonstrate that DMGDL outperforms state-of-the-art methods,achiev-ing a mean angular error of 3.98°on the Max-Planck institute for informatics face gaze(MPI-IFaceGaze)dataset,10.21°on the physically unconstrained gaze estimation in the wild(Gaze360)dataset and 6.14°on the real-time eye gaze estimation in natural environments(RT-Gene)dataset,exhibiting superior generalization and robustness.
基金supported in part by the National Key R&D Program of China(Grant No.2023YFB3307604)the Shanxi Province Basic Research Program Youth Science Research Project(Grant Nos.202303021212054 and 202303021212046)+3 种基金the Key Projects Supported by Hebei Natural Science Foundation(Grant No.E2024203125)the National Science Foundation of China(Grant No.52105391)the Hebei Provincial Science and Technology Major Project(Grant No.23280101Z)the National Key Laboratory of Metal Forming Technology and Heavy Equipment Open Fund(Grant No.S2308100.W17).
文摘A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet.
基金supported by the Natural Science Foundation of Zhejiang Province,China(Grant Nos.LY20A010021,LY19A010002,LY20G030025)the Natural Science Founda-tion of Ningbo City,China(Grant Nos.2021J147,2021J235).
文摘By introducing the dimensional splitting(DS)method into the multiscale interpolating element-free Galerkin(VMIEFG)method,a dimension-splitting multiscale interpolating element-free Galerkin(DS-VMIEFG)method is proposed for three-dimensional(3D)singular perturbed convection-diffusion(SPCD)problems.In the DSVMIEFG method,the 3D problem is decomposed into a series of 2D problems by the DS method,and the discrete equations on the 2D splitting surface are obtained by the VMIEFG method.The improved interpolation-type moving least squares(IIMLS)method is used to construct shape functions in the weak form and to combine 2D discrete equations into a global system of discrete equations for the three-dimensional SPCD problems.The solved numerical example verifies the effectiveness of the method in this paper for the 3D SPCD problems.The numerical solution will gradually converge to the analytical solution with the increase in the number of nodes.For extremely small singular diffusion coefficients,the numerical solution will avoid numerical oscillation and has high computational stability.
基金the Natural Science Foundation of Shandong Province,No.ZR2021MH213and in part by the Suzhou Science and Technology Bureau,No.SJC2021023.
文摘Sputum smear tests are critical for the diagnosis of respiratory diseases. Automatic segmentation of bacteria from spu-tum smear images is important for improving diagnostic efficiency. However, this remains a challenging task owing to the high interclass similarity among different categories of bacteria and the low contrast of the bacterial edges. To explore more levels of global pattern features to promote the distinguishing ability of bacterial categories and main-tain sufficient local fine-grained features to ensure accurate localization of ambiguous bacteria simultaneously, we propose a novel dual-branch deformable cross-attention fusion network (DB-DCAFN) for accurate bacterial segmen-tation. Specifically, we first designed a dual-branch encoder consisting of multiple convolution and transformer blocks in parallel to simultaneously extract multilevel local and global features. We then designed a sparse and deformable cross-attention module to capture the semantic dependencies between local and global features, which can bridge the semantic gap and fuse features effectively. Furthermore, we designed a feature assignment fusion module to enhance meaningful features using an adaptive feature weighting strategy to obtain more accurate segmentation. We conducted extensive experiments to evaluate the effectiveness of DB-DCAFN on a clinical dataset comprising three bacterial categories: Acinetobacter baumannii, Klebsiella pneumoniae, and Pseudomonas aeruginosa. The experi-mental results demonstrate that the proposed DB-DCAFN outperforms other state-of-the-art methods and is effective at segmenting bacteria from sputum smear images.
基金supported by Natural Science Foundation of Gansu Province(No.21JR11RA062)University Innovation Fund of Gansu Province(No.2022A-047).
文摘Aiming at the problem of insufficient feature extraction in single scale neural network model and the problem that convolutional neural network cannot process sequential tasks in the classification of EEG signals in depression,a hybrid model(BFTCNet)of dualbranch convolutional neural network(Bi_CNN)and temporal convolutional network(TCN)based on feature recalibration(FR)was proposed to classify EEG signals of depressed patients and healthy controls.Firstly,Bi_CNN module was used to extract the mixed EEG features between different frequency bands and different channels.Secondly,FR module was used to enhance the features extracted by Bi_CNN.Finally,TCN with dilated causal convolution was used for the sequence learning to capture the temporal dependency between features.In this study,128 EEG channels of resting-state(closed-eye)EEG data from the public dataset MODMA were used as experimental data,including 29 healthy controls and 24 depression patients.The performance of the model was evaluated by the 10-fold cross validation method.The proposed BFTCNet achieves a classification accuracy of 95.98%,F1 score value of 95.47%,sensitivity and specificity of 94.21%and 97.50%,respectively.Compared with the single-scale network model EEGNet-8,2,the classification accuracy and F1 value are improved by 1.5%and 1.48%,respectively.Meanwhile,the ablation experiment proved that each sub-module had its contribution to the improvement of the model’s classification ability.
基金supported by the NationalNatural Science Foundation of China(No.62001272).
文摘Extracting useful details from images is essential for the Internet of Things project.However,in real life,various external environments,such as badweather conditions,will cause the occlusion of key target information and image distortion,resulting in difficulties and obstacles to the extraction of key information,affecting the judgment of the real situation in the process of the Internet of Things,and causing system decision-making errors and accidents.In this paper,we mainly solve the problem of rain on the image occlusion,remove the rain grain in the image,and get a clear image without rain.Therefore,the single image deraining algorithm is studied,and a dual-branch network structure based on the attention module and convolutional neural network(CNN)module is proposed to accomplish the task of rain removal.In order to complete the rain removal of a single image with high quality,we apply the spatial attention module,channel attention module and CNN module to the network structure,and build the network using the coder-decoder structure.In the experiment,with the structural similarity(SSIM)and the peak signal-to-noise ratio(PSNR)as evaluation indexes,the training and testing results on the rain removal dataset show that the proposed structure has a good effect on the single image deraining task.
文摘In nature, the most possible reason that helix is chosen as the basic structure of life molecule is based on its simplest chiral three-dimensional structure. The process of the conversion from chemical molecules to double helical molecules is completed by the topology effect which belongs to the simplest way to form helix, and no external power is needed;moreover, the energy of double helix has fixed drive direction [1]. The dual-branch loop helix (II)—the transition state of double helix has many uses, for example, it can be turned to double helix, and it may be broken into two fragments of a and b which can construct more complicated structures. So the dual- branch loop helix (II) can provide special "building block" of assembling biomembrane and other life molecules.
基金This work was supported by the National Natural Science Foundation of China(No.61871196 and 62001176)the Natural Science Foundation of Fujian Province of China(No.2019J01082 and 2020J01085)the Promotion Program for Young and Middle-aged Teachers in Science and Technology Research of Huaqiao University(ZQN-YX601).
文摘Monocular 6D pose estimation is a functional task in the field of com-puter vision and robotics.In recent years,2D-3D correspondence-based methods have achieved improved performance in multiview and depth data-based scenes.However,for monocular 6D pose estimation,these methods are affected by the prediction results of the 2D-3D correspondences and the robustness of the per-spective-n-point(PnP)algorithm.There is still a difference in the distance from the expected estimation effect.To obtain a more effective feature representation result,edge enhancement is proposed to increase the shape information of the object by analyzing the influence of inaccurate 2D-3D matching on 6D pose regression and comparing the effectiveness of the intermediate representation.Furthermore,although the transformation matrix is composed of rotation and translation matrices from 3D model points to 2D pixel points,the two variables are essentially different and the same network cannot be used for both variables in the regression process.Therefore,to improve the effectiveness of the PnP algo-rithm,this paper designs a dual-branch PnP network to predict rotation and trans-lation information.Finally,the proposed method is verified on the public LM,LM-O and YCB-Video datasets.The ADD(S)values of the proposed method are 94.2 and 62.84 on the LM and LM-O datasets,respectively.The AUC of ADD(-S)value on YCB-Video is 81.1.These experimental results show that the performance of the proposed method is superior to that of similar methods.
文摘Manhole cover defect recognition is of significant practical importance as it can accurately identify damaged or missing covers, enabling timely replacement and maintenance. Traditional manhole cover detection techniques primarily focus on detecting the presence of covers rather than classifying the types of defects. However, manhole cover defects exhibit small inter-class feature differences and large intra-class feature variations, which makes their recognition challenging. To improve the classification of manhole cover defect types, we propose a Progressive Dual-Branch Feature Fusion Network (PDBFFN). The baseline backbone network adopts a multi-stage hierarchical architecture design using Res-Net50 as the visual feature extractor, from which both local and global information is obtained. Additionally, a Feature Enhancement Module (FEM) and a Fusion Module (FM) are introduced to enhance the network’s ability to learn critical features. Experimental results demonstrate that our model achieves a classification accuracy of 82.6% on a manhole cover defect dataset, outperforming several state-of-the-art fine-grained image classification models.
基金supported by the Science Center Program of the National Natural Science Foundation of China(Grant No.62188101)SiYuan Col-laborative Innovation Alliance of Artificial Intelligence Science and Technol-ogy(Grant No.HTKJ2023SY502003)+1 种基金Heilongjiang Touyan Team,Guang-dong Major Project of Basic and Applied Basic Research(Grant No.2019B030302001)Shanghai Aerospace Science and Technology Inno-vation Foundation(Grant No.SAST2021-033).
文摘To enhance the accuracy of deep learning methods based on reconstruction discrepancy in satellite anomaly detection tasks,this study proposes a dual-branch reconstruction model(DBRM)and designs a comprehensive satellite anomaly detection framework around this model.Firstly,we introduce the temporal-channel mixer(TC-Mixer)module,which mainly comprises a self-attention layer for capturing long-range temporal dependencies in telemetry data,and two types of feed-forward networks(FFN)for extract-ing complex patterns in the temporal and channel dimension of telemetry data.This design endows the TC-Mixer module with robust capabilities for extracting complicated dependencies in telemetry data.Secondly,with the TC-Mixer module as the main component,we designed the DBRM.This model utilizes a shared latent representation layer,allowing the regeneration branch and forecasting branch of the DBRM to share most of the feature extraction network architecture.This approach significantly en-hances the model’s regression accuracy while reducing computational complexity.Thirdly,using the DBRM as the core network model,we devised a comprehensive satellite anomaly detection framework.This includes an anomaly criterion that considers the reconstruction discrepancy of both the regeneration and forecasting branches,the peak-over-threshold(POT)method for anomaly thresholding,and the MIC-based feature engineering method,etc.Finally,we conducted comparative experiments with several SOTA anomaly detection algorithms on two public and one private satellite anomaly detection datasets.The experimental results validate the effectiveness and superiority of our proposed method.
基金supported by the Open Project of the State Key Laboratory of Trauma and Chemical Poisoning(SKL202102)the Key R&D and Transformation of Science and Technology Projects in Tibet Autonomous Region(XZ2022RH001)+3 种基金Chongqing Talents Program(CQYC2020030146)the Project of Chongqing Science and Technology Bureau(cstc2021ycjh-bgzxm0345)Chongqing Bayu Scholar Program(DP2020036)Chongqing Entrepreneurship and Innovation Support Program for Overseas Students Returning to China.
文摘Tactile sensing provides robots the ability of object recognition,fine operation,natural interaction,etc.However,in the actual scenario,robotic tactile recognition of similar objects still faces difficulties such as low efficiency and accuracy,resulting from a lack of high-performance sensors and intelligent recognition algorithms.In this paper,a flexible sensor combining a pyramidal microstructure with a gradient conformal ionic gel coating was demonstrated,exhibiting excellent signal-to-noise ratio(48 dB),low detection limit(1 Pa),high sensitivity(92.96 kPa^(-1)),fast response time(55 ms),and outstanding stability over 15,000 compression-release cycles.Furthermore,a Pressure-Slip Dual-Branch Convolutional Neural Network(PSNet)architecture was proposed to separately extract hardness and texture features and perform feature fusion.In tactile experiments on different kinds of leaves,a recognition rate of 97.16%was achieved,and surpassed that of human hands recognition(72.5%).These researches showed the great potential in a broad application in bionic robots,intelligent prostheses,and precise human–computer interaction.