期刊文献+
共找到36篇文章
< 1 2 >
每页显示 20 50 100
Multi-Head Attention Enhanced Parallel Dilated Convolution and Residual Learning for Network Traffic Anomaly Detection
1
作者 Guorong Qi Jian Mao +2 位作者 Kai Huang Zhengxian You Jinliang Lin 《Computers, Materials & Continua》 2025年第2期2159-2176,共18页
Abnormal network traffic, as a frequent security risk, requires a series of techniques to categorize and detect it. Existing network traffic anomaly detection still faces challenges: the inability to fully extract loc... Abnormal network traffic, as a frequent security risk, requires a series of techniques to categorize and detect it. Existing network traffic anomaly detection still faces challenges: the inability to fully extract local and global features, as well as the lack of effective mechanisms to capture complex interactions between features;Additionally, when increasing the receptive field to obtain deeper feature representations, the reliance on increasing network depth leads to a significant increase in computational resource consumption, affecting the efficiency and performance of detection. Based on these issues, firstly, this paper proposes a network traffic anomaly detection model based on parallel dilated convolution and residual learning (Res-PDC). To better explore the interactive relationships between features, the traffic samples are converted into two-dimensional matrix. A module combining parallel dilated convolutions and residual learning (res-pdc) was designed to extract local and global features of traffic at different scales. By utilizing res-pdc modules with different dilation rates, we can effectively capture spatial features at different scales and explore feature dependencies spanning wider regions without increasing computational resources. Secondly, to focus and integrate the information in different feature subspaces, further enhance and extract the interactions among the features, multi-head attention is added to Res-PDC, resulting in the final model: multi-head attention enhanced parallel dilated convolution and residual learning (MHA-Res-PDC) for network traffic anomaly detection. Finally, comparisons with other machine learning and deep learning algorithms are conducted on the NSL-KDD and CIC-IDS-2018 datasets. The experimental results demonstrate that the proposed method in this paper can effectively improve the detection performance. 展开更多
关键词 Network traffic anomaly detection multi-head attention parallel dilated convolution residual learning
在线阅读 下载PDF
An improved deep dilated convolutional neural network for seismic facies interpretation 被引量:1
2
作者 Na-Xia Yang Guo-Fa Li +2 位作者 Ting-Hui Li Dong-Feng Zhao Wei-Wei Gu 《Petroleum Science》 SCIE EI CAS CSCD 2024年第3期1569-1583,共15页
With the successful application and breakthrough of deep learning technology in image segmentation,there has been continuous development in the field of seismic facies interpretation using convolutional neural network... With the successful application and breakthrough of deep learning technology in image segmentation,there has been continuous development in the field of seismic facies interpretation using convolutional neural networks.These intelligent and automated methods significantly reduce manual labor,particularly in the laborious task of manually labeling seismic facies.However,the extensive demand for training data imposes limitations on their wider application.To overcome this challenge,we adopt the UNet architecture as the foundational network structure for seismic facies classification,which has demonstrated effective segmentation results even with small-sample training data.Additionally,we integrate spatial pyramid pooling and dilated convolution modules into the network architecture to enhance the perception of spatial information across a broader range.The seismic facies classification test on the public data from the F3 block verifies the superior performance of our proposed improved network structure in delineating seismic facies boundaries.Comparative analysis against the traditional UNet model reveals that our method achieves more accurate predictive classification results,as evidenced by various evaluation metrics for image segmentation.Obviously,the classification accuracy reaches an impressive 96%.Furthermore,the results of seismic facies classification in the seismic slice dimension provide further confirmation of the superior performance of our proposed method,which accurately defines the range of different seismic facies.This approach holds significant potential for analyzing geological patterns and extracting valuable depositional information. 展开更多
关键词 Seismic facies interpretation dilated convolution Spatial pyramid pooling Internal feature maps Compound loss function
原文传递
DcNet: Dilated Convolutional Neural Networks for Side-Scan Sonar Image Semantic Segmentation 被引量:2
3
作者 ZHAO Xiaohong QIN Rixia +3 位作者 ZHANG Qilei YU Fei WANG Qi HE Bo 《Journal of Ocean University of China》 SCIE CAS CSCD 2021年第5期1089-1096,共8页
In ocean explorations,side-scan sonar(SSS)plays a very important role and can quickly depict seabed topography.As-sembling the SSS to an autonomous underwater vehicle(AUV)and performing semantic segmentation of an SSS... In ocean explorations,side-scan sonar(SSS)plays a very important role and can quickly depict seabed topography.As-sembling the SSS to an autonomous underwater vehicle(AUV)and performing semantic segmentation of an SSS image in real time can realize online submarine geomorphology or target recognition,which is conducive to submarine detection.However,because of the complexity of the marine environment,various noises in the ocean pollute the sonar image,which also encounters the intensity inhomogeneity problem.In this paper,we propose a novel neural network architecture named dilated convolutional neural network(DcNet)that can run in real time while addressing the above-mentioned issues and providing accurate semantic segmentation.The proposed architecture presents an encoder-decoder network to gradually reduce the spatial dimension of the input image and recover the details of the target,respectively.The core of our network is a novel block connection named DCblock,which mainly uses dilated convolution and depthwise separable convolution between the encoder and decoder to attain more context while still retaining high accuracy.Furthermore,our proposed method performs a super-resolution reconstruction to enlarge the dataset with high-quality im-ages.We compared our network to other common semantic segmentation networks performed on an NVIDIA Jetson TX2 using our sonar image datasets.Experimental results show that while the inference speed of the proposed network significantly outperforms state-of-the-art architectures,the accuracy of our method is still comparable,which indicates its potential applications not only in AUVs equipped with SSS but also in marine exploration. 展开更多
关键词 side-scan sonar(SSS) semantic segmentation dilated convolutions SUPER-RESOLUTION
在线阅读 下载PDF
Long Text Classification Algorithm Using a Hybrid Model of Bidirectional Encoder Representation from Transformers-Hierarchical Attention Networks-Dilated Convolutions Network 被引量:1
4
作者 ZHAO Yuanyuan GAO Shining +1 位作者 LIU Yang GONG Xiaohui 《Journal of Donghua University(English Edition)》 CAS 2021年第4期341-350,共10页
Text format information is full of most of the resources of Internet,which puts forward higher and higher requirements for the accuracy of text classification.Therefore,in this manuscript,firstly,we design a hybrid mo... Text format information is full of most of the resources of Internet,which puts forward higher and higher requirements for the accuracy of text classification.Therefore,in this manuscript,firstly,we design a hybrid model of bidirectional encoder representation from transformers-hierarchical attention networks-dilated convolutions networks(BERT_HAN_DCN)which based on BERT pre-trained model with superior ability of extracting characteristic.The advantages of HAN model and DCN model are taken into account which can help gain abundant semantic information,fusing context semantic features and hierarchical characteristics.Secondly,the traditional softmax algorithm increases the learning difficulty of the same kind of samples,making it more difficult to distinguish similar features.Based on this,AM-softmax is introduced to replace the traditional softmax.Finally,the fused model is validated,which shows superior performance in the accuracy rate and F1-score of this hybrid model on two datasets and the experimental analysis shows the general single models such as HAN,DCN,based on BERT pre-trained model.Besides,the improved AM-softmax network model is superior to the general softmax network model. 展开更多
关键词 long text classification dilated convolution BERT fusing context semantic features hierarchical characteristics BERT_HAN_DCN AM-softmax
在线阅读 下载PDF
Multi⁃Scale Dilated Convolutional Neural Network for Hyperspectral Image Classification
5
作者 Shanshan Zheng Wen Liu +3 位作者 Rui Shan Jingyi Zhao Guoqian Jiang Zhi Zhang 《Journal of Harbin Institute of Technology(New Series)》 CAS 2021年第4期25-32,共8页
Aiming at the problem of image information loss,dilated convolution is introduced and a novel multi⁃scale dilated convolutional neural network(MDCNN)is proposed.Dilated convolution can polymerize image multi⁃scale inf... Aiming at the problem of image information loss,dilated convolution is introduced and a novel multi⁃scale dilated convolutional neural network(MDCNN)is proposed.Dilated convolution can polymerize image multi⁃scale information without reducing the resolution.The first layer of the network used spectral convolutional step to reduce dimensionality.Then the multi⁃scale aggregation extracted multi⁃scale features through applying dilated convolution and shortcut connection.The extracted features which represent properties of data were fed through Softmax to predict the samples.MDCNN achieved the overall accuracy of 99.58% and 99.92% on two public datasets,Indian Pines and Pavia University.Compared with four other existing models,the results illustrate that MDCNN can extract better discriminative features and achieve higher classification performance. 展开更多
关键词 multi⁃scale aggregation dilated convolution hyperspectral image classification(HSIC) shortcut connection
在线阅读 下载PDF
Channel-Attention DenseNet with Dilated Convolutions for MRI Brain Tumor Classification
6
作者 Abdu Salam Mohammad Abrar +5 位作者 Raja Waseem Anwer Farhan Amin Faizan Ullah Isabel de la Torre Gerardo Mendez Mezquita Henry Fabian Gongora 《Computer Modeling in Engineering & Sciences》 2025年第11期2457-2479,共23页
Brain tumors pose significant diagnostic challenges due to their diverse types and complex anatomical locations.Due to the increase in precision image-based diagnostic tools,driven by advancements in artificial intell... Brain tumors pose significant diagnostic challenges due to their diverse types and complex anatomical locations.Due to the increase in precision image-based diagnostic tools,driven by advancements in artificial intelligence(AI)and deep learning,there has been potential to improve diagnostic accuracy,especially with Magnetic Resonance Imaging(MRI).However,traditional state-of-the-art models lack the sensitivity essential for reliable tumor identification and segmentation.Thus,our research aims to enhance brain tumor diagnosis in MRI by proposing an advanced model.The proposed model incorporates dilated convolutions to optimize the brain tumor segmentation and classification.The proposed model is first trained and later evaluated using the BraTS 2020 dataset.In our proposed model preprocessing consists of normalization,noise reduction,and data augmentation to improve model robustness.The attention mechanism and dilated convolutions were introduced to increase the model’s focus on critical regions and capture finer spatial details without compromising image resolution.We have performed experimentation to measure efficiency.For this,we have used various metrics including accuracy,sensitivity,and curve(AUC-ROC).The proposed model achieved a high accuracy of 94%,a sensitivity of 93%,a specificity of 92%,and an AUC-ROC of 0.98,outperforming traditional diagnostic models in brain tumor detection.The proposed model accurately identifies tumor regions,while dilated convolutions enhanced the segmentation accuracy,especially for complex tumor structures.The proposed model demonstrates significant potential for clinical application,providing reliable and precise brain tumor detection in MRI. 展开更多
关键词 Artificial intelligence MRI analysis deep learning dilated convolution DenseNet brain tumor detection brain tumor segmentation
在线阅读 下载PDF
Magnetic Resonance Imaging Reconstruction Based on Butterfly Dilated Geometric Distillation
7
作者 DUO Lin XU Boyu +1 位作者 REN Yong YANG Xin 《Journal of Shanghai Jiaotong university(Science)》 2025年第3期590-599,共10页
In order to improve the reconstruction accuracy of magnetic resonance imaging(MRI),an accurate natural image compressed sensing(CS)reconstruction network is proposed,which combines the advantages of model-based and de... In order to improve the reconstruction accuracy of magnetic resonance imaging(MRI),an accurate natural image compressed sensing(CS)reconstruction network is proposed,which combines the advantages of model-based and deep learning-based CS-MRI methods.In theory,enhancing geometric texture details in linear reconstruction is possible.First,the optimization problem is decomposed into two problems:linear approximation and geometric compensation.Aimed at the problem of image linear approximation,the data consistency module is used to deal with it.Since the processing process will lose texture details,a neural network layer that explicitly combines image and frequency feature representation is proposed,which is named butterfly dilated geometric distillation network.The network introduces the idea of butterfly operation,skillfully integrates the features of image domain and frequency domain,and avoids the loss of texture details when extracting features in a single domain.Finally,a channel feature fusion module is designed by combining channel attention mechanism and dilated convolution.The attention of the channel makes the final output feature map focus on the more important part,thus improving the feature representation ability.The dilated convolution enlarges the receptive field,thereby obtaining more dense image feature data.The experimental results show that the peak signal-to-noise ratio of the network is 5.43 dB,5.24 dB and 3.89 dB higher than that of ISTA-Net+,FISTA and DGDN networks on the brain data set with a Cartesian sampling mask CS ratio of 10%. 展开更多
关键词 butterfly geometric distillation dilation convolution channel attention image reconstruction
原文传递
Hard-rock tunnel lithology identification using multiscale dilated convolutional attention network based on tunnel face images 被引量:1
8
作者 Wenjun ZHANG Wuqi ZHANG +5 位作者 Gaole ZHANG Jun HUANG Minggeng LI Xiaohui WANG Fei YE Xiaoming GUAN 《Frontiers of Structural and Civil Engineering》 SCIE EI CSCD 2023年第12期1796-1812,共17页
For real-time classification of rock-masses in hard-rock tunnels,quick determination of the rock lithology on the tunnel face during construction is essential.Motivated by current breakthroughs in artificial intellige... For real-time classification of rock-masses in hard-rock tunnels,quick determination of the rock lithology on the tunnel face during construction is essential.Motivated by current breakthroughs in artificial intelligence technology in machine vision,a new automatic detection approach for classifying tunnel lithology based on tunnel face images was developed.The method benefits from residual learning for training a deep convolutional neural network(DCNN),and a multi-scale dilated convolutional attention block is proposed.The block with different dilation rates can provide various receptive fields,and thus it can extract multi-scale features.Moreover,the attention mechanism is utilized to select the salient features adaptively and further improve the performance of the model.In this study,an initial image data set made up of photographs of tunnel faces consisting of basalt,granite,siltstone,and tuff was first collected.After classifying and enhancing the training,validation,and testing data sets,a new image data set was generated.A comparison of the experimental findings demonstrated that the suggested approach outperforms previous classifiers in terms of various indicators,including accuracy,precision,recall,F1-score,and computing time.Finally,a visualization analysis was performed to explain the process of the network in the classification of tunnel lithology through feature extraction.Overall,this study demonstrates the potential of using artificial intelligence methods for in situ rock lithology classification utilizing geological images of the tunnel face. 展开更多
关键词 hard-rock tunnel face intelligent lithology identification multi-scale dilated convolutional attention network image classification deep learning
原文传递
Convolution-Transformer for Image Feature Extraction 被引量:2
9
作者 Lirong Yin Lei Wang +10 位作者 Siyu Lu Ruiyang Wang Youshuai Yang Bo Yang Shan Liu Ahmed AlSanad Salman A.AlQahtani Zhengtong Yin Xiaolu Li Xiaobing Chen Wenfeng Zheng 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期87-106,共20页
This study addresses the limitations of Transformer models in image feature extraction,particularly their lack of inductive bias for visual structures.Compared to Convolutional Neural Networks(CNNs),the Transformers a... This study addresses the limitations of Transformer models in image feature extraction,particularly their lack of inductive bias for visual structures.Compared to Convolutional Neural Networks(CNNs),the Transformers are more sensitive to different hyperparameters of optimizers,which leads to a lack of stability and slow convergence.To tackle these challenges,we propose the Convolution-based Efficient Transformer Image Feature Extraction Network(CEFormer)as an enhancement of the Transformer architecture.Our model incorporates E-Attention,depthwise separable convolution,and dilated convolution to introduce crucial inductive biases,such as translation invariance,locality,and scale invariance,into the Transformer framework.Additionally,we implement a lightweight convolution module to process the input images,resulting in faster convergence and improved stability.This results in an efficient convolution combined Transformer image feature extraction network.Experimental results on the ImageNet1k Top-1 dataset demonstrate that the proposed network achieves better accuracy while maintaining high computational speed.It achieves up to 85.0%accuracy across various model sizes on image classification,outperforming various baseline models.When integrated into the Mask Region-ConvolutionalNeuralNetwork(R-CNN)framework as a backbone network,CEFormer outperforms other models and achieves the highest mean Average Precision(mAP)scores.This research presents a significant advancement in Transformer-based image feature extraction,balancing performance and computational efficiency. 展开更多
关键词 TRANSFORMER E-Attention depth convolution dilated convolution CEFormer
在线阅读 下载PDF
DTCC:Multi-level dilated convolution with transformer for weakly-supervised crowd counting
10
作者 Zhuangzhuang Miao Yong Zhang +2 位作者 Yuan Peng Haocheng Peng Baocai Yin 《Computational Visual Media》 SCIE EI CSCD 2023年第4期859-873,共15页
Crowd counting provides an important foundation for public security and urban management.Due to the existence of small targets and large density variations in crowd images,crowd counting is a challenging task.Mainstre... Crowd counting provides an important foundation for public security and urban management.Due to the existence of small targets and large density variations in crowd images,crowd counting is a challenging task.Mainstream methods usually apply convolution neural networks(CNNs)to regress a density map,which requires annotations of individual persons and counts.Weakly-supervised methods can avoid detailed labeling and only require counts as annotations of images,but existing methods fail to achieve satisfactory performance because a global perspective field and multi-level information are usually ignored.We propose a weakly-supervised method,DTCC,which effectively combines multi-level dilated convolution and transformer methods to realize end-to-end crowd counting.Its main components include a recursive swin transformer and a multi-level dilated convolution regression head.The recursive swin transformer combines a pyramid visual transformer with a fine-tuned recursive pyramid structure to capture deep multi-level crowd features,including global features.The multi-level dilated convolution regression head includes multi-level dilated convolution and a linear regression head for the feature extraction module.This module can capture both low-and high-level features simultaneously to enhance the receptive field.In addition,two regression head fusion mechanisms realize dynamic and mean fusion counting.Experiments on four well-known benchmark crowd counting datasets(UCF_CC_50,ShanghaiTech,UCF_QNRF,and JHU-Crowd++)show that DTCC achieves results superior to other weakly-supervised methods and comparable to fully-supervised methods. 展开更多
关键词 crowd counting TRANSFORMER dilated convolution global perspective field PYRAMID
原文传递
TSCND:Temporal Subsequence-Based Convolutional Network with Difference for Time Series Forecasting
11
作者 Haoran Huang Weiting Chen Zheming Fan 《Computers, Materials & Continua》 SCIE EI 2024年第3期3665-3681,共17页
Time series forecasting plays an important role in various fields, such as energy, finance, transport, and weather. Temporal convolutional networks (TCNs) based on dilated causal convolution have been widely used in t... Time series forecasting plays an important role in various fields, such as energy, finance, transport, and weather. Temporal convolutional networks (TCNs) based on dilated causal convolution have been widely used in time series forecasting. However, two problems weaken the performance of TCNs. One is that in dilated casual convolution, causal convolution leads to the receptive fields of outputs being concentrated in the earlier part of the input sequence, whereas the recent input information will be severely lost. The other is that the distribution shift problem in time series has not been adequately solved. To address the first problem, we propose a subsequence-based dilated convolution method (SDC). By using multiple convolutional filters to convolve elements of neighboring subsequences, the method extracts temporal features from a growing receptive field via a growing subsequence rather than a single element. Ultimately, the receptive field of each output element can cover the whole input sequence. To address the second problem, we propose a difference and compensation method (DCM). The method reduces the discrepancies between and within the input sequences by difference operations and then compensates the outputs for the information lost due to difference operations. Based on SDC and DCM, we further construct a temporal subsequence-based convolutional network with difference (TSCND) for time series forecasting. The experimental results show that TSCND can reduce prediction mean squared error by 7.3% and save runtime, compared with state-of-the-art models and vanilla TCN. 展开更多
关键词 DIFFERENCE data prediction time series temporal convolutional network dilated convolution
在线阅读 下载PDF
A Lightweight Convolutional Neural Network with Hierarchical Multi-Scale Feature Fusion for Image Classification 被引量:2
12
作者 Adama Dembele Ronald Waweru Mwangi Ananda Omutokoh Kube 《Journal of Computer and Communications》 2024年第2期173-200,共28页
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso... Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline. 展开更多
关键词 MobileNet Image Classification Lightweight convolutional Neural Network Depthwise dilated Separable convolution Hierarchical Multi-Scale Feature Fusion
在线阅读 下载PDF
A Novel Self-Supervised Learning Network for Binocular Disparity Estimation 被引量:1
13
作者 Jiawei Tian Yu Zhou +5 位作者 Xiaobing Chen Salman A.AlQahtani Hongrong Chen Bo Yang Siyu Lu Wenfeng Zheng 《Computer Modeling in Engineering & Sciences》 SCIE EI 2025年第1期209-229,共21页
Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This st... Two-dimensional endoscopic images are susceptible to interferences such as specular reflections and monotonous texture illumination,hindering accurate three-dimensional lesion reconstruction by surgical robots.This study proposes a novel end-to-end disparity estimation model to address these challenges.Our approach combines a Pseudo-Siamese neural network architecture with pyramid dilated convolutions,integrating multi-scale image information to enhance robustness against lighting interferences.This study introduces a Pseudo-Siamese structure-based disparity regression model that simplifies left-right image comparison,improving accuracy and efficiency.The model was evaluated using a dataset of stereo endoscopic videos captured by the Da Vinci surgical robot,comprising simulated silicone heart sequences and real heart video data.Experimental results demonstrate significant improvement in the network’s resistance to lighting interference without substantially increasing parameters.Moreover,the model exhibited faster convergence during training,contributing to overall performance enhancement.This study advances endoscopic image processing accuracy and has potential implications for surgical robot applications in complex environments. 展开更多
关键词 Parallax estimation parallax regression model self-supervised learning Pseudo-Siamese neural network pyramid dilated convolution binocular disparity estimation
在线阅读 下载PDF
3DMAU-Net:liver segmentation network based on 3D U-Net
14
作者 ZHU Dong MA Tianyi +3 位作者 YANG Mengzhu LI Guoqiang HU Shunbo WANG Yongfang 《Optoelectronics Letters》 2025年第6期370-377,共8页
Considering the three-dimensional(3D) U-Net lacks sufficient local feature extraction for image features and lacks attention to the fusion of high-and low-level features, we propose a new model called 3DMAU-Net based ... Considering the three-dimensional(3D) U-Net lacks sufficient local feature extraction for image features and lacks attention to the fusion of high-and low-level features, we propose a new model called 3DMAU-Net based on the 3D U-Net architecture for liver region segmentation. Our model replaces the last two layers of the 3D U-Net with a sliding window-based multilayer perceptron(SMLP), enabling better extraction of local image features. We also design a high-and low-level feature fusion dilated convolution block that focuses on local features and better supplements the surrounding information of the target region. This block is embedded in the entire encoding process, ensuring that the overall network is not simply downsampling. Before each feature extraction, the input features are processed by the dilated convolution block. We validate our experiments on the liver tumor segmentation challenge 2017(Lits2017) dataset, and our model achieves a Dice coefficient of 0.95, which is an improvement of 0.015 compared to the 3D U-Net model. Furthermore, we compare our results with other segmentation methods, and our model consistently outperforms them. 展开更多
关键词 dilated convolution bl multilayer perceptron liver region segmentation feature extraction liver segmentation sliding window extraction local image features image features
原文传递
TSMS-InceptionNeXt:A Framework for Image-Based Combustion State Recognition in Counterflow Burners via Feature Extraction Optimization
15
作者 Huiling Yu Xibei Jia +1 位作者 Yongfeng Niu Yizhuo Zhang 《Computers, Materials & Continua》 2025年第6期4329-4352,共24页
The counterflow burner is a combustion device used for research on combustion.By utilizing deep convolutional models to identify the combustion state of a counter flow burner through visible flame images,it facilitate... The counterflow burner is a combustion device used for research on combustion.By utilizing deep convolutional models to identify the combustion state of a counter flow burner through visible flame images,it facilitates the optimization of the combustion process and enhances combustion efficiency.Among existing deep convolutional models,InceptionNeXt is a deep learning architecture that integrates the ideas of the Inception series and ConvNeXt.It has garnered significant attention for its computational efficiency,remarkable model accuracy,and exceptional feature extraction capabilities.However,since this model still has limitations in the combustion state recognition task,we propose a Triple-Scale Multi-Stage InceptionNeXt(TSMS-InceptionNeXt)combustion state recognitionmethod based on feature extraction optimization.First,to address the InceptionNeXt model’s limited ability to capture dynamic features in flame images,we introduce Triplet Attention,which applies attention to the width,height,and Red Green Blue(RGB)dimensions of the flame images to enhance its ability to model dynamic features.Secondly,to address the issue of key information loss in the Inception deep convolution layers,we propose a Similarity-based Feature Concentration(SimC)mechanism to enhance the model’s capability to concentrate on critical features.Next,to address the insufficient receptive field of the model,we propose a Multi-Scale Dilated Channel Parallel Integration(MDCPI)mechanism to enhance the model’s ability to extract multi-scale contextual information.Finally,to address the issue of the model’s Multi-Layer Perceptron Head(MlpHead)neglecting channel interactions,we propose a Channel Shuffle-Guided Channel-Spatial Attention(ShuffleCS)mechanism,which integrates information from different channels to further enhance the representational power of the input features.To validate the effectiveness of the method,experiments are conducted on the counterflow burner flame visible light image dataset.The experimental results show that the TSMS-InceptionNeXt model achieved an accuracy of 85.71%on the dataset,improving by 2.38%over the baseline model and outperforming the baseline model’s performance.It achieved accuracy improvements of 10.47%,4.76%,11.19%,and 9.28%compared to the Reparameterized Visual Geometry Group(RepVGG),Squeeze-erunhanced Axial Transoformer(SeaFormer),Simplified Graph Transformers(SGFormer),and VanillaNet models,respectively,effectively enhancing the recognition performance for combustion states in counterflow burners. 展开更多
关键词 Counterflow burner combustion state recognition InceptionNeXt dilated convolution channel shuffling
在线阅读 下载PDF
Prediction of turned surface roughness based on GADF of multi-channel signal fusion and MA-ResNet
16
作者 SHI Lichen LIU Tengfei WANG Haitao 《Journal of Measurement Science and Instrumentation》 2025年第2期302-312,共11页
In order to achieve high precision online prediction of surface roughness during turning process and improve cutting quality,a prediction method of turned surface roughness based on Gramian angular difference field(GA... In order to achieve high precision online prediction of surface roughness during turning process and improve cutting quality,a prediction method of turned surface roughness based on Gramian angular difference field(GADF)of multi-channel signal fusion and multi-scale attention residual network(MA-ResNet)was proposed.Firstly,the multi-channel vibration signals were subdivided into various frequency bands using wavelet packet decomposition,and the sensitive channels were selected for signal fusion by doing correlation analysis between the signals of various frequency bands and the surface roughness.Then the fused signals were converted into pictures using GADF image encoding.Finally,the pictures were inputted into the residual network model combining the parallel dilation convolution and attention module for training and verifying the effectiveness of the model performance.The proposed method has a root mean square error of 0.0187,a mean absolute error of 0.0143,and a coefficient of determination of 0.8694 in predicting the surface roughness,which is close to the actual value.Therefore,the proposed method had good engineering significance for high-precision prediction and was conducive to on-line monitoring of surface quality during workpiece processing. 展开更多
关键词 signal fusion Gramian angular difference field dilated convolution residual network roughness prediction
在线阅读 下载PDF
1D-CNN:Speech Emotion Recognition System Using a Stacked Network with Dilated CNN Features 被引量:6
17
作者 Mustaqeem Soonil Kwon 《Computers, Materials & Continua》 SCIE EI 2021年第6期4039-4059,共21页
Emotion recognition from speech data is an active and emerging area of research that plays an important role in numerous applications,such as robotics,virtual reality,behavior assessments,and emergency call centers.Re... Emotion recognition from speech data is an active and emerging area of research that plays an important role in numerous applications,such as robotics,virtual reality,behavior assessments,and emergency call centers.Recently,researchers have developed many techniques in this field in order to ensure an improvement in the accuracy by utilizing several deep learning approaches,but the recognition rate is still not convincing.Our main aim is to develop a new technique that increases the recognition rate with reasonable cost computations.In this paper,we suggested a new technique,which is a one-dimensional dilated convolutional neural network(1D-DCNN)for speech emotion recognition(SER)that utilizes the hierarchical features learning blocks(HFLBs)with a bi-directional gated recurrent unit(BiGRU).We designed a one-dimensional CNN network to enhance the speech signals,which uses a spectral analysis,and to extract the hidden patterns from the speech signals that are fed into a stacked one-dimensional dilated network that are called HFLBs.Each HFLB contains one dilated convolution layer(DCL),one batch normalization(BN),and one leaky_relu(Relu)layer in order to extract the emotional features using a hieratical correlation strategy.Furthermore,the learned emotional features are feed into a BiGRU in order to adjust the global weights and to recognize the temporal cues.The final state of the deep BiGRU is passed from a softmax classifier in order to produce the probabilities of the emotions.The proposed model was evaluated over three benchmarked datasets that included the IEMOCAP,EMO-DB,and RAVDESS,which achieved 72.75%,91.14%,and 78.01%accuracy,respectively. 展开更多
关键词 Affective computing one-dimensional dilated convolutional neural network emotion recognition gated recurrent unit raw audio clips
在线阅读 下载PDF
Advanced Face Mask Detection Model Using Hybrid Dilation Convolution Based Method 被引量:1
18
作者 Shaohan Wang Xiangyu Wang Xin Guo 《Journal of Software Engineering and Applications》 2023年第1期1-19,共19页
A face-mask object detection model incorporating hybrid dilation convolutional network termed ResNet Hybrid-dilation-convolution Face-mask-detector (RHF) is proposed in this paper. Furthermore, a lightweight face-mask... A face-mask object detection model incorporating hybrid dilation convolutional network termed ResNet Hybrid-dilation-convolution Face-mask-detector (RHF) is proposed in this paper. Furthermore, a lightweight face-mask dataset named Light Masked Face Dataset (LMFD) and a medium-sized face-mask dataset named Masked Face Dataset (MFD) with data augmentation methods applied is also constructed in this paper. The hybrid dilation convolutional network is able to expand the perception of the convolutional kernel without concern about the discontinuity of image information during the convolution process. For the given two datasets being constructed above, the trained models are significantly optimized in terms of detection performance, training time, and other related metrics. By using the MFD dataset of 55,905 images, the RHF model requires roughly 10 hours less training time compared to ResNet50 with better detection results with mAP of 93.45%. 展开更多
关键词 Face Mask Detection Object Detection Hybrid Dilation convolution Computer Vision
在线阅读 下载PDF
Attention-relation network for mobile phone screen defect classification via a few samples 被引量:2
19
作者 Jiao Mao Guoliang Xu +1 位作者 Lijun He Jiangtao Luo 《Digital Communications and Networks》 SCIE CSCD 2024年第4期1113-1120,共8页
How to use a few defect samples to complete the defect classification is a key challenge in the production of mobile phone screens.An attention-relation network for the mobile phone screen defect classification is pro... How to use a few defect samples to complete the defect classification is a key challenge in the production of mobile phone screens.An attention-relation network for the mobile phone screen defect classification is proposed in this paper.The architecture of the attention-relation network contains two modules:a feature extract module and a feature metric module.Different from other few-shot models,an attention mechanism is applied to metric learning in our model to measure the distance between features,so as to pay attention to the correlation between features and suppress unwanted information.Besides,we combine dilated convolution and skip connection to extract more feature information for follow-up processing.We validate attention-relation network on the mobile phone screen defect dataset.The experimental results show that the classification accuracy of the attentionrelation network is 0.9486 under the 5-way 1-shot training strategy and 0.9039 under the 5-way 5-shot setting.It achieves the excellent effect of classification for mobile phone screen defects and outperforms with dominant advantages. 展开更多
关键词 Mobile phone screen defects A few samples Relation network Attention mechanism dilated convolution
在线阅读 下载PDF
Chinese named entity recognition with multi-network fusion of multi-scale lexical information 被引量:1
20
作者 Yan Guo Hong-Chen Liu +3 位作者 Fu-Jiang Liu Wei-Hua Lin Quan-Sen Shao Jun-Shun Su 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第4期53-80,共28页
Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is ... Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is widely used and often yields notable results.However,recognizing each entity with high accuracy remains challenging.Many entities do not appear as single words but as part of complex phrases,making it difficult to achieve accurate recognition using word embedding information alone because the intricate lexical structure often impacts the performance.To address this issue,we propose an improved Bidirectional Encoder Representations from Transformers(BERT)character word conditional random field(CRF)(BCWC)model.It incorporates a pre-trained word embedding model using the skip-gram with negative sampling(SGNS)method,alongside traditional BERT embeddings.By comparing datasets with different word segmentation tools,we obtain enhanced word embedding features for segmented data.These features are then processed using the multi-scale convolution and iterated dilated convolutional neural networks(IDCNNs)with varying expansion rates to capture features at multiple scales and extract diverse contextual information.Additionally,a multi-attention mechanism is employed to fuse word and character embeddings.Finally,CRFs are applied to learn sequence constraints and optimize entity label annotations.A series of experiments are conducted on three public datasets,demonstrating that the proposed method outperforms the recent advanced baselines.BCWC is capable to address the challenge of recognizing complex entities by combining character-level and word-level embedding information,thereby improving the accuracy of CNER.Such a model is potential to the applications of more precise knowledge extraction such as knowledge graph construction and information retrieval,particularly in domain-specific natural language processing tasks that require high entity recognition precision. 展开更多
关键词 Bi-directional long short-term memory(BiLSTM) Chinese named entity recognition(CNER) Iterated dilated convolutional neural network(IDCNN) Multi-network integration Multi-scale lexical features
在线阅读 下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部