With the rapid spread of the coronavirus disease 2019(COVID-19)worldwide,the establishment of an accurate and fast process to diagnose the disease is important.The routine real-time reverse transcription-polymerase ch...With the rapid spread of the coronavirus disease 2019(COVID-19)worldwide,the establishment of an accurate and fast process to diagnose the disease is important.The routine real-time reverse transcription-polymerase chain reaction(rRT-PCR)test that is currently used does not provide such high accuracy or speed in the screening process.Among the good choices for an accurate and fast test to screen COVID-19 are deep learning techniques.In this study,a new convolutional neural network(CNN)framework for COVID-19 detection using computed tomography(CT)images is proposed.The EfficientNet architecture is applied as the backbone structure of the proposed network,in which feature maps with different scales are extracted from the input CT scan images.In addition,atrous convolution at different rates is applied to these multi-scale feature maps to generate denser features,which facilitates in obtaining COVID-19 findings in CT scan images.The proposed framework is also evaluated in this study using a public CT dataset containing 2482 CT scan images from patients of both classes(i.e.,COVID-19 and non-COVID-19).To augment the dataset using additional training examples,adversarial examples generation is performed.The proposed system validates its superiority over the state-of-the-art methods with values exceeding 99.10%in terms of several metrics,such as accuracy,precision,recall,and F1.The proposed system also exhibits good robustness,when it is trained using a small portion of data(20%),with an accuracy of 96.16%.展开更多
A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study.The wearable device consisted of a six-axis sensor,Raspberry Pi 3,and a power bank.M...A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study.The wearable device consisted of a six-axis sensor,Raspberry Pi 3,and a power bank.Multiple kernel sizes were used in convolutional neural network(CNN)to evaluate their performance for extracting features.Moreover,a multiscale CNN with two kernel sizes was used to perform feature fusion at different scales in a concatenated manner.The CNN achieved recognition of the four table tennis strokes.Experimental data were obtained from20 research participants who wore sensors on the back of their hands while performing the four table tennis strokes in a laboratory environment.The data were collected to verify the performance of the proposed models for wearable devices.Finally,the sensor and multi-scale CNN designed in this study achieved accuracy and F1 scores of 99.58%and 99.16%,respectively,for the four strokes.The accuracy for five-fold cross validation was 99.87%.This result also shows that the multi-scale convolutional neural network has better robustness after fivefold cross validation.展开更多
Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectra...Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectral similarity between buildings and backgrounds,sensor variations,and insufficient computational efficiency.To address these challenges,this paper proposes a novel Multi-scale Efficient Wavelet-based Change Detection Network(MewCDNet),which integrates the advantages of Convolutional Neural Networks and Transformers,balances computational costs,and achieves high-performance building change detection.The network employs EfficientNet-B4 as the backbone for hierarchical feature extraction,integrates multi-level feature maps through a multi-scale fusion strategy,and incorporates two key modules:Cross-temporal Difference Detection(CTDD)and Cross-scale Wavelet Refinement(CSWR).CTDD adopts a dual-branch architecture that combines pixel-wise differencing with semanticaware Euclidean distance weighting to enhance the distinction between true changes and background noise.CSWR integrates Haar-based Discrete Wavelet Transform with multi-head cross-attention mechanisms,enabling cross-scale feature fusion while significantly improving edge localization and suppressing spurious changes.Extensive experiments on four benchmark datasets demonstrate MewCDNet’s superiority over comparison methods:achieving F1 scores of 91.54%on LEVIR,93.70%on WHUCD,and 64.96%on S2Looking for building change detection.Furthermore,MewCDNet exhibits optimal performance on the multi-class⋅SYSU dataset(F1:82.71%),highlighting its exceptional generalization capability.展开更多
Rock fracture mechanisms can be inferred from moment tensors(MT)inverted from microseismic events.However,MT can only be inverted for events whose waveforms are acquired across a network of sensors.This is limiting fo...Rock fracture mechanisms can be inferred from moment tensors(MT)inverted from microseismic events.However,MT can only be inverted for events whose waveforms are acquired across a network of sensors.This is limiting for underground mines where the microseismic stations often lack azimuthal coverage.Thus,there is a need for a method to invert fracture mechanisms using waveforms acquired by a sparse microseismic network.Here,we present a novel,multi-scale framework to classify whether a rock crack contracts or dilates based on a single waveform.The framework consists of a deep learning model that is initially trained on 2400000+manually labelled field-scale seismic and microseismic waveforms acquired across 692 stations.Transfer learning is then applied to fine-tune the model on 300000+MT-labelled labscale acoustic emission waveforms from 39 individual experiments instrumented with different sensor layouts,loading,and rock types in training.The optimal model achieves over 86%F-score on unseen waveforms at both the lab-and field-scale.This model outperforms existing empirical methods in classification of rock fracture mechanisms monitored by a sparse microseismic network.This facilitates rapid assessment of,and early warning against,various rock engineering hazard such as induced earthquakes and rock bursts.展开更多
To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network mo...To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.展开更多
Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on ...Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.展开更多
Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often...Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.展开更多
<div style="text-align:justify;"> In the frequency division duplex (FDD) mode of the massive MIMO system, the system needs to perform coding through channel state information (CSI) to obtain performanc...<div style="text-align:justify;"> In the frequency division duplex (FDD) mode of the massive MIMO system, the system needs to perform coding through channel state information (CSI) to obtain performance gains. However, the number of antennas of the base station has been greatly increased, resulting in a rapid increase in the overhead for the user terminal to feedback CSI to the base station. In this article, we propose a method based on multi-task CNN to achieve compression and reconstruction of channel state information through a multi-scale and multi-channel convolutional neural network. We also introduce a dynamic learning rate model to improve the accuracy of channel state information reconstruction. The simulation results show that compared with the original CsiNet and other work, the proposed CSI feedback network has better reconstruction performance. </div>展开更多
Change detection(CD)plays a crucial role in numerous fields,where both convolutional neural networks(CNNs)and Transformers have demonstrated exceptional performance in CD tasks.However,CNNs suffer from limited recepti...Change detection(CD)plays a crucial role in numerous fields,where both convolutional neural networks(CNNs)and Transformers have demonstrated exceptional performance in CD tasks.However,CNNs suffer from limited receptive fields,hindering their ability to capture global features,while Transformers are constrained by high computational complexity.Recently,Mamba architecture,which is based on state space models(SSMs),has shown powerful global modeling capabilities while achieving linear computational complexity.Although some researchers have incorporated Mamba into CD tasks,the existing Mamba⁃based remote sensing CD methods struggle to effectively perceive the inherent locality of changed regions when flattening and scanning remote sensing images,leading to limitations in extracting change features.To address these issues,we propose a novel Mamba⁃based CD method termed difference feature fusion Mamba model(DFFMamba)by mitigating the loss of feature locality caused by traditional Mamba⁃style scanning.Specifically,two distinct difference feature extraction modules are designed:Difference Mamba(DMamba)and local difference Mamba(LDMamba),where DMamba extracts difference features by calculating the difference in coefficient matrices between the state⁃space equations of the bi⁃temporal features.Building upon DMamba,LDMamba combines a locally adaptive state⁃space scanning(LASS)strategy to enhance feature locality so as to accurately extract difference features.Additionally,a fusion Mamba(FMamba)module is proposed,which employs a spatial⁃channel token modeling SSM(SCTMS)unit to integrate multi⁃dimensional spatio⁃temporal interactions of change features,thereby capturing their dependencies across both spatial and channel dimensions.To verify the effectiveness of the proposed DFFMamba,extensive experiments are conducted on three datasets of WHU⁃CD,LEVIR⁃CD,and CLCD.The results demonstrate that DFFMamba significantly outperforms state⁃of⁃the⁃art CD methods,achieving intersection over union(IoU)scores of 90.67%,85.04%,and 66.56%on the three datasets,respectively.展开更多
基金support provided from the Deanship of Scientific Research at King Saud University through the,Research Group No.(RG-1435-050.)。
文摘With the rapid spread of the coronavirus disease 2019(COVID-19)worldwide,the establishment of an accurate and fast process to diagnose the disease is important.The routine real-time reverse transcription-polymerase chain reaction(rRT-PCR)test that is currently used does not provide such high accuracy or speed in the screening process.Among the good choices for an accurate and fast test to screen COVID-19 are deep learning techniques.In this study,a new convolutional neural network(CNN)framework for COVID-19 detection using computed tomography(CT)images is proposed.The EfficientNet architecture is applied as the backbone structure of the proposed network,in which feature maps with different scales are extracted from the input CT scan images.In addition,atrous convolution at different rates is applied to these multi-scale feature maps to generate denser features,which facilitates in obtaining COVID-19 findings in CT scan images.The proposed framework is also evaluated in this study using a public CT dataset containing 2482 CT scan images from patients of both classes(i.e.,COVID-19 and non-COVID-19).To augment the dataset using additional training examples,adversarial examples generation is performed.The proposed system validates its superiority over the state-of-the-art methods with values exceeding 99.10%in terms of several metrics,such as accuracy,precision,recall,and F1.The proposed system also exhibits good robustness,when it is trained using a small portion of data(20%),with an accuracy of 96.16%.
基金supporting of the Ministry of Science and Technology MOST(Grant No.MOST 108–2221-E-150–022-MY3,MOST 110–2634-F-019–002)the National Taiwan Ocean University,China.
文摘A system for classifying four basic table tennis strokes using wearable devices and deep learning networks is proposed in this study.The wearable device consisted of a six-axis sensor,Raspberry Pi 3,and a power bank.Multiple kernel sizes were used in convolutional neural network(CNN)to evaluate their performance for extracting features.Moreover,a multiscale CNN with two kernel sizes was used to perform feature fusion at different scales in a concatenated manner.The CNN achieved recognition of the four table tennis strokes.Experimental data were obtained from20 research participants who wore sensors on the back of their hands while performing the four table tennis strokes in a laboratory environment.The data were collected to verify the performance of the proposed models for wearable devices.Finally,the sensor and multi-scale CNN designed in this study achieved accuracy and F1 scores of 99.58%and 99.16%,respectively,for the four strokes.The accuracy for five-fold cross validation was 99.87%.This result also shows that the multi-scale convolutional neural network has better robustness after fivefold cross validation.
基金supported by the Henan Province Key R&D Project under Grant 241111210400the Henan Provincial Science and Technology Research Project under Grants 252102211047,252102211062,252102211055 and 232102210069+2 种基金the Jiangsu Provincial Scheme Double Initiative Plan JSS-CBS20230474,the XJTLU RDF-21-02-008the Science and Technology Innovation Project of Zhengzhou University of Light Industry under Grant 23XNKJTD0205the Higher Education Teaching Reform Research and Practice Project of Henan Province under Grant 2024SJGLX0126。
文摘Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectral similarity between buildings and backgrounds,sensor variations,and insufficient computational efficiency.To address these challenges,this paper proposes a novel Multi-scale Efficient Wavelet-based Change Detection Network(MewCDNet),which integrates the advantages of Convolutional Neural Networks and Transformers,balances computational costs,and achieves high-performance building change detection.The network employs EfficientNet-B4 as the backbone for hierarchical feature extraction,integrates multi-level feature maps through a multi-scale fusion strategy,and incorporates two key modules:Cross-temporal Difference Detection(CTDD)and Cross-scale Wavelet Refinement(CSWR).CTDD adopts a dual-branch architecture that combines pixel-wise differencing with semanticaware Euclidean distance weighting to enhance the distinction between true changes and background noise.CSWR integrates Haar-based Discrete Wavelet Transform with multi-head cross-attention mechanisms,enabling cross-scale feature fusion while significantly improving edge localization and suppressing spurious changes.Extensive experiments on four benchmark datasets demonstrate MewCDNet’s superiority over comparison methods:achieving F1 scores of 91.54%on LEVIR,93.70%on WHUCD,and 64.96%on S2Looking for building change detection.Furthermore,MewCDNet exhibits optimal performance on the multi-class⋅SYSU dataset(F1:82.71%),highlighting its exceptional generalization capability.
基金supported by Western Research Interdisciplinary Initiative R6259A03.
文摘Rock fracture mechanisms can be inferred from moment tensors(MT)inverted from microseismic events.However,MT can only be inverted for events whose waveforms are acquired across a network of sensors.This is limiting for underground mines where the microseismic stations often lack azimuthal coverage.Thus,there is a need for a method to invert fracture mechanisms using waveforms acquired by a sparse microseismic network.Here,we present a novel,multi-scale framework to classify whether a rock crack contracts or dilates based on a single waveform.The framework consists of a deep learning model that is initially trained on 2400000+manually labelled field-scale seismic and microseismic waveforms acquired across 692 stations.Transfer learning is then applied to fine-tune the model on 300000+MT-labelled labscale acoustic emission waveforms from 39 individual experiments instrumented with different sensor layouts,loading,and rock types in training.The optimal model achieves over 86%F-score on unseen waveforms at both the lab-and field-scale.This model outperforms existing empirical methods in classification of rock fracture mechanisms monitored by a sparse microseismic network.This facilitates rapid assessment of,and early warning against,various rock engineering hazard such as induced earthquakes and rock bursts.
基金supported by National Natural Science Foundation of China(No.61862037)Lanzhou Jiaotong University Tianyou Innovation Team Project(No.TY202002)。
文摘To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.
基金The National Natural Science Foundation of China(No.51675098)
文摘Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.
基金This research was supported by the National Natural Science Foundation of China No.62276086the National Key R&D Program of China No.2022YFD2000100Zhejiang Provincial Natural Science Foundation of China under Grant No.LTGN23D010002.
文摘Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.
文摘<div style="text-align:justify;"> In the frequency division duplex (FDD) mode of the massive MIMO system, the system needs to perform coding through channel state information (CSI) to obtain performance gains. However, the number of antennas of the base station has been greatly increased, resulting in a rapid increase in the overhead for the user terminal to feedback CSI to the base station. In this article, we propose a method based on multi-task CNN to achieve compression and reconstruction of channel state information through a multi-scale and multi-channel convolutional neural network. We also introduce a dynamic learning rate model to improve the accuracy of channel state information reconstruction. The simulation results show that compared with the original CsiNet and other work, the proposed CSI feedback network has better reconstruction performance. </div>
基金supported by the National Natural Science Foundation of China(Nos.42371449,41801386).
文摘Change detection(CD)plays a crucial role in numerous fields,where both convolutional neural networks(CNNs)and Transformers have demonstrated exceptional performance in CD tasks.However,CNNs suffer from limited receptive fields,hindering their ability to capture global features,while Transformers are constrained by high computational complexity.Recently,Mamba architecture,which is based on state space models(SSMs),has shown powerful global modeling capabilities while achieving linear computational complexity.Although some researchers have incorporated Mamba into CD tasks,the existing Mamba⁃based remote sensing CD methods struggle to effectively perceive the inherent locality of changed regions when flattening and scanning remote sensing images,leading to limitations in extracting change features.To address these issues,we propose a novel Mamba⁃based CD method termed difference feature fusion Mamba model(DFFMamba)by mitigating the loss of feature locality caused by traditional Mamba⁃style scanning.Specifically,two distinct difference feature extraction modules are designed:Difference Mamba(DMamba)and local difference Mamba(LDMamba),where DMamba extracts difference features by calculating the difference in coefficient matrices between the state⁃space equations of the bi⁃temporal features.Building upon DMamba,LDMamba combines a locally adaptive state⁃space scanning(LASS)strategy to enhance feature locality so as to accurately extract difference features.Additionally,a fusion Mamba(FMamba)module is proposed,which employs a spatial⁃channel token modeling SSM(SCTMS)unit to integrate multi⁃dimensional spatio⁃temporal interactions of change features,thereby capturing their dependencies across both spatial and channel dimensions.To verify the effectiveness of the proposed DFFMamba,extensive experiments are conducted on three datasets of WHU⁃CD,LEVIR⁃CD,and CLCD.The results demonstrate that DFFMamba significantly outperforms state⁃of⁃the⁃art CD methods,achieving intersection over union(IoU)scores of 90.67%,85.04%,and 66.56%on the three datasets,respectively.