This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi...This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi-scale encoding significantly enhances the model’s ability to capture both fine-grained and global features,while the dynamic loss function adapts during training to optimize classification accuracy and retrieval performance.Our approach was evaluated on the ISIC-2018 and ChestX-ray14 datasets,yielding notable improvements.Specifically,on the ISIC-2018 dataset,our method achieves an F1-Score improvement of+4.84% compared to the standard ViT,with a precision increase of+5.46% for melanoma(MEL).On the ChestX-ray14 dataset,the method delivers an F1-Score improvement of 5.3%over the conventional ViT,with precision gains of+5.0% for pneumonia(PNEU)and+5.4%for fibrosis(FIB).Experimental results demonstrate that our approach outperforms traditional CNN-based models and existing ViT variants,particularly in retrieving relevant medical cases and enhancing diagnostic accuracy.These findings highlight the potential of the proposedmethod for large-scalemedical image analysis,offering improved tools for clinical decision-making through superior classification and case comparison.展开更多
The capacity to diagnose faults in rolling bearings is of significant practical importance to ensure the normal operation of the equipment.Frequency-domain features can effectively enhance the identification of fault ...The capacity to diagnose faults in rolling bearings is of significant practical importance to ensure the normal operation of the equipment.Frequency-domain features can effectively enhance the identification of fault modes.However,existing methods often suffer from insufficient frequency-domain representation in practical applications,which greatly affects diagnostic performance.Therefore,this paper proposes a rolling bearing fault diagnosismethod based on aMulti-Scale FusionNetwork(MSFN)using the Time-Division Fourier Transform(TDFT).The method constructs multi-scale channels to extract time-domain and frequency-domain features of the signal in parallel.A multi-level,multi-scale filter-based approach is designed to extract frequency-domain features in a segmented manner.A cross-attention mechanism is introduced to facilitate the fusion of the extracted time-frequency domain features.The performance of the proposed method is validated using the CWRU and Ottawa datasets.The results show that the average accuracy of MSFN under complex noisy signals is 97.75%and 94.41%.The average accuracy under variable load conditions is 98.68%.This demonstrates its significant application potential compared to existing methods.展开更多
Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete v...Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete vessel segmentation and poor continuity.In this study,we propose CT-MFENet to address the aforementioned issues.First,the use of context transformer(CT)allows for the integration of contextual feature information,which helps establish the connection between pixels and solve the problem of incomplete vessel continuity.Second,multi-scale dense residual networks are used instead of traditional CNN to address the issue of inadequate local feature extraction when the model encounters vessels at multiple scales.In the decoding stage,we introduce a local-global fusion module.It enhances the localization of vascular information and reduces the semantic gap between high-and low-level features.To address the class imbalance in retinal images,we propose a hybrid loss function that enhances the segmentation ability of the model for topological structures.We conducted experiments on the publicly available DRIVE,CHASEDB1,STARE,and IOSTAR datasets.The experimental results show that our CT-MFENet performs better than most existing methods,including the baseline U-Net.展开更多
Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global featu...Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.展开更多
Noise has traditionally been suppressed or eliminated in seismic data sets by the use of Fourier filters and, to a lesser degree, nonlinear statistical filters. Although these methods are quite useful under specific c...Noise has traditionally been suppressed or eliminated in seismic data sets by the use of Fourier filters and, to a lesser degree, nonlinear statistical filters. Although these methods are quite useful under specific conditions, they may produce undesirable effects for the low signal to noise ratio data. In this paper, a new method, multi-scale ridgelet transform, is used in the light of the theory of ridgelet transform. We employ wavelet transform to do sub-band decomposition for the signals and then use non-linear thresholding in ridgelet domain for every block. In other words, it is based on the idea of partition, at sufficiently fine scale, a curving singularity looks straight, and so ridgelet transform can work well in such cases. Applications on both synthetic data and actual seismic data from Sichuan basin, South China, show that the new method eliminates the noise portion of the signal more efficiently and retains a greater amount of geologic data than other methods, the quality and consecutiveness of seismic event are improved obviously as well as the quality of section is improved.展开更多
Computer-aided diagnosis of pneumonia based on deep learning is a research hotspot.However,there are some problems that the features of different sizes and different directions are not sufficient when extracting the f...Computer-aided diagnosis of pneumonia based on deep learning is a research hotspot.However,there are some problems that the features of different sizes and different directions are not sufficient when extracting the features in lung X-ray images.A pneumonia classification model based on multi-scale directional feature enhancement MSD-Net is proposed in this paper.The main innovations are as follows:Firstly,the Multi-scale Residual Feature Extraction Module(MRFEM)is designed to effectively extract multi-scale features.The MRFEM uses dilated convolutions with different expansion rates to increase the receptive field and extract multi-scale features effectively.Secondly,the Multi-scale Directional Feature Perception Module(MDFPM)is designed,which uses a three-branch structure of different sizes convolution to transmit direction feature layer by layer,and focuses on the target region to enhance the feature information.Thirdly,the Axial Compression Former Module(ACFM)is designed to perform global calculations to enhance the perception ability of global features in different directions.To verify the effectiveness of the MSD-Net,comparative experiments and ablation experiments are carried out.In the COVID-19 RADIOGRAPHY DATABASE,the Accuracy,Recall,Precision,F1 Score,and Specificity of MSD-Net are 97.76%,95.57%,95.52%,95.52%,and 98.51%,respectively.In the chest X-ray dataset,the Accuracy,Recall,Precision,F1 Score and Specificity of MSD-Net are 97.78%,95.22%,96.49%,95.58%,and 98.11%,respectively.This model improves the accuracy of lung image recognition effectively and provides an important clinical reference to pneumonia Computer-Aided Diagnosis.展开更多
Infrared-visible image fusion plays an important role in multi-source data fusion,which has the advantage of integrating useful information from multi-source sensors.However,there are still challenges in target enhanc...Infrared-visible image fusion plays an important role in multi-source data fusion,which has the advantage of integrating useful information from multi-source sensors.However,there are still challenges in target enhancement and visual improvement.To deal with these problems,a sub-regional infrared-visible image fusion method(SRF)is proposed.First,morphology and threshold segmentation is applied to extract targets interested in infrared images.Second,the infrared back-ground is reconstructed based on extracted targets and the visible image.Finally,target and back-ground regions are fused using a multi-scale transform.Experimental results are obtained using public data for comparison and evaluation,which demonstrate that the proposed SRF has poten-tial benefits over other methods.展开更多
BACKGROUND: Recent studies have focused on various methods of wavelet transformation for electroencephalogram (EEG) signals. However, there are very few studies reporting characteristics of multi-scale phase waves ...BACKGROUND: Recent studies have focused on various methods of wavelet transformation for electroencephalogram (EEG) signals. However, there are very few studies reporting characteristics of multi-scale phase waves during epileptic discharge.OBJECTIVE: To extract multi-scale phase average waveforms from childhood absence epilepsy EEG signals between time and frequency domains using wavelet transformation, and to compare EEG signals of absence seizure with pre-epileptic seizure and normal children, and to quantify multi-scale phase average waveforms from childhood absence epilepsy EEG signals. DESIGN, TIME AND SETTING: The case-comparative experiment was performed at the Department of Neuroelectrophysiology, Tianjin Medical University from August 2002 to May 2005. PARTICIPANTS: A total of 15 patients with childhood absence epilepsy from the General Hospital of Tianjin Medical University were enrolled in the study. The patients were not administered anti-epileptic drugs or sedatives prior to EEG testing. In addition, 12 healthy, age- and gender-matched children were also enrolled.METHODS: EEG signals were tested on 15 patients with childhood absence epilepsy and 12 normal children. Epileptic discharge signals during clinical and subclinical seizures were collected 10 and 20 times, respectively. The collected EEG signals were treated with wavelet transformation to extract multi-scale characteristics during absence epilepsy seizure using a conditional sampling method. Multi-scale phase average waveforms were collected using a conditional phase averaging technique. Amplitude of phase average waveform from EEG signals of epilepsy seizure, subclinical epileptic discharge, and EEG signals of normal children were compared and statistically analyzed in the first half-cycle.MAIN OUTCOME MEASURES: Multi-scale wavelet coefficient and the evolution of EEG signals were observed during childhood absence epilepsy seizures using wavelet transformation. Multi-scale phase average waveforms from EEG signals were observed using a conditional sampling method and phase averaging technique.RESULTS: Multi-scale characteristics of EEG signals demonstrated that 12-scale (3 Hz) rhythmical activity was significantly enhanced during childhood absence epilepsy seizure and co-existed with background structure (〈1 Hz, low frequency discharge). The phase average wave exhibited opposed phase abnormal rhythm at 3 Hz. Prior to childhood absence epilepsy seizure, EEG detected opposed abnormal a rhythm and 3 Hz composition, which were not detected with traditional EEG. Compared to EEG signals from normal children, epileptic discharges from clinical and subclinical childhood absence epilepsy seizures were positive and amplitude was significantly greater (P〈0.05).CONCLUSION: Wavelet transformation was used to analyze EEG signals from childhood absence epilepsy to obtain multi-scale quantitative characteristics and phase average waveforms. Multi-scale wavelet coefficients of EEG signals correlated with childhood absence epilepsy seizure, and multi-scale waveforms prior to epilepsy seizure were similar to characteristics during the onset period. Compared to normal children, EEG signals during epilepsy seizure exhibited an opposed phase model.展开更多
The high-frequency components in the traditional multi-scale transform method are approximately sparse, which can represent different information of the details. But in the low-frequency component, the coefficients ar...The high-frequency components in the traditional multi-scale transform method are approximately sparse, which can represent different information of the details. But in the low-frequency component, the coefficients around the zero value are very few, so we cannot sparsely represent low-frequency image information. The low-frequency component contains the main energy of the image and depicts the profile of the image. Direct fusion of the low-frequency component will not be conducive to obtain highly accurate fusion result. Therefore, this paper presents an infrared and visible image fusion method combining the multi-scale and top-hat transforms. On one hand, the new top-hat-transform can effectively extract the salient features of the low-frequency component. On the other hand, the multi-scale transform can extract highfrequency detailed information in multiple scales and from diverse directions. The combination of the two methods is conducive to the acquisition of more characteristics and more accurate fusion results. Among them, for the low-frequency component, a new type of top-hat transform is used to extract low-frequency features, and then different fusion rules are applied to fuse the low-frequency features and low-frequency background; for high-frequency components, the product of characteristics method is used to integrate the detailed information in high-frequency. Experimental results show that the proposed algorithm can obtain more detailed information and clearer infrared target fusion results than the traditional multiscale transform methods. Compared with the state-of-the-art fusion methods based on sparse representation, the proposed algorithm is simple and efficacious, and the time consumption is significantly reduced.展开更多
Recently,there has been a widespread application of deep learning in object detection with Synthetic Aperture Radar(SAR).The current algorithms based on Convolutional Neural Networks(CNN)often achieve good accuracy at...Recently,there has been a widespread application of deep learning in object detection with Synthetic Aperture Radar(SAR).The current algorithms based on Convolutional Neural Networks(CNN)often achieve good accuracy at the expense of more complex model structures and huge parameters,which poses a great challenge for real-time and accurate detection of multi-scale targets.To address these problems,we propose a lightweight real-time SAR ship object detector based on detection transformer(LSD-DETR)in this study.First,a lightweight backbone network LCNet containing a stem module and inverted residual structure is constructed to balance the inference speed and detection accuracy of model.Second,we design a transformer encoder with Cascaded Group Attention(CGA Encoder)to enrich the feature information of small targets in SAR images,which makes detection of small-sized ships more precise.Third,an efficient cross-scale feature fusion pyramid module(C3Het-FPN)is proposed through the lightweight units(C3Het)and the introduction of the weighted bidirectional feature pyramid(BiFPN)structure,which realizes the adaptive fusion of multi-scale features with fewer parameters.Ablation experiments and comparative experiments demonstrate the effectiveness of LSD-DETR.The model parameter of LSD-DETR is 8.8 M(only 20.6%of DETR),the model’s FPS reached 43.1,the average detection accuracy mAP50 on the SSDD and HRSID datasets reached 97.3%and 93.4%.Compared to advanced methods,the LSD-DETR can attain superior precision with fewer parameters,which enables accurate real-time object detection of multi-scale ships in SAR images.展开更多
Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presen...Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.展开更多
Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. Th...Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.展开更多
Cardiovascular diseases are the world’s leading cause of death;therefore cardiac health of the human heart has been a fascinating topic for decades.The electrocardiogram(ECG)signal is a comprehensive non-invasive met...Cardiovascular diseases are the world’s leading cause of death;therefore cardiac health of the human heart has been a fascinating topic for decades.The electrocardiogram(ECG)signal is a comprehensive non-invasive method for determining cardiac health.Various health practitioners use the ECG signal to ascertain critical information about the human heart.In this article,swarm intelligence approaches are used in the biomedical signal processing sector to enhance adaptive hybrid filters and empirical wavelet transforms(EWTs).At first,the white Gaussian noise is added to the input ECG signal and then applied to the EWT.The ECG signals are denoised by the proposed adaptive hybrid filter.The honey badge optimization(HBO)algorithm is utilized to optimize the EWT window function and adaptive hybrid filter weight parameters.The proposed approach is simulated by MATLAB 2018a using the MIT-BIH dataset with white Gaussian,electromyogram and electrode motion artifact noises.A comparison of the HBO approach with recursive least square-based adaptive filter,multichannel least means square,and discrete wavelet transform methods has been done in order to show the efficiency of the proposed adaptive hybrid filter.The experimental results show that the HBO approach supported by EWT and adaptive hybrid filter can be employed efficiently for cardiovascular signal denoising.展开更多
基金funded by the Deanship of Research and Graduate Studies at King Khalid University through small group research under grant number RGP1/278/45.
文摘This paper introduces a novel method for medical image retrieval and classification by integrating a multi-scale encoding mechanism with Vision Transformer(ViT)architectures and a dynamic multi-loss function.The multi-scale encoding significantly enhances the model’s ability to capture both fine-grained and global features,while the dynamic loss function adapts during training to optimize classification accuracy and retrieval performance.Our approach was evaluated on the ISIC-2018 and ChestX-ray14 datasets,yielding notable improvements.Specifically,on the ISIC-2018 dataset,our method achieves an F1-Score improvement of+4.84% compared to the standard ViT,with a precision increase of+5.46% for melanoma(MEL).On the ChestX-ray14 dataset,the method delivers an F1-Score improvement of 5.3%over the conventional ViT,with precision gains of+5.0% for pneumonia(PNEU)and+5.4%for fibrosis(FIB).Experimental results demonstrate that our approach outperforms traditional CNN-based models and existing ViT variants,particularly in retrieving relevant medical cases and enhancing diagnostic accuracy.These findings highlight the potential of the proposedmethod for large-scalemedical image analysis,offering improved tools for clinical decision-making through superior classification and case comparison.
基金fully supported by the Frontier Exploration Projects of Longmen Laboratory(No.LMQYTSKT034)Key Research and Development and Promotion of Special(Science and Technology)Project of Henan Province,China(No.252102210158)。
文摘The capacity to diagnose faults in rolling bearings is of significant practical importance to ensure the normal operation of the equipment.Frequency-domain features can effectively enhance the identification of fault modes.However,existing methods often suffer from insufficient frequency-domain representation in practical applications,which greatly affects diagnostic performance.Therefore,this paper proposes a rolling bearing fault diagnosismethod based on aMulti-Scale FusionNetwork(MSFN)using the Time-Division Fourier Transform(TDFT).The method constructs multi-scale channels to extract time-domain and frequency-domain features of the signal in parallel.A multi-level,multi-scale filter-based approach is designed to extract frequency-domain features in a segmented manner.A cross-attention mechanism is introduced to facilitate the fusion of the extracted time-frequency domain features.The performance of the proposed method is validated using the CWRU and Ottawa datasets.The results show that the average accuracy of MSFN under complex noisy signals is 97.75%and 94.41%.The average accuracy under variable load conditions is 98.68%.This demonstrates its significant application potential compared to existing methods.
基金the National Natural Science Foundation of China(No.62266025)。
文摘Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete vessel segmentation and poor continuity.In this study,we propose CT-MFENet to address the aforementioned issues.First,the use of context transformer(CT)allows for the integration of contextual feature information,which helps establish the connection between pixels and solve the problem of incomplete vessel continuity.Second,multi-scale dense residual networks are used instead of traditional CNN to address the issue of inadequate local feature extraction when the model encounters vessels at multiple scales.In the decoding stage,we introduce a local-global fusion module.It enhances the localization of vascular information and reduces the semantic gap between high-and low-level features.To address the class imbalance in retinal images,we propose a hybrid loss function that enhances the segmentation ability of the model for topological structures.We conducted experiments on the publicly available DRIVE,CHASEDB1,STARE,and IOSTAR datasets.The experimental results show that our CT-MFENet performs better than most existing methods,including the baseline U-Net.
基金National Key Research and Development Program of China,Grant/Award Number:2018YFE0206900China Postdoctoral Science Foundation,Grant/Award Number:2023M731204+2 种基金The Open Project of Key Laboratory for Quality Evaluation of Ultrasound Surgical Equipment of National Medical Products Administration,Grant/Award Number:SMDTKL-2023-1-01The Hubei Province Key Research and Development Project,Grant/Award Number:2023BCB007CAAI-Huawei MindSpore Open Fund。
文摘Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.
基金supported by China Petrochemical key project during the 11th Five-year Plan as well as the Doctorate Fund of Ministry of Education of China (No.20050491504)
文摘Noise has traditionally been suppressed or eliminated in seismic data sets by the use of Fourier filters and, to a lesser degree, nonlinear statistical filters. Although these methods are quite useful under specific conditions, they may produce undesirable effects for the low signal to noise ratio data. In this paper, a new method, multi-scale ridgelet transform, is used in the light of the theory of ridgelet transform. We employ wavelet transform to do sub-band decomposition for the signals and then use non-linear thresholding in ridgelet domain for every block. In other words, it is based on the idea of partition, at sufficiently fine scale, a curving singularity looks straight, and so ridgelet transform can work well in such cases. Applications on both synthetic data and actual seismic data from Sichuan basin, South China, show that the new method eliminates the noise portion of the signal more efficiently and retains a greater amount of geologic data than other methods, the quality and consecutiveness of seismic event are improved obviously as well as the quality of section is improved.
基金supported in part by the National Natural Science Foundation of China(Grant No.62062003)Natural Science Foundation of Ningxia(Grant No.2023AAC03293).
文摘Computer-aided diagnosis of pneumonia based on deep learning is a research hotspot.However,there are some problems that the features of different sizes and different directions are not sufficient when extracting the features in lung X-ray images.A pneumonia classification model based on multi-scale directional feature enhancement MSD-Net is proposed in this paper.The main innovations are as follows:Firstly,the Multi-scale Residual Feature Extraction Module(MRFEM)is designed to effectively extract multi-scale features.The MRFEM uses dilated convolutions with different expansion rates to increase the receptive field and extract multi-scale features effectively.Secondly,the Multi-scale Directional Feature Perception Module(MDFPM)is designed,which uses a three-branch structure of different sizes convolution to transmit direction feature layer by layer,and focuses on the target region to enhance the feature information.Thirdly,the Axial Compression Former Module(ACFM)is designed to perform global calculations to enhance the perception ability of global features in different directions.To verify the effectiveness of the MSD-Net,comparative experiments and ablation experiments are carried out.In the COVID-19 RADIOGRAPHY DATABASE,the Accuracy,Recall,Precision,F1 Score,and Specificity of MSD-Net are 97.76%,95.57%,95.52%,95.52%,and 98.51%,respectively.In the chest X-ray dataset,the Accuracy,Recall,Precision,F1 Score and Specificity of MSD-Net are 97.78%,95.22%,96.49%,95.58%,and 98.11%,respectively.This model improves the accuracy of lung image recognition effectively and provides an important clinical reference to pneumonia Computer-Aided Diagnosis.
基金supported by the China Postdoctoral Science Foundation Funded Project(No.2021M690385)the National Natural Science Foundation of China(No.62101045).
文摘Infrared-visible image fusion plays an important role in multi-source data fusion,which has the advantage of integrating useful information from multi-source sensors.However,there are still challenges in target enhancement and visual improvement.To deal with these problems,a sub-regional infrared-visible image fusion method(SRF)is proposed.First,morphology and threshold segmentation is applied to extract targets interested in infrared images.Second,the infrared back-ground is reconstructed based on extracted targets and the visible image.Finally,target and back-ground regions are fused using a multi-scale transform.Experimental results are obtained using public data for comparison and evaluation,which demonstrate that the proposed SRF has poten-tial benefits over other methods.
基金the National Natural Science Foundation of China,No. 60703045
文摘BACKGROUND: Recent studies have focused on various methods of wavelet transformation for electroencephalogram (EEG) signals. However, there are very few studies reporting characteristics of multi-scale phase waves during epileptic discharge.OBJECTIVE: To extract multi-scale phase average waveforms from childhood absence epilepsy EEG signals between time and frequency domains using wavelet transformation, and to compare EEG signals of absence seizure with pre-epileptic seizure and normal children, and to quantify multi-scale phase average waveforms from childhood absence epilepsy EEG signals. DESIGN, TIME AND SETTING: The case-comparative experiment was performed at the Department of Neuroelectrophysiology, Tianjin Medical University from August 2002 to May 2005. PARTICIPANTS: A total of 15 patients with childhood absence epilepsy from the General Hospital of Tianjin Medical University were enrolled in the study. The patients were not administered anti-epileptic drugs or sedatives prior to EEG testing. In addition, 12 healthy, age- and gender-matched children were also enrolled.METHODS: EEG signals were tested on 15 patients with childhood absence epilepsy and 12 normal children. Epileptic discharge signals during clinical and subclinical seizures were collected 10 and 20 times, respectively. The collected EEG signals were treated with wavelet transformation to extract multi-scale characteristics during absence epilepsy seizure using a conditional sampling method. Multi-scale phase average waveforms were collected using a conditional phase averaging technique. Amplitude of phase average waveform from EEG signals of epilepsy seizure, subclinical epileptic discharge, and EEG signals of normal children were compared and statistically analyzed in the first half-cycle.MAIN OUTCOME MEASURES: Multi-scale wavelet coefficient and the evolution of EEG signals were observed during childhood absence epilepsy seizures using wavelet transformation. Multi-scale phase average waveforms from EEG signals were observed using a conditional sampling method and phase averaging technique.RESULTS: Multi-scale characteristics of EEG signals demonstrated that 12-scale (3 Hz) rhythmical activity was significantly enhanced during childhood absence epilepsy seizure and co-existed with background structure (〈1 Hz, low frequency discharge). The phase average wave exhibited opposed phase abnormal rhythm at 3 Hz. Prior to childhood absence epilepsy seizure, EEG detected opposed abnormal a rhythm and 3 Hz composition, which were not detected with traditional EEG. Compared to EEG signals from normal children, epileptic discharges from clinical and subclinical childhood absence epilepsy seizures were positive and amplitude was significantly greater (P〈0.05).CONCLUSION: Wavelet transformation was used to analyze EEG signals from childhood absence epilepsy to obtain multi-scale quantitative characteristics and phase average waveforms. Multi-scale wavelet coefficients of EEG signals correlated with childhood absence epilepsy seizure, and multi-scale waveforms prior to epilepsy seizure were similar to characteristics during the onset period. Compared to normal children, EEG signals during epilepsy seizure exhibited an opposed phase model.
基金Project supported by the National Natural Science Foundation of China(Grant No.61402368)Aerospace Support Fund,China(Grant No.2017-HT-XGD)Aerospace Science and Technology Innovation Foundation,China(Grant No.2017 ZD 53047)
文摘The high-frequency components in the traditional multi-scale transform method are approximately sparse, which can represent different information of the details. But in the low-frequency component, the coefficients around the zero value are very few, so we cannot sparsely represent low-frequency image information. The low-frequency component contains the main energy of the image and depicts the profile of the image. Direct fusion of the low-frequency component will not be conducive to obtain highly accurate fusion result. Therefore, this paper presents an infrared and visible image fusion method combining the multi-scale and top-hat transforms. On one hand, the new top-hat-transform can effectively extract the salient features of the low-frequency component. On the other hand, the multi-scale transform can extract highfrequency detailed information in multiple scales and from diverse directions. The combination of the two methods is conducive to the acquisition of more characteristics and more accurate fusion results. Among them, for the low-frequency component, a new type of top-hat transform is used to extract low-frequency features, and then different fusion rules are applied to fuse the low-frequency features and low-frequency background; for high-frequency components, the product of characteristics method is used to integrate the detailed information in high-frequency. Experimental results show that the proposed algorithm can obtain more detailed information and clearer infrared target fusion results than the traditional multiscale transform methods. Compared with the state-of-the-art fusion methods based on sparse representation, the proposed algorithm is simple and efficacious, and the time consumption is significantly reduced.
基金National Nature Science Foundation of China(No.U24A20589)National Key Research and Development Program of China(No.2023YFB3905504)+1 种基金Innovation Team of the Ministry of Education of China(No.8091B042227)Innovation Group of Sichuan Natural Science Foundation(No.2023NSFSC1974).
文摘Recently,there has been a widespread application of deep learning in object detection with Synthetic Aperture Radar(SAR).The current algorithms based on Convolutional Neural Networks(CNN)often achieve good accuracy at the expense of more complex model structures and huge parameters,which poses a great challenge for real-time and accurate detection of multi-scale targets.To address these problems,we propose a lightweight real-time SAR ship object detector based on detection transformer(LSD-DETR)in this study.First,a lightweight backbone network LCNet containing a stem module and inverted residual structure is constructed to balance the inference speed and detection accuracy of model.Second,we design a transformer encoder with Cascaded Group Attention(CGA Encoder)to enrich the feature information of small targets in SAR images,which makes detection of small-sized ships more precise.Third,an efficient cross-scale feature fusion pyramid module(C3Het-FPN)is proposed through the lightweight units(C3Het)and the introduction of the weighted bidirectional feature pyramid(BiFPN)structure,which realizes the adaptive fusion of multi-scale features with fewer parameters.Ablation experiments and comparative experiments demonstrate the effectiveness of LSD-DETR.The model parameter of LSD-DETR is 8.8 M(only 20.6%of DETR),the model’s FPS reached 43.1,the average detection accuracy mAP50 on the SSDD and HRSID datasets reached 97.3%and 93.4%.Compared to advanced methods,the LSD-DETR can attain superior precision with fewer parameters,which enables accurate real-time object detection of multi-scale ships in SAR images.
文摘Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.
文摘Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.
文摘Cardiovascular diseases are the world’s leading cause of death;therefore cardiac health of the human heart has been a fascinating topic for decades.The electrocardiogram(ECG)signal is a comprehensive non-invasive method for determining cardiac health.Various health practitioners use the ECG signal to ascertain critical information about the human heart.In this article,swarm intelligence approaches are used in the biomedical signal processing sector to enhance adaptive hybrid filters and empirical wavelet transforms(EWTs).At first,the white Gaussian noise is added to the input ECG signal and then applied to the EWT.The ECG signals are denoised by the proposed adaptive hybrid filter.The honey badge optimization(HBO)algorithm is utilized to optimize the EWT window function and adaptive hybrid filter weight parameters.The proposed approach is simulated by MATLAB 2018a using the MIT-BIH dataset with white Gaussian,electromyogram and electrode motion artifact noises.A comparison of the HBO approach with recursive least square-based adaptive filter,multichannel least means square,and discrete wavelet transform methods has been done in order to show the efficiency of the proposed adaptive hybrid filter.The experimental results show that the HBO approach supported by EWT and adaptive hybrid filter can be employed efficiently for cardiovascular signal denoising.