Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi...Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.展开更多
Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods ex...Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.展开更多
Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often stru...Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.展开更多
The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is i...The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is imperative to ensure thorough tumor resection.In this study,we integrate Raman imaging technology with an artificial intelligence(AI)generative model,proposing an innovative approach for intraoperative margin status diagnosis.This method utilizes Raman imaging to swiftly and non-invasively capture tissue Raman images,which are then transformed into hematoxylin-eosin(H&E)-stained histopathological images using an AI generative model for histopathological diagnosis.The generated H&E-stained images clearly illustrate the tissue’s pathological conditions.Independently reviewed by three pathologists,the overall diagnostic accuracy for distinguishing between tumor tissue and normal muscle tissue reaches 86.7%.Notably,it outperforms current clinical practices,especially in TSCC with positive lymph node metastasis or moderately differentiated grades.This advancement highlights the potential of AI-enhanced Raman imaging to significantly improve intraoperative assessments and surgical margin evaluations,promising a versatile diagnostic tool beyond TSCC.展开更多
The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photograp...The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale who...Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.展开更多
The precise identification of quartz minerals is crucial in mineralogy and geology due to their widespread occurrence and industrial significance.Traditional methods of quartz identification in thin sections are labor...The precise identification of quartz minerals is crucial in mineralogy and geology due to their widespread occurrence and industrial significance.Traditional methods of quartz identification in thin sections are labor-intensive and require significant expertise,often complicated by the coexistence of other minerals.This study presents a novel approach leveraging deep learning techniques combined with hyperspectral imaging to automate the identification process of quartz minerals.The utilizied four advanced deep learning models—PSPNet,U-Net,FPN,and LinkNet—has significant advancements in efficiency and accuracy.Among these models,PSPNet exhibited superior performance,achieving the highest intersection over union(IoU)scores and demonstrating exceptional reliability in segmenting quartz minerals,even in complex scenarios.The study involved a comprehensive dataset of 120 thin sections,encompassing 2470 hyperspectral images prepared from 20 rock samples.Expert-reviewed masks were used for model training,ensuring robust segmentation results.This automated approach not only expedites the recognition process but also enhances reliability,providing a valuable tool for geologists and advancing the field of mineralogical analysis.展开更多
A novel method is developed by utilizing the fractional frequency based multirange rulers to precisely position the passive inter-modulation(PIM)sources within radio frequency(RF)cables.The proposed method employs a s...A novel method is developed by utilizing the fractional frequency based multirange rulers to precisely position the passive inter-modulation(PIM)sources within radio frequency(RF)cables.The proposed method employs a set of fractional frequencies to create multiple measuring rulers with different metric ranges to determine the values of the tens,ones,tenths,and hundredths digits of the distance.Among these rulers,the one with the lowest frequency determines the maximum metric range,while the one with the highest frequency decides the highest achievable accuracy of the position system.For all rulers,the metric accuracy is uniquely determined by the phase accuracy of the detected PIM signals.With the all-phase Fourier transform method,the phases of the PIM signals at all fractional frequencies maintain almost the same accuracy,approximately 1°(about 1/360 wavelength in the positioning accuracy)at the signal-to-noise ratio(SNR)of 10 d B.Numerical simulations verify the effectiveness of the proposed method,improving the positioning accuracy of the cable PIM up to a millimeter level with the highest fractional frequency operating at 200 MHz.展开更多
Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the s...Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the spinal cord,nerves,intervertebral discs,and vertebrae,Magnetic Resonance Imaging is thought to be the most effective method for imaging the spine.The semantic segmentation of vertebrae plays a major role in the diagnostic process of lumbar diseases.It is difficult to semantically partition the vertebrae in Magnetic Resonance Images from the surrounding variety of tissues,including muscles,ligaments,and intervertebral discs.U-Net is a powerful deep-learning architecture to handle the challenges of medical image analysis tasks and achieves high segmentation accuracy.This work proposes a modified U-Net architecture namely MU-Net,consisting of the Meijering convolutional layer that incorporates the Meijering filter to perform the semantic segmentation of lumbar vertebrae L1 to L5 and sacral vertebra S1.Pseudo-colour mask images were generated and used as ground truth for training the model.The work has been carried out on 1312 images expanded from T1-weighted mid-sagittal MRI images of 515 patients in the Lumbar Spine MRI Dataset publicly available from Mendeley Data.The proposed MU-Net model for the semantic segmentation of the lumbar vertebrae gives better performance with 98.79%of pixel accuracy(PA),98.66%of dice similarity coefficient(DSC),97.36%of Jaccard coefficient,and 92.55%mean Intersection over Union(mean IoU)metrics using the mentioned dataset.展开更多
Brain tumor segmentation is critical in clinical diagnosis and treatment planning.Existing methods for brain tumor segmentation with missing modalities often struggle when dealing with multiple missing modalities,a co...Brain tumor segmentation is critical in clinical diagnosis and treatment planning.Existing methods for brain tumor segmentation with missing modalities often struggle when dealing with multiple missing modalities,a common scenario in real-world clinical settings.These methods primarily focus on handling a single missing modality at a time,making them insufficiently robust for the additional complexity encountered with incomplete data containing various missing modality combinations.Additionally,most existing methods rely on single models,which may limit their performance and increase the risk of overfitting the training data.This work proposes a novel method called the ensemble adversarial co-training neural network(EACNet)for accurate brain tumor segmentation from multi-modal magnetic resonance imaging(MRI)scans with multiple missing modalities.The proposed method consists of three key modules:the ensemble of pre-trained models,which captures diverse feature representations from the MRI data by employing an ensemble of pre-trained models;adversarial learning,which leverages a competitive training approach involving two models;a generator model,which creates realistic missing data,while sub-networks acting as discriminators learn to distinguish real data from the generated“fake”data.Co-training framework utilizes the information extracted by the multimodal path(trained on complete scans)to guide the learning process in the path handling missing modalities.The model potentially compensates for missing information through co-training interactions by exploiting the relationships between available modalities and the tumor segmentation task.EACNet was evaluated on the BraTS2018 and BraTS2020 challenge datasets and achieved state-of-the-art and competitive performance respectively.Notably,the segmentation results for the whole tumor(WT)dice similarity coefficient(DSC)reached 89.27%,surpassing the performance of existing methods.The analysis suggests that the ensemble approach offers potential benefits,and the adversarial co-training contributes to the increased robustness and accuracy of EACNet for brain tumor segmentation of MRI scans with missing modalities.The experimental results show that EACNet has promising results for the task of brain tumor segmentation of MRI scans with missing modalities and is a better candidate for real-world clinical applications.展开更多
With the increasing demand for indoor localization,indoor location based on Wi-Fi has gained wide attention due to its convenience of access.In this paper,we propose a new multi-feature fusion convolutional neural net...With the increasing demand for indoor localization,indoor location based on Wi-Fi has gained wide attention due to its convenience of access.In this paper,we propose a new multi-feature fusion convolutional neural network(CNN)based on channel state information(CSI)images,which contains more feature information by constituting a new CSI image with amplitude and angle of arrival information of CSI information collected at known points.Moreover,the global mean filtering(GMC)algorithm with median filtering proposed in this paper is used to filter and reduce the noise of CSI images to obtain clearer images for network training.To extract more features from the CSI images,the traditional single-channel network is extended,and a two-channel design is introduced to extract feature information between adjacent subcarriers.Experimental evaluation is performed in a typical indoor environment,and the proposed method is experimentally proven to have good localization performance.展开更多
Osteosarcomas are malignant neoplasms derived from undifferentiated osteogenic mesenchymal cells. It causes severe and permanent damage to human tissue and has a high mortality rate. The condition has the capacity to ...Osteosarcomas are malignant neoplasms derived from undifferentiated osteogenic mesenchymal cells. It causes severe and permanent damage to human tissue and has a high mortality rate. The condition has the capacity to occur in any bone;however, it often impacts long bones like the arms and legs. Prompt identification and prompt intervention are essential for augmenting patient longevity. However, the intricate composition and erratic placement of osteosarcoma provide difficulties for clinicians in accurately determining the scope of the afflicted area. There is a pressing requirement for developing an algorithm that can automatically detect bone tumors with tremendous accuracy. Therefore, in this study, we proposed a novel feature extractor framework associated with a supervised three-class XGBoost algorithm for the detection of osteosarcoma in whole slide histopathology images. This method allows for quicker and more effective data analysis. The first step involves preprocessing the imbalanced histopathology dataset, followed by augmentation and balancing utilizing two techniques: SMOTE and ADASYN. Next, a unique feature extraction framework is used to extract features, which are then inputted into the supervised three-class XGBoost algorithm for classification into three categories: non-tumor, viable tumor, and non-viable tumor. The experimental findings indicate that the proposed model exhibits superior efficiency, accuracy, and a more lightweight design in comparison to other current models for osteosarcoma detection.展开更多
In digital signal processing,image enhancement or image denoising are challenging task to preserve pixel quality.There are several approaches from conventional to deep learning that are used to resolve such issues.But...In digital signal processing,image enhancement or image denoising are challenging task to preserve pixel quality.There are several approaches from conventional to deep learning that are used to resolve such issues.But they still face challenges in terms of computational requirements,overfitting and generalization issues,etc.To resolve such issues,optimization algorithms provide greater control and transparency in designing digital filters for image enhancement and denoising.Therefore,this paper presented a novel denoising approach for medical applications using an Optimized Learning⁃based Multi⁃level discrete Wavelet Cascaded Convolutional Neural Network(OLMWCNN).In this approach,the optimal filter parameters are identified to preserve the image quality after denoising.The performance and efficiency of the OLMWCNN filter are evaluated,demonstrating significant progress in denoising medical images while overcoming the limitations of conventional methods.展开更多
AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited...AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited adaptive histogram equalization(NICLAHE)to improve retinal image contrast was suggested to aid in the accurate identification of retinal disorders and improve the visibility of fine retinal structures.Additionally,a minimal-order filter was applied to effectively denoise the images without compromising important retinal structures.The novel NICLAHE algorithm was inspired by the classical CLAHE algorithm,but enhanced it by selecting the clip limits and tile sized in a dynamical manner relative to the pixel values in an image as opposed to using fixed values.It was evaluated on the Drive and high-resolution fundus(HRF)datasets on conventional quality measures.RESULTS:The new proposed preprocessing technique was applied to two retinal image databases,Drive and HRF,with four quality metrics being,root mean square error(RMSE),peak signal to noise ratio(PSNR),root mean square contrast(RMSC),and overall contrast.The technique performed superiorly on both the data sets as compared to the traditional enhancement methods.In order to assess the compatibility of the method with automated diagnosis,a deep learning framework named ResNet was applied in the segmentation of retinal blood vessels.Sensitivity,specificity,precision and accuracy were used to analyse the performance.NICLAHE–enhanced images outperformed the traditional techniques on both the datasets with improved accuracy.CONCLUSION:NICLAHE provides better results than traditional methods with less error and improved contrastrelated values.These enhanced images are subsequently measured by sensitivity,specificity,precision,and accuracy,which yield a better result in both datasets.展开更多
Various and intricate varieties of lung disease have made it challenging for computer aided diagnosis to appropriately segment lung lesions utilizing computed tomography(CT)images.This study integrates transfer learni...Various and intricate varieties of lung disease have made it challenging for computer aided diagnosis to appropriately segment lung lesions utilizing computed tomography(CT)images.This study integrates transfer learning with the attention mechanism to construct a deep learning model that can automatically detect new coronary pneumonia on lung CT images.In this study,using VGG16 pre-trained by ImageNet as the encoder,the decoder was established utilizing the U-Net structure.The attention module is incorporated during each concatenate procedure,permitting the model to concentrate on the critical information and identify the crucial components efficiently.The public COVID-19-CT-Seg-Benchmark dataset was utilized for experiments,and the highest scores for Dice,F1,and Accuracy were 0.9071,0.9076,and 0.9965,respectively.The generalization performance was assessed concurrently,with performance metrics including Dice,F1,and Accuracy over 0.8.The experimental findings indicate the feasibility of the segmentation network proposed in this study.展开更多
Agromyzid leafminers cause significant economic losses in both vegetable and horticultural crops,and precise assessments of pesticide needs must be based on the extent of leaf damage.Traditionally,surveyors estimate t...Agromyzid leafminers cause significant economic losses in both vegetable and horticultural crops,and precise assessments of pesticide needs must be based on the extent of leaf damage.Traditionally,surveyors estimate the damage by visually comparing the proportion of damaged to intact leaf area,a method that lacks objectivity,precision,and reliable data traceability.To address these issues,an advanced survey system that combines augmented reality(AR)glasses with a camera and an artificial intelligence(AI)algorithm was developed in this study to objectively and accurately assess leafminer damage in the feld.By wearing AR glasses equipped with a voice-controlled camera,surveyors can easily flatten damaged leaves by hand and capture images for analysis.This method can provide a precise and reliable diagnosis of leafminer damage levels,which in turn supports the implementation of scientifically grounded and targeted pest management strategies.To calculate the leafminer damage level,the DeepLab-Leafminer model was proposed to precisely segment the leafminer-damaged regions and the intact leaf region.The integration of an edge-aware module and a Canny loss function into the DeepLabv3+model enhanced the DeepLab-Leafminer model's capability to accurately segment the edges of leafminer-damaged regions,which often exhibit irregular shapes.Compared with state-of-the-art segmentation models,the DeepLabLeafminer model achieved superior segmentation performance with an Intersection over Union(IoU)of 81.23%and an F1score of 87.92%on leafminer-damaged leaves.The test results revealed a 92.38%diagnosis accuracy of leafminer damage levels based on the DeepLab-Leafminer model.A mobile application and a web platform were developed to assist surveyors in displaying the diagnostic results of leafminer damage levels.This system provides surveyors with an advanced,user-friendly,and accurate tool for assessing agromyzid leafminer damage in agricultural felds using wearable AR glasses and an AI model.This method can also be utilized to automatically diagnose pest and disease damage levels in other crops based on leaf images.展开更多
This paper proposes a subpixel transformation method to correct Keystone and Smile distortions in fiber spectral images from the Fiber Arrayed Solar Optical Telescope.These distortions affect the spatial and spectral ...This paper proposes a subpixel transformation method to correct Keystone and Smile distortions in fiber spectral images from the Fiber Arrayed Solar Optical Telescope.These distortions affect the spatial and spectral positions,degrading resolution and accuracy.To correct Keystone distortion,we use a local summation and peak-finding method to locate central horizontal coordinates,calculate shifting values,and straighten the curves.For Smile distortion,we use quartic polynomial fitting based on absorption lines at different wavelengths.This technique preserves subpixel components,redistributes pixel values,and interpolates non-fiber portions,rectifying the spectra for accurate analysis.The method can also be applied to other astronomical projects like Large Sky Area Multi-Object Fiber Spectroscopic Telescope,enhancing the accuracy and reliability of spectral data in various astronomical studies.展开更多
Studying various aurora morphology helps us understand space's physical processes and the mechanisms behind these patterns.Auroral arcs are the brightest and most prominent auroral patterns.Due to the difficulty i...Studying various aurora morphology helps us understand space's physical processes and the mechanisms behind these patterns.Auroral arcs are the brightest and most prominent auroral patterns.Due to the difficulty in precisely defining auroral shape edges,auroral arc skeleton extraction is expected as an alternative representation for studying auroral morphology,resorting skeletons extract key morphological features from complex auroral shapes.Transformer models provide a better understanding of the relationship between the overall morphology and the details when processing image data,so we proposed a Transformer-based method for auroral arc skeleton extraction.Combined with ridge-guided annotation on all-sky images,a Transformer-based skeleton extractor is trained and used to estimate the number of auroral arcs.Experiments demonstrate that the Transformer-based model can more effectively capture structural information and local details of auroral arcs,which is suitable for complex auroral morphologies.展开更多
In recent years,camouflage technology has evolved from single-spectral-band applications to multifunctional and multispectral implementations.Hyperspectral imaging has emerged as a powerful technique for target identi...In recent years,camouflage technology has evolved from single-spectral-band applications to multifunctional and multispectral implementations.Hyperspectral imaging has emerged as a powerful technique for target identification due to its capacity to capture both spectral and spatial information.The advancement of imaging spectroscopy technology has significantly enhanced reconnaissance capabilities,offering substantial advantages in camouflaged target classification and detection.However,the increasing spectral similarity between camouflaged targets and their backgrounds has significantly compromised detection performance in specific scenarios.Conventional feature extraction methods are often limited to single,shallow spectral or spatial features,failing to extract deep features and consequently yielding suboptimal classification accuracy.To address these limitations,this study proposes an innovative 3D-2D convolutional neural networks architecture incorporating depthwise separable convolution(DSC)and attention mechanisms(AM).The framework first applies dimensionality reduction to hyperspectral images and extracts preliminary spectral-spatial features.It then employs an alternating combination of 3D and 2D convolutions for deep feature extraction.For target classification,the LogSoftmax function is implemented.The integration of depthwise separable convolution not only enhances classification accuracy but also substantially reduces model parameters.Furthermore,the attention mechanisms significantly improve the network's ability to represent multidimensional features.Extensive experiments were conducted on a custom land-based hyperspectral image dataset.The results demonstrate remarkable classification accuracy:98.74%for grassland camouflage,99.13%for dead leaf camouflage and 98.94%for wild grass camouflage.Comparative analysis shows that the proposed framework is outstanding in terms of classification accuracy and robustness for camouflage target classification.展开更多
基金funded by University of Transport and Communications(UTC)under grant number T2025-CN-004.
文摘Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.
基金This study was supported by:Inner Mongolia Academy of Forestry Sciences Open Research Project(Grant No.KF2024MS03)The Project to Improve the Scientific Research Capacity of the Inner Mongolia Academy of Forestry Sciences(Grant No.2024NLTS04)The Innovation and Entrepreneurship Training Program for Undergraduates of Beijing Forestry University(Grant No.X202410022268).
文摘Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2025-02-01295).
文摘Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.
基金supported by the National Natural Science Foundation of China(Grant Nos.82272955 and 22203057)the Natural Science Foundation of Fujian Province(Grant No.2021J011361).
文摘The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is imperative to ensure thorough tumor resection.In this study,we integrate Raman imaging technology with an artificial intelligence(AI)generative model,proposing an innovative approach for intraoperative margin status diagnosis.This method utilizes Raman imaging to swiftly and non-invasively capture tissue Raman images,which are then transformed into hematoxylin-eosin(H&E)-stained histopathological images using an AI generative model for histopathological diagnosis.The generated H&E-stained images clearly illustrate the tissue’s pathological conditions.Independently reviewed by three pathologists,the overall diagnostic accuracy for distinguishing between tumor tissue and normal muscle tissue reaches 86.7%.Notably,it outperforms current clinical practices,especially in TSCC with positive lymph node metastasis or moderately differentiated grades.This advancement highlights the potential of AI-enhanced Raman imaging to significantly improve intraoperative assessments and surgical margin evaluations,promising a versatile diagnostic tool beyond TSCC.
文摘The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by the National Natural Science Foundation of China(No.82371933)the National Natural Science Foundation of Shandong Province of China(No.ZR2021MH120)+1 种基金the Taishan Scholars Project(No.tsqn202211378)the Shandong Provincial Natural Science Foundation for Excellent Young Scholars(No.ZR2024YQ075).
文摘Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.
文摘The precise identification of quartz minerals is crucial in mineralogy and geology due to their widespread occurrence and industrial significance.Traditional methods of quartz identification in thin sections are labor-intensive and require significant expertise,often complicated by the coexistence of other minerals.This study presents a novel approach leveraging deep learning techniques combined with hyperspectral imaging to automate the identification process of quartz minerals.The utilizied four advanced deep learning models—PSPNet,U-Net,FPN,and LinkNet—has significant advancements in efficiency and accuracy.Among these models,PSPNet exhibited superior performance,achieving the highest intersection over union(IoU)scores and demonstrating exceptional reliability in segmenting quartz minerals,even in complex scenarios.The study involved a comprehensive dataset of 120 thin sections,encompassing 2470 hyperspectral images prepared from 20 rock samples.Expert-reviewed masks were used for model training,ensuring robust segmentation results.This automated approach not only expedites the recognition process but also enhances reliability,providing a valuable tool for geologists and advancing the field of mineralogical analysis.
文摘A novel method is developed by utilizing the fractional frequency based multirange rulers to precisely position the passive inter-modulation(PIM)sources within radio frequency(RF)cables.The proposed method employs a set of fractional frequencies to create multiple measuring rulers with different metric ranges to determine the values of the tens,ones,tenths,and hundredths digits of the distance.Among these rulers,the one with the lowest frequency determines the maximum metric range,while the one with the highest frequency decides the highest achievable accuracy of the position system.For all rulers,the metric accuracy is uniquely determined by the phase accuracy of the detected PIM signals.With the all-phase Fourier transform method,the phases of the PIM signals at all fractional frequencies maintain almost the same accuracy,approximately 1°(about 1/360 wavelength in the positioning accuracy)at the signal-to-noise ratio(SNR)of 10 d B.Numerical simulations verify the effectiveness of the proposed method,improving the positioning accuracy of the cable PIM up to a millimeter level with the highest fractional frequency operating at 200 MHz.
文摘Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the spinal cord,nerves,intervertebral discs,and vertebrae,Magnetic Resonance Imaging is thought to be the most effective method for imaging the spine.The semantic segmentation of vertebrae plays a major role in the diagnostic process of lumbar diseases.It is difficult to semantically partition the vertebrae in Magnetic Resonance Images from the surrounding variety of tissues,including muscles,ligaments,and intervertebral discs.U-Net is a powerful deep-learning architecture to handle the challenges of medical image analysis tasks and achieves high segmentation accuracy.This work proposes a modified U-Net architecture namely MU-Net,consisting of the Meijering convolutional layer that incorporates the Meijering filter to perform the semantic segmentation of lumbar vertebrae L1 to L5 and sacral vertebra S1.Pseudo-colour mask images were generated and used as ground truth for training the model.The work has been carried out on 1312 images expanded from T1-weighted mid-sagittal MRI images of 515 patients in the Lumbar Spine MRI Dataset publicly available from Mendeley Data.The proposed MU-Net model for the semantic segmentation of the lumbar vertebrae gives better performance with 98.79%of pixel accuracy(PA),98.66%of dice similarity coefficient(DSC),97.36%of Jaccard coefficient,and 92.55%mean Intersection over Union(mean IoU)metrics using the mentioned dataset.
基金supported by Gansu Natural Science Foundation Programme(No.24JRRA231)National Natural Science Foundation of China(No.62061023)Gansu Provincial Education,Science and Technology Innovation and Industry(No.2021CYZC-04)。
文摘Brain tumor segmentation is critical in clinical diagnosis and treatment planning.Existing methods for brain tumor segmentation with missing modalities often struggle when dealing with multiple missing modalities,a common scenario in real-world clinical settings.These methods primarily focus on handling a single missing modality at a time,making them insufficiently robust for the additional complexity encountered with incomplete data containing various missing modality combinations.Additionally,most existing methods rely on single models,which may limit their performance and increase the risk of overfitting the training data.This work proposes a novel method called the ensemble adversarial co-training neural network(EACNet)for accurate brain tumor segmentation from multi-modal magnetic resonance imaging(MRI)scans with multiple missing modalities.The proposed method consists of three key modules:the ensemble of pre-trained models,which captures diverse feature representations from the MRI data by employing an ensemble of pre-trained models;adversarial learning,which leverages a competitive training approach involving two models;a generator model,which creates realistic missing data,while sub-networks acting as discriminators learn to distinguish real data from the generated“fake”data.Co-training framework utilizes the information extracted by the multimodal path(trained on complete scans)to guide the learning process in the path handling missing modalities.The model potentially compensates for missing information through co-training interactions by exploiting the relationships between available modalities and the tumor segmentation task.EACNet was evaluated on the BraTS2018 and BraTS2020 challenge datasets and achieved state-of-the-art and competitive performance respectively.Notably,the segmentation results for the whole tumor(WT)dice similarity coefficient(DSC)reached 89.27%,surpassing the performance of existing methods.The analysis suggests that the ensemble approach offers potential benefits,and the adversarial co-training contributes to the increased robustness and accuracy of EACNet for brain tumor segmentation of MRI scans with missing modalities.The experimental results show that EACNet has promising results for the task of brain tumor segmentation of MRI scans with missing modalities and is a better candidate for real-world clinical applications.
基金supported by Natural Science Foundation of Hunan Province under Grant(NO:2021JJ31142).
文摘With the increasing demand for indoor localization,indoor location based on Wi-Fi has gained wide attention due to its convenience of access.In this paper,we propose a new multi-feature fusion convolutional neural network(CNN)based on channel state information(CSI)images,which contains more feature information by constituting a new CSI image with amplitude and angle of arrival information of CSI information collected at known points.Moreover,the global mean filtering(GMC)algorithm with median filtering proposed in this paper is used to filter and reduce the noise of CSI images to obtain clearer images for network training.To extract more features from the CSI images,the traditional single-channel network is extended,and a two-channel design is introduced to extract feature information between adjacent subcarriers.Experimental evaluation is performed in a typical indoor environment,and the proposed method is experimentally proven to have good localization performance.
文摘Osteosarcomas are malignant neoplasms derived from undifferentiated osteogenic mesenchymal cells. It causes severe and permanent damage to human tissue and has a high mortality rate. The condition has the capacity to occur in any bone;however, it often impacts long bones like the arms and legs. Prompt identification and prompt intervention are essential for augmenting patient longevity. However, the intricate composition and erratic placement of osteosarcoma provide difficulties for clinicians in accurately determining the scope of the afflicted area. There is a pressing requirement for developing an algorithm that can automatically detect bone tumors with tremendous accuracy. Therefore, in this study, we proposed a novel feature extractor framework associated with a supervised three-class XGBoost algorithm for the detection of osteosarcoma in whole slide histopathology images. This method allows for quicker and more effective data analysis. The first step involves preprocessing the imbalanced histopathology dataset, followed by augmentation and balancing utilizing two techniques: SMOTE and ADASYN. Next, a unique feature extraction framework is used to extract features, which are then inputted into the supervised three-class XGBoost algorithm for classification into three categories: non-tumor, viable tumor, and non-viable tumor. The experimental findings indicate that the proposed model exhibits superior efficiency, accuracy, and a more lightweight design in comparison to other current models for osteosarcoma detection.
文摘In digital signal processing,image enhancement or image denoising are challenging task to preserve pixel quality.There are several approaches from conventional to deep learning that are used to resolve such issues.But they still face challenges in terms of computational requirements,overfitting and generalization issues,etc.To resolve such issues,optimization algorithms provide greater control and transparency in designing digital filters for image enhancement and denoising.Therefore,this paper presented a novel denoising approach for medical applications using an Optimized Learning⁃based Multi⁃level discrete Wavelet Cascaded Convolutional Neural Network(OLMWCNN).In this approach,the optimal filter parameters are identified to preserve the image quality after denoising.The performance and efficiency of the OLMWCNN filter are evaluated,demonstrating significant progress in denoising medical images while overcoming the limitations of conventional methods.
文摘AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited adaptive histogram equalization(NICLAHE)to improve retinal image contrast was suggested to aid in the accurate identification of retinal disorders and improve the visibility of fine retinal structures.Additionally,a minimal-order filter was applied to effectively denoise the images without compromising important retinal structures.The novel NICLAHE algorithm was inspired by the classical CLAHE algorithm,but enhanced it by selecting the clip limits and tile sized in a dynamical manner relative to the pixel values in an image as opposed to using fixed values.It was evaluated on the Drive and high-resolution fundus(HRF)datasets on conventional quality measures.RESULTS:The new proposed preprocessing technique was applied to two retinal image databases,Drive and HRF,with four quality metrics being,root mean square error(RMSE),peak signal to noise ratio(PSNR),root mean square contrast(RMSC),and overall contrast.The technique performed superiorly on both the data sets as compared to the traditional enhancement methods.In order to assess the compatibility of the method with automated diagnosis,a deep learning framework named ResNet was applied in the segmentation of retinal blood vessels.Sensitivity,specificity,precision and accuracy were used to analyse the performance.NICLAHE–enhanced images outperformed the traditional techniques on both the datasets with improved accuracy.CONCLUSION:NICLAHE provides better results than traditional methods with less error and improved contrastrelated values.These enhanced images are subsequently measured by sensitivity,specificity,precision,and accuracy,which yield a better result in both datasets.
基金the Natural Science Foundation of Zhejiang Province(No.LQ20F020024)。
文摘Various and intricate varieties of lung disease have made it challenging for computer aided diagnosis to appropriately segment lung lesions utilizing computed tomography(CT)images.This study integrates transfer learning with the attention mechanism to construct a deep learning model that can automatically detect new coronary pneumonia on lung CT images.In this study,using VGG16 pre-trained by ImageNet as the encoder,the decoder was established utilizing the U-Net structure.The attention module is incorporated during each concatenate procedure,permitting the model to concentrate on the critical information and identify the crucial components efficiently.The public COVID-19-CT-Seg-Benchmark dataset was utilized for experiments,and the highest scores for Dice,F1,and Accuracy were 0.9071,0.9076,and 0.9965,respectively.The generalization performance was assessed concurrently,with performance metrics including Dice,F1,and Accuracy over 0.8.The experimental findings indicate the feasibility of the segmentation network proposed in this study.
基金supported by the National Key R&D Program of China(2021YFC2600400 and 2023YFC2605200)the National Key Research Program of China(2021YFD1401100)the“San Nong Jiu Fang”Sciences and Technologies Cooperation Project of Zhejiang Province,China(2024SNJF010)。
文摘Agromyzid leafminers cause significant economic losses in both vegetable and horticultural crops,and precise assessments of pesticide needs must be based on the extent of leaf damage.Traditionally,surveyors estimate the damage by visually comparing the proportion of damaged to intact leaf area,a method that lacks objectivity,precision,and reliable data traceability.To address these issues,an advanced survey system that combines augmented reality(AR)glasses with a camera and an artificial intelligence(AI)algorithm was developed in this study to objectively and accurately assess leafminer damage in the feld.By wearing AR glasses equipped with a voice-controlled camera,surveyors can easily flatten damaged leaves by hand and capture images for analysis.This method can provide a precise and reliable diagnosis of leafminer damage levels,which in turn supports the implementation of scientifically grounded and targeted pest management strategies.To calculate the leafminer damage level,the DeepLab-Leafminer model was proposed to precisely segment the leafminer-damaged regions and the intact leaf region.The integration of an edge-aware module and a Canny loss function into the DeepLabv3+model enhanced the DeepLab-Leafminer model's capability to accurately segment the edges of leafminer-damaged regions,which often exhibit irregular shapes.Compared with state-of-the-art segmentation models,the DeepLabLeafminer model achieved superior segmentation performance with an Intersection over Union(IoU)of 81.23%and an F1score of 87.92%on leafminer-damaged leaves.The test results revealed a 92.38%diagnosis accuracy of leafminer damage levels based on the DeepLab-Leafminer model.A mobile application and a web platform were developed to assist surveyors in displaying the diagnostic results of leafminer damage levels.This system provides surveyors with an advanced,user-friendly,and accurate tool for assessing agromyzid leafminer damage in agricultural felds using wearable AR glasses and an AI model.This method can also be utilized to automatically diagnose pest and disease damage levels in other crops based on leaf images.
基金Astronomy Joint Research Fund supported this work under cooperative agreements between the National Natural Science Foundation of China(NSFC)and the Chinese Academy of Sciences(CAS)(project numbers:U2031132 and U1931206).
文摘This paper proposes a subpixel transformation method to correct Keystone and Smile distortions in fiber spectral images from the Fiber Arrayed Solar Optical Telescope.These distortions affect the spatial and spectral positions,degrading resolution and accuracy.To correct Keystone distortion,we use a local summation and peak-finding method to locate central horizontal coordinates,calculate shifting values,and straighten the curves.For Smile distortion,we use quartic polynomial fitting based on absorption lines at different wavelengths.This technique preserves subpixel components,redistributes pixel values,and interpolates non-fiber portions,rectifying the spectra for accurate analysis.The method can also be applied to other astronomical projects like Large Sky Area Multi-Object Fiber Spectroscopic Telescope,enhancing the accuracy and reliability of spectral data in various astronomical studies.
基金supported by the National Natural Science Foundation of China(Grant no.41874173)。
文摘Studying various aurora morphology helps us understand space's physical processes and the mechanisms behind these patterns.Auroral arcs are the brightest and most prominent auroral patterns.Due to the difficulty in precisely defining auroral shape edges,auroral arc skeleton extraction is expected as an alternative representation for studying auroral morphology,resorting skeletons extract key morphological features from complex auroral shapes.Transformer models provide a better understanding of the relationship between the overall morphology and the details when processing image data,so we proposed a Transformer-based method for auroral arc skeleton extraction.Combined with ridge-guided annotation on all-sky images,a Transformer-based skeleton extractor is trained and used to estimate the number of auroral arcs.Experiments demonstrate that the Transformer-based model can more effectively capture structural information and local details of auroral arcs,which is suitable for complex auroral morphologies.
文摘In recent years,camouflage technology has evolved from single-spectral-band applications to multifunctional and multispectral implementations.Hyperspectral imaging has emerged as a powerful technique for target identification due to its capacity to capture both spectral and spatial information.The advancement of imaging spectroscopy technology has significantly enhanced reconnaissance capabilities,offering substantial advantages in camouflaged target classification and detection.However,the increasing spectral similarity between camouflaged targets and their backgrounds has significantly compromised detection performance in specific scenarios.Conventional feature extraction methods are often limited to single,shallow spectral or spatial features,failing to extract deep features and consequently yielding suboptimal classification accuracy.To address these limitations,this study proposes an innovative 3D-2D convolutional neural networks architecture incorporating depthwise separable convolution(DSC)and attention mechanisms(AM).The framework first applies dimensionality reduction to hyperspectral images and extracts preliminary spectral-spatial features.It then employs an alternating combination of 3D and 2D convolutions for deep feature extraction.For target classification,the LogSoftmax function is implemented.The integration of depthwise separable convolution not only enhances classification accuracy but also substantially reduces model parameters.Furthermore,the attention mechanisms significantly improve the network's ability to represent multidimensional features.Extensive experiments were conducted on a custom land-based hyperspectral image dataset.The results demonstrate remarkable classification accuracy:98.74%for grassland camouflage,99.13%for dead leaf camouflage and 98.94%for wild grass camouflage.Comparative analysis shows that the proposed framework is outstanding in terms of classification accuracy and robustness for camouflage target classification.