Organoids possess immense potential for unraveling the intricate functions of human tissues and facilitating preclinical disease treatment.Their applications span from high-throughput drug screening to the modeling of...Organoids possess immense potential for unraveling the intricate functions of human tissues and facilitating preclinical disease treatment.Their applications span from high-throughput drug screening to the modeling of complex diseases,with some even achieving clinical translation.Changes in the overall size,shape,boundary,and other morphological features of organoids provide a noninvasive method for assessing organoid drug sensitivity.However,the precise segmentation of organoids in bright-field microscopy images is made difficult by the complexity of the organoid morphology and interference,including overlapping organoids,bubbles,dust particles,and cell fragments.This paper introduces the precision organoid segmentation technique(POST),which is a deep-learning algorithm for segmenting challenging organoids under simple bright-field imaging conditions.Unlike existing methods,POST accurately segments each organoid and eliminates various artifacts encountered during organoid culturing and imaging.Furthermore,it is sensitive to and aligns with measurements of organoid activity in drug sensitivity experiments.POST is expected to be a valuable tool for drug screening using organoids owing to its capability of automatically and rapidly eliminating interfering substances and thereby streamlining the organoid analysis and drug screening process.展开更多
Over the years,Generative Adversarial Networks(GANs)have revolutionized the medical imaging industry for applications such as image synthesis,denoising,super resolution,data augmentation,and cross-modality translation...Over the years,Generative Adversarial Networks(GANs)have revolutionized the medical imaging industry for applications such as image synthesis,denoising,super resolution,data augmentation,and cross-modality translation.The objective of this review is to evaluate the advances,relevances,and limitations of GANs in medical imaging.An organised literature review was conducted following the guidelines of PRISMA(Preferred Reporting Items for Systematic Reviews and Meta-Analyses).The literature considered included peer-reviewed papers published between 2020 and 2025 across databases including PubMed,IEEE Xplore,and Scopus.The studies related to applications of GAN architectures in medical imaging with reported experimental outcomes and published in English in reputable journals and conferences were considered for the review.Thesis,white papers,communication letters,and non-English articles were not included for the same.CLAIM based quality assessment criteria were applied to the included studies to assess the quality.The study classifies diverse GAN architectures,summarizing their clinical applications,technical performances,and their implementation hardships.Key findings reveal the increasing applications of GANs for enhancing diagnostic accuracy,reducing data scarcity through synthetic data generation,and supporting modality translation.However,concerns such as limited generalizability,lack of clinical validation,and regulatory constraints persist.This review provides a comprehensive study of the prevailing scenario of GANs in medical imaging and highlights crucial research gaps and future directions.Though GANs hold transformative capability for medical imaging,their integration into clinical use demands further validation,interpretability,and regulatory alignment.展开更多
Unmanned aerial vehicle(UAV)-borne gamma-ray spectrum survey plays a crucial role in geological mapping,radioactive mineral exploration,and environmental monitoring.However,raw data are often compromised by flight and...Unmanned aerial vehicle(UAV)-borne gamma-ray spectrum survey plays a crucial role in geological mapping,radioactive mineral exploration,and environmental monitoring.However,raw data are often compromised by flight and instrument background noise,as well as detector resolution limitations,which affect the accuracy of geological interpretations.This study aims to explore the application of the Real-ESRGAN algorithm in the super-resolution reconstruction of UAV-borne gamma-ray spectrum images to enhance spatial resolution and the quality of geological feature visualization.We conducted super-resolution reconstruction experiments with 2×,4×and 6×magnification using the Real-ESRGAN algorithm,comparing the results with three other mainstream algorithms(SRCNN,SRGAN,FSRCNN)to verify the superiority in image quality.The experimental results indicate that Real-ESRGAN achieved a structural similarity index(SSIM)value of 0.950 at 2×magnification,significantly higher than the other algorithms,demonstrating its advantage in detail preservation.Furthermore,Real-ESRGAN effectively reduced ringing and overshoot artifacts,enhancing the clarity of geological structures and mineral deposit sites,thus providing high-quality visual information for geological exploration.展开更多
With the rapid development of transportation infrastructure,ensuring road safety through timely and accurate highway inspection has become increasingly critical.Traditional manual inspection methods are not only time-...With the rapid development of transportation infrastructure,ensuring road safety through timely and accurate highway inspection has become increasingly critical.Traditional manual inspection methods are not only time-consuming and labor-intensive,but they also struggle to provide consistent,high-precision detection and realtime monitoring of pavement surface defects.To overcome these limitations,we propose an Automatic Recognition of PavementDefect(ARPD)algorithm,which leverages unmanned aerial vehicle(UAV)-based aerial imagery to automate the inspection process.The ARPD framework incorporates a backbone network based on the Selective State Space Model(S3M),which is designed to capture long-range temporal dependencies.This enables effective modeling of dynamic correlations among redundant and often repetitive structures commonly found in road imagery.Furthermore,a neck structure based on Semantics and Detail Infusion(SDI)is introduced to guide cross-scale feature fusion.The SDI module enhances the integration of low-level spatial details with high-level semantic cues,thereby improving feature expressiveness and defect localization accuracy.Experimental evaluations demonstrate that theARPDalgorithm achieves a mean average precision(mAP)of 86.1%on a custom-labeled pavement defect dataset,outperforming the state-of-the-art YOLOv11 segmentation model.The algorithm also maintains strong generalization ability on public datasets.These results confirm that ARPD is well-suited for diverse real-world applications in intelligent,large-scale highway defect monitoring and maintenance planning.展开更多
Background:Diabetic macular edema is a prevalent retinal condition and a leading cause of visual impairment among diabetic patients’Early detection of affected areas is beneficial for effective diagnosis and treatmen...Background:Diabetic macular edema is a prevalent retinal condition and a leading cause of visual impairment among diabetic patients’Early detection of affected areas is beneficial for effective diagnosis and treatment.Traditionally,diagnosis relies on optical coherence tomography imaging technology interpreted by ophthalmologists.However,this manual image interpretation is often slow and subjective.Therefore,developing automated segmentation for macular edema images is essential to enhance to improve the diagnosis efficiency and accuracy.Methods:In order to improve clinical diagnostic efficiency and accuracy,we proposed a SegNet network structure integrated with a convolutional block attention module(CBAM).This network introduces a multi-scale input module,the CBAM attention mechanism,and jump connection.The multi-scale input module enhances the network’s perceptual capabilities,while the lightweight CBAM effectively fuses relevant features across channels and spatial dimensions,allowing for better learning of varying information levels.Results:Experimental results demonstrate that the proposed network achieves an IoU of 80.127%and an accuracy of 99.162%.Compared to the traditional segmentation network,this model has fewer parameters,faster training and testing speed,and superior performance on semantic segmentation tasks,indicating its highly practical applicability.Conclusion:The C-SegNet proposed in this study enables accurate segmentation of Diabetic macular edema lesion images,which facilitates quicker diagnosis for healthcare professionals.展开更多
Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi...Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.展开更多
Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods ex...Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.展开更多
Recent studies indicate that millions of individuals suffer from renal diseases,with renal carcinoma,a type of kidney cancer,emerging as both a chronic illness and a significant cause of mortality.Magnetic Resonance I...Recent studies indicate that millions of individuals suffer from renal diseases,with renal carcinoma,a type of kidney cancer,emerging as both a chronic illness and a significant cause of mortality.Magnetic Resonance Imaging(MRI)and Computed Tomography(CT)have become essential tools for diagnosing and assessing kidney disorders.However,accurate analysis of thesemedical images is critical for detecting and evaluating tumor severity.This study introduces an integrated hybrid framework that combines three complementary deep learning models for kidney tumor segmentation from MRI images.The proposed framework fuses a customized U-Net and Mask R-CNN using a weighted scheme to achieve semantic and instance-level segmentation.The fused outputs are further refined through edge detection using Stochastic FeatureMapping Neural Networks(SFMNN),while volumetric consistency is ensured through Improved Mini-Batch K-Means(IMBKM)clustering integrated with an Encoder-Decoder Convolutional Neural Network(EDCNN).The outputs of these three stages are combined through a weighted fusion mechanism,with optimal weights determined empirically.Experiments on MRI scans from the TCGA-KIRC dataset demonstrate that the proposed hybrid framework significantly outperforms standalone models,achieving a Dice Score of 92.5%,an IoU of 87.8%,a Precision of 93.1%,a Recall of 90.8%,and a Hausdorff Distance of 2.8 mm.These findings validate that the weighted integration of complementary architectures effectively overcomes key limitations in kidney tumor segmentation,leading to improved diagnostic accuracy and robustness in medical image analysis.展开更多
Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often stru...Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.展开更多
AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited...AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited adaptive histogram equalization(NICLAHE)to improve retinal image contrast was suggested to aid in the accurate identification of retinal disorders and improve the visibility of fine retinal structures.Additionally,a minimal-order filter was applied to effectively denoise the images without compromising important retinal structures.The novel NICLAHE algorithm was inspired by the classical CLAHE algorithm,but enhanced it by selecting the clip limits and tile sized in a dynamical manner relative to the pixel values in an image as opposed to using fixed values.It was evaluated on the Drive and high-resolution fundus(HRF)datasets on conventional quality measures.RESULTS:The new proposed preprocessing technique was applied to two retinal image databases,Drive and HRF,with four quality metrics being,root mean square error(RMSE),peak signal to noise ratio(PSNR),root mean square contrast(RMSC),and overall contrast.The technique performed superiorly on both the data sets as compared to the traditional enhancement methods.In order to assess the compatibility of the method with automated diagnosis,a deep learning framework named ResNet was applied in the segmentation of retinal blood vessels.Sensitivity,specificity,precision and accuracy were used to analyse the performance.NICLAHE–enhanced images outperformed the traditional techniques on both the datasets with improved accuracy.CONCLUSION:NICLAHE provides better results than traditional methods with less error and improved contrastrelated values.These enhanced images are subsequently measured by sensitivity,specificity,precision,and accuracy,which yield a better result in both datasets.展开更多
The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is i...The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is imperative to ensure thorough tumor resection.In this study,we integrate Raman imaging technology with an artificial intelligence(AI)generative model,proposing an innovative approach for intraoperative margin status diagnosis.This method utilizes Raman imaging to swiftly and non-invasively capture tissue Raman images,which are then transformed into hematoxylin-eosin(H&E)-stained histopathological images using an AI generative model for histopathological diagnosis.The generated H&E-stained images clearly illustrate the tissue’s pathological conditions.Independently reviewed by three pathologists,the overall diagnostic accuracy for distinguishing between tumor tissue and normal muscle tissue reaches 86.7%.Notably,it outperforms current clinical practices,especially in TSCC with positive lymph node metastasis or moderately differentiated grades.This advancement highlights the potential of AI-enhanced Raman imaging to significantly improve intraoperative assessments and surgical margin evaluations,promising a versatile diagnostic tool beyond TSCC.展开更多
The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photograp...The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Breast cancer is one of the major causes of deaths in women.However,the early diagnosis is important for screening and control the mortality rate.Thus for the diagnosis of breast cancer at the early stage,a computer-a...Breast cancer is one of the major causes of deaths in women.However,the early diagnosis is important for screening and control the mortality rate.Thus for the diagnosis of breast cancer at the early stage,a computer-aided diagnosis system is highly required.Ultrasound is an important examination technique for breast cancer diagnosis due to its low cost.Recently,many learning-based techniques have been introduced to classify breast cancer using breast ultrasound imaging dataset(BUSI)datasets;however,the manual handling is not an easy process and time consuming.The authors propose an EfficientNet-integrated ResNet deep network and XAI-based framework for accurately classifying breast cancer(malignant and benign).In the initial step,data augmentation is performed to increase the number of training samples.For this purpose,three-pixel flip mathematical equations are introduced:horizontal,vertical,and 90°.Later,two pretrained deep learning models were employed,skipped some layers,and fine-tuned.Both fine-tuned models are later trained using a deep transfer learning process and extracted features from the deeper layer.Explainable artificial intelligence-based analysed the performance of trained models.After that,a new feature selection technique is proposed based on the cuckoo search algorithm called cuckoo search controlled standard error mean.This technique selects the best features and fuses using a new parallel zeropadding maximum correlated coefficient features.In the end,the selection algorithm is applied again to the fused feature vector and classified using machine learning algorithms.The experimental process of the proposed framework is conducted on a publicly available BUSI and obtained 98.4%and 98%accuracy in two different experiments.Comparing the proposed framework is also conducted with recent techniques and shows improved accuracy.In addition,the proposed framework was executed less than the original deep learning models.展开更多
Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale who...Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.展开更多
Kinship verification is a key biometric recognition task that determines biological relationships based on physical features.Traditional methods predominantly use facial recognition,leveraging established techniques a...Kinship verification is a key biometric recognition task that determines biological relationships based on physical features.Traditional methods predominantly use facial recognition,leveraging established techniques and extensive datasets.However,recent research has highlighted ear recognition as a promising alternative,offering advantages in robustness against variations in facial expressions,aging,and occlusions.Despite its potential,a significant challenge in ear-based kinship verification is the lack of large-scale datasets necessary for training deep learning models effectively.To address this challenge,we introduce the EarKinshipVN dataset,a novel and extensive collection of ear images designed specifically for kinship verification.This dataset consists of 4876 high-resolution color images from 157 multiracial families across different regions,forming 73,220 kinship pairs.EarKinshipVN,a diverse and large-scale dataset,advances kinship verification research using ear features.Furthermore,we propose the Mixer Attention Inception(MAI)model,an improved architecture that enhances feature extraction and classification accuracy.The MAI model fuses Inceptionv4 and MLP Mixer,integrating four attention mechanisms to enhance spatial and channel-wise feature representation.Experimental results demonstrate that MAI significantly outperforms traditional backbone architectures.It achieves an accuracy of 98.71%,surpassing Vision Transformer models while reducing computational complexity by up to 95%in parameter usage.These findings suggest that ear-based kinship verification,combined with an optimized deep learning model and a comprehensive dataset,holds significant promise for biometric applications.展开更多
The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring ef...The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models.展开更多
The precise identification of quartz minerals is crucial in mineralogy and geology due to their widespread occurrence and industrial significance.Traditional methods of quartz identification in thin sections are labor...The precise identification of quartz minerals is crucial in mineralogy and geology due to their widespread occurrence and industrial significance.Traditional methods of quartz identification in thin sections are labor-intensive and require significant expertise,often complicated by the coexistence of other minerals.This study presents a novel approach leveraging deep learning techniques combined with hyperspectral imaging to automate the identification process of quartz minerals.The utilizied four advanced deep learning models—PSPNet,U-Net,FPN,and LinkNet—has significant advancements in efficiency and accuracy.Among these models,PSPNet exhibited superior performance,achieving the highest intersection over union(IoU)scores and demonstrating exceptional reliability in segmenting quartz minerals,even in complex scenarios.The study involved a comprehensive dataset of 120 thin sections,encompassing 2470 hyperspectral images prepared from 20 rock samples.Expert-reviewed masks were used for model training,ensuring robust segmentation results.This automated approach not only expedites the recognition process but also enhances reliability,providing a valuable tool for geologists and advancing the field of mineralogical analysis.展开更多
A novel method is developed by utilizing the fractional frequency based multirange rulers to precisely position the passive inter-modulation(PIM)sources within radio frequency(RF)cables.The proposed method employs a s...A novel method is developed by utilizing the fractional frequency based multirange rulers to precisely position the passive inter-modulation(PIM)sources within radio frequency(RF)cables.The proposed method employs a set of fractional frequencies to create multiple measuring rulers with different metric ranges to determine the values of the tens,ones,tenths,and hundredths digits of the distance.Among these rulers,the one with the lowest frequency determines the maximum metric range,while the one with the highest frequency decides the highest achievable accuracy of the position system.For all rulers,the metric accuracy is uniquely determined by the phase accuracy of the detected PIM signals.With the all-phase Fourier transform method,the phases of the PIM signals at all fractional frequencies maintain almost the same accuracy,approximately 1°(about 1/360 wavelength in the positioning accuracy)at the signal-to-noise ratio(SNR)of 10 d B.Numerical simulations verify the effectiveness of the proposed method,improving the positioning accuracy of the cable PIM up to a millimeter level with the highest fractional frequency operating at 200 MHz.展开更多
基金supported by the National Key R&D Program of China(No.2022YFC2504403)the National Natural Science Foundation of China(No.62172202)+1 种基金the Experiment Project of China Manned Space Program(No.HYZHXM01019)the Fundamental Research Funds for the Central Universities from Southeast University(No.3207032101C3)。
文摘Organoids possess immense potential for unraveling the intricate functions of human tissues and facilitating preclinical disease treatment.Their applications span from high-throughput drug screening to the modeling of complex diseases,with some even achieving clinical translation.Changes in the overall size,shape,boundary,and other morphological features of organoids provide a noninvasive method for assessing organoid drug sensitivity.However,the precise segmentation of organoids in bright-field microscopy images is made difficult by the complexity of the organoid morphology and interference,including overlapping organoids,bubbles,dust particles,and cell fragments.This paper introduces the precision organoid segmentation technique(POST),which is a deep-learning algorithm for segmenting challenging organoids under simple bright-field imaging conditions.Unlike existing methods,POST accurately segments each organoid and eliminates various artifacts encountered during organoid culturing and imaging.Furthermore,it is sensitive to and aligns with measurements of organoid activity in drug sensitivity experiments.POST is expected to be a valuable tool for drug screening using organoids owing to its capability of automatically and rapidly eliminating interfering substances and thereby streamlining the organoid analysis and drug screening process.
基金supported by Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Research Project under grant number RGP2/540/46.
文摘Over the years,Generative Adversarial Networks(GANs)have revolutionized the medical imaging industry for applications such as image synthesis,denoising,super resolution,data augmentation,and cross-modality translation.The objective of this review is to evaluate the advances,relevances,and limitations of GANs in medical imaging.An organised literature review was conducted following the guidelines of PRISMA(Preferred Reporting Items for Systematic Reviews and Meta-Analyses).The literature considered included peer-reviewed papers published between 2020 and 2025 across databases including PubMed,IEEE Xplore,and Scopus.The studies related to applications of GAN architectures in medical imaging with reported experimental outcomes and published in English in reputable journals and conferences were considered for the review.Thesis,white papers,communication letters,and non-English articles were not included for the same.CLAIM based quality assessment criteria were applied to the included studies to assess the quality.The study classifies diverse GAN architectures,summarizing their clinical applications,technical performances,and their implementation hardships.Key findings reveal the increasing applications of GANs for enhancing diagnostic accuracy,reducing data scarcity through synthetic data generation,and supporting modality translation.However,concerns such as limited generalizability,lack of clinical validation,and regulatory constraints persist.This review provides a comprehensive study of the prevailing scenario of GANs in medical imaging and highlights crucial research gaps and future directions.Though GANs hold transformative capability for medical imaging,their integration into clinical use demands further validation,interpretability,and regulatory alignment.
基金supported by the National Natural Science Foundation of China(Nos.12205044 and 12265003)2024 Jiangxi Province Civil-Military Integration Research Institute‘BeiDou+’Project Subtopic(No.2024JXRH0Y06).
文摘Unmanned aerial vehicle(UAV)-borne gamma-ray spectrum survey plays a crucial role in geological mapping,radioactive mineral exploration,and environmental monitoring.However,raw data are often compromised by flight and instrument background noise,as well as detector resolution limitations,which affect the accuracy of geological interpretations.This study aims to explore the application of the Real-ESRGAN algorithm in the super-resolution reconstruction of UAV-borne gamma-ray spectrum images to enhance spatial resolution and the quality of geological feature visualization.We conducted super-resolution reconstruction experiments with 2×,4×and 6×magnification using the Real-ESRGAN algorithm,comparing the results with three other mainstream algorithms(SRCNN,SRGAN,FSRCNN)to verify the superiority in image quality.The experimental results indicate that Real-ESRGAN achieved a structural similarity index(SSIM)value of 0.950 at 2×magnification,significantly higher than the other algorithms,demonstrating its advantage in detail preservation.Furthermore,Real-ESRGAN effectively reduced ringing and overshoot artifacts,enhancing the clarity of geological structures and mineral deposit sites,thus providing high-quality visual information for geological exploration.
基金supported in part by the Technical Service for the Development and Application of an Intelligent Visual Management Platformfor Expressway Construction Progress Based on BIM Technology(grant NO.JKYZLX-2023-09)in partby the Technical Service for the Development of an Early Warning Model in the Research and Application of Key Technologies for Tunnel Operation Safety Monitoring and Early Warning Based on Digital Twin(grant NO.JK-S02-ZNGS-202412-JISHU-FA-0035)sponsored by Yunnan Transportation Science Research Institute Co.,Ltd.
文摘With the rapid development of transportation infrastructure,ensuring road safety through timely and accurate highway inspection has become increasingly critical.Traditional manual inspection methods are not only time-consuming and labor-intensive,but they also struggle to provide consistent,high-precision detection and realtime monitoring of pavement surface defects.To overcome these limitations,we propose an Automatic Recognition of PavementDefect(ARPD)algorithm,which leverages unmanned aerial vehicle(UAV)-based aerial imagery to automate the inspection process.The ARPD framework incorporates a backbone network based on the Selective State Space Model(S3M),which is designed to capture long-range temporal dependencies.This enables effective modeling of dynamic correlations among redundant and often repetitive structures commonly found in road imagery.Furthermore,a neck structure based on Semantics and Detail Infusion(SDI)is introduced to guide cross-scale feature fusion.The SDI module enhances the integration of low-level spatial details with high-level semantic cues,thereby improving feature expressiveness and defect localization accuracy.Experimental evaluations demonstrate that theARPDalgorithm achieves a mean average precision(mAP)of 86.1%on a custom-labeled pavement defect dataset,outperforming the state-of-the-art YOLOv11 segmentation model.The algorithm also maintains strong generalization ability on public datasets.These results confirm that ARPD is well-suited for diverse real-world applications in intelligent,large-scale highway defect monitoring and maintenance planning.
基金supported by the Guangdong Pharmaceutical University 2024 Higher Education Research Projects(GKP202403,GMP202402)the Guangdong Pharmaceutical University College Students’Innovation and Entrepreneurship Training Programs(Grant No.202504302033,202504302034,202504302036,and 202504302244).
文摘Background:Diabetic macular edema is a prevalent retinal condition and a leading cause of visual impairment among diabetic patients’Early detection of affected areas is beneficial for effective diagnosis and treatment.Traditionally,diagnosis relies on optical coherence tomography imaging technology interpreted by ophthalmologists.However,this manual image interpretation is often slow and subjective.Therefore,developing automated segmentation for macular edema images is essential to enhance to improve the diagnosis efficiency and accuracy.Methods:In order to improve clinical diagnostic efficiency and accuracy,we proposed a SegNet network structure integrated with a convolutional block attention module(CBAM).This network introduces a multi-scale input module,the CBAM attention mechanism,and jump connection.The multi-scale input module enhances the network’s perceptual capabilities,while the lightweight CBAM effectively fuses relevant features across channels and spatial dimensions,allowing for better learning of varying information levels.Results:Experimental results demonstrate that the proposed network achieves an IoU of 80.127%and an accuracy of 99.162%.Compared to the traditional segmentation network,this model has fewer parameters,faster training and testing speed,and superior performance on semantic segmentation tasks,indicating its highly practical applicability.Conclusion:The C-SegNet proposed in this study enables accurate segmentation of Diabetic macular edema lesion images,which facilitates quicker diagnosis for healthcare professionals.
基金funded by University of Transport and Communications(UTC)under grant number T2025-CN-004.
文摘Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.
基金This study was supported by:Inner Mongolia Academy of Forestry Sciences Open Research Project(Grant No.KF2024MS03)The Project to Improve the Scientific Research Capacity of the Inner Mongolia Academy of Forestry Sciences(Grant No.2024NLTS04)The Innovation and Entrepreneurship Training Program for Undergraduates of Beijing Forestry University(Grant No.X202410022268).
文摘Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.
基金funded by the Ongoing Research Funding Program-Research Chairs(ORF-RC-2025-2400),King Saud University,Riyadh,Saudi Arabia。
文摘Recent studies indicate that millions of individuals suffer from renal diseases,with renal carcinoma,a type of kidney cancer,emerging as both a chronic illness and a significant cause of mortality.Magnetic Resonance Imaging(MRI)and Computed Tomography(CT)have become essential tools for diagnosing and assessing kidney disorders.However,accurate analysis of thesemedical images is critical for detecting and evaluating tumor severity.This study introduces an integrated hybrid framework that combines three complementary deep learning models for kidney tumor segmentation from MRI images.The proposed framework fuses a customized U-Net and Mask R-CNN using a weighted scheme to achieve semantic and instance-level segmentation.The fused outputs are further refined through edge detection using Stochastic FeatureMapping Neural Networks(SFMNN),while volumetric consistency is ensured through Improved Mini-Batch K-Means(IMBKM)clustering integrated with an Encoder-Decoder Convolutional Neural Network(EDCNN).The outputs of these three stages are combined through a weighted fusion mechanism,with optimal weights determined empirically.Experiments on MRI scans from the TCGA-KIRC dataset demonstrate that the proposed hybrid framework significantly outperforms standalone models,achieving a Dice Score of 92.5%,an IoU of 87.8%,a Precision of 93.1%,a Recall of 90.8%,and a Hausdorff Distance of 2.8 mm.These findings validate that the weighted integration of complementary architectures effectively overcomes key limitations in kidney tumor segmentation,leading to improved diagnostic accuracy and robustness in medical image analysis.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2025-02-01295).
文摘Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.
文摘AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited adaptive histogram equalization(NICLAHE)to improve retinal image contrast was suggested to aid in the accurate identification of retinal disorders and improve the visibility of fine retinal structures.Additionally,a minimal-order filter was applied to effectively denoise the images without compromising important retinal structures.The novel NICLAHE algorithm was inspired by the classical CLAHE algorithm,but enhanced it by selecting the clip limits and tile sized in a dynamical manner relative to the pixel values in an image as opposed to using fixed values.It was evaluated on the Drive and high-resolution fundus(HRF)datasets on conventional quality measures.RESULTS:The new proposed preprocessing technique was applied to two retinal image databases,Drive and HRF,with four quality metrics being,root mean square error(RMSE),peak signal to noise ratio(PSNR),root mean square contrast(RMSC),and overall contrast.The technique performed superiorly on both the data sets as compared to the traditional enhancement methods.In order to assess the compatibility of the method with automated diagnosis,a deep learning framework named ResNet was applied in the segmentation of retinal blood vessels.Sensitivity,specificity,precision and accuracy were used to analyse the performance.NICLAHE–enhanced images outperformed the traditional techniques on both the datasets with improved accuracy.CONCLUSION:NICLAHE provides better results than traditional methods with less error and improved contrastrelated values.These enhanced images are subsequently measured by sensitivity,specificity,precision,and accuracy,which yield a better result in both datasets.
基金supported by the National Natural Science Foundation of China(Grant Nos.82272955 and 22203057)the Natural Science Foundation of Fujian Province(Grant No.2021J011361).
文摘The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is imperative to ensure thorough tumor resection.In this study,we integrate Raman imaging technology with an artificial intelligence(AI)generative model,proposing an innovative approach for intraoperative margin status diagnosis.This method utilizes Raman imaging to swiftly and non-invasively capture tissue Raman images,which are then transformed into hematoxylin-eosin(H&E)-stained histopathological images using an AI generative model for histopathological diagnosis.The generated H&E-stained images clearly illustrate the tissue’s pathological conditions.Independently reviewed by three pathologists,the overall diagnostic accuracy for distinguishing between tumor tissue and normal muscle tissue reaches 86.7%.Notably,it outperforms current clinical practices,especially in TSCC with positive lymph node metastasis or moderately differentiated grades.This advancement highlights the potential of AI-enhanced Raman imaging to significantly improve intraoperative assessments and surgical margin evaluations,promising a versatile diagnostic tool beyond TSCC.
文摘The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
文摘Breast cancer is one of the major causes of deaths in women.However,the early diagnosis is important for screening and control the mortality rate.Thus for the diagnosis of breast cancer at the early stage,a computer-aided diagnosis system is highly required.Ultrasound is an important examination technique for breast cancer diagnosis due to its low cost.Recently,many learning-based techniques have been introduced to classify breast cancer using breast ultrasound imaging dataset(BUSI)datasets;however,the manual handling is not an easy process and time consuming.The authors propose an EfficientNet-integrated ResNet deep network and XAI-based framework for accurately classifying breast cancer(malignant and benign).In the initial step,data augmentation is performed to increase the number of training samples.For this purpose,three-pixel flip mathematical equations are introduced:horizontal,vertical,and 90°.Later,two pretrained deep learning models were employed,skipped some layers,and fine-tuned.Both fine-tuned models are later trained using a deep transfer learning process and extracted features from the deeper layer.Explainable artificial intelligence-based analysed the performance of trained models.After that,a new feature selection technique is proposed based on the cuckoo search algorithm called cuckoo search controlled standard error mean.This technique selects the best features and fuses using a new parallel zeropadding maximum correlated coefficient features.In the end,the selection algorithm is applied again to the fused feature vector and classified using machine learning algorithms.The experimental process of the proposed framework is conducted on a publicly available BUSI and obtained 98.4%and 98%accuracy in two different experiments.Comparing the proposed framework is also conducted with recent techniques and shows improved accuracy.In addition,the proposed framework was executed less than the original deep learning models.
基金supported by the National Natural Science Foundation of China(No.82371933)the National Natural Science Foundation of Shandong Province of China(No.ZR2021MH120)+1 种基金the Taishan Scholars Project(No.tsqn202211378)the Shandong Provincial Natural Science Foundation for Excellent Young Scholars(No.ZR2024YQ075).
文摘Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.
文摘Kinship verification is a key biometric recognition task that determines biological relationships based on physical features.Traditional methods predominantly use facial recognition,leveraging established techniques and extensive datasets.However,recent research has highlighted ear recognition as a promising alternative,offering advantages in robustness against variations in facial expressions,aging,and occlusions.Despite its potential,a significant challenge in ear-based kinship verification is the lack of large-scale datasets necessary for training deep learning models effectively.To address this challenge,we introduce the EarKinshipVN dataset,a novel and extensive collection of ear images designed specifically for kinship verification.This dataset consists of 4876 high-resolution color images from 157 multiracial families across different regions,forming 73,220 kinship pairs.EarKinshipVN,a diverse and large-scale dataset,advances kinship verification research using ear features.Furthermore,we propose the Mixer Attention Inception(MAI)model,an improved architecture that enhances feature extraction and classification accuracy.The MAI model fuses Inceptionv4 and MLP Mixer,integrating four attention mechanisms to enhance spatial and channel-wise feature representation.Experimental results demonstrate that MAI significantly outperforms traditional backbone architectures.It achieves an accuracy of 98.71%,surpassing Vision Transformer models while reducing computational complexity by up to 95%in parameter usage.These findings suggest that ear-based kinship verification,combined with an optimized deep learning model and a comprehensive dataset,holds significant promise for biometric applications.
基金supported by the National Natural Science Foundation of China(Grant Nos.62071315 and 62271336).
文摘The multi-modal characteristics of mineral particles play a pivotal role in enhancing the classification accuracy,which is critical for obtaining a profound understanding of the Earth's composition and ensuring effective exploitation utilization of its resources.However,the existing methods for classifying mineral particles do not fully utilize these multi-modal features,thereby limiting the classification accuracy.Furthermore,when conventional multi-modal image classification methods are applied to planepolarized and cross-polarized sequence images of mineral particles,they encounter issues such as information loss,misaligned features,and challenges in spatiotemporal feature extraction.To address these challenges,we propose a multi-modal mineral particle polarization image classification network(MMGC-Net)for precise mineral particle classification.Initially,MMGC-Net employs a two-dimensional(2D)backbone network with shared parameters to extract features from two types of polarized images to ensure feature alignment.Subsequently,a cross-polarized intra-modal feature fusion module is designed to refine the spatiotemporal features from the extracted features of the cross-polarized sequence images.Ultimately,the inter-modal feature fusion module integrates the two types of modal features to enhance the classification precision.Quantitative and qualitative experimental results indicate that when compared with the current state-of-the-art multi-modal image classification methods,MMGC-Net demonstrates marked superiority in terms of mineral particle multi-modal feature learning and four classification evaluation metrics.It also demonstrates better stability than the existing models.
文摘The precise identification of quartz minerals is crucial in mineralogy and geology due to their widespread occurrence and industrial significance.Traditional methods of quartz identification in thin sections are labor-intensive and require significant expertise,often complicated by the coexistence of other minerals.This study presents a novel approach leveraging deep learning techniques combined with hyperspectral imaging to automate the identification process of quartz minerals.The utilizied four advanced deep learning models—PSPNet,U-Net,FPN,and LinkNet—has significant advancements in efficiency and accuracy.Among these models,PSPNet exhibited superior performance,achieving the highest intersection over union(IoU)scores and demonstrating exceptional reliability in segmenting quartz minerals,even in complex scenarios.The study involved a comprehensive dataset of 120 thin sections,encompassing 2470 hyperspectral images prepared from 20 rock samples.Expert-reviewed masks were used for model training,ensuring robust segmentation results.This automated approach not only expedites the recognition process but also enhances reliability,providing a valuable tool for geologists and advancing the field of mineralogical analysis.
文摘A novel method is developed by utilizing the fractional frequency based multirange rulers to precisely position the passive inter-modulation(PIM)sources within radio frequency(RF)cables.The proposed method employs a set of fractional frequencies to create multiple measuring rulers with different metric ranges to determine the values of the tens,ones,tenths,and hundredths digits of the distance.Among these rulers,the one with the lowest frequency determines the maximum metric range,while the one with the highest frequency decides the highest achievable accuracy of the position system.For all rulers,the metric accuracy is uniquely determined by the phase accuracy of the detected PIM signals.With the all-phase Fourier transform method,the phases of the PIM signals at all fractional frequencies maintain almost the same accuracy,approximately 1°(about 1/360 wavelength in the positioning accuracy)at the signal-to-noise ratio(SNR)of 10 d B.Numerical simulations verify the effectiveness of the proposed method,improving the positioning accuracy of the cable PIM up to a millimeter level with the highest fractional frequency operating at 200 MHz.