The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time perfor...Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
In today’s digital era,the rapid evolution of image editing technologies has brought about a significant simplification of image manipulation.Unfortunately,this progress has also given rise to the misuse of manipulat...In today’s digital era,the rapid evolution of image editing technologies has brought about a significant simplification of image manipulation.Unfortunately,this progress has also given rise to the misuse of manipulated images across various domains.One of the pressing challenges stemming from this advancement is the increasing difficulty in discerning between unaltered and manipulated images.This paper offers a comprehensive survey of existing methodologies for detecting image tampering,shedding light on the diverse approaches employed in the field of contemporary image forensics.The methods used to identify image forgery can be broadly classified into two primary categories:classical machine learning techniques,heavily reliant on manually crafted features,and deep learning methods.Additionally,this paper explores recent developments in image forensics,placing particular emphasis on the detection of counterfeit colorization.Image colorization involves predicting colors for grayscale images,thereby enhancing their visual appeal.The advancements in colorization techniques have reached a level where distinguishing between authentic and forged images with the naked eye has become an exceptionally challenging task.This paper serves as an in-depth exploration of the intricacies of image forensics in the modern age,with a specific focus on the detection of colorization forgery,presenting a comprehensive overview of methodologies in this critical field.展开更多
Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensur...Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensure patient safety.This survey examines the current state of pill image recognition,focusing on advancements,methodologies,and the challenges that remain unresolved.It provides a comprehensive overview of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and aims to explore the ongoing difficulties in the field.We summarize and classify the methods used in each article,compare the strengths and weaknesses of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and review benchmark datasets for pill image recognition.Additionally,we compare the performance of proposed methods on popular benchmark datasets.This survey applies recent advancements,such as Transformer models and cutting-edge technologies like Augmented Reality(AR),to discuss potential research directions and conclude the review.By offering a holistic perspective,this paper aims to serve as a valuable resource for researchers and practitioners striving to advance the field of pill image recognition.展开更多
The visual noise of each light intensity area is different when the image is drawn by Monte Carlo method.However,the existing denoising algorithms have limited denoising performance under complex lighting conditions a...The visual noise of each light intensity area is different when the image is drawn by Monte Carlo method.However,the existing denoising algorithms have limited denoising performance under complex lighting conditions and are easy to lose detailed information.So we propose a rendered image denoising method with filtering guided by lighting information.First,we design an image segmentation algorithm based on lighting information to segment the image into different illumination areas.Then,we establish the parameter prediction model guided by lighting information for filtering(PGLF)to predict the filtering parameters of different illumination areas.For different illumination areas,we use these filtering parameters to construct area filters,and the filters are guided by the lighting information to perform sub-area filtering.Finally,the filtering results are fused with auxiliary features to output denoised images for improving the overall denoising effect of the image.Under the physically based rendering tool(PBRT)scene and Tungsten dataset,the experimental results show that compared with other guided filtering denoising methods,our method improves the peak signal-to-noise ratio(PSNR)metrics by 4.2164 dB on average and the structural similarity index(SSIM)metrics by 7.8%on average.This shows that our method can better reduce the noise in complex lighting scenesand improvethe imagequality.展开更多
In the field of image forensics,image tampering detection is a critical and challenging task.Traditional methods based on manually designed feature extraction typically focus on a specific type of tampering operation,...In the field of image forensics,image tampering detection is a critical and challenging task.Traditional methods based on manually designed feature extraction typically focus on a specific type of tampering operation,which limits their effectiveness in complex scenarios involving multiple forms of tampering.Although deep learningbasedmethods offer the advantage of automatic feature learning,current approaches still require further improvements in terms of detection accuracy and computational efficiency.To address these challenges,this study applies the UNet 3+model to image tampering detection and proposes a hybrid framework,referred to as DDT-Net(Deep Detail Tracking Network),which integrates deep learning with traditional detection techniques.In contrast to traditional additive methods,this approach innovatively applies amultiplicative fusion technique during downsampling,effectively combining the deep learning feature maps at each layer with those generated by the Bayar noise stream.This design enables noise residual features to guide the learning of semantic features more precisely and efficiently,thus facilitating comprehensive feature-level interaction.Furthermore,by leveraging the complementary strengths of deep networks in capturing large-scale semantic manipulations and traditional algorithms’proficiency in detecting fine-grained local traces,the method significantly enhances the accuracy and robustness of tampered region detection.Compared with other approaches,the proposed method achieves an F1 score improvement exceeding 30% on the DEFACTO and DIS25k datasets.In addition,it has been extensively validated on other datasets,including CASIA and DIS25k.Experimental results demonstrate that this method achieves outstanding performance across various types of image tampering detection tasks.展开更多
Medical institutions frequently utilize cloud servers for storing digital medical imaging data, aiming to lower both storage expenses and computational expenses. Nevertheless, the reliability of cloud servers as third...Medical institutions frequently utilize cloud servers for storing digital medical imaging data, aiming to lower both storage expenses and computational expenses. Nevertheless, the reliability of cloud servers as third-party providers is not always guaranteed. To safeguard against the exposure and misuse of personal privacy information, and achieve secure and efficient retrieval, a secure medical image retrieval based on a multi-attention mechanism and triplet deep hashing is proposed in this paper (abbreviated as MATDH). Specifically, this method first utilizes the contrast-limited adaptive histogram equalization method applicable to color images to enhance chest X-ray images. Next, a designed multi-attention mechanism focuses on important local features during the feature extraction stage. Moreover, a triplet loss function is utilized to learn discriminative hash codes to construct a compact and efficient triplet deep hashing. Finally, upsampling is used to restore the original resolution of the images during retrieval, thereby enabling more accurate matching. To ensure the security of medical image data, a lightweight image encryption method based on frequency domain encryption is designed to encrypt the chest X-ray images. The findings of the experiment indicate that, in comparison to various advanced image retrieval techniques, the suggested approach improves the precision of feature extraction and retrieval using the COVIDx dataset. Additionally, it offers enhanced protection for the confidentiality of medical images stored in cloud settings and demonstrates strong practicality.展开更多
Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify sp...Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.展开更多
In the field of image processing,the analysis of Synthetic Aperture Radar(SAR)images is crucial due to its broad range of applications.However,SAR images are often affected by coherent speckle noise,which significantl...In the field of image processing,the analysis of Synthetic Aperture Radar(SAR)images is crucial due to its broad range of applications.However,SAR images are often affected by coherent speckle noise,which significantly degrades image quality.Traditional denoising methods,typically based on filter techniques,often face challenges related to inefficiency and limited adaptability.To address these limitations,this study proposes a novel SAR image denoising algorithm based on an enhanced residual network architecture,with the objective of enhancing the utility of SAR imagery in complex electromagnetic environments.The proposed algorithm integrates residual network modules,which directly process the noisy input images to generate denoised outputs.This approach not only reduces computational complexity but also mitigates the difficulties associated with model training.By combining the Transformer module with the residual block,the algorithm enhances the network's ability to extract global features,offering superior feature extraction capabilities compared to CNN-based residual modules.Additionally,the algorithm employs the adaptive activation function Meta-ACON,which dynamically adjusts the activation patterns of neurons,thereby improving the network's feature extraction efficiency.The effectiveness of the proposed denoising method is empirically validated using real SAR images from the RSOD dataset.The proposed algorithm exhibits remarkable performance in terms of EPI,SSIM,and ENL,while achieving a substantial enhancement in PSNR when compared to traditional and deep learning-based algorithms.The PSNR performance is enhanced by over twofold.Moreover,the evaluation of the MSTAR SAR dataset substantiates the algorithm's robustness and applicability in SAR denoising tasks,with a PSNR of 25.2021 being attained.These findings underscore the efficacy of the proposed algorithm in mitigating speckle noise while preserving critical features in SAR imagery,thereby enhancing its quality and usability in practical scenarios.展开更多
Image-based similar trademark retrieval is a time-consuming and labor-intensive task in the trademark examination process.This paper aims to support trademark examiners by training Deep Convolutional Neural Network(DC...Image-based similar trademark retrieval is a time-consuming and labor-intensive task in the trademark examination process.This paper aims to support trademark examiners by training Deep Convolutional Neural Network(DCNN)models for effective Trademark Image Retrieval(TIR).To achieve this goal,we first develop a novel labeling method that automatically generates hundreds of thousands of labeled similar and dissimilar trademark image pairs using accompanying data fields such as citation lists,Vienna classification(VC)codes,and trademark ownership information.This approach eliminates the need for manual labeling and provides a large-scale dataset suitable for training deep learning models.We then train DCNN models based on Siamese and Triplet architectures,evaluating various feature extractors to determine the most effective configuration.Furthermore,we present an Adapted Contrastive Loss Function(ACLF)for the trademark retrieval task,specifically engineered to mitigate the influence of noisy labels found in automatically created datasets.Experimental results indicate that our proposed model(Efficient-Net_v21_Siamese)performs best at both True Negative Rate(TNR)threshold levels,TNR 0.9 and TNR 0.95,with==respective True Positive Rates(TPRs)of 77.7%and 70.8%and accuracies of 83.9%and 80.4%.Additionally,when testing on the public trademark dataset METU_v2,our model achieves a normalized average rank(NAR)of 0.0169,outperforming the current state-of-the-art(SOTA)model.Based on these findings,we estimate that considering only approximately 10%of the returned trademarks would be sufficient,significantly reducing the review time.Therefore,the paper highlights the potential of utilizing national trademark data to enhance the accuracy and efficiency of trademark retrieval systems,ultimately supporting trademark examiners in their evaluation tasks.展开更多
Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status...Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.展开更多
Brain tumor segmentation is critical in clinical diagnosis and treatment planning.Existing methods for brain tumor segmentation with missing modalities often struggle when dealing with multiple missing modalities,a co...Brain tumor segmentation is critical in clinical diagnosis and treatment planning.Existing methods for brain tumor segmentation with missing modalities often struggle when dealing with multiple missing modalities,a common scenario in real-world clinical settings.These methods primarily focus on handling a single missing modality at a time,making them insufficiently robust for the additional complexity encountered with incomplete data containing various missing modality combinations.Additionally,most existing methods rely on single models,which may limit their performance and increase the risk of overfitting the training data.This work proposes a novel method called the ensemble adversarial co-training neural network(EACNet)for accurate brain tumor segmentation from multi-modal magnetic resonance imaging(MRI)scans with multiple missing modalities.The proposed method consists of three key modules:the ensemble of pre-trained models,which captures diverse feature representations from the MRI data by employing an ensemble of pre-trained models;adversarial learning,which leverages a competitive training approach involving two models;a generator model,which creates realistic missing data,while sub-networks acting as discriminators learn to distinguish real data from the generated“fake”data.Co-training framework utilizes the information extracted by the multimodal path(trained on complete scans)to guide the learning process in the path handling missing modalities.The model potentially compensates for missing information through co-training interactions by exploiting the relationships between available modalities and the tumor segmentation task.EACNet was evaluated on the BraTS2018 and BraTS2020 challenge datasets and achieved state-of-the-art and competitive performance respectively.Notably,the segmentation results for the whole tumor(WT)dice similarity coefficient(DSC)reached 89.27%,surpassing the performance of existing methods.The analysis suggests that the ensemble approach offers potential benefits,and the adversarial co-training contributes to the increased robustness and accuracy of EACNet for brain tumor segmentation of MRI scans with missing modalities.The experimental results show that EACNet has promising results for the task of brain tumor segmentation of MRI scans with missing modalities and is a better candidate for real-world clinical applications.展开更多
BACKGROUND Optical coherence tomography(OCT)enables high-resolution,non-invasive visualization of retinal structures.Recent evidence suggests that retinal layer alterations may reflect central nervous system changes a...BACKGROUND Optical coherence tomography(OCT)enables high-resolution,non-invasive visualization of retinal structures.Recent evidence suggests that retinal layer alterations may reflect central nervous system changes associated with psychiatric disorders such as schizophrenia(SZ).AIM To develop an advanced deep learning model to classify OCT images and distinguish patients with SZ from healthy controls using retinal biomarkers.METHODS A novel convolutional neural network,Self-AttentionNeXt,was designed by integrating grouped self-attention mechanisms,residual and inverted bottleneck blocks,and a final 1×1 convolution for feature refinement.The model was trained and tested on both a custom OCT dataset collected from patients with SZ and a publicly available OCT dataset(OCT2017).RESULTS Self-AttentionNeXt achieved 97.0%accuracy on the collected SZ OCT dataset and over 95%accuracy on the public OCT2017 dataset.Gradient-weighted class activation mapping visualizations confirmed the model’s attention to clinically relevant retinal regions,suggesting effective feature localization.CONCLUSION Self-AttentionNeXt effectively combines transformer-inspired attention mechanisms with convolutional neural networks architecture to support the early and accurate detection of SZ using OCT images.This approach offers a promising direction for artificial intelligence-assisted psychiatric diagnostics and clinical decision support.展开更多
Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval sy...Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval system mainly relies on the efficiency and accuracy of the classification models.This research addresses the challenge of enhancing the image retrieval system by developing a novel approach,EfficientNet-Convolutional Neural Network(EffNet-CNN).The key objective of this research is to evaluate the proposed EffNet-CNN model’s performance in image classification,image mining,and CBIR.The novelty of the proposed EffNet-CNN model includes the integration of different techniques and modifications.The model includes the Mahalanobis distance metric for feature matching,which enhances the similarity measurements.The model extends EfficientNet architecture by incorporating additional convolutional layers,batch normalization,dropout,and pooling layers for improved hierarchical feature extraction.A systematic hyperparameter optimization using SGD,performance evaluation with three datasets,and data normalization for improving feature representations.The EffNet-CNN is assessed utilizing precision,accuracy,F-measure,and recall metrics across MS-COCO,CIFAR-10 and 100 datasets.The model achieved accuracy values ranging from 90.60%to 95.90%for the MS-COCO dataset,96.8%to 98.3%for the CIFAR-10 dataset and 92.9%to 98.6%for the CIFAR-100 dataset.A validation of the EffNet-CNN model’s results with other models reveals the proposed model’s superior performance.The results highlight the potential of the EffNet-CNN model proposed for image classification and its usefulness in image mining and CBIR.展开更多
Ensuring information security in the quantum era is a growing challenge due to advancements in cryptographic attacks and the emergence of quantum computing.To address these concerns,this paper presents the mathematica...Ensuring information security in the quantum era is a growing challenge due to advancements in cryptographic attacks and the emergence of quantum computing.To address these concerns,this paper presents the mathematical and computer modeling of a novel two-dimensional(2D)chaotic system for secure key generation in quantum image encryption(QIE).The proposed map employs trigonometric perturbations in conjunction with rational-saturation functions and hence,named as Trigonometric-Rational-Saturation(TRS)map.Through rigorous mathematical analysis and computational simulations,the map is extensively evaluated for bifurcation behaviour,chaotic trajectories,and Lyapunov exponents.The security evaluation validates the map’s non-linearity,unpredictability,and sensitive dependence on initial conditions.In addition,the proposed TRS map has further been tested by integrating it in a QIE scheme.The QIE scheme first quantum-encodes the classic image using the Novel Enhanced Quantum Representation(NEQR)technique,the TRS map is used for the generation of secure diffusion key,which is XOR-ed with the quantum-ready image to obtain the encrypted images.The security evaluation of the QIE scheme demonstrates superior security of the encrypted images in terms of statistical security attacks and also against Differential attacks.The encrypted images exhibit zero correlation and maximum entropy with demonstrating strong resilience due to 99.62%and 33.47%results for Number of Pixels Change Rate(NPCR)and Unified Average Changing Intensity(UACI).The results validate the effectiveness of TRS-based quantum encryption scheme in securing digital images against emerging quantum threats,making it suitable for secure image encryption in IoT and edge-based applications.展开更多
Osteosarcomas are malignant neoplasms derived from undifferentiated osteogenic mesenchymal cells. It causes severe and permanent damage to human tissue and has a high mortality rate. The condition has the capacity to ...Osteosarcomas are malignant neoplasms derived from undifferentiated osteogenic mesenchymal cells. It causes severe and permanent damage to human tissue and has a high mortality rate. The condition has the capacity to occur in any bone;however, it often impacts long bones like the arms and legs. Prompt identification and prompt intervention are essential for augmenting patient longevity. However, the intricate composition and erratic placement of osteosarcoma provide difficulties for clinicians in accurately determining the scope of the afflicted area. There is a pressing requirement for developing an algorithm that can automatically detect bone tumors with tremendous accuracy. Therefore, in this study, we proposed a novel feature extractor framework associated with a supervised three-class XGBoost algorithm for the detection of osteosarcoma in whole slide histopathology images. This method allows for quicker and more effective data analysis. The first step involves preprocessing the imbalanced histopathology dataset, followed by augmentation and balancing utilizing two techniques: SMOTE and ADASYN. Next, a unique feature extraction framework is used to extract features, which are then inputted into the supervised three-class XGBoost algorithm for classification into three categories: non-tumor, viable tumor, and non-viable tumor. The experimental findings indicate that the proposed model exhibits superior efficiency, accuracy, and a more lightweight design in comparison to other current models for osteosarcoma detection.展开更多
Agromyzid leafminers cause significant economic losses in both vegetable and horticultural crops,and precise assessments of pesticide needs must be based on the extent of leaf damage.Traditionally,surveyors estimate t...Agromyzid leafminers cause significant economic losses in both vegetable and horticultural crops,and precise assessments of pesticide needs must be based on the extent of leaf damage.Traditionally,surveyors estimate the damage by visually comparing the proportion of damaged to intact leaf area,a method that lacks objectivity,precision,and reliable data traceability.To address these issues,an advanced survey system that combines augmented reality(AR)glasses with a camera and an artificial intelligence(AI)algorithm was developed in this study to objectively and accurately assess leafminer damage in the feld.By wearing AR glasses equipped with a voice-controlled camera,surveyors can easily flatten damaged leaves by hand and capture images for analysis.This method can provide a precise and reliable diagnosis of leafminer damage levels,which in turn supports the implementation of scientifically grounded and targeted pest management strategies.To calculate the leafminer damage level,the DeepLab-Leafminer model was proposed to precisely segment the leafminer-damaged regions and the intact leaf region.The integration of an edge-aware module and a Canny loss function into the DeepLabv3+model enhanced the DeepLab-Leafminer model's capability to accurately segment the edges of leafminer-damaged regions,which often exhibit irregular shapes.Compared with state-of-the-art segmentation models,the DeepLabLeafminer model achieved superior segmentation performance with an Intersection over Union(IoU)of 81.23%and an F1score of 87.92%on leafminer-damaged leaves.The test results revealed a 92.38%diagnosis accuracy of leafminer damage levels based on the DeepLab-Leafminer model.A mobile application and a web platform were developed to assist surveyors in displaying the diagnostic results of leafminer damage levels.This system provides surveyors with an advanced,user-friendly,and accurate tool for assessing agromyzid leafminer damage in agricultural felds using wearable AR glasses and an AI model.This method can also be utilized to automatically diagnose pest and disease damage levels in other crops based on leaf images.展开更多
In digital signal processing,image enhancement or image denoising are challenging task to preserve pixel quality.There are several approaches from conventional to deep learning that are used to resolve such issues.But...In digital signal processing,image enhancement or image denoising are challenging task to preserve pixel quality.There are several approaches from conventional to deep learning that are used to resolve such issues.But they still face challenges in terms of computational requirements,overfitting and generalization issues,etc.To resolve such issues,optimization algorithms provide greater control and transparency in designing digital filters for image enhancement and denoising.Therefore,this paper presented a novel denoising approach for medical applications using an Optimized Learning⁃based Multi⁃level discrete Wavelet Cascaded Convolutional Neural Network(OLMWCNN).In this approach,the optimal filter parameters are identified to preserve the image quality after denoising.The performance and efficiency of the OLMWCNN filter are evaluated,demonstrating significant progress in denoising medical images while overcoming the limitations of conventional methods.展开更多
With the fast development of multimedia social platforms,content dissemination on social media platforms is becomingmore popular.Social image sharing can also raise privacy concerns.Image encryption can protect social...With the fast development of multimedia social platforms,content dissemination on social media platforms is becomingmore popular.Social image sharing can also raise privacy concerns.Image encryption can protect social images.However,most existing image protection methods cannot be applied to multimedia social platforms because of encryption in the spatial domain.In this work,the authors propose a secure social image-sharing method with watermarking/fingerprinting and encryption.First,the fingerprint code with a hierarchical community structure is designed based on social network analysis.Then,discrete wavelet transform(DWT)from block discrete cosine transform(DCT)directly is employed.After that,all codeword segments are embedded into the LL,LH,and HL subbands,respectively.The selected subbands are confused based on Game of Life(GoL),and then all subbands are diffused with singular value decomposition(SVD).Experimental results and security analysis demonstrate the security,invisibility,and robustness of our method.Further,the superiority of the technique is elaborated through comparison with some related image security algorithms.The solution not only performs the fast transformation from block DCT to one-level DWT but also protects users’privacy in multimedia social platforms.With the proposed method,JPEG image secure sharing in multimedia social platforms can be ensured.展开更多
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.
基金supported by the National Key Research and Development Project of China(No.2023YFB3709605)the National Natural Science Foundation of China(No.62073193)the National College Student Innovation Training Program(No.202310422122)。
文摘Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(2021R1I1A3049788).
文摘In today’s digital era,the rapid evolution of image editing technologies has brought about a significant simplification of image manipulation.Unfortunately,this progress has also given rise to the misuse of manipulated images across various domains.One of the pressing challenges stemming from this advancement is the increasing difficulty in discerning between unaltered and manipulated images.This paper offers a comprehensive survey of existing methodologies for detecting image tampering,shedding light on the diverse approaches employed in the field of contemporary image forensics.The methods used to identify image forgery can be broadly classified into two primary categories:classical machine learning techniques,heavily reliant on manually crafted features,and deep learning methods.Additionally,this paper explores recent developments in image forensics,placing particular emphasis on the detection of counterfeit colorization.Image colorization involves predicting colors for grayscale images,thereby enhancing their visual appeal.The advancements in colorization techniques have reached a level where distinguishing between authentic and forged images with the naked eye has become an exceptionally challenging task.This paper serves as an in-depth exploration of the intricacies of image forensics in the modern age,with a specific focus on the detection of colorization forgery,presenting a comprehensive overview of methodologies in this critical field.
文摘Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensure patient safety.This survey examines the current state of pill image recognition,focusing on advancements,methodologies,and the challenges that remain unresolved.It provides a comprehensive overview of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and aims to explore the ongoing difficulties in the field.We summarize and classify the methods used in each article,compare the strengths and weaknesses of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and review benchmark datasets for pill image recognition.Additionally,we compare the performance of proposed methods on popular benchmark datasets.This survey applies recent advancements,such as Transformer models and cutting-edge technologies like Augmented Reality(AR),to discuss potential research directions and conclude the review.By offering a holistic perspective,this paper aims to serve as a valuable resource for researchers and practitioners striving to advance the field of pill image recognition.
基金supported by the National Natural Science(No.U19A2063)the Jilin Provincial Development Program of Science and Technology (No.20230201080GX)the Jilin Province Education Department Scientific Research Project (No.JJKH20230851KJ)。
文摘The visual noise of each light intensity area is different when the image is drawn by Monte Carlo method.However,the existing denoising algorithms have limited denoising performance under complex lighting conditions and are easy to lose detailed information.So we propose a rendered image denoising method with filtering guided by lighting information.First,we design an image segmentation algorithm based on lighting information to segment the image into different illumination areas.Then,we establish the parameter prediction model guided by lighting information for filtering(PGLF)to predict the filtering parameters of different illumination areas.For different illumination areas,we use these filtering parameters to construct area filters,and the filters are guided by the lighting information to perform sub-area filtering.Finally,the filtering results are fused with auxiliary features to output denoised images for improving the overall denoising effect of the image.Under the physically based rendering tool(PBRT)scene and Tungsten dataset,the experimental results show that compared with other guided filtering denoising methods,our method improves the peak signal-to-noise ratio(PSNR)metrics by 4.2164 dB on average and the structural similarity index(SSIM)metrics by 7.8%on average.This shows that our method can better reduce the noise in complex lighting scenesand improvethe imagequality.
基金supported by National Natural Science Foundation of China(No.61502274).
文摘In the field of image forensics,image tampering detection is a critical and challenging task.Traditional methods based on manually designed feature extraction typically focus on a specific type of tampering operation,which limits their effectiveness in complex scenarios involving multiple forms of tampering.Although deep learningbasedmethods offer the advantage of automatic feature learning,current approaches still require further improvements in terms of detection accuracy and computational efficiency.To address these challenges,this study applies the UNet 3+model to image tampering detection and proposes a hybrid framework,referred to as DDT-Net(Deep Detail Tracking Network),which integrates deep learning with traditional detection techniques.In contrast to traditional additive methods,this approach innovatively applies amultiplicative fusion technique during downsampling,effectively combining the deep learning feature maps at each layer with those generated by the Bayar noise stream.This design enables noise residual features to guide the learning of semantic features more precisely and efficiently,thus facilitating comprehensive feature-level interaction.Furthermore,by leveraging the complementary strengths of deep networks in capturing large-scale semantic manipulations and traditional algorithms’proficiency in detecting fine-grained local traces,the method significantly enhances the accuracy and robustness of tampered region detection.Compared with other approaches,the proposed method achieves an F1 score improvement exceeding 30% on the DEFACTO and DIS25k datasets.In addition,it has been extensively validated on other datasets,including CASIA and DIS25k.Experimental results demonstrate that this method achieves outstanding performance across various types of image tampering detection tasks.
基金supported by the NationalNatural Science Foundation of China(No.61862041).
文摘Medical institutions frequently utilize cloud servers for storing digital medical imaging data, aiming to lower both storage expenses and computational expenses. Nevertheless, the reliability of cloud servers as third-party providers is not always guaranteed. To safeguard against the exposure and misuse of personal privacy information, and achieve secure and efficient retrieval, a secure medical image retrieval based on a multi-attention mechanism and triplet deep hashing is proposed in this paper (abbreviated as MATDH). Specifically, this method first utilizes the contrast-limited adaptive histogram equalization method applicable to color images to enhance chest X-ray images. Next, a designed multi-attention mechanism focuses on important local features during the feature extraction stage. Moreover, a triplet loss function is utilized to learn discriminative hash codes to construct a compact and efficient triplet deep hashing. Finally, upsampling is used to restore the original resolution of the images during retrieval, thereby enabling more accurate matching. To ensure the security of medical image data, a lightweight image encryption method based on frequency domain encryption is designed to encrypt the chest X-ray images. The findings of the experiment indicate that, in comparison to various advanced image retrieval techniques, the suggested approach improves the precision of feature extraction and retrieval using the COVIDx dataset. Additionally, it offers enhanced protection for the confidentiality of medical images stored in cloud settings and demonstrates strong practicality.
基金the Deanship of Scientifc Research at King Khalid University for funding this work through large group Research Project under grant number RGP2/421/45supported via funding from Prince Sattam bin Abdulaziz University project number(PSAU/2024/R/1446)+1 种基金supported by theResearchers Supporting Project Number(UM-DSR-IG-2023-07)Almaarefa University,Riyadh,Saudi Arabia.supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2021R1F1A1055408).
文摘Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.
文摘In the field of image processing,the analysis of Synthetic Aperture Radar(SAR)images is crucial due to its broad range of applications.However,SAR images are often affected by coherent speckle noise,which significantly degrades image quality.Traditional denoising methods,typically based on filter techniques,often face challenges related to inefficiency and limited adaptability.To address these limitations,this study proposes a novel SAR image denoising algorithm based on an enhanced residual network architecture,with the objective of enhancing the utility of SAR imagery in complex electromagnetic environments.The proposed algorithm integrates residual network modules,which directly process the noisy input images to generate denoised outputs.This approach not only reduces computational complexity but also mitigates the difficulties associated with model training.By combining the Transformer module with the residual block,the algorithm enhances the network's ability to extract global features,offering superior feature extraction capabilities compared to CNN-based residual modules.Additionally,the algorithm employs the adaptive activation function Meta-ACON,which dynamically adjusts the activation patterns of neurons,thereby improving the network's feature extraction efficiency.The effectiveness of the proposed denoising method is empirically validated using real SAR images from the RSOD dataset.The proposed algorithm exhibits remarkable performance in terms of EPI,SSIM,and ENL,while achieving a substantial enhancement in PSNR when compared to traditional and deep learning-based algorithms.The PSNR performance is enhanced by over twofold.Moreover,the evaluation of the MSTAR SAR dataset substantiates the algorithm's robustness and applicability in SAR denoising tasks,with a PSNR of 25.2021 being attained.These findings underscore the efficacy of the proposed algorithm in mitigating speckle noise while preserving critical features in SAR imagery,thereby enhancing its quality and usability in practical scenarios.
基金funded by the Institute of InformationTechnology,VietnamAcademy of Science and Technology(project number CSCL02.02/22-23)“Research and Development of Methods for Searching Similar Trademark Images Using Machine Learning to Support Trademark Examination in Vietnam”.
文摘Image-based similar trademark retrieval is a time-consuming and labor-intensive task in the trademark examination process.This paper aims to support trademark examiners by training Deep Convolutional Neural Network(DCNN)models for effective Trademark Image Retrieval(TIR).To achieve this goal,we first develop a novel labeling method that automatically generates hundreds of thousands of labeled similar and dissimilar trademark image pairs using accompanying data fields such as citation lists,Vienna classification(VC)codes,and trademark ownership information.This approach eliminates the need for manual labeling and provides a large-scale dataset suitable for training deep learning models.We then train DCNN models based on Siamese and Triplet architectures,evaluating various feature extractors to determine the most effective configuration.Furthermore,we present an Adapted Contrastive Loss Function(ACLF)for the trademark retrieval task,specifically engineered to mitigate the influence of noisy labels found in automatically created datasets.Experimental results indicate that our proposed model(Efficient-Net_v21_Siamese)performs best at both True Negative Rate(TNR)threshold levels,TNR 0.9 and TNR 0.95,with==respective True Positive Rates(TPRs)of 77.7%and 70.8%and accuracies of 83.9%and 80.4%.Additionally,when testing on the public trademark dataset METU_v2,our model achieves a normalized average rank(NAR)of 0.0169,outperforming the current state-of-the-art(SOTA)model.Based on these findings,we estimate that considering only approximately 10%of the returned trademarks would be sufficient,significantly reducing the review time.Therefore,the paper highlights the potential of utilizing national trademark data to enhance the accuracy and efficiency of trademark retrieval systems,ultimately supporting trademark examiners in their evaluation tasks.
基金supported by the Deanship of Research and Graduate Studies at King Khalid University under Small Research Project grant number RGP1/139/45.
文摘Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.
基金supported by Gansu Natural Science Foundation Programme(No.24JRRA231)National Natural Science Foundation of China(No.62061023)Gansu Provincial Education,Science and Technology Innovation and Industry(No.2021CYZC-04)。
文摘Brain tumor segmentation is critical in clinical diagnosis and treatment planning.Existing methods for brain tumor segmentation with missing modalities often struggle when dealing with multiple missing modalities,a common scenario in real-world clinical settings.These methods primarily focus on handling a single missing modality at a time,making them insufficiently robust for the additional complexity encountered with incomplete data containing various missing modality combinations.Additionally,most existing methods rely on single models,which may limit their performance and increase the risk of overfitting the training data.This work proposes a novel method called the ensemble adversarial co-training neural network(EACNet)for accurate brain tumor segmentation from multi-modal magnetic resonance imaging(MRI)scans with multiple missing modalities.The proposed method consists of three key modules:the ensemble of pre-trained models,which captures diverse feature representations from the MRI data by employing an ensemble of pre-trained models;adversarial learning,which leverages a competitive training approach involving two models;a generator model,which creates realistic missing data,while sub-networks acting as discriminators learn to distinguish real data from the generated“fake”data.Co-training framework utilizes the information extracted by the multimodal path(trained on complete scans)to guide the learning process in the path handling missing modalities.The model potentially compensates for missing information through co-training interactions by exploiting the relationships between available modalities and the tumor segmentation task.EACNet was evaluated on the BraTS2018 and BraTS2020 challenge datasets and achieved state-of-the-art and competitive performance respectively.Notably,the segmentation results for the whole tumor(WT)dice similarity coefficient(DSC)reached 89.27%,surpassing the performance of existing methods.The analysis suggests that the ensemble approach offers potential benefits,and the adversarial co-training contributes to the increased robustness and accuracy of EACNet for brain tumor segmentation of MRI scans with missing modalities.The experimental results show that EACNet has promising results for the task of brain tumor segmentation of MRI scans with missing modalities and is a better candidate for real-world clinical applications.
文摘BACKGROUND Optical coherence tomography(OCT)enables high-resolution,non-invasive visualization of retinal structures.Recent evidence suggests that retinal layer alterations may reflect central nervous system changes associated with psychiatric disorders such as schizophrenia(SZ).AIM To develop an advanced deep learning model to classify OCT images and distinguish patients with SZ from healthy controls using retinal biomarkers.METHODS A novel convolutional neural network,Self-AttentionNeXt,was designed by integrating grouped self-attention mechanisms,residual and inverted bottleneck blocks,and a final 1×1 convolution for feature refinement.The model was trained and tested on both a custom OCT dataset collected from patients with SZ and a publicly available OCT dataset(OCT2017).RESULTS Self-AttentionNeXt achieved 97.0%accuracy on the collected SZ OCT dataset and over 95%accuracy on the public OCT2017 dataset.Gradient-weighted class activation mapping visualizations confirmed the model’s attention to clinically relevant retinal regions,suggesting effective feature localization.CONCLUSION Self-AttentionNeXt effectively combines transformer-inspired attention mechanisms with convolutional neural networks architecture to support the early and accurate detection of SZ using OCT images.This approach offers a promising direction for artificial intelligence-assisted psychiatric diagnostics and clinical decision support.
基金The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University,Kingdom of Saudi Arabia,for funding this work through the Small Research Group Project under Grant Number RGP.1/316/45.
文摘Content-Based Image Retrieval(CBIR)and image mining are becoming more important study fields in computer vision due to their wide range of applications in healthcare,security,and various domains.The image retrieval system mainly relies on the efficiency and accuracy of the classification models.This research addresses the challenge of enhancing the image retrieval system by developing a novel approach,EfficientNet-Convolutional Neural Network(EffNet-CNN).The key objective of this research is to evaluate the proposed EffNet-CNN model’s performance in image classification,image mining,and CBIR.The novelty of the proposed EffNet-CNN model includes the integration of different techniques and modifications.The model includes the Mahalanobis distance metric for feature matching,which enhances the similarity measurements.The model extends EfficientNet architecture by incorporating additional convolutional layers,batch normalization,dropout,and pooling layers for improved hierarchical feature extraction.A systematic hyperparameter optimization using SGD,performance evaluation with three datasets,and data normalization for improving feature representations.The EffNet-CNN is assessed utilizing precision,accuracy,F-measure,and recall metrics across MS-COCO,CIFAR-10 and 100 datasets.The model achieved accuracy values ranging from 90.60%to 95.90%for the MS-COCO dataset,96.8%to 98.3%for the CIFAR-10 dataset and 92.9%to 98.6%for the CIFAR-100 dataset.A validation of the EffNet-CNN model’s results with other models reveals the proposed model’s superior performance.The results highlight the potential of the EffNet-CNN model proposed for image classification and its usefulness in image mining and CBIR.
基金funded by Deanship of Research and Graduate Studies at King Khalid University.The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Group Project under grant number(RGP.2/556/45).
文摘Ensuring information security in the quantum era is a growing challenge due to advancements in cryptographic attacks and the emergence of quantum computing.To address these concerns,this paper presents the mathematical and computer modeling of a novel two-dimensional(2D)chaotic system for secure key generation in quantum image encryption(QIE).The proposed map employs trigonometric perturbations in conjunction with rational-saturation functions and hence,named as Trigonometric-Rational-Saturation(TRS)map.Through rigorous mathematical analysis and computational simulations,the map is extensively evaluated for bifurcation behaviour,chaotic trajectories,and Lyapunov exponents.The security evaluation validates the map’s non-linearity,unpredictability,and sensitive dependence on initial conditions.In addition,the proposed TRS map has further been tested by integrating it in a QIE scheme.The QIE scheme first quantum-encodes the classic image using the Novel Enhanced Quantum Representation(NEQR)technique,the TRS map is used for the generation of secure diffusion key,which is XOR-ed with the quantum-ready image to obtain the encrypted images.The security evaluation of the QIE scheme demonstrates superior security of the encrypted images in terms of statistical security attacks and also against Differential attacks.The encrypted images exhibit zero correlation and maximum entropy with demonstrating strong resilience due to 99.62%and 33.47%results for Number of Pixels Change Rate(NPCR)and Unified Average Changing Intensity(UACI).The results validate the effectiveness of TRS-based quantum encryption scheme in securing digital images against emerging quantum threats,making it suitable for secure image encryption in IoT and edge-based applications.
文摘Osteosarcomas are malignant neoplasms derived from undifferentiated osteogenic mesenchymal cells. It causes severe and permanent damage to human tissue and has a high mortality rate. The condition has the capacity to occur in any bone;however, it often impacts long bones like the arms and legs. Prompt identification and prompt intervention are essential for augmenting patient longevity. However, the intricate composition and erratic placement of osteosarcoma provide difficulties for clinicians in accurately determining the scope of the afflicted area. There is a pressing requirement for developing an algorithm that can automatically detect bone tumors with tremendous accuracy. Therefore, in this study, we proposed a novel feature extractor framework associated with a supervised three-class XGBoost algorithm for the detection of osteosarcoma in whole slide histopathology images. This method allows for quicker and more effective data analysis. The first step involves preprocessing the imbalanced histopathology dataset, followed by augmentation and balancing utilizing two techniques: SMOTE and ADASYN. Next, a unique feature extraction framework is used to extract features, which are then inputted into the supervised three-class XGBoost algorithm for classification into three categories: non-tumor, viable tumor, and non-viable tumor. The experimental findings indicate that the proposed model exhibits superior efficiency, accuracy, and a more lightweight design in comparison to other current models for osteosarcoma detection.
基金supported by the National Key R&D Program of China(2021YFC2600400 and 2023YFC2605200)the National Key Research Program of China(2021YFD1401100)the“San Nong Jiu Fang”Sciences and Technologies Cooperation Project of Zhejiang Province,China(2024SNJF010)。
文摘Agromyzid leafminers cause significant economic losses in both vegetable and horticultural crops,and precise assessments of pesticide needs must be based on the extent of leaf damage.Traditionally,surveyors estimate the damage by visually comparing the proportion of damaged to intact leaf area,a method that lacks objectivity,precision,and reliable data traceability.To address these issues,an advanced survey system that combines augmented reality(AR)glasses with a camera and an artificial intelligence(AI)algorithm was developed in this study to objectively and accurately assess leafminer damage in the feld.By wearing AR glasses equipped with a voice-controlled camera,surveyors can easily flatten damaged leaves by hand and capture images for analysis.This method can provide a precise and reliable diagnosis of leafminer damage levels,which in turn supports the implementation of scientifically grounded and targeted pest management strategies.To calculate the leafminer damage level,the DeepLab-Leafminer model was proposed to precisely segment the leafminer-damaged regions and the intact leaf region.The integration of an edge-aware module and a Canny loss function into the DeepLabv3+model enhanced the DeepLab-Leafminer model's capability to accurately segment the edges of leafminer-damaged regions,which often exhibit irregular shapes.Compared with state-of-the-art segmentation models,the DeepLabLeafminer model achieved superior segmentation performance with an Intersection over Union(IoU)of 81.23%and an F1score of 87.92%on leafminer-damaged leaves.The test results revealed a 92.38%diagnosis accuracy of leafminer damage levels based on the DeepLab-Leafminer model.A mobile application and a web platform were developed to assist surveyors in displaying the diagnostic results of leafminer damage levels.This system provides surveyors with an advanced,user-friendly,and accurate tool for assessing agromyzid leafminer damage in agricultural felds using wearable AR glasses and an AI model.This method can also be utilized to automatically diagnose pest and disease damage levels in other crops based on leaf images.
文摘In digital signal processing,image enhancement or image denoising are challenging task to preserve pixel quality.There are several approaches from conventional to deep learning that are used to resolve such issues.But they still face challenges in terms of computational requirements,overfitting and generalization issues,etc.To resolve such issues,optimization algorithms provide greater control and transparency in designing digital filters for image enhancement and denoising.Therefore,this paper presented a novel denoising approach for medical applications using an Optimized Learning⁃based Multi⁃level discrete Wavelet Cascaded Convolutional Neural Network(OLMWCNN).In this approach,the optimal filter parameters are identified to preserve the image quality after denoising.The performance and efficiency of the OLMWCNN filter are evaluated,demonstrating significant progress in denoising medical images while overcoming the limitations of conventional methods.
基金funded by NSFC Grants 61502154,61972136,the NSF of Hubei Province(2023AFB004,2024AFB544)Hubei Provincial Department of Education Project(No.Q20232206)Project of Hubei University of Economics(No.T201410).
文摘With the fast development of multimedia social platforms,content dissemination on social media platforms is becomingmore popular.Social image sharing can also raise privacy concerns.Image encryption can protect social images.However,most existing image protection methods cannot be applied to multimedia social platforms because of encryption in the spatial domain.In this work,the authors propose a secure social image-sharing method with watermarking/fingerprinting and encryption.First,the fingerprint code with a hierarchical community structure is designed based on social network analysis.Then,discrete wavelet transform(DWT)from block discrete cosine transform(DCT)directly is employed.After that,all codeword segments are embedded into the LL,LH,and HL subbands,respectively.The selected subbands are confused based on Game of Life(GoL),and then all subbands are diffused with singular value decomposition(SVD).Experimental results and security analysis demonstrate the security,invisibility,and robustness of our method.Further,the superiority of the technique is elaborated through comparison with some related image security algorithms.The solution not only performs the fast transformation from block DCT to one-level DWT but also protects users’privacy in multimedia social platforms.With the proposed method,JPEG image secure sharing in multimedia social platforms can be ensured.