The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce...The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce poor computer vision results.The common image denoising techniques tend to remove significant image details and also remove noise,provided they are based on space and frequency filtering.The updated framework presented in this paper is a novel denoising model that makes use of Boruta-driven feature selection using a Long Short-Term Memory Autoencoder(LSTMAE).The Boruta algorithm identifies the most useful depth features that are used to maximize the spatial structure integrity and reduce redundancy.An LSTMAE is then used to process these selected features and model depth pixel sequences to generate robust,noise-resistant representations.The system uses the encoder to encode the input data into a latent space that has been compressed before it is decoded to retrieve the clean image.Experiments on a benchmark data set show that the suggested technique attains a PSNR of 45 dB and an SSIM of 0.90,which is 10 dB higher than the performance of conventional convolutional autoencoders and 15 times higher than that of the wavelet-based models.Moreover,the feature selection step will decrease the input dimensionality by 40%,resulting in a 37.5%reduction in training time and a real-time inference rate of 200 FPS.Boruta-LSTMAE framework,therefore,offers a highly efficient and scalable system for depth image denoising,with a high potential to be applied to close-range 3D systems,such as robotic manipulation and gesture-based interfaces.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vi...Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.展开更多
Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurat...Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurate classification.Enhancing the visibility of these elusive cell features helps train robust deep-learning models.However,the selection and application of image processing techniques for such enhancement have not been systematically explored in the research community.To address this challenge,we introduce Salient Features Guided Augmentation(SFGA),an approach that strategically integrates machine learning and image processing.SFGA utilizes machine learning algorithms to identify crucial features within cell images,subsequently mapping these features to appropriate image processing techniques to enhance training images.By emphasizing salient features and aligning them with corresponding image processing methods,SFGA is designed to enhance the discriminating power of deep learning models in cell classification tasks.Our research undertakes a series of experiments,each exploring the performance of different datasets and data enhancement techniques in classifying cell types,highlighting the significance of data quality and enhancement in mitigating overfitting and distinguishing cell characteristics.Specifically,SFGA focuses on identifying tumor cells from tissue for extranodal extension detection,with the SFGA-enhanced dataset showing notable advantages in accuracy.We conducted a preliminary study of five experiments,among which the accuracy of the pleomorphism experiment improved significantly from 50.81%to 95.15%.The accuracy of the other four experiments also increased,with improvements ranging from 3 to 43 percentage points.Our preliminary study shows the possibilities to enhance the diagnostic accuracy of deep learning models and proposes a systematic approach that could enhance cancer diagnosis,contributing as a first step in using SFGA in medical image enhancement.展开更多
Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi...Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.展开更多
Currently,thyroid diseases are prevalent worldwide;therefore,it is necessary to develop techniques that help doctors improve their diagnostic skills for such diseases.In previous studies,2-dimensional convolutional ne...Currently,thyroid diseases are prevalent worldwide;therefore,it is necessary to develop techniques that help doctors improve their diagnostic skills for such diseases.In previous studies,2-dimensional convolutional neural network(2D CNN)techniques were employed to classify thyroid nodules as benign and malignant without detecting the presence of thyroid nodules in the obtained ultrasound images.To address this issue,we propose a 3-dimensional convolutional neural network(3D CNN)for thyroid nodule detection.The proposed CNN exploits the 3D information and spatial features contained in ultrasound images and generates distinctive features during its training using multiple samples,even for small nodules.In contrast,a 2D CNN only depends on spatial features.In this study,we used two datasets of 2210 ultrasound images obtained from the Sultan Abdul Aziz Shah Hospital in Malaysia,and a public open dataset,Digital Database Thyroid Image(DDTI).We created folders containing three images each,processed the images and extracted volumetric features suitable for the 3-dimensional convolutional neural network(3D CNN).The proposed model was assessed using four metrics:accuracy,recall,precision and F1 score.The results showed that the accuracy of the model in predicting the presence of thyroid nodules in ultrasound images was 96%.In conclusion,this study could help radiologists in hospitals and medical centres in classifying ultrasound images and detecting thyroid nodules.展开更多
Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify sp...Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.展开更多
BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features ...BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features is crucial for early detection and appropriate treatment planning.AIM To retrospectively analyze the relationship between different pathological types of pancreatic cancer and their corresponding imaging features.METHODS We retrospectively analyzed the data of 500 patients diagnosed with pancreatic cancer between January 2010 and December 2020 at our institution.Pathological types were determined by histopathological examination of the surgical spe-cimens or biopsy samples.The imaging features were assessed using computed tomography,magnetic resonance imaging,and endoscopic ultrasound.Statistical analyses were performed to identify significant associations between pathological types and specific imaging characteristics.RESULTS There were 320(64%)cases of pancreatic ductal adenocarcinoma,75(15%)of intraductal papillary mucinous neoplasms,50(10%)of neuroendocrine tumors,and 55(11%)of other rare types.Distinct imaging features were identified in each pathological type.Pancreatic ductal adenocarcinoma typically presents as a hypodense mass with poorly defined borders on computed tomography,whereas intraductal papillary mucinous neoplasms present as characteristic cystic lesions with mural nodules.Neuroendocrine tumors often appear as hypervascular lesions in contrast-enhanced imaging.Statistical analysis revealed significant correlations between specific imaging features and pathological types(P<0.001).CONCLUSION This study demonstrated a strong association between the pathological types of pancreatic cancer and imaging features.These findings can enhance the accuracy of noninvasive diagnosis and guide personalized treatment approaches.展开更多
The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image segmentation.However,their ability to learn local,contextual relationships ...The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image segmentation.However,their ability to learn local,contextual relationships between pixels requires further improvement.Previous methods face challenges in efficiently managing multi-scale fea-tures of different granularities from the encoder backbone,leaving room for improvement in their global representation and feature extraction capabilities.To address these challenges,we propose a novel Decoder with Multi-Head Feature Receptors(DMHFR),which receives multi-scale features from the encoder backbone and organizes them into three feature groups with different granularities:coarse,fine-grained,and full set.These groups are subsequently processed by Multi-Head Feature Receptors(MHFRs)after feature capture and modeling operations.MHFRs include two Three-Head Feature Receptors(THFRs)and one Four-Head Feature Receptor(FHFR).Each group of features is passed through these MHFRs and then fed into axial transformers,which help the model capture long-range dependencies within the features.The three MHFRs produce three distinct feature outputs.The output from the FHFR serves as auxiliary auxiliary features in the prediction head,and the prediction output and their losses will eventually be aggregated.Experimental results show that the Transformer using DMHFR outperforms 15 state of the arts(SOTA)methods on five public datasets.Specifically,it achieved significant improvements in mean DICE scores over the classic Parallel Reverse Attention Network(PraNet)method,with gains of 4.1%,2.2%,1.4%,8.9%,and 16.3%on the CVC-ClinicDB,Kvasir-SEG,CVC-T,CVC-ColonDB,and ETIS-LaribPolypDB datasets,respectively.展开更多
The study by Luo et al published in the World Journal of Gastrointestinal Oncology presents a thorough and scientific methodology.Pancreatic cancer is the most challenging malignancy in the digestive system,exhibiting...The study by Luo et al published in the World Journal of Gastrointestinal Oncology presents a thorough and scientific methodology.Pancreatic cancer is the most challenging malignancy in the digestive system,exhibiting one of the highest mortality rates associated with cancer globally.The delayed onset of symptoms and diagnosis often results in metastasis or local progression of the cancer,thereby constraining treatment options and outcomes.For these patients,prompt tumour identification and treatment strategising are crucial.The present objective of pancreatic cancer research is to examine the correlation between various pathological types and imaging data to facilitate therapeutic decision-making.This study aims to clarify the correlation between diverse pathological markers and imaging in pancreatic cancer patients,with prospective longitudinal studies potentially providing novel insights into the diagnosis and treatment of pancreatic cancer.展开更多
Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feat...Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification.展开更多
Lunar Laser Ranging has extremely high requirements for the pointing accuracy of the telescopes used.To improve its pointing accuracy and solve the problem of insufficiently accurate telescope pointing correction achi...Lunar Laser Ranging has extremely high requirements for the pointing accuracy of the telescopes used.To improve its pointing accuracy and solve the problem of insufficiently accurate telescope pointing correction achieved by tracking stars in the all-sky region,we propose a processing scheme to select larger-sized lunar craters near the Lunar Corner Cube Retroreflector as reference features for telescope pointing bias computation.Accurately determining the position of the craters in the images is crucial for calculating the pointing bias;therefore,we propose a method for accurately calculating the crater position based on lunar surface feature matching.This method uses matched feature points obtained from image feature matching,using a deep learning method to solve the image transformation matrix.The known position of a crater in a reference image is mapped using this matrix to calculate the crater position in the target image.We validate this method using craters near the Lunar Corner Cube Retroreflectors of Apollo 15 and Luna 17 and find that the calculated position of a crater on the target image falls on the center of the crater,even for image features with large distortion near the lunar limb.The maximum image matching error is approximately 1″,and the minimum is only 0.47″,which meets the pointing requirements of Lunar Laser Ranging.This method provides a new technical means for the high-precision pointing bias calculation of the Lunar Laser Ranging system.展开更多
Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. N...Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods.展开更多
In last few years,guided image fusion algorithms become more and more popular.However,the current algorithms cannot solve the halo artifacts.We propose an image fusion algorithm based on fast weighted guided filter.Fi...In last few years,guided image fusion algorithms become more and more popular.However,the current algorithms cannot solve the halo artifacts.We propose an image fusion algorithm based on fast weighted guided filter.Firstly,the source images are separated into a series of high and low frequency components.Secondly,three visual features of the source image are extracted to construct a decision graph model.Thirdly,a fast weighted guided filter is raised to optimize the result obtained in the previous step and reduce the time complexity by considering the correlation among neighboring pixels.Finally,the image obtained in the previous step is combined with the weight map to realize the image fusion.The proposed algorithm is applied to multi-focus,visible-infrared and multi-modal image respectively and the final results show that the algorithm effectively solves the halo artifacts of the merged images with higher efficiency,and is better than the traditional method considering subjective visual consequent and objective evaluation.展开更多
The application of transformer networks and feature fusion models in medical image segmentation has aroused considerable attention within the academic circle.Nevertheless,two main obstacles persist:(1)the restrictions...The application of transformer networks and feature fusion models in medical image segmentation has aroused considerable attention within the academic circle.Nevertheless,two main obstacles persist:(1)the restrictions of the Transformer network in dealing with locally detailed features,and(2)the considerable loss of feature information in current feature fusion modules.To solve these issues,this study initially presents a refined feature extraction approach,employing a double-branch feature extraction network to capture complex multi-scale local and global information from images.Subsequently,we proposed a low-loss feature fusion method-Multi-branch Feature Fusion Enhancement Module(MFFEM),which realizes effective feature fusion with minimal loss.Simultaneously,the cross-layer cross-attention fusion module(CLCA)is adopted to further achieve adequate feature fusion by enhancing the interaction between encoders and decoders of various scales.Finally,the feasibility of our method was verified using the Synapse and ACDC datasets,demonstrating its competitiveness.The average DSC(%)was 83.62 and 91.99 respectively,and the average HD95(mm)was reduced to 19.55 and 1.15 respectively.展开更多
A wavelet-based local and global feature fusion network(LAGN)is proposed for low-light image enhancement,aiming to enhance image details and restore colors in dark areas.This study focuses on addressing three key issu...A wavelet-based local and global feature fusion network(LAGN)is proposed for low-light image enhancement,aiming to enhance image details and restore colors in dark areas.This study focuses on addressing three key issues in low-light image enhancement:Enhancing low-light images using LAGN to preserve image details and colors;extracting image edge information via wavelet transform to enhance image details;and extracting local and global features of images through convolutional neural networks and Transformer to improve image contrast.Comparisons with state-of-the-art methods on two datasets verify that LAGN achieves the best performance in terms of details,brightness,and contrast.展开更多
BACKGROUND Colorectal cancer(CRC)is a malignant tumor with high morbidity and mortality rates worldwide.With the development of medical imaging technology,imaging features are playing an increasingly important role in...BACKGROUND Colorectal cancer(CRC)is a malignant tumor with high morbidity and mortality rates worldwide.With the development of medical imaging technology,imaging features are playing an increasingly important role in the prognostic evaluation of CRC.Laparoscopic radical resection is a common surgical approach for treating CRC.However,research on the link between preoperative imaging and short-term prognosis in this context is limited.We hypothesized that specific preope-rative imaging features can predict the short-term prognosis in patients under-going laparoscopic CRC resection.AIM To investigate the imaging features of CRC and analyze their correlation with the short-term prognosis of laparoscopic radical resection.METHODS This retrospective study conducted at the Affiliated Cancer Hospital of Shandong First Medical University included 122 patients diagnosed with CRC who under-went laparoscopic radical resection between January 2021 and February 2024.All patients underwent magnetic resonance imaging(MRI)and were diagnosed with CRC through pathological examination.MRI data and prognostic indicators were collected 30 days post-surgery.Logistic regression analysis identified imaging fea-tures linked to short-term prognosis,and a receiver operating characteristic(ROC)curve was used to evaluate the predictive value.RESULTS Among 122 patients,22 had irregular,low-intensity tumors with adjacent high signals.In 55,tumors were surrounded by alternating signals in the muscle layer.In 32,tumors extended through the muscular layer and blurred boundaries with perienteric adipose tissue.Tumor signals appeared in the adjacent tissues in 13 patients with blurred gaps.Logistic regression revealed differences in longitudinal tumor length,axial tumor length,volume transfer constant,plasma volume fraction,and apparent diffusion coefficient among patients with varying prognostic results.ROC analysis indicated that the areas under the curve for these parameters were 0.648,0.927,0.821,0.809,and 0.831,respectively.Sensitivity values were 0.643,0.893,0.607,0.714,and 0.714,and specificity 0.702,0.904,0.883,0.968,and 0.894(P<0.05).CONCLUSION The imaging features of CRC correlate with the short-term prognosis following laparoscopic radical resection.These findings provide valuable insights for clinical decision-making.展开更多
When detecting objects in Unmanned Aerial Vehicle(UAV)taken images,large number of objects and high proportion of small objects bring huge challenges for detection algorithms based on the You Only Look Once(YOLO)frame...When detecting objects in Unmanned Aerial Vehicle(UAV)taken images,large number of objects and high proportion of small objects bring huge challenges for detection algorithms based on the You Only Look Once(YOLO)framework,rendering them challenging to deal with tasks that demand high precision.To address these problems,this paper proposes a high-precision object detection algorithm based on YOLOv10s.Firstly,a Multi-branch Enhancement Coordinate Attention(MECA)module is proposed to enhance feature extraction capability.Secondly,a Multilayer Feature Reconstruction(MFR)mechanism is designed to fully exploit multilayer features,which can enrich object information as well as remove redundant information.Finally,an MFR Path Aggregation Network(MFR-Neck)is constructed,which integrates multi-scale features to improve the network's ability to perceive objects of var-ying sizes.The experimental results demonstrate that the proposed algorithm increases the average detection accuracy by 14.15%on the Vis Drone dataset compared to YOLOv10s,effectively enhancing object detection precision in UAV-taken images.展开更多
Recently, the digital image blind forensics technology has received an increasing attention in academic community. This paper aims at developing a new identification approach based on the statistical noise and exchang...Recently, the digital image blind forensics technology has received an increasing attention in academic community. This paper aims at developing a new identification approach based on the statistical noise and exchangeable image file format (EXIF) information of image for images authen- tication. In particular, the authors can identify whether the current image has been modified or not by utilizing the relevance between noise and EXIF parameters and comparing the real values with the estimated values of the EXIF parameters. Experimental results validate the proposed method. That is, the detecting system can identify the doctored image effectively.展开更多
Generative image steganography is a technique that directly generates stego images from secret infor-mation.Unlike traditional methods,it theoretically resists steganalysis because there is no cover image.Currently,th...Generative image steganography is a technique that directly generates stego images from secret infor-mation.Unlike traditional methods,it theoretically resists steganalysis because there is no cover image.Currently,the existing generative image steganography methods generally have good steganography performance,but there is still potential room for enhancing both the quality of stego images and the accuracy of secret information extraction.Therefore,this paper proposes a generative image steganography algorithm based on attribute feature transformation and invertible mapping rule.Firstly,the reference image is disentangled by a content and an attribute encoder to obtain content features and attribute features,respectively.Then,a mean mapping rule is introduced to map the binary secret information into a noise vector,conforming to the distribution of attribute features.This noise vector is input into the generator to produce the attribute transformed stego image with the content feature of the reference image.Additionally,we design an adversarial loss,a reconstruction loss,and an image diversity loss to train the proposed model.Experimental results demonstrate that the stego images generated by the proposed method are of high quality,with an average extraction accuracy of 99.4%for the hidden information.Furthermore,since the stego image has a uniform distribution similar to the attribute-transformed image without secret information,it effectively resists both subjective and objective steganalysis.展开更多
文摘The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce poor computer vision results.The common image denoising techniques tend to remove significant image details and also remove noise,provided they are based on space and frequency filtering.The updated framework presented in this paper is a novel denoising model that makes use of Boruta-driven feature selection using a Long Short-Term Memory Autoencoder(LSTMAE).The Boruta algorithm identifies the most useful depth features that are used to maximize the spatial structure integrity and reduce redundancy.An LSTMAE is then used to process these selected features and model depth pixel sequences to generate robust,noise-resistant representations.The system uses the encoder to encode the input data into a latent space that has been compressed before it is decoded to retrieve the clean image.Experiments on a benchmark data set show that the suggested technique attains a PSNR of 45 dB and an SSIM of 0.90,which is 10 dB higher than the performance of conventional convolutional autoencoders and 15 times higher than that of the wavelet-based models.Moreover,the feature selection step will decrease the input dimensionality by 40%,resulting in a 37.5%reduction in training time and a real-time inference rate of 200 FPS.Boruta-LSTMAE framework,therefore,offers a highly efficient and scalable system for depth image denoising,with a high potential to be applied to close-range 3D systems,such as robotic manipulation and gesture-based interfaces.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by the Fundamental Research Funds for the Provincial Universities of Zhejiang (No.GK249909299001-036)National Key Research and Development Program of China (No. 2023YFB4502803)Zhejiang Provincial Natural Science Foundation of China (No.LDT23F01014F01)。
文摘Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.
基金supported by grants fromthe North China University of Technology Research Start-Up Fund(11005136024XN147-14)and(110051360024XN151-97)Guangzhou Development Zone Science and Technology Project(2023GH02)+4 种基金the National Key R&D Program of China(2021YFE0201100 and 2022YFA1103401 to Juntao Gao)National Natural Science Foundation of China(981890991 to Juntao Gao)Beijing Municipal Natural Science Foundation(Z200021 to Juntao Gao)CAS Interdisciplinary Innovation Team(JCTD-2020-04 to Juntao Gao)0032/2022/A,by Macao FDCT,and MYRG2022-00271-FST.
文摘Hematoxylin and Eosin(H&E)images,popularly used in the field of digital pathology,often pose challenges due to their limited color richness,hindering the differentiation of subtle cell features crucial for accurate classification.Enhancing the visibility of these elusive cell features helps train robust deep-learning models.However,the selection and application of image processing techniques for such enhancement have not been systematically explored in the research community.To address this challenge,we introduce Salient Features Guided Augmentation(SFGA),an approach that strategically integrates machine learning and image processing.SFGA utilizes machine learning algorithms to identify crucial features within cell images,subsequently mapping these features to appropriate image processing techniques to enhance training images.By emphasizing salient features and aligning them with corresponding image processing methods,SFGA is designed to enhance the discriminating power of deep learning models in cell classification tasks.Our research undertakes a series of experiments,each exploring the performance of different datasets and data enhancement techniques in classifying cell types,highlighting the significance of data quality and enhancement in mitigating overfitting and distinguishing cell characteristics.Specifically,SFGA focuses on identifying tumor cells from tissue for extranodal extension detection,with the SFGA-enhanced dataset showing notable advantages in accuracy.We conducted a preliminary study of five experiments,among which the accuracy of the pleomorphism experiment improved significantly from 50.81%to 95.15%.The accuracy of the other four experiments also increased,with improvements ranging from 3 to 43 percentage points.Our preliminary study shows the possibilities to enhance the diagnostic accuracy of deep learning models and proposes a systematic approach that could enhance cancer diagnosis,contributing as a first step in using SFGA in medical image enhancement.
基金supported by National Nature Science Foundation of China (Nos. 61462046 and 61762052)Natural Science Foundation of Jiangxi Province (Nos. 20161BAB202049 and 20161BAB204172)+2 种基金the Bidding Project of the Key Laboratory of Watershed Ecology and Geographical Environment Monitoring, NASG (Nos. WE2016003, WE2016013 and WE2016015)the Science and Technology Research Projects of Jiangxi Province Education Department (Nos. GJJ160741, GJJ170632 and GJJ170633)the Art Planning Project of Jiangxi Province (Nos. YG2016250 and YG2017381)
文摘Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.
基金supported by the Ministry of Higher Education under the Fundamentals Research Grant Scheme(FRGS/1/2024/ICT02/UPM/02/5).
文摘Currently,thyroid diseases are prevalent worldwide;therefore,it is necessary to develop techniques that help doctors improve their diagnostic skills for such diseases.In previous studies,2-dimensional convolutional neural network(2D CNN)techniques were employed to classify thyroid nodules as benign and malignant without detecting the presence of thyroid nodules in the obtained ultrasound images.To address this issue,we propose a 3-dimensional convolutional neural network(3D CNN)for thyroid nodule detection.The proposed CNN exploits the 3D information and spatial features contained in ultrasound images and generates distinctive features during its training using multiple samples,even for small nodules.In contrast,a 2D CNN only depends on spatial features.In this study,we used two datasets of 2210 ultrasound images obtained from the Sultan Abdul Aziz Shah Hospital in Malaysia,and a public open dataset,Digital Database Thyroid Image(DDTI).We created folders containing three images each,processed the images and extracted volumetric features suitable for the 3-dimensional convolutional neural network(3D CNN).The proposed model was assessed using four metrics:accuracy,recall,precision and F1 score.The results showed that the accuracy of the model in predicting the presence of thyroid nodules in ultrasound images was 96%.In conclusion,this study could help radiologists in hospitals and medical centres in classifying ultrasound images and detecting thyroid nodules.
基金the Deanship of Scientifc Research at King Khalid University for funding this work through large group Research Project under grant number RGP2/421/45supported via funding from Prince Sattam bin Abdulaziz University project number(PSAU/2024/R/1446)+1 种基金supported by theResearchers Supporting Project Number(UM-DSR-IG-2023-07)Almaarefa University,Riyadh,Saudi Arabia.supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2021R1F1A1055408).
文摘Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.
文摘BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features is crucial for early detection and appropriate treatment planning.AIM To retrospectively analyze the relationship between different pathological types of pancreatic cancer and their corresponding imaging features.METHODS We retrospectively analyzed the data of 500 patients diagnosed with pancreatic cancer between January 2010 and December 2020 at our institution.Pathological types were determined by histopathological examination of the surgical spe-cimens or biopsy samples.The imaging features were assessed using computed tomography,magnetic resonance imaging,and endoscopic ultrasound.Statistical analyses were performed to identify significant associations between pathological types and specific imaging characteristics.RESULTS There were 320(64%)cases of pancreatic ductal adenocarcinoma,75(15%)of intraductal papillary mucinous neoplasms,50(10%)of neuroendocrine tumors,and 55(11%)of other rare types.Distinct imaging features were identified in each pathological type.Pancreatic ductal adenocarcinoma typically presents as a hypodense mass with poorly defined borders on computed tomography,whereas intraductal papillary mucinous neoplasms present as characteristic cystic lesions with mural nodules.Neuroendocrine tumors often appear as hypervascular lesions in contrast-enhanced imaging.Statistical analysis revealed significant correlations between specific imaging features and pathological types(P<0.001).CONCLUSION This study demonstrated a strong association between the pathological types of pancreatic cancer and imaging features.These findings can enhance the accuracy of noninvasive diagnosis and guide personalized treatment approaches.
基金supported by Xiamen Medical and Health Guidance Project in 2021(No.3502Z20214ZD1070)supported by a grant from Guangxi Key Laboratory of Machine Vision and Intelligent Control,China(No.2023B02).
文摘The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image segmentation.However,their ability to learn local,contextual relationships between pixels requires further improvement.Previous methods face challenges in efficiently managing multi-scale fea-tures of different granularities from the encoder backbone,leaving room for improvement in their global representation and feature extraction capabilities.To address these challenges,we propose a novel Decoder with Multi-Head Feature Receptors(DMHFR),which receives multi-scale features from the encoder backbone and organizes them into three feature groups with different granularities:coarse,fine-grained,and full set.These groups are subsequently processed by Multi-Head Feature Receptors(MHFRs)after feature capture and modeling operations.MHFRs include two Three-Head Feature Receptors(THFRs)and one Four-Head Feature Receptor(FHFR).Each group of features is passed through these MHFRs and then fed into axial transformers,which help the model capture long-range dependencies within the features.The three MHFRs produce three distinct feature outputs.The output from the FHFR serves as auxiliary auxiliary features in the prediction head,and the prediction output and their losses will eventually be aggregated.Experimental results show that the Transformer using DMHFR outperforms 15 state of the arts(SOTA)methods on five public datasets.Specifically,it achieved significant improvements in mean DICE scores over the classic Parallel Reverse Attention Network(PraNet)method,with gains of 4.1%,2.2%,1.4%,8.9%,and 16.3%on the CVC-ClinicDB,Kvasir-SEG,CVC-T,CVC-ColonDB,and ETIS-LaribPolypDB datasets,respectively.
基金Supported by the National Health Commission’s Key Laboratory of Gastrointestinal Tumor Diagnosis and Treatment for The Year 2022,National Health Commission’s Master’s and Doctoral/Postdoctoral Fund Project,No.NHCDP2022001Gansu Provincial People’s Hospital Doctoral Supervisor Training Project,No.22GSSYA-3.
文摘The study by Luo et al published in the World Journal of Gastrointestinal Oncology presents a thorough and scientific methodology.Pancreatic cancer is the most challenging malignancy in the digestive system,exhibiting one of the highest mortality rates associated with cancer globally.The delayed onset of symptoms and diagnosis often results in metastasis or local progression of the cancer,thereby constraining treatment options and outcomes.For these patients,prompt tumour identification and treatment strategising are crucial.The present objective of pancreatic cancer research is to examine the correlation between various pathological types and imaging data to facilitate therapeutic decision-making.This study aims to clarify the correlation between diverse pathological markers and imaging in pancreatic cancer patients,with prospective longitudinal studies potentially providing novel insights into the diagnosis and treatment of pancreatic cancer.
基金supported by the National Natural Science Foundation of China(62302167,62477013)Natural Science Foundation of Shanghai(No.24ZR1456100)+1 种基金Science and Technology Commission of Shanghai Municipality(No.24DZ2305900)the Shanghai Municipal Special Fund for Promoting High-Quality Development of Industries(2211106).
文摘Multi-label image classification is a challenging task due to the diverse sizes and complex backgrounds of objects in images.Obtaining class-specific precise representations at different scales is a key aspect of feature representation.However,existing methods often rely on the single-scale deep feature,neglecting shallow and deeper layer features,which poses challenges when predicting objects of varying scales within the same image.Although some studies have explored multi-scale features,they rarely address the flow of information between scales or efficiently obtain class-specific precise representations for features at different scales.To address these issues,we propose a two-stage,three-branch Transformer-based framework.The first stage incorporates multi-scale image feature extraction and hierarchical scale attention.This design enables the model to consider objects at various scales while enhancing the flow of information across different feature scales,improving the model’s generalization to diverse object scales.The second stage includes a global feature enhancement module and a region selection module.The global feature enhancement module strengthens interconnections between different image regions,mitigating the issue of incomplete represen-tations,while the region selection module models the cross-modal relationships between image features and labels.Together,these components enable the efficient acquisition of class-specific precise feature representations.Extensive experiments on public datasets,including COCO2014,VOC2007,and VOC2012,demonstrate the effectiveness of our proposed method.Our approach achieves consistent performance gains of 0.3%,0.4%,and 0.2%over state-of-the-art methods on the three datasets,respectively.These results validate the reliability and superiority of our approach for multi-label image classification.
基金funded by Natural Science Foundation of Jilin Province(20220101125JC)the National Natural Science Foundation of China(12273079).
文摘Lunar Laser Ranging has extremely high requirements for the pointing accuracy of the telescopes used.To improve its pointing accuracy and solve the problem of insufficiently accurate telescope pointing correction achieved by tracking stars in the all-sky region,we propose a processing scheme to select larger-sized lunar craters near the Lunar Corner Cube Retroreflector as reference features for telescope pointing bias computation.Accurately determining the position of the craters in the images is crucial for calculating the pointing bias;therefore,we propose a method for accurately calculating the crater position based on lunar surface feature matching.This method uses matched feature points obtained from image feature matching,using a deep learning method to solve the image transformation matrix.The known position of a crater in a reference image is mapped using this matrix to calculate the crater position in the target image.We validate this method using craters near the Lunar Corner Cube Retroreflectors of Apollo 15 and Luna 17 and find that the calculated position of a crater on the target image falls on the center of the crater,even for image features with large distortion near the lunar limb.The maximum image matching error is approximately 1″,and the minimum is only 0.47″,which meets the pointing requirements of Lunar Laser Ranging.This method provides a new technical means for the high-precision pointing bias calculation of the Lunar Laser Ranging system.
基金the National Natural Science Foundation of China(No.51975374)。
文摘Recent advances in convolution neural network (CNN) have fostered the progress in object recognition and semantic segmentation, which in turn has improved the performance of hyperspectral image (HSI) classification. Nevertheless, the difficulty of high dimensional feature extraction and the shortage of small training samples seriously hinder the future development of HSI classification. In this paper, we propose a novel algorithm for HSI classification based on three-dimensional (3D) CNN and a feature pyramid network (FPN), called 3D-FPN. The framework contains a principle component analysis, a feature extraction structure and a logistic regression. Specifically, the FPN built with 3D convolutions not only retains the advantages of 3D convolution to fully extract the spectral-spatial feature maps, but also concentrates on more detailed information and performs multi-scale feature fusion. This method avoids the excessive complexity of the model and is suitable for small sample hyperspectral classification with varying categories and spatial resolutions. In order to test the performance of our proposed 3D-FPN method, rigorous experimental analysis was performed on three public hyperspectral data sets and hyperspectral data of GF-5 satellite. Quantitative and qualitative results indicated that our proposed method attained the best performance among other current state-of-the-art end-to-end deep learning-based methods.
基金supported by the National Natural Science Foundation of China(61472324 61671383)+1 种基金Shaanxi Key Industry Innovation Chain Project(2018ZDCXL-G-12-2 2019ZDLGY14-02-02)
文摘In last few years,guided image fusion algorithms become more and more popular.However,the current algorithms cannot solve the halo artifacts.We propose an image fusion algorithm based on fast weighted guided filter.Firstly,the source images are separated into a series of high and low frequency components.Secondly,three visual features of the source image are extracted to construct a decision graph model.Thirdly,a fast weighted guided filter is raised to optimize the result obtained in the previous step and reduce the time complexity by considering the correlation among neighboring pixels.Finally,the image obtained in the previous step is combined with the weight map to realize the image fusion.The proposed algorithm is applied to multi-focus,visible-infrared and multi-modal image respectively and the final results show that the algorithm effectively solves the halo artifacts of the merged images with higher efficiency,and is better than the traditional method considering subjective visual consequent and objective evaluation.
基金funded by the Henan Science and Technology research project(222103810042)Support by the open project of scientific research platform of grain information processing center of Henan University of Technology(KFJJ-2021-108)+1 种基金Support by the innovative funds plan of Henan University of Technology(2021ZKCJ14)Henan University of Technology Youth Backbone Teacher Program.
文摘The application of transformer networks and feature fusion models in medical image segmentation has aroused considerable attention within the academic circle.Nevertheless,two main obstacles persist:(1)the restrictions of the Transformer network in dealing with locally detailed features,and(2)the considerable loss of feature information in current feature fusion modules.To solve these issues,this study initially presents a refined feature extraction approach,employing a double-branch feature extraction network to capture complex multi-scale local and global information from images.Subsequently,we proposed a low-loss feature fusion method-Multi-branch Feature Fusion Enhancement Module(MFFEM),which realizes effective feature fusion with minimal loss.Simultaneously,the cross-layer cross-attention fusion module(CLCA)is adopted to further achieve adequate feature fusion by enhancing the interaction between encoders and decoders of various scales.Finally,the feasibility of our method was verified using the Synapse and ACDC datasets,demonstrating its competitiveness.The average DSC(%)was 83.62 and 91.99 respectively,and the average HD95(mm)was reduced to 19.55 and 1.15 respectively.
文摘A wavelet-based local and global feature fusion network(LAGN)is proposed for low-light image enhancement,aiming to enhance image details and restore colors in dark areas.This study focuses on addressing three key issues in low-light image enhancement:Enhancing low-light images using LAGN to preserve image details and colors;extracting image edge information via wavelet transform to enhance image details;and extracting local and global features of images through convolutional neural networks and Transformer to improve image contrast.Comparisons with state-of-the-art methods on two datasets verify that LAGN achieves the best performance in terms of details,brightness,and contrast.
文摘BACKGROUND Colorectal cancer(CRC)is a malignant tumor with high morbidity and mortality rates worldwide.With the development of medical imaging technology,imaging features are playing an increasingly important role in the prognostic evaluation of CRC.Laparoscopic radical resection is a common surgical approach for treating CRC.However,research on the link between preoperative imaging and short-term prognosis in this context is limited.We hypothesized that specific preope-rative imaging features can predict the short-term prognosis in patients under-going laparoscopic CRC resection.AIM To investigate the imaging features of CRC and analyze their correlation with the short-term prognosis of laparoscopic radical resection.METHODS This retrospective study conducted at the Affiliated Cancer Hospital of Shandong First Medical University included 122 patients diagnosed with CRC who under-went laparoscopic radical resection between January 2021 and February 2024.All patients underwent magnetic resonance imaging(MRI)and were diagnosed with CRC through pathological examination.MRI data and prognostic indicators were collected 30 days post-surgery.Logistic regression analysis identified imaging fea-tures linked to short-term prognosis,and a receiver operating characteristic(ROC)curve was used to evaluate the predictive value.RESULTS Among 122 patients,22 had irregular,low-intensity tumors with adjacent high signals.In 55,tumors were surrounded by alternating signals in the muscle layer.In 32,tumors extended through the muscular layer and blurred boundaries with perienteric adipose tissue.Tumor signals appeared in the adjacent tissues in 13 patients with blurred gaps.Logistic regression revealed differences in longitudinal tumor length,axial tumor length,volume transfer constant,plasma volume fraction,and apparent diffusion coefficient among patients with varying prognostic results.ROC analysis indicated that the areas under the curve for these parameters were 0.648,0.927,0.821,0.809,and 0.831,respectively.Sensitivity values were 0.643,0.893,0.607,0.714,and 0.714,and specificity 0.702,0.904,0.883,0.968,and 0.894(P<0.05).CONCLUSION The imaging features of CRC correlate with the short-term prognosis following laparoscopic radical resection.These findings provide valuable insights for clinical decision-making.
基金co-supported by the National Natural Science Foundation of China(No.62103190)the Natural Science Foundation of Jiangsu Province,China(No.BK20230923)。
文摘When detecting objects in Unmanned Aerial Vehicle(UAV)taken images,large number of objects and high proportion of small objects bring huge challenges for detection algorithms based on the You Only Look Once(YOLO)framework,rendering them challenging to deal with tasks that demand high precision.To address these problems,this paper proposes a high-precision object detection algorithm based on YOLOv10s.Firstly,a Multi-branch Enhancement Coordinate Attention(MECA)module is proposed to enhance feature extraction capability.Secondly,a Multilayer Feature Reconstruction(MFR)mechanism is designed to fully exploit multilayer features,which can enrich object information as well as remove redundant information.Finally,an MFR Path Aggregation Network(MFR-Neck)is constructed,which integrates multi-scale features to improve the network's ability to perceive objects of var-ying sizes.The experimental results demonstrate that the proposed algorithm increases the average detection accuracy by 14.15%on the Vis Drone dataset compared to YOLOv10s,effectively enhancing object detection precision in UAV-taken images.
基金supported by the National Natural Science Foundation of China under Grant Nos.61370195and 11101048Beijing Natural Science Foundation under Grant No.4132060the National Cryptography Development Foundation of China under Grant No.MMJJ201201002
文摘Recently, the digital image blind forensics technology has received an increasing attention in academic community. This paper aims at developing a new identification approach based on the statistical noise and exchangeable image file format (EXIF) information of image for images authen- tication. In particular, the authors can identify whether the current image has been modified or not by utilizing the relevance between noise and EXIF parameters and comparing the real values with the estimated values of the EXIF parameters. Experimental results validate the proposed method. That is, the detecting system can identify the doctored image effectively.
基金supported in part by the National Natural Science Foundation of China(Nos.62202234,62401270)the China Postdoctoral Science Foundation(No.2023M741778)the Natural Science Foundation of Jiangsu Province(Nos.BK20240706,BK20240694).
文摘Generative image steganography is a technique that directly generates stego images from secret infor-mation.Unlike traditional methods,it theoretically resists steganalysis because there is no cover image.Currently,the existing generative image steganography methods generally have good steganography performance,but there is still potential room for enhancing both the quality of stego images and the accuracy of secret information extraction.Therefore,this paper proposes a generative image steganography algorithm based on attribute feature transformation and invertible mapping rule.Firstly,the reference image is disentangled by a content and an attribute encoder to obtain content features and attribute features,respectively.Then,a mean mapping rule is introduced to map the binary secret information into a noise vector,conforming to the distribution of attribute features.This noise vector is input into the generator to produce the attribute transformed stego image with the content feature of the reference image.Additionally,we design an adversarial loss,a reconstruction loss,and an image diversity loss to train the proposed model.Experimental results demonstrate that the stego images generated by the proposed method are of high quality,with an average extraction accuracy of 99.4%for the hidden information.Furthermore,since the stego image has a uniform distribution similar to the attribute-transformed image without secret information,it effectively resists both subjective and objective steganalysis.