High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes an...High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.展开更多
Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive te...Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive text data.Their potential integration into clinical settings offers a promising avenue that could transform clinical diagnosis and decision-making processes in the future(Thirunavukarasu et al.,2023).This article aims to provide an in-depth analysis of LLMs’current and potential impact on clinical practices.Their ability to generate differential diagnosis lists underscores their potential as invaluable tools in medical practice and education(Hirosawa et al.,2023;Koga et al.,2023).展开更多
Digital watermarking technology plays an important role in detecting malicious tampering and protecting image copyright.However,in practical applications,this technology faces various problems such as severe image dis...Digital watermarking technology plays an important role in detecting malicious tampering and protecting image copyright.However,in practical applications,this technology faces various problems such as severe image distortion,inaccurate localization of the tampered regions,and difficulty in recovering content.Given these shortcomings,a fragile image watermarking algorithm for tampering blind-detection and content self-recovery is proposed.The multi-feature watermarking authentication code(AC)is constructed using texture feature of local binary patterns(LBP),direct coefficient of discrete cosine transform(DCT)and contrast feature of gray level co-occurrence matrix(GLCM)for detecting the tampered region,and the recovery code(RC)is designed according to the average grayscale value of pixels in image blocks for recovering the tampered content.Optimal pixel adjustment process(OPAP)and least significant bit(LSB)algorithms are used to embed the recovery code and authentication code into the image in a staggered manner.When detecting the integrity of the image,the authentication code comparison method and threshold judgment method are used to perform two rounds of tampering detection on the image and blindly recover the tampered content.Experimental results show that this algorithm has good transparency,strong and blind detection,and self-recovery performance against four types of malicious attacks and some conventional signal processing operations.When resisting copy-paste,text addition,cropping and vector quantization under the tampering rate(TR)10%,the average tampering detection rate is up to 94.09%,and the peak signal-to-noise ratio(PSNR)of the watermarked image and the recovered image are both greater than 41.47 and 40.31 dB,which demonstrates its excellent advantages compared with other related algorithms in recent years.展开更多
A large-scale view of the magnetospheric cusp is expected to be obtained by the Soft X-ray Imager(SXI)onboard the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE).However,it is challenging to trace the three-d...A large-scale view of the magnetospheric cusp is expected to be obtained by the Soft X-ray Imager(SXI)onboard the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE).However,it is challenging to trace the three-dimensional cusp boundary from a two-dimensional X-ray image because the detected X-ray signals will be integrated along the line of sight.In this work,a global magnetohydrodynamic code was used to simulate the X-ray images and photon count images,assuming an interplanetary magnetic field with a pure Bz component.The assumption of an elliptic cusp boundary at a given altitude was used to trace the equatorward and poleward boundaries of the cusp from a simulated X-ray image.The average discrepancy was less than 0.1 RE.To reduce the influence of instrument effects and cosmic X-ray backgrounds,image denoising was considered before applying the method above to SXI photon count images.The cusp boundaries were reasonably reconstructed from the noisy X-ray image.展开更多
Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approach...Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.展开更多
Driven by advancements in mobile internet technology,images have become a crucial data medium.Ensuring the security of image information during transmission has thus emerged as an urgent challenge.This study proposes ...Driven by advancements in mobile internet technology,images have become a crucial data medium.Ensuring the security of image information during transmission has thus emerged as an urgent challenge.This study proposes a novel image encryption algorithm specifically designed for grayscale image security.This research introduces a new Cantor diagonal matrix permutation method.The proposed permutation method uses row and column index sequences to control the Cantor diagonal matrix,where the row and column index sequences are generated by a spatiotemporal chaotic system named coupled map lattice(CML).The high initial value sensitivity of the CML system makes the permutation method highly sensitive and secure.Additionally,leveraging fractal theory,this study introduces a chaotic fractal matrix and applies this matrix in the diffusion process.This chaotic fractal matrix exhibits selfsimilarity and irregularity.Using the Cantor diagonal matrix and chaotic fractal matrix,this paper introduces a fast image encryption algorithm involving two diffusion steps and one permutation step.Moreover,the algorithm achieves robust security with only a single encryption round,ensuring high operational efficiency.Experimental results show that the proposed algorithm features an expansive key space,robust security,high sensitivity,high efficiency,and superior statistical properties for the ciphered images.Thus,the proposed algorithm not only provides a practical solution for secure image transmission but also bridges fractal theory with image encryption techniques,thereby opening new research avenues in chaotic cryptography and advancing the development of information security technology.展开更多
Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi...Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.展开更多
Photoresponsive memristors(i.e.,photomemristors)have been recently highly regarded to tackle data latency and energy consumption challenges in conventional Von Neumann architecture-based image recognition systems.Howe...Photoresponsive memristors(i.e.,photomemristors)have been recently highly regarded to tackle data latency and energy consumption challenges in conventional Von Neumann architecture-based image recognition systems.However,their efficacy in recognizing low-contrast images is quite limited,and while preprocessing algorithms are usually employed to enhance these images,which naturally introduce delays that hinder real-time recognition in complex conditions.To address this challenge,here we present a selfdriven polarization-sensitive ferroelectric photomemristor inspired by advanced biological systems.The proposed prototype device is engineered to extract image polarization information,enabling real-time and in-situ enhanced image recognition and classification capabilities.By combining the anisotropic optical feature of the two-dimensional material(ReSe_(2))and ferroelectric polarization of singlecrystalline diisopropylammonium bromide(DIPAB)thin film,tunable and self-driven polarized responsiveness with intelligence was achieved.With remarkable optoelectronic synaptic characteristics of the fabricated device,a significant enhancement was demonstrated in recognition probability—averaging an impressive 85.9% for low-contrast scenarios,in contrast to the mere 47.5% exhibited by traditional photomemristors.This holds substantial implications for the detection and recognition of subtle information in diverse scenes such as autonomous driving,medical imaging,and astronomical observation.展开更多
Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods ex...Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.展开更多
Colorectal cancer(CRC)with lung oligometastases,particularly in the presence of extrapulmonary disease,poses considerable therapeutic challenges in clinical practice.We have carefully studied the multicenter study by ...Colorectal cancer(CRC)with lung oligometastases,particularly in the presence of extrapulmonary disease,poses considerable therapeutic challenges in clinical practice.We have carefully studied the multicenter study by Hu et al,which evaluated the survival outcomes of patients with metastatic CRC who received image-guided thermal ablation(IGTA).These findings provide valuable clinical evidence supporting IGTA as a feasible,minimally invasive approach and underscore the prognostic significance of metastatic distribution.However,the study by Hu et al has several limitations,including that not all pulmonary lesions were pathologically confirmed,postoperative follow-up mainly relied on dynamic contrast-enhanced computed tomography,no comparative analysis was performed with other local treatments,and the impact of other imaging features on efficacy and prognosis was not evaluated.Future studies should include complete pathological confirmation,integrate functional imaging and radiomics,and use prospective multicenter collaboration to optimize patient selection standards for IGTA treatment,strengthen its clinical evidence base,and ultimately promote individualized decision-making for patients with metastatic CRC.展开更多
Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often stru...Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.展开更多
In this work, image feature vectors are formed for blocks containing sufficient information, which are selected using a singular-value criterion. When the ratio between the first two SVs axe below a given threshold, t...In this work, image feature vectors are formed for blocks containing sufficient information, which are selected using a singular-value criterion. When the ratio between the first two SVs axe below a given threshold, the block is considered informative. A total of 12 features including statistics of brightness, color components and texture measures are used to form intermediate vectors. Principal component analysis is then performed to reduce the dimension to 6 to give the final feature vectors. Relevance of the constructed feature vectors is demonstrated by experiments in which k-means clustering is used to group the vectors hence the blocks. Blocks falling into the same group show similar visual appearances.展开更多
This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorit...This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorithm. Then all the CCs including text CCs and non-text CCs are verified on their text features by a 2-stage classification module, where most non-text CCs are discarded by an attentional cascade classifier and remaining CCs are further verified by an SVM. All the accepted CCs are output to result in text only binary image. Experiments with many images in different scenes showed satisfactory performance of our proposed method.展开更多
This paper develops a deep learning classification method with fully-connected 8-layers characteristics to classification of coastal wetland based on CHRIS hyperspectral image. The method combined spectral feature and...This paper develops a deep learning classification method with fully-connected 8-layers characteristics to classification of coastal wetland based on CHRIS hyperspectral image. The method combined spectral feature and multi-spatial texture feature information has been applied in the Huanghe(Yellow) River Estuary coastal wetland.The results show that:(1) Based on testing samples, the DCNN model combined spectral feature and texture feature after K-L transformation appear high classification accuracy, which is up to 99%.(2) The accuracy by using spectral feature with all the texture feature is lower than that using spectral only and combing spectral and texture feature after K-L transformation. The DCNN classification accuracy using spectral feature and texture feature after K-L transformation was up to 99.38%, and the outperformed that of all the texture feature by 4.15%.(3) The classification accuracy of the DCNN method achieves better performance than other methods based on the whole validation image, with an overall accuracy of 84.64% and the Kappa coefficient of 0.80.(4) The developed DCNN model classification algorithm ensured the accuracy of all types is more balanced, and it also greatly improved the accuracy of tidal flat and farmland, while kept the classification accuracy of main types almost invariant compared to the shallow algorithms. The classification accuracy of tidal flat and farmland is up to 79.26% and 56.72%respectively based on the DCNN model. And it improves by about 2.51% and 10.6% compared with that of the other shallow classification methods.展开更多
The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the qu...The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).展开更多
The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
Aiming at the binary text image's characteristics of simple pixel, complex texture and bad immunity of information concealment, a digital watermarking embedment location choosing method has been put forward based upo...Aiming at the binary text image's characteristics of simple pixel, complex texture and bad immunity of information concealment, a digital watermarking embedment location choosing method has been put forward based upon compatible roughness set. The method divides binary text image into different equivalent classes. Equivalent classes are further divided into different subclasses according to each pixel's degree and texture changes between blocks. Through properties' combination, the embedment block and location which are fit for watermarking are found out. At last, different binary text images are chosen for emulation experiment. After being embedded, the image is compressed in JPIG-2. Gaussian noise, salt & pepper noise are added and cutting is employed to imitate the actual environment in which images may suffer from various attacks and interferences. The result shows that the detector has a sound testing effect under various conditions.展开更多
基金provided by the Science Research Project of Hebei Education Department under grant No.BJK2024115.
文摘High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.
文摘Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive text data.Their potential integration into clinical settings offers a promising avenue that could transform clinical diagnosis and decision-making processes in the future(Thirunavukarasu et al.,2023).This article aims to provide an in-depth analysis of LLMs’current and potential impact on clinical practices.Their ability to generate differential diagnosis lists underscores their potential as invaluable tools in medical practice and education(Hirosawa et al.,2023;Koga et al.,2023).
基金supported by Postgraduate Research&Practice Innovation Program of Jiangsu Province,China(Grant No.SJCX24_1332)Jiangsu Province Education Science Planning Project in 2024(Grant No.B-b/2024/01/122)High-Level Talent Scientific Research Foundation of Jinling Institute of Technology,China(Grant No.jit-b-201918).
文摘Digital watermarking technology plays an important role in detecting malicious tampering and protecting image copyright.However,in practical applications,this technology faces various problems such as severe image distortion,inaccurate localization of the tampered regions,and difficulty in recovering content.Given these shortcomings,a fragile image watermarking algorithm for tampering blind-detection and content self-recovery is proposed.The multi-feature watermarking authentication code(AC)is constructed using texture feature of local binary patterns(LBP),direct coefficient of discrete cosine transform(DCT)and contrast feature of gray level co-occurrence matrix(GLCM)for detecting the tampered region,and the recovery code(RC)is designed according to the average grayscale value of pixels in image blocks for recovering the tampered content.Optimal pixel adjustment process(OPAP)and least significant bit(LSB)algorithms are used to embed the recovery code and authentication code into the image in a staggered manner.When detecting the integrity of the image,the authentication code comparison method and threshold judgment method are used to perform two rounds of tampering detection on the image and blindly recover the tampered content.Experimental results show that this algorithm has good transparency,strong and blind detection,and self-recovery performance against four types of malicious attacks and some conventional signal processing operations.When resisting copy-paste,text addition,cropping and vector quantization under the tampering rate(TR)10%,the average tampering detection rate is up to 94.09%,and the peak signal-to-noise ratio(PSNR)of the watermarked image and the recovered image are both greater than 41.47 and 40.31 dB,which demonstrates its excellent advantages compared with other related algorithms in recent years.
基金funded by the National Natural Science Foundation of China(NNSFC)under Grant Numbers 42322408,42188101,and 42441809Additional support was provided by the Climbing Program of the National Space Science Center(NSSC,Grant No.E4PD3005)as well as the Specialized Research Fund for State Key Laboratories of China.
文摘A large-scale view of the magnetospheric cusp is expected to be obtained by the Soft X-ray Imager(SXI)onboard the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE).However,it is challenging to trace the three-dimensional cusp boundary from a two-dimensional X-ray image because the detected X-ray signals will be integrated along the line of sight.In this work,a global magnetohydrodynamic code was used to simulate the X-ray images and photon count images,assuming an interplanetary magnetic field with a pure Bz component.The assumption of an elliptic cusp boundary at a given altitude was used to trace the equatorward and poleward boundaries of the cusp from a simulated X-ray image.The average discrepancy was less than 0.1 RE.To reduce the influence of instrument effects and cosmic X-ray backgrounds,image denoising was considered before applying the method above to SXI photon count images.The cusp boundaries were reasonably reconstructed from the noisy X-ray image.
基金funded by the National Natural Science Foundation of China,grant numbers 52374156 and 62476005。
文摘Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.
基金supported by the National Natural Science Foundation of China(62376106)The Science and Technology Development Plan of Jilin Province(20250102212JC).
文摘Driven by advancements in mobile internet technology,images have become a crucial data medium.Ensuring the security of image information during transmission has thus emerged as an urgent challenge.This study proposes a novel image encryption algorithm specifically designed for grayscale image security.This research introduces a new Cantor diagonal matrix permutation method.The proposed permutation method uses row and column index sequences to control the Cantor diagonal matrix,where the row and column index sequences are generated by a spatiotemporal chaotic system named coupled map lattice(CML).The high initial value sensitivity of the CML system makes the permutation method highly sensitive and secure.Additionally,leveraging fractal theory,this study introduces a chaotic fractal matrix and applies this matrix in the diffusion process.This chaotic fractal matrix exhibits selfsimilarity and irregularity.Using the Cantor diagonal matrix and chaotic fractal matrix,this paper introduces a fast image encryption algorithm involving two diffusion steps and one permutation step.Moreover,the algorithm achieves robust security with only a single encryption round,ensuring high operational efficiency.Experimental results show that the proposed algorithm features an expansive key space,robust security,high sensitivity,high efficiency,and superior statistical properties for the ciphered images.Thus,the proposed algorithm not only provides a practical solution for secure image transmission but also bridges fractal theory with image encryption techniques,thereby opening new research avenues in chaotic cryptography and advancing the development of information security technology.
基金funded by University of Transport and Communications(UTC)under grant number T2025-CN-004.
文摘Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.
基金supported by the National Key Research and Development Program of China for International Cooperation under Grant 2023YFE0117100the National Natural Science Foundation of China(Nos.62074040 and 62074045).
文摘Photoresponsive memristors(i.e.,photomemristors)have been recently highly regarded to tackle data latency and energy consumption challenges in conventional Von Neumann architecture-based image recognition systems.However,their efficacy in recognizing low-contrast images is quite limited,and while preprocessing algorithms are usually employed to enhance these images,which naturally introduce delays that hinder real-time recognition in complex conditions.To address this challenge,here we present a selfdriven polarization-sensitive ferroelectric photomemristor inspired by advanced biological systems.The proposed prototype device is engineered to extract image polarization information,enabling real-time and in-situ enhanced image recognition and classification capabilities.By combining the anisotropic optical feature of the two-dimensional material(ReSe_(2))and ferroelectric polarization of singlecrystalline diisopropylammonium bromide(DIPAB)thin film,tunable and self-driven polarized responsiveness with intelligence was achieved.With remarkable optoelectronic synaptic characteristics of the fabricated device,a significant enhancement was demonstrated in recognition probability—averaging an impressive 85.9% for low-contrast scenarios,in contrast to the mere 47.5% exhibited by traditional photomemristors.This holds substantial implications for the detection and recognition of subtle information in diverse scenes such as autonomous driving,medical imaging,and astronomical observation.
基金This study was supported by:Inner Mongolia Academy of Forestry Sciences Open Research Project(Grant No.KF2024MS03)The Project to Improve the Scientific Research Capacity of the Inner Mongolia Academy of Forestry Sciences(Grant No.2024NLTS04)The Innovation and Entrepreneurship Training Program for Undergraduates of Beijing Forestry University(Grant No.X202410022268).
文摘Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.
文摘Colorectal cancer(CRC)with lung oligometastases,particularly in the presence of extrapulmonary disease,poses considerable therapeutic challenges in clinical practice.We have carefully studied the multicenter study by Hu et al,which evaluated the survival outcomes of patients with metastatic CRC who received image-guided thermal ablation(IGTA).These findings provide valuable clinical evidence supporting IGTA as a feasible,minimally invasive approach and underscore the prognostic significance of metastatic distribution.However,the study by Hu et al has several limitations,including that not all pulmonary lesions were pathologically confirmed,postoperative follow-up mainly relied on dynamic contrast-enhanced computed tomography,no comparative analysis was performed with other local treatments,and the impact of other imaging features on efficacy and prognosis was not evaluated.Future studies should include complete pathological confirmation,integrate functional imaging and radiomics,and use prospective multicenter collaboration to optimize patient selection standards for IGTA treatment,strengthen its clinical evidence base,and ultimately promote individualized decision-making for patients with metastatic CRC.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2025-02-01295).
文摘Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.
基金Project supported by the National Natural Science Foundation of China (Grant No.60502039), the Shanghai Rising-Star Program (Grant No.06QA14022), and the Key Project of Shanghai Municipality for Basic Research (Grant No.04JC14037)
文摘In this work, image feature vectors are formed for blocks containing sufficient information, which are selected using a singular-value criterion. When the ratio between the first two SVs axe below a given threshold, the block is considered informative. A total of 12 features including statistics of brightness, color components and texture measures are used to form intermediate vectors. Principal component analysis is then performed to reduce the dimension to 6 to give the final feature vectors. Relevance of the constructed feature vectors is demonstrated by experiments in which k-means clustering is used to group the vectors hence the blocks. Blocks falling into the same group show similar visual appearances.
基金Project supported by the OMRON and SJTU Collaborative Founda-tion under PVS project (2005.03~2005.10)
文摘This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorithm. Then all the CCs including text CCs and non-text CCs are verified on their text features by a 2-stage classification module, where most non-text CCs are discarded by an attentional cascade classifier and remaining CCs are further verified by an SVM. All the accepted CCs are output to result in text only binary image. Experiments with many images in different scenes showed satisfactory performance of our proposed method.
基金The National Natural Science Foundation of China under contract No.61601133 and 41206172the Marine Application System of High Resolution Earth Observation System Major Project
文摘This paper develops a deep learning classification method with fully-connected 8-layers characteristics to classification of coastal wetland based on CHRIS hyperspectral image. The method combined spectral feature and multi-spatial texture feature information has been applied in the Huanghe(Yellow) River Estuary coastal wetland.The results show that:(1) Based on testing samples, the DCNN model combined spectral feature and texture feature after K-L transformation appear high classification accuracy, which is up to 99%.(2) The accuracy by using spectral feature with all the texture feature is lower than that using spectral only and combing spectral and texture feature after K-L transformation. The DCNN classification accuracy using spectral feature and texture feature after K-L transformation was up to 99.38%, and the outperformed that of all the texture feature by 4.15%.(3) The classification accuracy of the DCNN method achieves better performance than other methods based on the whole validation image, with an overall accuracy of 84.64% and the Kappa coefficient of 0.80.(4) The developed DCNN model classification algorithm ensured the accuracy of all types is more balanced, and it also greatly improved the accuracy of tidal flat and farmland, while kept the classification accuracy of main types almost invariant compared to the shallow algorithms. The classification accuracy of tidal flat and farmland is up to 79.26% and 56.72%respectively based on the DCNN model. And it improves by about 2.51% and 10.6% compared with that of the other shallow classification methods.
基金Foundation items:Shanghai Sailing Program,China (No. 21YF1401300)Shanghai Science and Technology Innovation Action Plan,China (No.19511101802)Fundamental Research Funds for the Central Universities,China (No.2232021D-25)。
文摘The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.
基金Supported by the Natural Science Foundation ofHubei Province(B200627003)
文摘Aiming at the binary text image's characteristics of simple pixel, complex texture and bad immunity of information concealment, a digital watermarking embedment location choosing method has been put forward based upon compatible roughness set. The method divides binary text image into different equivalent classes. Equivalent classes are further divided into different subclasses according to each pixel's degree and texture changes between blocks. Through properties' combination, the embedment block and location which are fit for watermarking are found out. At last, different binary text images are chosen for emulation experiment. After being embedded, the image is compressed in JPIG-2. Gaussian noise, salt & pepper noise are added and cutting is employed to imitate the actual environment in which images may suffer from various attacks and interferences. The result shows that the detector has a sound testing effect under various conditions.