Among various architectures of polymers,end-group-free rings have attracted growing interests due to their distinct physicochemical performances over the linear counterparts which are exemplified by reduced hydrodynam...Among various architectures of polymers,end-group-free rings have attracted growing interests due to their distinct physicochemical performances over the linear counterparts which are exemplified by reduced hydrodynamic size and slower degradation.It is key to develop facile methods to large-scale synthesis of polymer rings with tunable compositions and microstructures.Recent progresses in large-scale synthesis of polymer rings against single-chain dynamic nanoparticles,and the example applications in synchronous enhancing toughness and strength of polymer nanocomposites are summarized.Once there is the breakthrough in rational design and effective large-scale synthesis of polymer rings and their functional derivatives,a family of cyclic functional hybrids would be available,thus providing a new paradigm in developing polymer science and engineering.展开更多
Digital watermarking technology plays an important role in detecting malicious tampering and protecting image copyright.However,in practical applications,this technology faces various problems such as severe image dis...Digital watermarking technology plays an important role in detecting malicious tampering and protecting image copyright.However,in practical applications,this technology faces various problems such as severe image distortion,inaccurate localization of the tampered regions,and difficulty in recovering content.Given these shortcomings,a fragile image watermarking algorithm for tampering blind-detection and content self-recovery is proposed.The multi-feature watermarking authentication code(AC)is constructed using texture feature of local binary patterns(LBP),direct coefficient of discrete cosine transform(DCT)and contrast feature of gray level co-occurrence matrix(GLCM)for detecting the tampered region,and the recovery code(RC)is designed according to the average grayscale value of pixels in image blocks for recovering the tampered content.Optimal pixel adjustment process(OPAP)and least significant bit(LSB)algorithms are used to embed the recovery code and authentication code into the image in a staggered manner.When detecting the integrity of the image,the authentication code comparison method and threshold judgment method are used to perform two rounds of tampering detection on the image and blindly recover the tampered content.Experimental results show that this algorithm has good transparency,strong and blind detection,and self-recovery performance against four types of malicious attacks and some conventional signal processing operations.When resisting copy-paste,text addition,cropping and vector quantization under the tampering rate(TR)10%,the average tampering detection rate is up to 94.09%,and the peak signal-to-noise ratio(PSNR)of the watermarked image and the recovered image are both greater than 41.47 and 40.31 dB,which demonstrates its excellent advantages compared with other related algorithms in recent years.展开更多
Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approach...Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.展开更多
High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes an...High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.展开更多
Driven by advancements in mobile internet technology,images have become a crucial data medium.Ensuring the security of image information during transmission has thus emerged as an urgent challenge.This study proposes ...Driven by advancements in mobile internet technology,images have become a crucial data medium.Ensuring the security of image information during transmission has thus emerged as an urgent challenge.This study proposes a novel image encryption algorithm specifically designed for grayscale image security.This research introduces a new Cantor diagonal matrix permutation method.The proposed permutation method uses row and column index sequences to control the Cantor diagonal matrix,where the row and column index sequences are generated by a spatiotemporal chaotic system named coupled map lattice(CML).The high initial value sensitivity of the CML system makes the permutation method highly sensitive and secure.Additionally,leveraging fractal theory,this study introduces a chaotic fractal matrix and applies this matrix in the diffusion process.This chaotic fractal matrix exhibits selfsimilarity and irregularity.Using the Cantor diagonal matrix and chaotic fractal matrix,this paper introduces a fast image encryption algorithm involving two diffusion steps and one permutation step.Moreover,the algorithm achieves robust security with only a single encryption round,ensuring high operational efficiency.Experimental results show that the proposed algorithm features an expansive key space,robust security,high sensitivity,high efficiency,and superior statistical properties for the ciphered images.Thus,the proposed algorithm not only provides a practical solution for secure image transmission but also bridges fractal theory with image encryption techniques,thereby opening new research avenues in chaotic cryptography and advancing the development of information security technology.展开更多
Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi...Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.展开更多
Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods ex...Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.展开更多
Existing single-pixel imaging(SPI)and sensing techniques suffer from poor reconstruction quality and heavy computation cost,limiting their widespread application.To tackle these challenges,we propose a large-scale sin...Existing single-pixel imaging(SPI)and sensing techniques suffer from poor reconstruction quality and heavy computation cost,limiting their widespread application.To tackle these challenges,we propose a large-scale single-pixel imaging and sensing(SPIS)technique that enables high-quality megapixel SPI and highly efficient image-free sensing with a low sampling rate.Specifically,we first scan and sample the entire scene using small-size optimized patterns to obtain information-coupled measurements.Compared with the conventional full-sized patterns,small-sized optimized patterns achieve higher imaging fidelity and sensing accuracy with 1 order of magnitude fewer pattern parameters.Next,the coupled measurements are processed through a transformer-based encoder to extract high-dimensional features,followed by a task-specific plugand-play decoder for imaging or image-free sensing.Considering that the regions with rich textures and edges are more difficult to reconstruct,we use an uncertainty-driven self-adaptive loss function to reinforce the network’s attention to these regions,thereby improving the imaging and sensing performance.Extensive experiments demonstrate that the reported technique achieves 24.13 dB megapixel SPI at a sampling rate of 3%within 1 s.In terms of sensing,it outperforms existing methods by 12%on image-free segmentation accuracy and achieves state-of-the-art image-free object detection accuracy with an order of magnitude less data bandwidth.展开更多
Colorectal cancer(CRC)with lung oligometastases,particularly in the presence of extrapulmonary disease,poses considerable therapeutic challenges in clinical practice.We have carefully studied the multicenter study by ...Colorectal cancer(CRC)with lung oligometastases,particularly in the presence of extrapulmonary disease,poses considerable therapeutic challenges in clinical practice.We have carefully studied the multicenter study by Hu et al,which evaluated the survival outcomes of patients with metastatic CRC who received image-guided thermal ablation(IGTA).These findings provide valuable clinical evidence supporting IGTA as a feasible,minimally invasive approach and underscore the prognostic significance of metastatic distribution.However,the study by Hu et al has several limitations,including that not all pulmonary lesions were pathologically confirmed,postoperative follow-up mainly relied on dynamic contrast-enhanced computed tomography,no comparative analysis was performed with other local treatments,and the impact of other imaging features on efficacy and prognosis was not evaluated.Future studies should include complete pathological confirmation,integrate functional imaging and radiomics,and use prospective multicenter collaboration to optimize patient selection standards for IGTA treatment,strengthen its clinical evidence base,and ultimately promote individualized decision-making for patients with metastatic CRC.展开更多
Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often stru...Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.展开更多
Due to advances in satellite and sensor technology,the number and size of Remote Sensing(RS)images continue to grow at a rapid pace.The continuous stream of sensor data from satellites poses major challenges for the r...Due to advances in satellite and sensor technology,the number and size of Remote Sensing(RS)images continue to grow at a rapid pace.The continuous stream of sensor data from satellites poses major challenges for the retrieval of relevant information from those satellite datastreams.The Bag-of-Words(BoW)framework is a leading image search approach and has been successfully applied in a broad range of computer vision problems and hence has received much attention from the RS community.However,the recognition performance of a typical BoW framework becomes very poor when the framework is applied to application scenarios where the appearance and texture of images are very similar.In this paper,we propose a simple method to improve recognition performance of a typical BoW framework by representing images with local features extracted from base images.In addition,we propose a similarity measure for RS images by counting the number of same words assigned to images.We compare the performance of these methods with a typical BoW framework.Our experiments show that the proposed method has better recognition performance than that of the BoW and requires less storage space for saving local invariant features.展开更多
Video captioning is the task of assigning complex high-level semantic descriptions (e.g., sentences or paragraphs) to video data. Different from previous video analysis techniques such as video annotation, video eve...Video captioning is the task of assigning complex high-level semantic descriptions (e.g., sentences or paragraphs) to video data. Different from previous video analysis techniques such as video annotation, video event detection and action recognition, video captioning is much closer to human cognition with smaller semantic gap. However, the scarcity of captioned video data severely limits the development of video captioning. In this paper, we propose a novel video captioning approach to describe videos by leveraging freely-available image corpus with abundant literal knowledge. There are two key aspects of our approach: 1) effective integration strategy bridging videos and images, and 2) high efficiency in handling ever-increasing training data. To achieve these goals, we adopt sophisticated visual hashing techniques to efficiently index and search large-scale images for relevant captions, which is of high extensibility to evolving data and the corresponding semantics. Extensive experimental results on various real-world visual datasets show the effectiveness of our approach with different hashing techniques, e.g., LSH (locality-sensitive hashing), PCA-ITQ (principle component analysis iterative quantization) and supervised discrete hashing, as compared with the state-of-the-art methods. It is worth noting that the empirical computational cost of our approach is much lower than that of an existing method, i.e., it takes 1/256 of the memory requirement and 1/64 of the time cost of the method of Devlin et al.展开更多
Unmanned aerial vehicle(UAV)-based imaging systems have many superiorities compared with other platforms,such as high flexibility and low cost in collecting images,providing wide application prospects.However,the acqu...Unmanned aerial vehicle(UAV)-based imaging systems have many superiorities compared with other platforms,such as high flexibility and low cost in collecting images,providing wide application prospects.However,the acquisition of the UAV-based image commonly results in very high resolution and very large-scale images,which poses great challenges for subsequent applications.Therefore,an efficient representation of large-scale UAV images is necessary for the extraction of the required information in a reasonable time.In this work,we proposed a multi-scale hierarchical representation,i.e.binary partition tree,for analyzing large-scale UAV images.More precisely,we first obtained an initial partition of images by an oversegmentation algorithm,i.e.the simple linear iterative clustering.Next,we merged the similar superpixels to build an object-based hierarchical structure by fully considering the spectral and spatial information of the superpixels and their topological relationships.Moreover,objects of interest and optimal segmentation were obtained using object-based analysis methods with the hierarchical structure.Experimental results on processing the post-seismic UAV images of the 2013 Ya’an earthquake and the mosaic of images in the South-west of Munich demonstrate the effectiveness and efficiency of our proposed method.展开更多
In this paper,an improved optical flow method for image registration is proposed.It is novel in the way that it improves the optical flow method with an initial motion estimator:extended phase correlation technique(EP...In this paper,an improved optical flow method for image registration is proposed.It is novel in the way that it improves the optical flow method with an initial motion estimator:extended phase correlation technique(EPCT),using merits of the latter to compensate deficiencies of the former.In a more detailed manner,it can be said that the optical flow method can reach the sub-pixel accuracy and calculate complex distortion patterns like chirping and tilting but is weak with large-scale movements.Because EPCT covers measurements of large translations and rotations with pixel level accuracy and is efficient in the calculating load,it can be treated as a good initial motion estimator for optical flow method.Tests have proved that this improved method will significantly enhance the registration performance,especially,for images with large-scale movements and robust against random noises.展开更多
Image caption is a high-level task in the area of image understanding,in which most of the models adopt a convolutional neural network(CNN)to extract image features assigning a recurrent neural network(RNN)to generate...Image caption is a high-level task in the area of image understanding,in which most of the models adopt a convolutional neural network(CNN)to extract image features assigning a recurrent neural network(RNN)to generate sentences.Researchers tend to design complex networks with deeper layers to improve the performance of feature extraction in recent years.Increasing the size of the network could obtain features of high quality,but it is not an efficient way in terms of computational cost.A large number of parameters brought by CNN makes the research difficult to apply in human daily life.In order to reduce the information loss of the convolutional process with less cost,we propose a lightweight convolutional neural network,named as Bifurcate-CNN(B-CNN).Furthermore,recent works are devoted to generating captions in English,in this paper,we develop an image caption model that generates descriptions in Chinese.Compared with Inception-v3,the depth of our model is shallower with fewer parameters,and the computational cost is lower.Evaluated on the AI CHALLENGER dataset,we prove that our model can enhance the performance,improving BLEU-4 from 46.1 to 49.9 and CIDEr from 142.5 to 156.6 respectively.展开更多
Image-based 3D modeling is an effective method for reconstructing large-scale scenes,especially city-level scenarios.In the image-based modeling pipeline,obtaining a watertight mesh model from a noisy multi-view stere...Image-based 3D modeling is an effective method for reconstructing large-scale scenes,especially city-level scenarios.In the image-based modeling pipeline,obtaining a watertight mesh model from a noisy multi-view stereo point cloud is a key step toward ensuring model quality.However,some state-of-the-art methods rely on the global Delaunay-based optimization formed by all the points and cameras;thus,they encounter scaling problems when dealing with large scenes.To circumvent these limitations,this study proposes a scalable pointcloud meshing approach to aid the reconstruction of city-scale scenes with minimal time consumption and memory usage.Firstly,the entire scene is divided along the x and y axes into several overlapping chunks so that each chunk can satisfy the memory limit.Then,the Delaunay-based optimization is performed to extract meshes for each chunk in parallel.Finally,the local meshes are merged together by resolving local inconsistencies in the overlapping areas between the chunks.We test the proposed method on three city-scale scenes with hundreds of millions of points and thousands of images,and demonstrate its scalability,accuracy,and completeness,compared with the state-of-the-art methods.展开更多
With the increasing of computing ability,large-scale simulations have been generating massive amounts of data in aerodynamics.Sort-last parallel rendering is the most classical image compositing method for large-scale...With the increasing of computing ability,large-scale simulations have been generating massive amounts of data in aerodynamics.Sort-last parallel rendering is the most classical image compositing method for large-scale scientific visualization.However,in the stage of image compositing,the sort-last method may suffer from scalability problem on large-scale processors.Existing image compositing algorithms tend to perform well in certain situations.For instance,Direct Send is well on small and medium scale;Radix-k gets well performance only when the k-value is appropriate and so on.In this paper,we propose a novel method named mSwap for scientific visualization in aerodynamics,which uses the best scale of processors to make sure its performance at the best.mSwap groups the processors that we can use with a(m,k)table,which records the best combination of m(the number of processors in subgroup of each group)and k(the number of processors in each group).Then in each group,using a m-ary tree to composite the image for reducing the communication of processors.Finally,the image is composited between different groups to generate the final image.The performance and scalability of our mSwap method is demonstrated through experiments with thousands of processors.展开更多
1.Introduction Climate change mitigation pathways aimed at limiting global anthropogenic carbon dioxide(CO_(2))emissions while striving to constrain the global temperature increase to below 2℃—as outlined by the Int...1.Introduction Climate change mitigation pathways aimed at limiting global anthropogenic carbon dioxide(CO_(2))emissions while striving to constrain the global temperature increase to below 2℃—as outlined by the Intergovernmental Panel on Climate Change(IPCC)—consistently predict the widespread implementation of CO_(2)geological storage on a global scale.展开更多
The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
基金Supported by the National Natural Science Foundation of China(Nos.52293472,22473096 and 22471164)。
文摘Among various architectures of polymers,end-group-free rings have attracted growing interests due to their distinct physicochemical performances over the linear counterparts which are exemplified by reduced hydrodynamic size and slower degradation.It is key to develop facile methods to large-scale synthesis of polymer rings with tunable compositions and microstructures.Recent progresses in large-scale synthesis of polymer rings against single-chain dynamic nanoparticles,and the example applications in synchronous enhancing toughness and strength of polymer nanocomposites are summarized.Once there is the breakthrough in rational design and effective large-scale synthesis of polymer rings and their functional derivatives,a family of cyclic functional hybrids would be available,thus providing a new paradigm in developing polymer science and engineering.
基金supported by Postgraduate Research&Practice Innovation Program of Jiangsu Province,China(Grant No.SJCX24_1332)Jiangsu Province Education Science Planning Project in 2024(Grant No.B-b/2024/01/122)High-Level Talent Scientific Research Foundation of Jinling Institute of Technology,China(Grant No.jit-b-201918).
文摘Digital watermarking technology plays an important role in detecting malicious tampering and protecting image copyright.However,in practical applications,this technology faces various problems such as severe image distortion,inaccurate localization of the tampered regions,and difficulty in recovering content.Given these shortcomings,a fragile image watermarking algorithm for tampering blind-detection and content self-recovery is proposed.The multi-feature watermarking authentication code(AC)is constructed using texture feature of local binary patterns(LBP),direct coefficient of discrete cosine transform(DCT)and contrast feature of gray level co-occurrence matrix(GLCM)for detecting the tampered region,and the recovery code(RC)is designed according to the average grayscale value of pixels in image blocks for recovering the tampered content.Optimal pixel adjustment process(OPAP)and least significant bit(LSB)algorithms are used to embed the recovery code and authentication code into the image in a staggered manner.When detecting the integrity of the image,the authentication code comparison method and threshold judgment method are used to perform two rounds of tampering detection on the image and blindly recover the tampered content.Experimental results show that this algorithm has good transparency,strong and blind detection,and self-recovery performance against four types of malicious attacks and some conventional signal processing operations.When resisting copy-paste,text addition,cropping and vector quantization under the tampering rate(TR)10%,the average tampering detection rate is up to 94.09%,and the peak signal-to-noise ratio(PSNR)of the watermarked image and the recovered image are both greater than 41.47 and 40.31 dB,which demonstrates its excellent advantages compared with other related algorithms in recent years.
基金funded by the National Natural Science Foundation of China,grant numbers 52374156 and 62476005。
文摘Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.
基金provided by the Science Research Project of Hebei Education Department under grant No.BJK2024115.
文摘High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.
基金supported by the National Natural Science Foundation of China(62376106)The Science and Technology Development Plan of Jilin Province(20250102212JC).
文摘Driven by advancements in mobile internet technology,images have become a crucial data medium.Ensuring the security of image information during transmission has thus emerged as an urgent challenge.This study proposes a novel image encryption algorithm specifically designed for grayscale image security.This research introduces a new Cantor diagonal matrix permutation method.The proposed permutation method uses row and column index sequences to control the Cantor diagonal matrix,where the row and column index sequences are generated by a spatiotemporal chaotic system named coupled map lattice(CML).The high initial value sensitivity of the CML system makes the permutation method highly sensitive and secure.Additionally,leveraging fractal theory,this study introduces a chaotic fractal matrix and applies this matrix in the diffusion process.This chaotic fractal matrix exhibits selfsimilarity and irregularity.Using the Cantor diagonal matrix and chaotic fractal matrix,this paper introduces a fast image encryption algorithm involving two diffusion steps and one permutation step.Moreover,the algorithm achieves robust security with only a single encryption round,ensuring high operational efficiency.Experimental results show that the proposed algorithm features an expansive key space,robust security,high sensitivity,high efficiency,and superior statistical properties for the ciphered images.Thus,the proposed algorithm not only provides a practical solution for secure image transmission but also bridges fractal theory with image encryption techniques,thereby opening new research avenues in chaotic cryptography and advancing the development of information security technology.
基金funded by University of Transport and Communications(UTC)under grant number T2025-CN-004.
文摘Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.
基金This study was supported by:Inner Mongolia Academy of Forestry Sciences Open Research Project(Grant No.KF2024MS03)The Project to Improve the Scientific Research Capacity of the Inner Mongolia Academy of Forestry Sciences(Grant No.2024NLTS04)The Innovation and Entrepreneurship Training Program for Undergraduates of Beijing Forestry University(Grant No.X202410022268).
文摘Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.
基金supported by the National Natural Science Foundation of China(Grant Nos.62322502,62131003,and 62088101)the Guangdong Province Key Laboratory of Intelligent Detection in Complex Environment of Aerospace,Land and Sea(Grant No.2022KSYS016).
文摘Existing single-pixel imaging(SPI)and sensing techniques suffer from poor reconstruction quality and heavy computation cost,limiting their widespread application.To tackle these challenges,we propose a large-scale single-pixel imaging and sensing(SPIS)technique that enables high-quality megapixel SPI and highly efficient image-free sensing with a low sampling rate.Specifically,we first scan and sample the entire scene using small-size optimized patterns to obtain information-coupled measurements.Compared with the conventional full-sized patterns,small-sized optimized patterns achieve higher imaging fidelity and sensing accuracy with 1 order of magnitude fewer pattern parameters.Next,the coupled measurements are processed through a transformer-based encoder to extract high-dimensional features,followed by a task-specific plugand-play decoder for imaging or image-free sensing.Considering that the regions with rich textures and edges are more difficult to reconstruct,we use an uncertainty-driven self-adaptive loss function to reinforce the network’s attention to these regions,thereby improving the imaging and sensing performance.Extensive experiments demonstrate that the reported technique achieves 24.13 dB megapixel SPI at a sampling rate of 3%within 1 s.In terms of sensing,it outperforms existing methods by 12%on image-free segmentation accuracy and achieves state-of-the-art image-free object detection accuracy with an order of magnitude less data bandwidth.
文摘Colorectal cancer(CRC)with lung oligometastases,particularly in the presence of extrapulmonary disease,poses considerable therapeutic challenges in clinical practice.We have carefully studied the multicenter study by Hu et al,which evaluated the survival outcomes of patients with metastatic CRC who received image-guided thermal ablation(IGTA).These findings provide valuable clinical evidence supporting IGTA as a feasible,minimally invasive approach and underscore the prognostic significance of metastatic distribution.However,the study by Hu et al has several limitations,including that not all pulmonary lesions were pathologically confirmed,postoperative follow-up mainly relied on dynamic contrast-enhanced computed tomography,no comparative analysis was performed with other local treatments,and the impact of other imaging features on efficacy and prognosis was not evaluated.Future studies should include complete pathological confirmation,integrate functional imaging and radiomics,and use prospective multicenter collaboration to optimize patient selection standards for IGTA treatment,strengthen its clinical evidence base,and ultimately promote individualized decision-making for patients with metastatic CRC.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2025-02-01295).
文摘Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.
文摘Due to advances in satellite and sensor technology,the number and size of Remote Sensing(RS)images continue to grow at a rapid pace.The continuous stream of sensor data from satellites poses major challenges for the retrieval of relevant information from those satellite datastreams.The Bag-of-Words(BoW)framework is a leading image search approach and has been successfully applied in a broad range of computer vision problems and hence has received much attention from the RS community.However,the recognition performance of a typical BoW framework becomes very poor when the framework is applied to application scenarios where the appearance and texture of images are very similar.In this paper,we propose a simple method to improve recognition performance of a typical BoW framework by representing images with local features extracted from base images.In addition,we propose a similarity measure for RS images by counting the number of same words assigned to images.We compare the performance of these methods with a typical BoW framework.Our experiments show that the proposed method has better recognition performance than that of the BoW and requires less storage space for saving local invariant features.
基金This work was partially supported by the National Basic Research 973 Program of China under Grant No. 2014CB347600, the National Natural Science Foundation of China under Grant Nos. 61522203, 61572108, 61632007, and 61502081, tile National Ten-Thousand Talents Program of China (Young Top-Notch Talent), the National Thousand Young Talents Program of China, the Fundamental Research Funds for the Central Universities of China under Grant Nos. ZYGX2014Z007 and ZYGX2015J055, and the Natural Science Foundation of Jiangsu Province of China under Grant No. BK20140058.
文摘Video captioning is the task of assigning complex high-level semantic descriptions (e.g., sentences or paragraphs) to video data. Different from previous video analysis techniques such as video annotation, video event detection and action recognition, video captioning is much closer to human cognition with smaller semantic gap. However, the scarcity of captioned video data severely limits the development of video captioning. In this paper, we propose a novel video captioning approach to describe videos by leveraging freely-available image corpus with abundant literal knowledge. There are two key aspects of our approach: 1) effective integration strategy bridging videos and images, and 2) high efficiency in handling ever-increasing training data. To achieve these goals, we adopt sophisticated visual hashing techniques to efficiently index and search large-scale images for relevant captions, which is of high extensibility to evolving data and the corresponding semantics. Extensive experimental results on various real-world visual datasets show the effectiveness of our approach with different hashing techniques, e.g., LSH (locality-sensitive hashing), PCA-ITQ (principle component analysis iterative quantization) and supervised discrete hashing, as compared with the state-of-the-art methods. It is worth noting that the empirical computational cost of our approach is much lower than that of an existing method, i.e., it takes 1/256 of the memory requirement and 1/64 of the time cost of the method of Devlin et al.
基金This work was supported in part by the National Key Basic Research and Development Program of China[grant number 2013CB733404]the National Natural Science Foundation of China[grant number 61271401],[grant number 91338113].
文摘Unmanned aerial vehicle(UAV)-based imaging systems have many superiorities compared with other platforms,such as high flexibility and low cost in collecting images,providing wide application prospects.However,the acquisition of the UAV-based image commonly results in very high resolution and very large-scale images,which poses great challenges for subsequent applications.Therefore,an efficient representation of large-scale UAV images is necessary for the extraction of the required information in a reasonable time.In this work,we proposed a multi-scale hierarchical representation,i.e.binary partition tree,for analyzing large-scale UAV images.More precisely,we first obtained an initial partition of images by an oversegmentation algorithm,i.e.the simple linear iterative clustering.Next,we merged the similar superpixels to build an object-based hierarchical structure by fully considering the spectral and spatial information of the superpixels and their topological relationships.Moreover,objects of interest and optimal segmentation were obtained using object-based analysis methods with the hierarchical structure.Experimental results on processing the post-seismic UAV images of the 2013 Ya’an earthquake and the mosaic of images in the South-west of Munich demonstrate the effectiveness and efficiency of our proposed method.
文摘In this paper,an improved optical flow method for image registration is proposed.It is novel in the way that it improves the optical flow method with an initial motion estimator:extended phase correlation technique(EPCT),using merits of the latter to compensate deficiencies of the former.In a more detailed manner,it can be said that the optical flow method can reach the sub-pixel accuracy and calculate complex distortion patterns like chirping and tilting but is weak with large-scale movements.Because EPCT covers measurements of large translations and rotations with pixel level accuracy and is efficient in the calculating load,it can be treated as a good initial motion estimator for optical flow method.Tests have proved that this improved method will significantly enhance the registration performance,especially,for images with large-scale movements and robust against random noises.
基金supported by the National Natural Foundation of China(No.61571328)。
文摘Image caption is a high-level task in the area of image understanding,in which most of the models adopt a convolutional neural network(CNN)to extract image features assigning a recurrent neural network(RNN)to generate sentences.Researchers tend to design complex networks with deeper layers to improve the performance of feature extraction in recent years.Increasing the size of the network could obtain features of high quality,but it is not an efficient way in terms of computational cost.A large number of parameters brought by CNN makes the research difficult to apply in human daily life.In order to reduce the information loss of the convolutional process with less cost,we propose a lightweight convolutional neural network,named as Bifurcate-CNN(B-CNN).Furthermore,recent works are devoted to generating captions in English,in this paper,we develop an image caption model that generates descriptions in Chinese.Compared with Inception-v3,the depth of our model is shallower with fewer parameters,and the computational cost is lower.Evaluated on the AI CHALLENGER dataset,we prove that our model can enhance the performance,improving BLEU-4 from 46.1 to 49.9 and CIDEr from 142.5 to 156.6 respectively.
基金This work was supported by the Natural Science Foundation of China(Nos.61632003,61873265)。
文摘Image-based 3D modeling is an effective method for reconstructing large-scale scenes,especially city-level scenarios.In the image-based modeling pipeline,obtaining a watertight mesh model from a noisy multi-view stereo point cloud is a key step toward ensuring model quality.However,some state-of-the-art methods rely on the global Delaunay-based optimization formed by all the points and cameras;thus,they encounter scaling problems when dealing with large scenes.To circumvent these limitations,this study proposes a scalable pointcloud meshing approach to aid the reconstruction of city-scale scenes with minimal time consumption and memory usage.Firstly,the entire scene is divided along the x and y axes into several overlapping chunks so that each chunk can satisfy the memory limit.Then,the Delaunay-based optimization is performed to extract meshes for each chunk in parallel.Finally,the local meshes are merged together by resolving local inconsistencies in the overlapping areas between the chunks.We test the proposed method on three city-scale scenes with hundreds of millions of points and thousands of images,and demonstrate its scalability,accuracy,and completeness,compared with the state-of-the-art methods.
基金supported by the National Numerical Windtunnel Project,partially by the National Natural Science Foundation of China under Grant No.61702360.
文摘With the increasing of computing ability,large-scale simulations have been generating massive amounts of data in aerodynamics.Sort-last parallel rendering is the most classical image compositing method for large-scale scientific visualization.However,in the stage of image compositing,the sort-last method may suffer from scalability problem on large-scale processors.Existing image compositing algorithms tend to perform well in certain situations.For instance,Direct Send is well on small and medium scale;Radix-k gets well performance only when the k-value is appropriate and so on.In this paper,we propose a novel method named mSwap for scientific visualization in aerodynamics,which uses the best scale of processors to make sure its performance at the best.mSwap groups the processors that we can use with a(m,k)table,which records the best combination of m(the number of processors in subgroup of each group)and k(the number of processors in each group).Then in each group,using a m-ary tree to composite the image for reducing the communication of processors.Finally,the image is composited between different groups to generate the final image.The performance and scalability of our mSwap method is demonstrated through experiments with thousands of processors.
基金supported by the National Key Research and Development Program of China(2022YFE0206700)。
文摘1.Introduction Climate change mitigation pathways aimed at limiting global anthropogenic carbon dioxide(CO_(2))emissions while striving to constrain the global temperature increase to below 2℃—as outlined by the Intergovernmental Panel on Climate Change(IPCC)—consistently predict the widespread implementation of CO_(2)geological storage on a global scale.
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.