Microscopy imaging is fundamental in analyzing bacterial morphology and dynamics,offering critical insights into bacterial physiology and pathogenicity.Image segmentation techniques enable quantitative analysis of bac...Microscopy imaging is fundamental in analyzing bacterial morphology and dynamics,offering critical insights into bacterial physiology and pathogenicity.Image segmentation techniques enable quantitative analysis of bacterial structures,facilitating precise measurement of morphological variations and population behaviors at single-cell resolution.This paper reviews advancements in bacterial image segmentation,emphasizing the shift from traditional thresholding and watershed methods to deep learning-driven approaches.Convolutional neural networks(CNNs),U-Net architectures,and three-dimensional(3D)frameworks excel at segmenting dense biofilms and resolving antibiotic-induced morphological changes.These methods combine automated feature extraction with physics-informed postprocessing.Despite progress,challenges persist in computational efficiency,cross-species generalizability,and integration with multimodal experimental workflows.Future progress will depend on improving model robustness across species and imaging modalities,integrating multimodal data for phenotype-function mapping,and developing standard pipelines that link computational tools with clinical diagnostics.These innovations will expand microbial phenotyping beyond structural analysis,enabling deeper insights into bacterial physiology and ecological interactions.展开更多
Colorectal cancer(CRC)with lung oligometastases,particularly in the presence of extrapulmonary disease,poses considerable therapeutic challenges in clinical practice.We have carefully studied the multicenter study by ...Colorectal cancer(CRC)with lung oligometastases,particularly in the presence of extrapulmonary disease,poses considerable therapeutic challenges in clinical practice.We have carefully studied the multicenter study by Hu et al,which evaluated the survival outcomes of patients with metastatic CRC who received image-guided thermal ablation(IGTA).These findings provide valuable clinical evidence supporting IGTA as a feasible,minimally invasive approach and underscore the prognostic significance of metastatic distribution.However,the study by Hu et al has several limitations,including that not all pulmonary lesions were pathologically confirmed,postoperative follow-up mainly relied on dynamic contrast-enhanced computed tomography,no comparative analysis was performed with other local treatments,and the impact of other imaging features on efficacy and prognosis was not evaluated.Future studies should include complete pathological confirmation,integrate functional imaging and radiomics,and use prospective multicenter collaboration to optimize patient selection standards for IGTA treatment,strengthen its clinical evidence base,and ultimately promote individualized decision-making for patients with metastatic CRC.展开更多
Background:Brain volume measurement serves as a critical approach for assessing brain health status.Considering the close biological connection between the eyes and brain,this study aims to investigate the feasibility...Background:Brain volume measurement serves as a critical approach for assessing brain health status.Considering the close biological connection between the eyes and brain,this study aims to investigate the feasibility of estimating brain volume through retinal fundus imaging integrated with clinical metadata,and to offer a cost-effective approach for assessing brain health.Methods:Based on clinical information,retinal fundus images,and neuroimaging data derived from a multicenter,population-based cohort study,the Kai Luan Study,we proposed a cross-modal correlation representation(CMCR)network to elucidate the intricate co-degenerative relationships between the eyes and brain for 755 subjects.Specifically,individual clinical information,which has been followed up for as long as 12 years,was encoded as a prompt to enhance the accuracy of brain volume estimation.Independent internal validation and external validation were performed to assess the robustness of the proposed model.Root mean square error(RMSE),peak signal-tonoise ratio(PSNR),and structural similarity index measure(SSIM)metrics were employed to quantitatively evaluate the quality of synthetic brain images derived from retinal imaging data.Results:The proposed framework yielded average RMSE,PSNR,and SSIM values of 98.23,35.78 d B,and 0.64,respectively,which significantly outperformed 5 other methods:multi-channel Variational Autoencoder(mcVAE),Pixelto-Pixel(Pixel2pixel),transformer-based U-Net(Trans UNet),multi-scale transformer network(MT-Net),and residual vision transformer(ResViT).The two-(2D)and three-dimensional(3D)visualization results showed that the shape and texture of the synthetic brain images generated by the proposed method most closely resembled those of actual brain images.Thus,the CMCR framework accurately captured the latent structural correlations between the fundus and the brain.The average difference between predicted and actual brain volumes was 61.36 cm~3,with a relative error of 4.54%.When all of the clinical information(including age and sex,daily habits,cardiovascular factors,metabolic factors,and inflammatory factors)was encoded,the difference was decreased to 53.89 cm~3,with a relative error of 3.98%.Based on the synthesized brain magnetic resonance images from retinal fundus images,the volumes of brain tissues could be estimated with high accuracy.Conclusion:This study provides an innovative,accurate,and cost-effective approach to characterize brain health status through readily accessible retinal fundus images.展开更多
Honeycombing Lung(HCL)is a chronic lung condition marked by advanced fibrosis,resulting in enlarged air spaces with thick fibrotic walls,which are visible on Computed Tomography(CT)scans.Differentiating between normal...Honeycombing Lung(HCL)is a chronic lung condition marked by advanced fibrosis,resulting in enlarged air spaces with thick fibrotic walls,which are visible on Computed Tomography(CT)scans.Differentiating between normal lung tissue,honeycombing lungs,and Ground Glass Opacity(GGO)in CT images is often challenging for radiologists and may lead to misinterpretations.Although earlier studies have proposed models to detect and classify HCL,many faced limitations such as high computational demands,lower accuracy,and difficulty distinguishing between HCL and GGO.CT images are highly effective for lung classification due to their high resolution,3D visualization,and sensitivity to tissue density variations.This study introduces Honeycombing Lungs Network(HCL Net),a novel classification algorithm inspired by ResNet50V2 and enhanced to overcome the shortcomings of previous approaches.HCL Net incorporates additional residual blocks,refined preprocessing techniques,and selective parameter tuning to improve classification performance.The dataset,sourced from the University Malaya Medical Centre(UMMC)and verified by expert radiologists,consists of CT images of normal,honeycombing,and GGO lungs.Experimental evaluations across five assessments demonstrated that HCL Net achieved an outstanding classification accuracy of approximately 99.97%.It also recorded strong performance in other metrics,achieving 93%precision,100%sensitivity,89%specificity,and an AUC-ROC score of 97%.Comparative analysis with baseline feature engineering methods confirmed the superior efficacy of HCL Net.The model significantly reduces misclassification,particularly between honeycombing and GGO lungs,enhancing diagnostic precision and reliability in lung image analysis.展开更多
As urban landscapes evolve and vehicular volumes soar,traditional traffic monitoring systems struggle to scale,often failing under the complexities of dense,dynamic,and occluded environments.This paper introduces a no...As urban landscapes evolve and vehicular volumes soar,traditional traffic monitoring systems struggle to scale,often failing under the complexities of dense,dynamic,and occluded environments.This paper introduces a novel,unified deep learning framework for vehicle detection,tracking,counting,and classification in aerial imagery designed explicitly for modern smart city infrastructure demands.Our approach begins with adaptive histogram equalization to optimize aerial image clarity,followed by a cutting-edge scene parsing technique using Mask2Former,enabling robust segmentation even in visually congested settings.Vehicle detection leverages the latest YOLOv11 architecture,delivering superior accuracy in aerial contexts by addressing occlusion,scale variance,and fine-grained object differentiation.We incorporate the highly efficient ByteTrack algorithm for tracking,enabling seamless identity preservation across frames.Vehicle counting is achieved through an unsupervised DBSCAN-based method,ensuring adaptability to varying traffic densities.We further introduce a hybrid feature extraction module combining Convolutional Neural Networks(CNNs)with Zernike Moments,capturing both deep semantic and geometric signatures of vehicles.The final classification is powered by NASNet,a neural architecture search-optimized model,ensuring high accuracy across diverse vehicle types and orientations.Extensive evaluations of the VAID benchmark dataset demonstrate the system’s outstanding performance,achieving 96%detection,94%tracking,and 96.4%classification accuracy.On the UAVDT dataset,the system attains 95%detection,93%tracking,and 95%classification accuracy,confirming its robustness across diverse aerial traffic scenarios.These results establish new benchmarks in aerial traffic analysis and validate the framework’s scalability,making it a powerful and adaptable solution for next-generation intelligent transportation systems and urban surveillance.展开更多
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ...Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.展开更多
Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global featu...Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.展开更多
Existing Transformer-based image captioning models typically rely on the self-attention mechanism to capture long-range dependencies,which effectively extracts and leverages the global correlation of image features.Ho...Existing Transformer-based image captioning models typically rely on the self-attention mechanism to capture long-range dependencies,which effectively extracts and leverages the global correlation of image features.However,these models still face challenges in effectively capturing local associations.Moreover,since the encoder extracts global and local association features that focus on different semantic information,semantic noise may occur during the decoding stage.To address these issues,we propose the Local Relationship Enhanced Gated Transformer(LREGT).In the encoder part,we introduce the Local Relationship Enhanced Encoder(LREE),whose core component is the Local Relationship Enhanced Module(LREM).LREM consists of two novel designs:the Local Correlation Perception Module(LCPM)and the Local-Global Fusion Module(LGFM),which are beneficial for generating a comprehensive feature representation that integrates both global and local information.In the decoder part,we propose the Dual-level Multi-branch Gated Decoder(DMGD).It first creates multiple decoding branches to generate multi-perspective contextual feature representations.Subsequently,it employs the Dual-Level Gating Mechanism(DLGM)to model the multi-level relationships of these multi-perspective contextual features,enhancing their fine-grained semantics and intrinsic relationship representations.This ultimately leads to the generation of high-quality and semantically rich image captions.Experiments on the standard MSCOCO dataset demonstrate that LREGT achieves state-of-the-art performance,with a CIDEr score of 140.8 and BLEU-4 score of 41.3,significantly outperforming existing mainstream methods.These results highlight LREGT’s superiority in capturing complex visual relationships and resolving semantic noise during decoding.展开更多
Objective:In the Radiology Department of Mzuzu Central Hospital(MCH),daily training for radiographers now includes content on Computed Tomography(CT)image quality control and equipment maintenance to ensure the normal...Objective:In the Radiology Department of Mzuzu Central Hospital(MCH),daily training for radiographers now includes content on Computed Tomography(CT)image quality control and equipment maintenance to ensure the normal,continuous,and stable operation of the 16-slice spiral CT scanner.Methods:Through comprehensive analysis of relevant equipment,we have identified key parameters that significantly impact CT image quality.Innovative optimization strategies and solutions targeting these parameters have been developed and integrated into daily training programs.Furthermore,starting from an examination of prevalent failure modes observed in CT equipment,we delve into essential maintenance and preservation techniques that CT technologists must master to ensure optimal system performance.Results:(1)Crucial factors affecting CT image quality include artifacts,noise,partial volume effects,and surrounding gap phenomena,alongside spatial and density resolutions,CT dose,reconstruction algorithms,and human factors during the scanning process.In the daily training for radiographers,emphasis is placed on strictly implementing image quality control measures at every stage of the CT scanning process and skillfully applying advanced scanning and image processing techniques.By doing so,we can provide clinicians with accurate and reliable imaging references for diagnosis and treatment.(2)Strategies for CT equipment maintenance:①Environmental inspection of the CT room to ensure cleanliness and hygiene.②Rational and accurate operation,including calibration software proficiency.③Regular maintenance and servicing for minimizing machine downtime.④Maintenance of the CT X-ray tube.CT technicians can become proficient in equipment maintenance and upkeep techniques through training,which can significantly extend the service life of CT systems and reduce the occurrence of malfunctions.Conclusion:Through the regular implementation of rigorous CT image quality control training for radiology technicians,coupled with diligent and proactive CT equipment maintenance,we have observed profound and beneficial impacts on improving image quality.The accuracy and fidelity of radiological data ultimately leads to more accurate diagnoses and effective treatments.展开更多
Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale who...Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.展开更多
To improve image quality under low illumination conditions,a novel low-light image enhancement method is proposed in this paper based on multi-illumination estimation and multi-scale fusion(MIMS).Firstly,the illuminat...To improve image quality under low illumination conditions,a novel low-light image enhancement method is proposed in this paper based on multi-illumination estimation and multi-scale fusion(MIMS).Firstly,the illumination is processed by contrast-limited adaptive histogram equalization(CLAHE),adaptive complementary gamma function(ACG),and adaptive detail preserving S-curve(ADPS),respectively,to obtain three components.Then,the fusion-relevant features,exposure,and color contrast are selected as the weight maps.Subsequently,these components and weight maps are fused through multi-scale to generate enhanced illumination.Finally,the enhanced images are obtained by multiplying the enhanced illumination and reflectance.Compared with existing approaches,this proposed method achieves an average increase of 0.81%and 2.89%in the structural similarity index measurement(SSIM)and peak signal-to-noise ratio(PSNR),and a decrease of 6.17%and 32.61%in the natural image quality evaluator(NIQE)and gradient magnitude similarity deviation(GMSD),respectively.展开更多
Although guided image filtering(GIF) is known for preserving edges and fast computation,it may produce inaccurate outputs in depth map restoration.In this paper,a novel confidence-weighted GIF called mutual-structure ...Although guided image filtering(GIF) is known for preserving edges and fast computation,it may produce inaccurate outputs in depth map restoration.In this paper,a novel confidence-weighted GIF called mutual-structure weighted GIF(MSWGIF) is proposed,which replaces the mean filtering strategy in GIF during handling overlapping windows.The confidence value is composed of a depth term and a mutual-structure term,where the depth term is utilized to protect the edges of the output,and the mutual-structure term helps to select accurate windows during the structure characteristics of the guidance image are transferred to the output.Experimental results show that MSWGIF reduces the root mean square error(RMSE) by an average of 12.37%,and the average growth rate of correlation(CORR) is 0.07% on average.Additionally,the average growth rate of structure similarity index measure(SSIM) is 0.34%.展开更多
Gamma-ray imaging systems are powerful tools in radiographic diagnosis.However,the recorded images suffer from degradations such as noise,blurring,and downsampling,consequently failing to meet high-precision diagnosti...Gamma-ray imaging systems are powerful tools in radiographic diagnosis.However,the recorded images suffer from degradations such as noise,blurring,and downsampling,consequently failing to meet high-precision diagnostic requirements.In this paper,we propose a novel single-image super-resolution algorithm to enhance the spatial resolution of gamma-ray imaging systems.A mathematical model of the gamma-ray imaging system is established based on maximum a posteriori estimation.Within the plug-and-play framework,the half-quadratic splitting method is employed to decouple the data fidelit term and the regularization term.An image denoiser using convolutional neural networks is adopted as an implicit image prior,referred to as a deep denoiser prior,eliminating the need to explicitly design a regularization term.Furthermore,the impact of the image boundary condition on reconstruction results is considered,and a method for estimating image boundaries is introduced.The results show that the proposed algorithm can effectively addresses boundary artifacts.By increasing the pixel number of the reconstructed images,the proposed algorithm is capable of recovering more details.Notably,in both simulation and real experiments,the proposed algorithm is demonstrated to achieve subpixel resolution,surpassing the Nyquist sampling limit determined by the camera pixel size.展开更多
The clustering technique is used to examine each pixel in the image which assigned to one of the clusters depending on the minimum distance to obtain primary classified image into different intensity regions. A waters...The clustering technique is used to examine each pixel in the image which assigned to one of the clusters depending on the minimum distance to obtain primary classified image into different intensity regions. A watershed transformation technique is then employes. This includes: gradient of the classified image, dividing the image into markers, checking the Marker Image to see if it has zero points (watershed lines). The watershed lines are then deleted in the Marker Image created by watershed algorithm. A Region Adjacency Graph (RAG) and Region Adjacency Boundary (RAB) are created between two regions from Marker Image. Finally region merging is done according to region average intensity and two edge strengths (T1, T2). The approach of the authors is tested on remote sensing and brain MR medical images. The final segmentation result is one closed boundary per actual region in the image.展开更多
Deep learning(DL)-based image reconstruction methods have garnered increasing interest in the last few years.Numerous studies demonstrate that DL-based reconstruction methods function admirably in optical tomographic ...Deep learning(DL)-based image reconstruction methods have garnered increasing interest in the last few years.Numerous studies demonstrate that DL-based reconstruction methods function admirably in optical tomographic imaging techniques,such as bioluminescence tomography(BLT).Nevertheless,nearly every existing DL-based method utilizes an explicit neural representation for the reconstruction problem,which either consumes much memory space or requires various complicated computations.In this paper,we present a neural field(NF)-based image reconstruction scheme for BLT that uses an implicit neural representation.The proposed NFbased method establishes a transformation between the coordinate of an arbitrary spatial point and the source value of the point with a relatively light-weight multilayer perceptron,which has remarkable computational efficiency.Another simple neural network composed of two fully connected layers and a 1D convolutional layer is used to generate the neural features.Results of simulations and experiments show that the proposed NF-based method has similar performance to the photon density complement network and the two-stage network,while consuming fewer floating point operations with fewer model parameters.展开更多
Deep learning techniques have significantly improved image restoration tasks in recent years.As a crucial compo-nent of deep learning,the loss function plays a key role in network optimization and performance enhancem...Deep learning techniques have significantly improved image restoration tasks in recent years.As a crucial compo-nent of deep learning,the loss function plays a key role in network optimization and performance enhancement.However,the currently prevalent loss functions assign equal weight to each pixel point during loss calculation,which hampers the ability to reflect the roles of different pixel points and fails to exploit the image’s characteristics fully.To address this issue,this study proposes an asymmetric loss function based on the image and data characteristics of the image recovery task.This novel loss function can adjust the weight of the reconstruction loss based on the grey value of different pixel points,thereby effectively optimizing the network training by differentially utilizing the grey information from the original image.Specifically,we calculate a weight factor for each pixel point based on its grey value and combine it with the reconstruction loss to create a new loss function.This ensures that pixel points with smaller grey values receive greater attention,improving network recovery.In order to verify the effectiveness of the proposed asymmetric loss function,we conducted experimental tests in the image super-resolution task.The experimental results show that the model with the introduction of asymmetric loss weights improves all the indexes of the processing results without increasing the training time.In the typical super-resolution network SRCNN,by introducing asymmetric weights,it is possible to improve the peak signal-to-noise ratio(PSNR)by up to about 0.5%,the structural similarity index(SSIM)by up to about 0.3%,and reduce the root-mean-square error(RMSE)by up to about 1.7%with essentially no increase in training time.In addition,we also further tested the performance of the proposed method in the denoising task to verify the potential applicability of the method in the image restoration task.展开更多
The instance segmentation of impacted teeth in the oral panoramic X-ray images is hotly researched.However,due to the complex structure,low contrast,and complex background of teeth in panoramic X-ray images,the task o...The instance segmentation of impacted teeth in the oral panoramic X-ray images is hotly researched.However,due to the complex structure,low contrast,and complex background of teeth in panoramic X-ray images,the task of instance segmentation is technically tricky.In this study,the contrast between impacted Teeth and periodontal tissues such as gingiva,periodontalmembrane,and alveolar bone is low,resulting in fuzzy boundaries of impacted teeth.Amodel based on Teeth YOLACT is proposed to provide amore efficient and accurate solution for the segmentation of impacted teeth in oral panoramic X-ray films.Firstly,a Multi-scale Res-Transformer Module(MRTM)is designed.In the module,depthwise separable convolutions with different receptive fields are used to enhance the sensitivity of the model to lesion size.Additionally,the Vision Transformer is integrated to improve the model’s ability to perceive global features.Secondly,the Context Interaction-awareness Module(CIaM)is designed to fuse deep and shallow features.The deep semantic features guide the shallow spatial features.Then,the shallow spatial features are embedded into the deep semantic features,and the cross-weighted attention mechanism is used to aggregate the deep and shallow features efficiently,and richer context information is obtained.Thirdly,the Edge-preserving perceptionModule(E2PM)is designed to enhance the teeth edge features.The first-order differential operator is used to get the tooth edge weight,and the perception ability of tooth edge features is improved.The shallow spatial feature is fused by linear mapping,weight concatenation,and matrix multiplication operations to preserve the tooth edge information.Finally,comparison experiments and ablation experiments are conducted on the oral panoramic X-ray image datasets.The results show that the APdet,APseg,ARdet,ARseg,mAPdet,and mAPseg indicators of the proposed model are 89.9%,91.9%,77.4%,77.6%,72.8%,and 73.5%,respectively.This study further verifies the application potential of the method combining multi-scale feature extraction,multi-scale feature fusion,and edge perception enhancement in medical image segmentation,which provides a valuable reference for future related research.展开更多
The aircraft system has recently gained its reputation as a reliable and efficient tool for sensing and parsing aerial scenes.However,accurate and fast semantic segmentation of highresolution aerial images for remote ...The aircraft system has recently gained its reputation as a reliable and efficient tool for sensing and parsing aerial scenes.However,accurate and fast semantic segmentation of highresolution aerial images for remote sensing applications is still facing three challenges:the requirements for limited processing resources and low-latency operations based on aerial platforms,the balance between high accuracy and real-time efficiency for model performance,and the confusing objects with large intra-class variations and small inter-class differences in high-resolution aerial images.To address these issues,a lightweight and dual-path deep convolutional architecture,namely Aerial Bilateral Segmentation Network(Aerial-Bi Se Net),is proposed to perform realtime segmentation on high-resolution aerial images with favorable accuracy.Specifically,inspired by the receptive field concept in human visual systems,Receptive Field Module(RFM)is proposed to encode rich multi-scale contextual information.Based on channel attention mechanism,two novel modules,called Feature Attention Module(FAM)and Channel Attention based Feature Fusion Module(CAFFM)respectively,are proposed to refine and combine features effectively to boost the model performance.Aerial-Bi Se Net is evaluated on the Potsdam and Vaihingen datasets,where leading performance is reported compared with other state-of-the-art models,in terms of both accuracy and efficiency.展开更多
The objective assessment of fabric pilling based on light projection and image analysis has been exploited recently.The device for capturing the cross-sectional images of the pilled fabrics with light projection is el...The objective assessment of fabric pilling based on light projection and image analysis has been exploited recently.The device for capturing the cross-sectional images of the pilled fabrics with light projection is elaborated.The detection of the profile line and integration of the sequential cross-sectional pilled image are discussed.The threshold based on Gaussian model is recommended for pill segmentation.The results show that the installed system is capable of eliminating the interference with pill information from the fabric color and pattern.展开更多
Photoacoustic computed tomography(PACT)is an innovative biomedical imaging technique that has gained significant application in the field of biomedicine due to its ability to visualize optical contrast with high resol...Photoacoustic computed tomography(PACT)is an innovative biomedical imaging technique that has gained significant application in the field of biomedicine due to its ability to visualize optical contrast with high resolution and deep tissue penetration.However,the inherent challenges associated with photoacoustic signal excitation,propagation and detection often result in suboptimal image quality.To overcome these limitations,researchers have developed various advanced algorithms that span the entire image reconstruction pipeline.This review paper aims to present a detailed analysis of the latest advancements in PACT algorithms and synthesize these algorithms into a coherent framework.We provide tripartite analysis—from signal processing to reconstruction solution to image processing,covering a spectrum of techniques.The principles and methodologies,as well as their applicability and limitations,are thoroughly discussed.The primary objective of this study is to provide a thorough review of advanced algorithms applicable to PACT,offering both theoretical foundations and practical guidance for enhancing the imaging effect of PACT.展开更多
基金financially supported by the Open Project Program of Wuhan National Laboratory for Optoelectronics(No.2022WNLOKF009)the National Natural Science Foundation of China(No.62475216)+2 种基金the Key Research and Development Program of Shaanxi(No.2024GH-ZDXM-37)the Fujian Provincial Natural Science Foundation of China(No.2024J01060)the Startup Program of XMU,and the Fundamental Research Funds for the Central Universities.
文摘Microscopy imaging is fundamental in analyzing bacterial morphology and dynamics,offering critical insights into bacterial physiology and pathogenicity.Image segmentation techniques enable quantitative analysis of bacterial structures,facilitating precise measurement of morphological variations and population behaviors at single-cell resolution.This paper reviews advancements in bacterial image segmentation,emphasizing the shift from traditional thresholding and watershed methods to deep learning-driven approaches.Convolutional neural networks(CNNs),U-Net architectures,and three-dimensional(3D)frameworks excel at segmenting dense biofilms and resolving antibiotic-induced morphological changes.These methods combine automated feature extraction with physics-informed postprocessing.Despite progress,challenges persist in computational efficiency,cross-species generalizability,and integration with multimodal experimental workflows.Future progress will depend on improving model robustness across species and imaging modalities,integrating multimodal data for phenotype-function mapping,and developing standard pipelines that link computational tools with clinical diagnostics.These innovations will expand microbial phenotyping beyond structural analysis,enabling deeper insights into bacterial physiology and ecological interactions.
文摘Colorectal cancer(CRC)with lung oligometastases,particularly in the presence of extrapulmonary disease,poses considerable therapeutic challenges in clinical practice.We have carefully studied the multicenter study by Hu et al,which evaluated the survival outcomes of patients with metastatic CRC who received image-guided thermal ablation(IGTA).These findings provide valuable clinical evidence supporting IGTA as a feasible,minimally invasive approach and underscore the prognostic significance of metastatic distribution.However,the study by Hu et al has several limitations,including that not all pulmonary lesions were pathologically confirmed,postoperative follow-up mainly relied on dynamic contrast-enhanced computed tomography,no comparative analysis was performed with other local treatments,and the impact of other imaging features on efficacy and prognosis was not evaluated.Future studies should include complete pathological confirmation,integrate functional imaging and radiomics,and use prospective multicenter collaboration to optimize patient selection standards for IGTA treatment,strengthen its clinical evidence base,and ultimately promote individualized decision-making for patients with metastatic CRC.
基金supported by the National Natural Science Foundation of China(62522119 and 62372358)the Beijing Natural Science Foundation(7242267)+2 种基金the Beijing Scholars Program([2015]160)the Natural Science Basic Research Program of Shaanxi(2023-JC-QN-0719)the Guangdong Basic and Applied Basic Research Foundation(2022A1515110453)。
文摘Background:Brain volume measurement serves as a critical approach for assessing brain health status.Considering the close biological connection between the eyes and brain,this study aims to investigate the feasibility of estimating brain volume through retinal fundus imaging integrated with clinical metadata,and to offer a cost-effective approach for assessing brain health.Methods:Based on clinical information,retinal fundus images,and neuroimaging data derived from a multicenter,population-based cohort study,the Kai Luan Study,we proposed a cross-modal correlation representation(CMCR)network to elucidate the intricate co-degenerative relationships between the eyes and brain for 755 subjects.Specifically,individual clinical information,which has been followed up for as long as 12 years,was encoded as a prompt to enhance the accuracy of brain volume estimation.Independent internal validation and external validation were performed to assess the robustness of the proposed model.Root mean square error(RMSE),peak signal-tonoise ratio(PSNR),and structural similarity index measure(SSIM)metrics were employed to quantitatively evaluate the quality of synthetic brain images derived from retinal imaging data.Results:The proposed framework yielded average RMSE,PSNR,and SSIM values of 98.23,35.78 d B,and 0.64,respectively,which significantly outperformed 5 other methods:multi-channel Variational Autoencoder(mcVAE),Pixelto-Pixel(Pixel2pixel),transformer-based U-Net(Trans UNet),multi-scale transformer network(MT-Net),and residual vision transformer(ResViT).The two-(2D)and three-dimensional(3D)visualization results showed that the shape and texture of the synthetic brain images generated by the proposed method most closely resembled those of actual brain images.Thus,the CMCR framework accurately captured the latent structural correlations between the fundus and the brain.The average difference between predicted and actual brain volumes was 61.36 cm~3,with a relative error of 4.54%.When all of the clinical information(including age and sex,daily habits,cardiovascular factors,metabolic factors,and inflammatory factors)was encoded,the difference was decreased to 53.89 cm~3,with a relative error of 3.98%.Based on the synthesized brain magnetic resonance images from retinal fundus images,the volumes of brain tissues could be estimated with high accuracy.Conclusion:This study provides an innovative,accurate,and cost-effective approach to characterize brain health status through readily accessible retinal fundus images.
文摘Honeycombing Lung(HCL)is a chronic lung condition marked by advanced fibrosis,resulting in enlarged air spaces with thick fibrotic walls,which are visible on Computed Tomography(CT)scans.Differentiating between normal lung tissue,honeycombing lungs,and Ground Glass Opacity(GGO)in CT images is often challenging for radiologists and may lead to misinterpretations.Although earlier studies have proposed models to detect and classify HCL,many faced limitations such as high computational demands,lower accuracy,and difficulty distinguishing between HCL and GGO.CT images are highly effective for lung classification due to their high resolution,3D visualization,and sensitivity to tissue density variations.This study introduces Honeycombing Lungs Network(HCL Net),a novel classification algorithm inspired by ResNet50V2 and enhanced to overcome the shortcomings of previous approaches.HCL Net incorporates additional residual blocks,refined preprocessing techniques,and selective parameter tuning to improve classification performance.The dataset,sourced from the University Malaya Medical Centre(UMMC)and verified by expert radiologists,consists of CT images of normal,honeycombing,and GGO lungs.Experimental evaluations across five assessments demonstrated that HCL Net achieved an outstanding classification accuracy of approximately 99.97%.It also recorded strong performance in other metrics,achieving 93%precision,100%sensitivity,89%specificity,and an AUC-ROC score of 97%.Comparative analysis with baseline feature engineering methods confirmed the superior efficacy of HCL Net.The model significantly reduces misclassification,particularly between honeycombing and GGO lungs,enhancing diagnostic precision and reliability in lung image analysis.
基金funded by the Open Access Initiative of the University of Bremen and the DFG via SuUB BremenThe authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Group Project under grant number(RGP2/367/46)+1 种基金This research is supported and funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R410)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘As urban landscapes evolve and vehicular volumes soar,traditional traffic monitoring systems struggle to scale,often failing under the complexities of dense,dynamic,and occluded environments.This paper introduces a novel,unified deep learning framework for vehicle detection,tracking,counting,and classification in aerial imagery designed explicitly for modern smart city infrastructure demands.Our approach begins with adaptive histogram equalization to optimize aerial image clarity,followed by a cutting-edge scene parsing technique using Mask2Former,enabling robust segmentation even in visually congested settings.Vehicle detection leverages the latest YOLOv11 architecture,delivering superior accuracy in aerial contexts by addressing occlusion,scale variance,and fine-grained object differentiation.We incorporate the highly efficient ByteTrack algorithm for tracking,enabling seamless identity preservation across frames.Vehicle counting is achieved through an unsupervised DBSCAN-based method,ensuring adaptability to varying traffic densities.We further introduce a hybrid feature extraction module combining Convolutional Neural Networks(CNNs)with Zernike Moments,capturing both deep semantic and geometric signatures of vehicles.The final classification is powered by NASNet,a neural architecture search-optimized model,ensuring high accuracy across diverse vehicle types and orientations.Extensive evaluations of the VAID benchmark dataset demonstrate the system’s outstanding performance,achieving 96%detection,94%tracking,and 96.4%classification accuracy.On the UAVDT dataset,the system attains 95%detection,93%tracking,and 95%classification accuracy,confirming its robustness across diverse aerial traffic scenarios.These results establish new benchmarks in aerial traffic analysis and validate the framework’s scalability,making it a powerful and adaptable solution for next-generation intelligent transportation systems and urban surveillance.
基金supported by the Natural Science Foundation of the Anhui Higher Education Institutions of China(Grant Nos.2023AH040149 and 2024AH051915)the Anhui Provincial Natural Science Foundation(Grant No.2208085MF168)+1 种基金the Science and Technology Innovation Tackle Plan Project of Maanshan(Grant No.2024RGZN001)the Scientific Research Fund Project of Anhui Medical University(Grant No.2023xkj122).
文摘Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.
基金National Key Research and Development Program of China,Grant/Award Number:2018YFE0206900China Postdoctoral Science Foundation,Grant/Award Number:2023M731204+2 种基金The Open Project of Key Laboratory for Quality Evaluation of Ultrasound Surgical Equipment of National Medical Products Administration,Grant/Award Number:SMDTKL-2023-1-01The Hubei Province Key Research and Development Project,Grant/Award Number:2023BCB007CAAI-Huawei MindSpore Open Fund。
文摘Convolutional neural network(CNN)with the encoder-decoder structure is popular in medical image segmentation due to its excellent local feature extraction ability but it faces limitations in capturing the global feature.The transformer can extract the global information well but adapting it to small medical datasets is challenging and its computational complexity can be heavy.In this work,a serial and parallel network is proposed for the accurate 3D medical image segmentation by combining CNN and transformer and promoting feature interactions across various semantic levels.The core components of the proposed method include the cross window self-attention based transformer(CWST)and multi-scale local enhanced(MLE)modules.The CWST module enhances the global context understanding by partitioning 3D images into non-overlapping windows and calculating sparse global attention between windows.The MLE module selectively fuses features by computing the voxel attention between different branch features,and uses convolution to strengthen the dense local information.The experiments on the prostate,atrium,and pancreas MR/CT image datasets consistently demonstrate the advantage of the proposed method over six popular segmentation models in both qualitative evaluation and quantitative indexes such as dice similarity coefficient,Intersection over Union,95%Hausdorff distance and average symmetric surface distance.
基金supported by the Natural Science Foundation of China(62473105,62172118)Nature Science Key Foundation of Guangxi(2021GXNSFDA196002)+1 种基金in part by the Guangxi Key Laboratory of Image and Graphic Intelligent Processing under Grants(GIIP2302,GIIP2303,GIIP2304)Innovation Project of Guang Xi Graduate Education(2024YCXB09,2024YCXS039).
文摘Existing Transformer-based image captioning models typically rely on the self-attention mechanism to capture long-range dependencies,which effectively extracts and leverages the global correlation of image features.However,these models still face challenges in effectively capturing local associations.Moreover,since the encoder extracts global and local association features that focus on different semantic information,semantic noise may occur during the decoding stage.To address these issues,we propose the Local Relationship Enhanced Gated Transformer(LREGT).In the encoder part,we introduce the Local Relationship Enhanced Encoder(LREE),whose core component is the Local Relationship Enhanced Module(LREM).LREM consists of two novel designs:the Local Correlation Perception Module(LCPM)and the Local-Global Fusion Module(LGFM),which are beneficial for generating a comprehensive feature representation that integrates both global and local information.In the decoder part,we propose the Dual-level Multi-branch Gated Decoder(DMGD).It first creates multiple decoding branches to generate multi-perspective contextual feature representations.Subsequently,it employs the Dual-Level Gating Mechanism(DLGM)to model the multi-level relationships of these multi-perspective contextual features,enhancing their fine-grained semantics and intrinsic relationship representations.This ultimately leads to the generation of high-quality and semantically rich image captions.Experiments on the standard MSCOCO dataset demonstrate that LREGT achieves state-of-the-art performance,with a CIDEr score of 140.8 and BLEU-4 score of 41.3,significantly outperforming existing mainstream methods.These results highlight LREGT’s superiority in capturing complex visual relationships and resolving semantic noise during decoding.
基金supported by the First Affiliated Hospital of Xi’an Jiaotong University Teaching Reform Project(Grant No.JG2023-0206 and JG2022-0324).
文摘Objective:In the Radiology Department of Mzuzu Central Hospital(MCH),daily training for radiographers now includes content on Computed Tomography(CT)image quality control and equipment maintenance to ensure the normal,continuous,and stable operation of the 16-slice spiral CT scanner.Methods:Through comprehensive analysis of relevant equipment,we have identified key parameters that significantly impact CT image quality.Innovative optimization strategies and solutions targeting these parameters have been developed and integrated into daily training programs.Furthermore,starting from an examination of prevalent failure modes observed in CT equipment,we delve into essential maintenance and preservation techniques that CT technologists must master to ensure optimal system performance.Results:(1)Crucial factors affecting CT image quality include artifacts,noise,partial volume effects,and surrounding gap phenomena,alongside spatial and density resolutions,CT dose,reconstruction algorithms,and human factors during the scanning process.In the daily training for radiographers,emphasis is placed on strictly implementing image quality control measures at every stage of the CT scanning process and skillfully applying advanced scanning and image processing techniques.By doing so,we can provide clinicians with accurate and reliable imaging references for diagnosis and treatment.(2)Strategies for CT equipment maintenance:①Environmental inspection of the CT room to ensure cleanliness and hygiene.②Rational and accurate operation,including calibration software proficiency.③Regular maintenance and servicing for minimizing machine downtime.④Maintenance of the CT X-ray tube.CT technicians can become proficient in equipment maintenance and upkeep techniques through training,which can significantly extend the service life of CT systems and reduce the occurrence of malfunctions.Conclusion:Through the regular implementation of rigorous CT image quality control training for radiology technicians,coupled with diligent and proactive CT equipment maintenance,we have observed profound and beneficial impacts on improving image quality.The accuracy and fidelity of radiological data ultimately leads to more accurate diagnoses and effective treatments.
基金supported by the National Natural Science Foundation of China(No.82371933)the National Natural Science Foundation of Shandong Province of China(No.ZR2021MH120)+1 种基金the Taishan Scholars Project(No.tsqn202211378)the Shandong Provincial Natural Science Foundation for Excellent Young Scholars(No.ZR2024YQ075).
文摘Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.
基金supported by the National Key R&D Program of China(No.2022YFB3205101)NSAF(No.U2230116)。
文摘To improve image quality under low illumination conditions,a novel low-light image enhancement method is proposed in this paper based on multi-illumination estimation and multi-scale fusion(MIMS).Firstly,the illumination is processed by contrast-limited adaptive histogram equalization(CLAHE),adaptive complementary gamma function(ACG),and adaptive detail preserving S-curve(ADPS),respectively,to obtain three components.Then,the fusion-relevant features,exposure,and color contrast are selected as the weight maps.Subsequently,these components and weight maps are fused through multi-scale to generate enhanced illumination.Finally,the enhanced images are obtained by multiplying the enhanced illumination and reflectance.Compared with existing approaches,this proposed method achieves an average increase of 0.81%and 2.89%in the structural similarity index measurement(SSIM)and peak signal-to-noise ratio(PSNR),and a decrease of 6.17%and 32.61%in the natural image quality evaluator(NIQE)and gradient magnitude similarity deviation(GMSD),respectively.
基金supported by the National Key Research and Development Program of China (No.2019YFB2204302)。
文摘Although guided image filtering(GIF) is known for preserving edges and fast computation,it may produce inaccurate outputs in depth map restoration.In this paper,a novel confidence-weighted GIF called mutual-structure weighted GIF(MSWGIF) is proposed,which replaces the mean filtering strategy in GIF during handling overlapping windows.The confidence value is composed of a depth term and a mutual-structure term,where the depth term is utilized to protect the edges of the output,and the mutual-structure term helps to select accurate windows during the structure characteristics of the guidance image are transferred to the output.Experimental results show that MSWGIF reduces the root mean square error(RMSE) by an average of 12.37%,and the average growth rate of correlation(CORR) is 0.07% on average.Additionally,the average growth rate of structure similarity index measure(SSIM) is 0.34%.
基金supported by the National Natural Science Foundation of China(Grant No.12175183)。
文摘Gamma-ray imaging systems are powerful tools in radiographic diagnosis.However,the recorded images suffer from degradations such as noise,blurring,and downsampling,consequently failing to meet high-precision diagnostic requirements.In this paper,we propose a novel single-image super-resolution algorithm to enhance the spatial resolution of gamma-ray imaging systems.A mathematical model of the gamma-ray imaging system is established based on maximum a posteriori estimation.Within the plug-and-play framework,the half-quadratic splitting method is employed to decouple the data fidelit term and the regularization term.An image denoiser using convolutional neural networks is adopted as an implicit image prior,referred to as a deep denoiser prior,eliminating the need to explicitly design a regularization term.Furthermore,the impact of the image boundary condition on reconstruction results is considered,and a method for estimating image boundaries is introduced.The results show that the proposed algorithm can effectively addresses boundary artifacts.By increasing the pixel number of the reconstructed images,the proposed algorithm is capable of recovering more details.Notably,in both simulation and real experiments,the proposed algorithm is demonstrated to achieve subpixel resolution,surpassing the Nyquist sampling limit determined by the camera pixel size.
文摘The clustering technique is used to examine each pixel in the image which assigned to one of the clusters depending on the minimum distance to obtain primary classified image into different intensity regions. A watershed transformation technique is then employes. This includes: gradient of the classified image, dividing the image into markers, checking the Marker Image to see if it has zero points (watershed lines). The watershed lines are then deleted in the Marker Image created by watershed algorithm. A Region Adjacency Graph (RAG) and Region Adjacency Boundary (RAB) are created between two regions from Marker Image. Finally region merging is done according to region average intensity and two edge strengths (T1, T2). The approach of the authors is tested on remote sensing and brain MR medical images. The final segmentation result is one closed boundary per actual region in the image.
基金supported in part by the National Natural Science Foundation of China(62101278,62001379,62271023)Beijing Natural Science Foundation(7242269).
文摘Deep learning(DL)-based image reconstruction methods have garnered increasing interest in the last few years.Numerous studies demonstrate that DL-based reconstruction methods function admirably in optical tomographic imaging techniques,such as bioluminescence tomography(BLT).Nevertheless,nearly every existing DL-based method utilizes an explicit neural representation for the reconstruction problem,which either consumes much memory space or requires various complicated computations.In this paper,we present a neural field(NF)-based image reconstruction scheme for BLT that uses an implicit neural representation.The proposed NFbased method establishes a transformation between the coordinate of an arbitrary spatial point and the source value of the point with a relatively light-weight multilayer perceptron,which has remarkable computational efficiency.Another simple neural network composed of two fully connected layers and a 1D convolutional layer is used to generate the neural features.Results of simulations and experiments show that the proposed NF-based method has similar performance to the photon density complement network and the two-stage network,while consuming fewer floating point operations with fewer model parameters.
基金supported by the National Natural Science Foundation of China(62201618).
文摘Deep learning techniques have significantly improved image restoration tasks in recent years.As a crucial compo-nent of deep learning,the loss function plays a key role in network optimization and performance enhancement.However,the currently prevalent loss functions assign equal weight to each pixel point during loss calculation,which hampers the ability to reflect the roles of different pixel points and fails to exploit the image’s characteristics fully.To address this issue,this study proposes an asymmetric loss function based on the image and data characteristics of the image recovery task.This novel loss function can adjust the weight of the reconstruction loss based on the grey value of different pixel points,thereby effectively optimizing the network training by differentially utilizing the grey information from the original image.Specifically,we calculate a weight factor for each pixel point based on its grey value and combine it with the reconstruction loss to create a new loss function.This ensures that pixel points with smaller grey values receive greater attention,improving network recovery.In order to verify the effectiveness of the proposed asymmetric loss function,we conducted experimental tests in the image super-resolution task.The experimental results show that the model with the introduction of asymmetric loss weights improves all the indexes of the processing results without increasing the training time.In the typical super-resolution network SRCNN,by introducing asymmetric weights,it is possible to improve the peak signal-to-noise ratio(PSNR)by up to about 0.5%,the structural similarity index(SSIM)by up to about 0.3%,and reduce the root-mean-square error(RMSE)by up to about 1.7%with essentially no increase in training time.In addition,we also further tested the performance of the proposed method in the denoising task to verify the potential applicability of the method in the image restoration task.
基金supported in part by the National Natural Science Foundation of China(Grant No.62062003)Natural Science Foundation of Ningxia(Grant No.2023AAC03293).
文摘The instance segmentation of impacted teeth in the oral panoramic X-ray images is hotly researched.However,due to the complex structure,low contrast,and complex background of teeth in panoramic X-ray images,the task of instance segmentation is technically tricky.In this study,the contrast between impacted Teeth and periodontal tissues such as gingiva,periodontalmembrane,and alveolar bone is low,resulting in fuzzy boundaries of impacted teeth.Amodel based on Teeth YOLACT is proposed to provide amore efficient and accurate solution for the segmentation of impacted teeth in oral panoramic X-ray films.Firstly,a Multi-scale Res-Transformer Module(MRTM)is designed.In the module,depthwise separable convolutions with different receptive fields are used to enhance the sensitivity of the model to lesion size.Additionally,the Vision Transformer is integrated to improve the model’s ability to perceive global features.Secondly,the Context Interaction-awareness Module(CIaM)is designed to fuse deep and shallow features.The deep semantic features guide the shallow spatial features.Then,the shallow spatial features are embedded into the deep semantic features,and the cross-weighted attention mechanism is used to aggregate the deep and shallow features efficiently,and richer context information is obtained.Thirdly,the Edge-preserving perceptionModule(E2PM)is designed to enhance the teeth edge features.The first-order differential operator is used to get the tooth edge weight,and the perception ability of tooth edge features is improved.The shallow spatial feature is fused by linear mapping,weight concatenation,and matrix multiplication operations to preserve the tooth edge information.Finally,comparison experiments and ablation experiments are conducted on the oral panoramic X-ray image datasets.The results show that the APdet,APseg,ARdet,ARseg,mAPdet,and mAPseg indicators of the proposed model are 89.9%,91.9%,77.4%,77.6%,72.8%,and 73.5%,respectively.This study further verifies the application potential of the method combining multi-scale feature extraction,multi-scale feature fusion,and edge perception enhancement in medical image segmentation,which provides a valuable reference for future related research.
基金co-supported by the National Natural Science Foundation of China(Nos.U1833117 and 61806015)the National Key Research and Development Program of China(No.2017YFB0503402)。
文摘The aircraft system has recently gained its reputation as a reliable and efficient tool for sensing and parsing aerial scenes.However,accurate and fast semantic segmentation of highresolution aerial images for remote sensing applications is still facing three challenges:the requirements for limited processing resources and low-latency operations based on aerial platforms,the balance between high accuracy and real-time efficiency for model performance,and the confusing objects with large intra-class variations and small inter-class differences in high-resolution aerial images.To address these issues,a lightweight and dual-path deep convolutional architecture,namely Aerial Bilateral Segmentation Network(Aerial-Bi Se Net),is proposed to perform realtime segmentation on high-resolution aerial images with favorable accuracy.Specifically,inspired by the receptive field concept in human visual systems,Receptive Field Module(RFM)is proposed to encode rich multi-scale contextual information.Based on channel attention mechanism,two novel modules,called Feature Attention Module(FAM)and Channel Attention based Feature Fusion Module(CAFFM)respectively,are proposed to refine and combine features effectively to boost the model performance.Aerial-Bi Se Net is evaluated on the Potsdam and Vaihingen datasets,where leading performance is reported compared with other state-of-the-art models,in terms of both accuracy and efficiency.
基金This research was supported by the Research Fund for Etoctoral Program of Higher Education (No. 99025508)
文摘The objective assessment of fabric pilling based on light projection and image analysis has been exploited recently.The device for capturing the cross-sectional images of the pilled fabrics with light projection is elaborated.The detection of the profile line and integration of the sequential cross-sectional pilled image are discussed.The threshold based on Gaussian model is recommended for pill segmentation.The results show that the installed system is capable of eliminating the interference with pill information from the fabric color and pattern.
基金supported by Beijing Natural Science Foundation(7232146)National Natural Science Foundation of China(NSFC)Grant(62475277,62105355,82122034,82327805,81927807,62275062)+4 种基金Strategic Priority Research Program of the Chinese Academy of Sciences(XDB0930000)Shenzhen Science and Technology Innovation Grant(JCYJ20220531100409023,JCYJ20210324101403010,JCYJ20220818101403008)Project of Shandong Innovation and Startup Community of High-end Medical Apparatus and Instruments Grant(2021-SGTTXM005)Shandong Province Technology Innovation Guidance Plan(Central Leading Local Science and Technology Development Fund,YDZX2023115)Taishan Scholar Special Funding Project of Shandong Province.
文摘Photoacoustic computed tomography(PACT)is an innovative biomedical imaging technique that has gained significant application in the field of biomedicine due to its ability to visualize optical contrast with high resolution and deep tissue penetration.However,the inherent challenges associated with photoacoustic signal excitation,propagation and detection often result in suboptimal image quality.To overcome these limitations,researchers have developed various advanced algorithms that span the entire image reconstruction pipeline.This review paper aims to present a detailed analysis of the latest advancements in PACT algorithms and synthesize these algorithms into a coherent framework.We provide tripartite analysis—from signal processing to reconstruction solution to image processing,covering a spectrum of techniques.The principles and methodologies,as well as their applicability and limitations,are thoroughly discussed.The primary objective of this study is to provide a thorough review of advanced algorithms applicable to PACT,offering both theoretical foundations and practical guidance for enhancing the imaging effect of PACT.