Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding ...Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.展开更多
A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec-...A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec- ond step, the discriminant-based methods or clustering-based methods are performed on the reformed distribution. It is focused on the typical clustering methods-Gaussian mixture model (GMM) and its variant to demonstrate the feasibility of the framework. Due to the independence of the first step in its second step, it can be integrated into the pixel-based and the histogram-based methods to improve their segmentation quality. The experiments on artificial and real images show that the framework can achieve effective and robust segmentation results.展开更多
Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the s...Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the spinal cord,nerves,intervertebral discs,and vertebrae,Magnetic Resonance Imaging is thought to be the most effective method for imaging the spine.The semantic segmentation of vertebrae plays a major role in the diagnostic process of lumbar diseases.It is difficult to semantically partition the vertebrae in Magnetic Resonance Images from the surrounding variety of tissues,including muscles,ligaments,and intervertebral discs.U-Net is a powerful deep-learning architecture to handle the challenges of medical image analysis tasks and achieves high segmentation accuracy.This work proposes a modified U-Net architecture namely MU-Net,consisting of the Meijering convolutional layer that incorporates the Meijering filter to perform the semantic segmentation of lumbar vertebrae L1 to L5 and sacral vertebra S1.Pseudo-colour mask images were generated and used as ground truth for training the model.The work has been carried out on 1312 images expanded from T1-weighted mid-sagittal MRI images of 515 patients in the Lumbar Spine MRI Dataset publicly available from Mendeley Data.The proposed MU-Net model for the semantic segmentation of the lumbar vertebrae gives better performance with 98.79%of pixel accuracy(PA),98.66%of dice similarity coefficient(DSC),97.36%of Jaccard coefficient,and 92.55%mean Intersection over Union(mean IoU)metrics using the mentioned dataset.展开更多
Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in ord...Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.展开更多
Deep learning(DL),derived from the domain of Artificial Neural Networks(ANN),forms one of the most essential components of modern deep learning algorithms.DL segmentation models rely on layer-by-layer convolution-base...Deep learning(DL),derived from the domain of Artificial Neural Networks(ANN),forms one of the most essential components of modern deep learning algorithms.DL segmentation models rely on layer-by-layer convolution-based feature representation,guided by forward and backward propagation.Acritical aspect of this process is the selection of an appropriate activation function(AF)to ensure robustmodel learning.However,existing activation functions often fail to effectively address the vanishing gradient problem or are complicated by the need for manual parameter tuning.Most current research on activation function design focuses on classification tasks using natural image datasets such asMNIST,CIFAR-10,and CIFAR-100.To address this gap,this study proposesMed-ReLU,a novel activation function specifically designed for medical image segmentation.Med-ReLU prevents deep learning models fromsuffering dead neurons or vanishing gradient issues.It is a hybrid activation function that combines the properties of ReLU and Softsign.For positive inputs,Med-ReLU adopts the linear behavior of ReLU to avoid vanishing gradients,while for negative inputs,it exhibits the Softsign’s polynomial convergence,ensuring robust training and avoiding inactive neurons across the training set.The training performance and segmentation accuracy ofMed-ReLU have been thoroughly evaluated,demonstrating stable learning behavior and resistance to overfitting.It consistently outperforms state-of-the-art activation functions inmedical image segmentation tasks.Designed as a parameter-free function,Med-ReLU is simple to implement in complex deep learning architectures,and its effectiveness spans various neural network models and anomaly detection scenarios.展开更多
Remote sensing image segmentation has a wide range of applications in land cover classification,urban building recognition,crop monitoring,and other fields.In recent years,with the booming development of deep learning...Remote sensing image segmentation has a wide range of applications in land cover classification,urban building recognition,crop monitoring,and other fields.In recent years,with the booming development of deep learning,remote sensing image segmentation models based on deep learning have gradually emerged and produced a large number of scientific research achievements.This article is based on deep learning and reviews the latest achievements in remote sensing image segmentation,exploring future development directions.Firstly,the basic concepts,characteristics,classification,tasks,and commonly used datasets of remote sensingimages are presented.Secondly,the segmentation models based on deep learning were classified and summarized,and the principles,characteristics,and applications of various models were presented.Then,the key technologies involved in deep learning remote sensing image segmentation were introduced.Finally,the future development direction and applicationprospects of remote sensing image segmentation were discussed.This article reviews the latest research achievements in remote sensing image segmentationfrom the perspective of deep learning,which can provide reference and inspiration for the research of remote sensing image segmentation.展开更多
The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack...The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.展开更多
he objective of the research is to develop a fast procedure for segmenting typical videophone images. In this paper, a new approach to color image segmentation based on HSI(Hue, Saturation, Intensity) color model is r...he objective of the research is to develop a fast procedure for segmenting typical videophone images. In this paper, a new approach to color image segmentation based on HSI(Hue, Saturation, Intensity) color model is reported. It is in contrast to the conventional approaches by using the three components of HSI color model in succession. This strategy makes the segmentation procedure much fast and effective. Experimental results with typical “headandshoulders” real images taken from videophone sequences show that the new appproach can fulfill the application requirements.展开更多
Mixture model based image segmentation method, which assumes that image pixels are independent and do not consider the position relationship between pixels, is not robust to noise and usually leads to misclassificatio...Mixture model based image segmentation method, which assumes that image pixels are independent and do not consider the position relationship between pixels, is not robust to noise and usually leads to misclassification. A new segmentation method, called multi-resolution Ganssian mixture model method, is proposed. First, an image pyramid is constructed and son-father link relationship is built between each level of pyramid. Then the mixture model segmentation method is applied to the top level. The segmentation result on the top level is passed top-down to the bottom level according to the son-father link relationship between levels. The proposed method considers not only local but also global information of image, it overcomes the effect of noise and can obtain better segmentation result. Experimental result demonstrates its effectiveness.展开更多
The growth patterns of mammary fat pads and glandular tissues inside the fat pads may be related with the risk factors of breast cancer.Quantitative measurements of this relationship are available after segmentation o...The growth patterns of mammary fat pads and glandular tissues inside the fat pads may be related with the risk factors of breast cancer.Quantitative measurements of this relationship are available after segmentation of mammary pads and glandular tissues.Rat fat pads may lose continuity along image sequences or adjoin similar intensity areas like epidermis and subcutaneous regions.A new approach for automatic tracing and segmentation of fat pads in magnetic resonance imaging(MRI) image sequences is presented,which does not require that the number of pads be constant or the spatial location of pads be adjacent among image slices.First,each image is decomposed into cartoon image and texture image based on cartoon-texture model.They will be used as smooth image and feature image for segmentation and for targeting pad seeds,respectively.Then,two-phase direct energy segmentation based on Chan-Vese active contour model is applied to partitioning the cartoon image into a set of regions,from which the pad boundary is traced iteratively from the pad seed.A tracing algorithm based on scanning order is proposed to accurately trace the pad boundary,which effectively removes the epidermis attached to the pad without any post processing as well as solves the problem of over-segmentation of some small holes inside the pad.The experimental results demonstrate the utility of this approach in accurate delineation of various numbers of mammary pads from several sets of MRI images.展开更多
With the widespread application of deep learning in the field of computer vision,gradually allowing medical image technology to assist doctors in making diagnoses has great practical and research significance.Aiming a...With the widespread application of deep learning in the field of computer vision,gradually allowing medical image technology to assist doctors in making diagnoses has great practical and research significance.Aiming at the shortcomings of the traditional U-Net model in 3D spatial information extraction,model over-fitting,and low degree of semantic information fusion,an improved medical image segmentation model has been used to achieve more accurate segmentation of medical images.In this model,we make full use of the residual network(ResNet)to solve the over-fitting problem.In order to process and aggregate data at different scales,the inception network is used instead of the traditional convolutional layer,and the dilated convolution is used to increase the receptive field.The conditional random field(CRF)can complete the contour refinement work.Compared with the traditional 3D U-Net network,the segmentation accuracy of the improved liver and tumor images increases by 2.89%and 7.66%,respectively.As a part of the image processing process,the method in this paper not only can be used for medical image segmentation,but also can lay the foundation for subsequent image 3D reconstruction work.展开更多
Spatially Constrained Mixture Model(SCMM)is an image segmentation model that works over the framework of maximum a-posteriori and Markov Random Field(MAP-MRF).It developed its own maximization step to be used within t...Spatially Constrained Mixture Model(SCMM)is an image segmentation model that works over the framework of maximum a-posteriori and Markov Random Field(MAP-MRF).It developed its own maximization step to be used within this framework.This research has proposed an improvement in the SCMM’s maximization step for segmenting simulated brain Magnetic Resonance Images(MRIs).The improved model is named as the Weighted Spatially Constrained Finite Mixture Model(WSCFMM).To compare the performance of SCMM and WSCFMM,simulated T1-Weighted normal MRIs were segmented.A region of interest(ROI)was extracted from segmented images.The similarity level between the extracted ROI and the ground truth(GT)was found by using the Jaccard and Dice similarity measuring method.According to the Jaccard similarity measuring method,WSCFMM showed an overall improvement of 4.72%,whereas the Dice similarity measuring method provided an overall improvement of 2.65%against the SCMM.Besides,WSCFMM signicantly stabilized and reduced the execution time by showing an improvement of 83.71%.The study concludes that WSCFMM is a stable model and performs better as compared to the SCMM in noisy and noise-free environments.展开更多
To reduce the computation cost of a combined probabilistic graphical model and a deep neural network in semantic segmentation, the local region condition random field (LRCRF) model is investigated which selectively ap...To reduce the computation cost of a combined probabilistic graphical model and a deep neural network in semantic segmentation, the local region condition random field (LRCRF) model is investigated which selectively applies the condition random field (CRF) to the most active region in the image. The full convolutional network structure is optimized with the ResNet-18 structure and dilated convolution to expand the receptive field. The tracking networks are also improved based on SiameseFC by considering the frame relations in consecutive-frame traffic scene maps. Moreover, the segmentation results of the greyscale input data sets are more stable and effective than using the RGB images for deep neural network feature extraction. The experimental results show that the proposed method takes advantage of the image features directly and achieves good real-time performance and high segmentation accuracy.展开更多
Aiming to solve the inefficient segmentation in traditional C-V model for complex topography image and time-consuming process caused by the level set function solving with partial differential, an improved Chan-Vese m...Aiming to solve the inefficient segmentation in traditional C-V model for complex topography image and time-consuming process caused by the level set function solving with partial differential, an improved Chan-Vese model is presented in this paper. With the good per)brmances of maintaining topological properties of the traditional level set method and avoiding the numerical so- lution of partial differential, the same segmentation results could be easily obtained. Thus, a stable foundation tbr rapid segmenta- tion-based on image reconstruction identification is established.展开更多
The dynamic transmission characteristics and the sensitivities of the three stage idler gear system of the new NC power turret are studied in the paper. Considering the strongly nonlinear factors such as the periodica...The dynamic transmission characteristics and the sensitivities of the three stage idler gear system of the new NC power turret are studied in the paper. Considering the strongly nonlinear factors such as the periodically time-varying mesh stiffness, the nonlinear tooth backlash, the lump-parameter model of the gear system is developed with one rotational and two translational freedoms of each gear. The eigen-values and eigenvectors are derived and analyzed on the basis of the real modal theory. The sensitivities of natural frequencies to design parameters including supporting and meshing stiffnesses, gear masses, and moments of inertia by the direct differential method are also calculated. The results show the quantitative and qualitative impact of the parameters to the natural characteristics of the gear system. Furthermore, the periodic steady state solutions are obtained by the numerical approach based on the nonlinear model. These results are employed to gain insights into the primary controlling parameters, to forecast the severity of the dynamic response, and to assess the acceptability of the gear design.展开更多
Liver hydatid disease is a common parasitic disease in farm and pastoral areas, which seriously influences people's health. Based on CT imaging features of this disease, an iterative approach for liver segmentatio...Liver hydatid disease is a common parasitic disease in farm and pastoral areas, which seriously influences people's health. Based on CT imaging features of this disease, an iterative approach for liver segmentation and hydatid lesion extraction simultaneously is proposed. In each iteration, our algorithm consists of two main steps: 1) according to the user-defined pixel seeds in the liver and hydatid lesion, Gaussian probability model fitting and smoothed Bayesian classification are applied to get initial segmentation of liver and lesion; 2) the parametric active contour model using priori shape force field is adopted to refine initial segmentation. We make subjective and objective evaluation on the proposed algorithm validity by the experiments of liver and hydatid lesion segmentation on different patients' CT slices. In comparison with ground-truth manual segmentation results, the experimental results show the effectiveness of our method to segment liver and hydatid lesion.展开更多
Jacquard image segmentation is one of the primary steps in image analysis for jacquard pattern identification. The main aim is to recognize homogeneous regions within a jacquard image as distinct, which belongs to dif...Jacquard image segmentation is one of the primary steps in image analysis for jacquard pattern identification. The main aim is to recognize homogeneous regions within a jacquard image as distinct, which belongs to different patterns. Active contour models have become popular for finding the contours of a pattern with a complex shape. However, the performance of active contour models is often inadequate under noisy environment. In this paper, a robust algorithm based on the Mumford-Shah model is proposed for the segmentation of noisy jacquard images. First, the Mumford-Shah model is discretized on piecewise linear finite element spaces to yield greater stability. Then, an iterative relaxation algorithm for numerically solving the discrete version of the model is presented. In this algorithm, an adaptive triangular mesh is refined to generate Delaunay type triangular mesh defined on structured triangulations, and then a quasi-Newton numerical method is applied to find the absolute minimum of the discrete model. Experimental results on noisy jacquard images demonstrated the efficacy of the proposed algorithm.展开更多
The effective method of the recognition of underwater complex objects in sonar image is to segment sonar image into target, shadow and sea-bottom reverberation regions and then extract the edge of the object. Because ...The effective method of the recognition of underwater complex objects in sonar image is to segment sonar image into target, shadow and sea-bottom reverberation regions and then extract the edge of the object. Because of the time-varying and space-varying characters of underwater acoustics environment, the sonar images have poor quality and serious speckle noise, so traditional image segmentation is unable to achieve precise segmentation. In the paper, the image segmentation process based on MRF (Markov random field) model is studied, and a practical method of estimating model parameters is proposed. Through analyzing the impact of chosen model parameters, a sonar imagery segmentation algorithm based on fixed parameters’ MRF model is proposed. Both of the segmentation effect and the low computing load are gained. By applying the algorithm to the synthesized texture image and actual side-scan sonar image, the algorithm can be achieved with precise segmentation result.展开更多
Computed Tomography(CT)is a commonly used technology in Printed Circuit Boards(PCB)non-destructive testing,and element segmentation of CT images is a key subsequent step.With the development of deep learning,researche...Computed Tomography(CT)is a commonly used technology in Printed Circuit Boards(PCB)non-destructive testing,and element segmentation of CT images is a key subsequent step.With the development of deep learning,researchers began to exploit the“pre-training and fine-tuning”training process for multi-element segmentation,reducing the time spent on manual annotation.However,the existing element segmentation model only focuses on the overall accuracy at the pixel level,ignoring whether the element connectivity relationship can be correctly identified.To this end,this paper proposes a PCB CT image element segmentation model optimizing the semantic perception of connectivity relationship(OSPC-seg).The overall training process adopts a“pre-training and fine-tuning”training process.A loss function that optimizes the semantic perception of circuit connectivity relationship(OSPC Loss)is designed from the aspect of alleviating the class imbalance problem and improving the correct connectivity rate.Also,the correct connectivity rate index(CCR)is proposed to evaluate the model’s connectivity relationship recognition capabilities.Experiments show that mIoU and CCR of OSPC-seg on our datasets are 90.1%and 97.0%,improved by 1.5%and 1.6%respectively compared with the baseline model.From visualization results,it can be seen that the segmentation performance of connection positions is significantly improved,which also demonstrates the effectiveness of OSPC-seg.展开更多
In this paper,we consider the Chan–Vese(C-V)model for image segmentation and obtain its numerical solution accurately and efficiently.For this purpose,we present a local radial basis function method based on a Gaussi...In this paper,we consider the Chan–Vese(C-V)model for image segmentation and obtain its numerical solution accurately and efficiently.For this purpose,we present a local radial basis function method based on a Gaussian kernel(GA-LRBF)for spatial discretization.Compared to the standard radial basis functionmethod,this approach consumes less CPU time and maintains good stability because it uses only a small subset of points in the whole computational domain.Additionally,since the Gaussian function has the property of dimensional separation,the GA-LRBF method is suitable for dealing with isotropic images.Finally,a numerical scheme that couples GA-LRBF with the fourth-order Runge–Kutta method is applied to the C-V model,and a comparison of some numerical results demonstrates that this scheme achieves much more reliable image segmentation.展开更多
基金supported by Natural Science Foundation Programme of Gansu Province(No.24JRRA231)National Natural Science Foundation of China(No.62061023)Gansu Provincial Science and Technology Plan Key Research and Development Program Project(No.24YFFA024).
文摘Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.
基金Supported by the National Natural Science Foundation of China(60505004,60773061)~~
文摘A new two-step framework is proposed for image segmentation. In the first step, the gray-value distribution of the given image is reshaped to have larger inter-class variance and less intra-class variance. In the sec- ond step, the discriminant-based methods or clustering-based methods are performed on the reformed distribution. It is focused on the typical clustering methods-Gaussian mixture model (GMM) and its variant to demonstrate the feasibility of the framework. Due to the independence of the first step in its second step, it can be integrated into the pixel-based and the histogram-based methods to improve their segmentation quality. The experiments on artificial and real images show that the framework can achieve effective and robust segmentation results.
文摘Lower back pain is one of the most common medical problems in the world and it is experienced by a huge percentage of people everywhere.Due to its ability to produce a detailed view of the soft tissues,including the spinal cord,nerves,intervertebral discs,and vertebrae,Magnetic Resonance Imaging is thought to be the most effective method for imaging the spine.The semantic segmentation of vertebrae plays a major role in the diagnostic process of lumbar diseases.It is difficult to semantically partition the vertebrae in Magnetic Resonance Images from the surrounding variety of tissues,including muscles,ligaments,and intervertebral discs.U-Net is a powerful deep-learning architecture to handle the challenges of medical image analysis tasks and achieves high segmentation accuracy.This work proposes a modified U-Net architecture namely MU-Net,consisting of the Meijering convolutional layer that incorporates the Meijering filter to perform the semantic segmentation of lumbar vertebrae L1 to L5 and sacral vertebra S1.Pseudo-colour mask images were generated and used as ground truth for training the model.The work has been carried out on 1312 images expanded from T1-weighted mid-sagittal MRI images of 515 patients in the Lumbar Spine MRI Dataset publicly available from Mendeley Data.The proposed MU-Net model for the semantic segmentation of the lumbar vertebrae gives better performance with 98.79%of pixel accuracy(PA),98.66%of dice similarity coefficient(DSC),97.36%of Jaccard coefficient,and 92.55%mean Intersection over Union(mean IoU)metrics using the mentioned dataset.
基金Natural Science Foundation of Zhejiang Province,Grant/Award Number:LY23F020025Science and Technology Commissioner Program of Huzhou,Grant/Award Number:2023GZ42Sichuan Provincial Science and Technology Support Program,Grant/Award Numbers:2023ZHCG0005,2023ZHCG0008。
文摘Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.
基金The researchers would like to thank the Deanship of Graduate Studies and Scientific Research at Qassim University for financial support(QU-APC-2025).
文摘Deep learning(DL),derived from the domain of Artificial Neural Networks(ANN),forms one of the most essential components of modern deep learning algorithms.DL segmentation models rely on layer-by-layer convolution-based feature representation,guided by forward and backward propagation.Acritical aspect of this process is the selection of an appropriate activation function(AF)to ensure robustmodel learning.However,existing activation functions often fail to effectively address the vanishing gradient problem or are complicated by the need for manual parameter tuning.Most current research on activation function design focuses on classification tasks using natural image datasets such asMNIST,CIFAR-10,and CIFAR-100.To address this gap,this study proposesMed-ReLU,a novel activation function specifically designed for medical image segmentation.Med-ReLU prevents deep learning models fromsuffering dead neurons or vanishing gradient issues.It is a hybrid activation function that combines the properties of ReLU and Softsign.For positive inputs,Med-ReLU adopts the linear behavior of ReLU to avoid vanishing gradients,while for negative inputs,it exhibits the Softsign’s polynomial convergence,ensuring robust training and avoiding inactive neurons across the training set.The training performance and segmentation accuracy ofMed-ReLU have been thoroughly evaluated,demonstrating stable learning behavior and resistance to overfitting.It consistently outperforms state-of-the-art activation functions inmedical image segmentation tasks.Designed as a parameter-free function,Med-ReLU is simple to implement in complex deep learning architectures,and its effectiveness spans various neural network models and anomaly detection scenarios.
文摘Remote sensing image segmentation has a wide range of applications in land cover classification,urban building recognition,crop monitoring,and other fields.In recent years,with the booming development of deep learning,remote sensing image segmentation models based on deep learning have gradually emerged and produced a large number of scientific research achievements.This article is based on deep learning and reviews the latest achievements in remote sensing image segmentation,exploring future development directions.Firstly,the basic concepts,characteristics,classification,tasks,and commonly used datasets of remote sensingimages are presented.Secondly,the segmentation models based on deep learning were classified and summarized,and the principles,characteristics,and applications of various models were presented.Then,the key technologies involved in deep learning remote sensing image segmentation were introduced.Finally,the future development direction and applicationprospects of remote sensing image segmentation were discussed.This article reviews the latest research achievements in remote sensing image segmentationfrom the perspective of deep learning,which can provide reference and inspiration for the research of remote sensing image segmentation.
基金supported by National Natural Science Foundation of China(No.52374155)Anhui Provincial Natural Science Foundation(No.2308085 MF218).
文摘The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.
文摘he objective of the research is to develop a fast procedure for segmenting typical videophone images. In this paper, a new approach to color image segmentation based on HSI(Hue, Saturation, Intensity) color model is reported. It is in contrast to the conventional approaches by using the three components of HSI color model in succession. This strategy makes the segmentation procedure much fast and effective. Experimental results with typical “headandshoulders” real images taken from videophone sequences show that the new appproach can fulfill the application requirements.
基金This project was supported by the National Natural Foundation of China (60404022) and the Foundation of Department ofEducation of Hebei Province (2002209).
文摘Mixture model based image segmentation method, which assumes that image pixels are independent and do not consider the position relationship between pixels, is not robust to noise and usually leads to misclassification. A new segmentation method, called multi-resolution Ganssian mixture model method, is proposed. First, an image pyramid is constructed and son-father link relationship is built between each level of pyramid. Then the mixture model segmentation method is applied to the top level. The segmentation result on the top level is passed top-down to the bottom level according to the son-father link relationship between levels. The proposed method considers not only local but also global information of image, it overcomes the effect of noise and can obtain better segmentation result. Experimental result demonstrates its effectiveness.
基金Supported by National Basic Research Program of China (No.2003CB716103)partially supported by the US Army Breast Cancer Research Program (DAMD17-03-1-0446)
文摘The growth patterns of mammary fat pads and glandular tissues inside the fat pads may be related with the risk factors of breast cancer.Quantitative measurements of this relationship are available after segmentation of mammary pads and glandular tissues.Rat fat pads may lose continuity along image sequences or adjoin similar intensity areas like epidermis and subcutaneous regions.A new approach for automatic tracing and segmentation of fat pads in magnetic resonance imaging(MRI) image sequences is presented,which does not require that the number of pads be constant or the spatial location of pads be adjacent among image slices.First,each image is decomposed into cartoon image and texture image based on cartoon-texture model.They will be used as smooth image and feature image for segmentation and for targeting pad seeds,respectively.Then,two-phase direct energy segmentation based on Chan-Vese active contour model is applied to partitioning the cartoon image into a set of regions,from which the pad boundary is traced iteratively from the pad seed.A tracing algorithm based on scanning order is proposed to accurately trace the pad boundary,which effectively removes the epidermis attached to the pad without any post processing as well as solves the problem of over-segmentation of some small holes inside the pad.The experimental results demonstrate the utility of this approach in accurate delineation of various numbers of mammary pads from several sets of MRI images.
文摘With the widespread application of deep learning in the field of computer vision,gradually allowing medical image technology to assist doctors in making diagnoses has great practical and research significance.Aiming at the shortcomings of the traditional U-Net model in 3D spatial information extraction,model over-fitting,and low degree of semantic information fusion,an improved medical image segmentation model has been used to achieve more accurate segmentation of medical images.In this model,we make full use of the residual network(ResNet)to solve the over-fitting problem.In order to process and aggregate data at different scales,the inception network is used instead of the traditional convolutional layer,and the dilated convolution is used to increase the receptive field.The conditional random field(CRF)can complete the contour refinement work.Compared with the traditional 3D U-Net network,the segmentation accuracy of the improved liver and tumor images increases by 2.89%and 7.66%,respectively.As a part of the image processing process,the method in this paper not only can be used for medical image segmentation,but also can lay the foundation for subsequent image 3D reconstruction work.
文摘Spatially Constrained Mixture Model(SCMM)is an image segmentation model that works over the framework of maximum a-posteriori and Markov Random Field(MAP-MRF).It developed its own maximization step to be used within this framework.This research has proposed an improvement in the SCMM’s maximization step for segmenting simulated brain Magnetic Resonance Images(MRIs).The improved model is named as the Weighted Spatially Constrained Finite Mixture Model(WSCFMM).To compare the performance of SCMM and WSCFMM,simulated T1-Weighted normal MRIs were segmented.A region of interest(ROI)was extracted from segmented images.The similarity level between the extracted ROI and the ground truth(GT)was found by using the Jaccard and Dice similarity measuring method.According to the Jaccard similarity measuring method,WSCFMM showed an overall improvement of 4.72%,whereas the Dice similarity measuring method provided an overall improvement of 2.65%against the SCMM.Besides,WSCFMM signicantly stabilized and reduced the execution time by showing an improvement of 83.71%.The study concludes that WSCFMM is a stable model and performs better as compared to the SCMM in noisy and noise-free environments.
文摘To reduce the computation cost of a combined probabilistic graphical model and a deep neural network in semantic segmentation, the local region condition random field (LRCRF) model is investigated which selectively applies the condition random field (CRF) to the most active region in the image. The full convolutional network structure is optimized with the ResNet-18 structure and dilated convolution to expand the receptive field. The tracking networks are also improved based on SiameseFC by considering the frame relations in consecutive-frame traffic scene maps. Moreover, the segmentation results of the greyscale input data sets are more stable and effective than using the RGB images for deep neural network feature extraction. The experimental results show that the proposed method takes advantage of the image features directly and achieves good real-time performance and high segmentation accuracy.
文摘Aiming to solve the inefficient segmentation in traditional C-V model for complex topography image and time-consuming process caused by the level set function solving with partial differential, an improved Chan-Vese model is presented in this paper. With the good per)brmances of maintaining topological properties of the traditional level set method and avoiding the numerical so- lution of partial differential, the same segmentation results could be easily obtained. Thus, a stable foundation tbr rapid segmenta- tion-based on image reconstruction identification is established.
文摘The dynamic transmission characteristics and the sensitivities of the three stage idler gear system of the new NC power turret are studied in the paper. Considering the strongly nonlinear factors such as the periodically time-varying mesh stiffness, the nonlinear tooth backlash, the lump-parameter model of the gear system is developed with one rotational and two translational freedoms of each gear. The eigen-values and eigenvectors are derived and analyzed on the basis of the real modal theory. The sensitivities of natural frequencies to design parameters including supporting and meshing stiffnesses, gear masses, and moments of inertia by the direct differential method are also calculated. The results show the quantitative and qualitative impact of the parameters to the natural characteristics of the gear system. Furthermore, the periodic steady state solutions are obtained by the numerical approach based on the nonlinear model. These results are employed to gain insights into the primary controlling parameters, to forecast the severity of the dynamic response, and to assess the acceptability of the gear design.
基金Science Special Fund for "Special Training" of Ethnical Minority Professional and Technical Intelligent in Xinjiang sponsored by the Scienceand Technology Department of Xinjiang Uygur Autonomous Regiongrant number:200723104+1 种基金National Natural Science Foundation of Chinagrant number:30960097
文摘Liver hydatid disease is a common parasitic disease in farm and pastoral areas, which seriously influences people's health. Based on CT imaging features of this disease, an iterative approach for liver segmentation and hydatid lesion extraction simultaneously is proposed. In each iteration, our algorithm consists of two main steps: 1) according to the user-defined pixel seeds in the liver and hydatid lesion, Gaussian probability model fitting and smoothed Bayesian classification are applied to get initial segmentation of liver and lesion; 2) the parametric active contour model using priori shape force field is adopted to refine initial segmentation. We make subjective and objective evaluation on the proposed algorithm validity by the experiments of liver and hydatid lesion segmentation on different patients' CT slices. In comparison with ground-truth manual segmentation results, the experimental results show the effectiveness of our method to segment liver and hydatid lesion.
基金Project (No. 2003AA411021) supported by the Hi-Tech Research andDevelopment Program (863) of China
文摘Jacquard image segmentation is one of the primary steps in image analysis for jacquard pattern identification. The main aim is to recognize homogeneous regions within a jacquard image as distinct, which belongs to different patterns. Active contour models have become popular for finding the contours of a pattern with a complex shape. However, the performance of active contour models is often inadequate under noisy environment. In this paper, a robust algorithm based on the Mumford-Shah model is proposed for the segmentation of noisy jacquard images. First, the Mumford-Shah model is discretized on piecewise linear finite element spaces to yield greater stability. Then, an iterative relaxation algorithm for numerically solving the discrete version of the model is presented. In this algorithm, an adaptive triangular mesh is refined to generate Delaunay type triangular mesh defined on structured triangulations, and then a quasi-Newton numerical method is applied to find the absolute minimum of the discrete model. Experimental results on noisy jacquard images demonstrated the efficacy of the proposed algorithm.
基金Supported by China Postdoctoral Science Foundation (Grant No. LRB00025), Research Fund for Doctoral Program of Higher Education of China (Grant No. 20050217010) and Foundation under the Underwater Acoustic Technology National Key Lab (Grant No. 9140C200501060C20).
文摘The effective method of the recognition of underwater complex objects in sonar image is to segment sonar image into target, shadow and sea-bottom reverberation regions and then extract the edge of the object. Because of the time-varying and space-varying characters of underwater acoustics environment, the sonar images have poor quality and serious speckle noise, so traditional image segmentation is unable to achieve precise segmentation. In the paper, the image segmentation process based on MRF (Markov random field) model is studied, and a practical method of estimating model parameters is proposed. Through analyzing the impact of chosen model parameters, a sonar imagery segmentation algorithm based on fixed parameters’ MRF model is proposed. Both of the segmentation effect and the low computing load are gained. By applying the algorithm to the synthesized texture image and actual side-scan sonar image, the algorithm can be achieved with precise segmentation result.
文摘Computed Tomography(CT)is a commonly used technology in Printed Circuit Boards(PCB)non-destructive testing,and element segmentation of CT images is a key subsequent step.With the development of deep learning,researchers began to exploit the“pre-training and fine-tuning”training process for multi-element segmentation,reducing the time spent on manual annotation.However,the existing element segmentation model only focuses on the overall accuracy at the pixel level,ignoring whether the element connectivity relationship can be correctly identified.To this end,this paper proposes a PCB CT image element segmentation model optimizing the semantic perception of connectivity relationship(OSPC-seg).The overall training process adopts a“pre-training and fine-tuning”training process.A loss function that optimizes the semantic perception of circuit connectivity relationship(OSPC Loss)is designed from the aspect of alleviating the class imbalance problem and improving the correct connectivity rate.Also,the correct connectivity rate index(CCR)is proposed to evaluate the model’s connectivity relationship recognition capabilities.Experiments show that mIoU and CCR of OSPC-seg on our datasets are 90.1%and 97.0%,improved by 1.5%and 1.6%respectively compared with the baseline model.From visualization results,it can be seen that the segmentation performance of connection positions is significantly improved,which also demonstrates the effectiveness of OSPC-seg.
基金sponsored by Guangdong Basic and Applied Basic Research Foundation under Grant No.2021A1515110680Guangzhou Basic and Applied Basic Research under Grant No.202102020340.
文摘In this paper,we consider the Chan–Vese(C-V)model for image segmentation and obtain its numerical solution accurately and efficiently.For this purpose,we present a local radial basis function method based on a Gaussian kernel(GA-LRBF)for spatial discretization.Compared to the standard radial basis functionmethod,this approach consumes less CPU time and maintains good stability because it uses only a small subset of points in the whole computational domain.Additionally,since the Gaussian function has the property of dimensional separation,the GA-LRBF method is suitable for dealing with isotropic images.Finally,a numerical scheme that couples GA-LRBF with the fourth-order Runge–Kutta method is applied to the C-V model,and a comparison of some numerical results demonstrates that this scheme achieves much more reliable image segmentation.