Inspections of power transmission lines(PTLs)conducted using unmanned aerial vehicles(UAVs)are complicated by the fine structure of the lines and complex backgrounds,making accurate and efficient segmentation challeng...Inspections of power transmission lines(PTLs)conducted using unmanned aerial vehicles(UAVs)are complicated by the fine structure of the lines and complex backgrounds,making accurate and efficient segmentation challenging.This study presents the Wavelet-Guided Transformer U-Net(WGT-UNet)model,a new hybrid net-work that combines Convolutional Neural Networks(CNNs),Discrete Wavelet Transform(DWT),and Transformer architectures.The model’s primary contribution is based on spatial and channel attention mechanisms derived from wavelet subbands to guide the Transformer’s self-attention structure.Thus,low and high frequency components are separated at each stage using DWT,suppressing structural noise and making linear objects more prominent.The developed design is supported by multi-component hybrid cost functions that simultaneously solve class imbalance,edge sharpness,structural integrity,and spatial regularity issues.Furthermore,high segmentation success has been achieved in producing sharp boundaries and continuous line structures with the DWT-guided attention mechanism.Experiments conducted on the TTPLA dataset reveal that the version using the ConvNeXt backbone outperforms the current state-of-the-art approaches with an F1-Score of 79.33%and an Intersection over Union(IoU)value of 68.38%.The models and visual outputs of the developed method and all compared models can be accessed at https://github.com/burhanbarakli/WGT-UNET.展开更多
Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and treatment.However,achieving precise segmentation remains a challenge due to vari...Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and treatment.However,achieving precise segmentation remains a challenge due to various factors,including scattering noise,low contrast,and limited resolution in ultrasound images.Although existing segmentation models have made progress,they still suffer from several limitations,such as high error rates,low generalizability,overfitting,limited feature learning capability,etc.To address these challenges,this paper proposes a Multi-level Relation Transformer-based U-Net(MLRT-UNet)to improve thyroid nodule segmentation.The MLRTUNet leverages a novel Relation Transformer,which processes images at multiple scales,overcoming the limitations of traditional encoding methods.This transformer integrates both local and global features effectively through selfattention and cross-attention units,capturing intricate relationships within the data.The approach also introduces a Co-operative Transformer Fusion(CTF)module to combine multi-scale features from different encoding layers,enhancing the model’s ability to capture complex patterns in the data.Furthermore,the Relation Transformer block enhances long-distance dependencies during the decoding process,improving segmentation accuracy.Experimental results showthat the MLRT-UNet achieves high segmentation accuracy,reaching 98.2% on the Digital Database Thyroid Image(DDT)dataset,97.8% on the Thyroid Nodule 3493(TG3K)dataset,and 98.2% on the Thyroid Nodule3K(TN3K)dataset.These findings demonstrate that the proposed method significantly enhances the accuracy of thyroid nodule segmentation,addressing the limitations of existing models.展开更多
Retinal vessel segmentation is a challenging medical task owing to small size of dataset,micro blood vessels and low image contrast.To address these issues,we introduce a novel convolutional neural network in this pap...Retinal vessel segmentation is a challenging medical task owing to small size of dataset,micro blood vessels and low image contrast.To address these issues,we introduce a novel convolutional neural network in this paper,which takes the advantage of both adversarial learning and recurrent neural network.An iterative design of network with recurrent unit is performed to refine the segmentation results from input retinal image gradually.Recurrent unit preserves high-level semantic information for feature reuse,so as to output a sufficiently refined segmentation map instead of a coarse mask.Moreover,an adversarial loss is imposing the integrity and connectivity constraints on the segmented vessel regions,thus greatly reducing topology errors of segmentation.The experimental results on the DRIVE dataset show that our method achieves area under curve and sensitivity of 98.17%and 80.64%,respectively.Our method achieves superior performance in retinal vessel segmentation compared with other existing state-of-the-art methods.展开更多
Existing glass segmentation networks have high computational complexity and large memory occupation,leading to high hardware requirements and time overheads for model inference,which is not conducive to efficiency-see...Existing glass segmentation networks have high computational complexity and large memory occupation,leading to high hardware requirements and time overheads for model inference,which is not conducive to efficiency-seeking real-time tasks such as autonomous driving.The inefficiency of the models is mainly due to employing homogeneous modules to process features of different layers.These modules require computationally intensive convolutions and weight calculation branches with numerous parameters to accommodate the differences in information across layers.We propose an efficient glass segmentation network(EGSNet)based on multi-level heterogeneous architecture and boundary awareness to balance the model performance and efficiency.EGSNet divides the feature layers from different stages into low-level understanding,semantic-level understanding,and global understanding with boundary guidance.Based on the information differences among the different layers,we further propose the multi-angle collaborative enhancement(MCE)module,which extracts the detailed information from shallow features,and the large-scale contextual feature extraction(LCFE)module to understand semantic logic through deep features.The models are trained and evaluated on the glass segmentation datasets HSO(Home-Scene-Oriented)and Trans10k-stuff,respectively,and EGSNet achieves the best efficiency and performance compared to advanced methods.In the HSO test set results,the IoU,Fβ,MAE(Mean Absolute Error),and BER(Balance Error Rate)of EGSNet are 0.804,0.847,0.084,and 0.085,and the GFLOPs(Giga Floating Point Operations Per Second)are only 27.15.Experimental results show that EGSNet significantly improves the efficiency of the glass segmentation task with better performance.展开更多
Semantic segmentation for mixed scenes of aerial remote sensing and road traffic is one of the key technologies for visual perception of flying cars.The State-of-the-Art(SOTA)semantic segmentation methods have made re...Semantic segmentation for mixed scenes of aerial remote sensing and road traffic is one of the key technologies for visual perception of flying cars.The State-of-the-Art(SOTA)semantic segmentation methods have made remarkable achievements in both fine-grained segmentation and real-time performance.However,when faced with the huge differences in scale and semantic categories brought about by the mixed scenes of aerial remote sensing and road traffic,they still face great challenges and there is little related research.Addressing the above issue,this paper proposes a semantic segmentation model specifically for mixed datasets of aerial remote sensing and road traffic scenes.First,a novel decoding-recoding multi-scale feature iterative refinement structure is proposed,which utilizes the re-integration and continuous enhancement of multi-scale information to effectively deal with the huge scale differences between cross-domain scenes,while using a fully convolutional structure to ensure the lightweight and real-time requirements.Second,a welldesigned cross-window attention mechanism combined with a global information integration decoding block forms an enhanced global context perception,which can effectively capture the long-range dependencies and multi-scale global context information of different scenes,thereby achieving fine-grained semantic segmentation.The proposed method is tested on a large-scale mixed dataset of aerial remote sensing and road traffic scenes.The results confirm that it can effectively deal with the problem of large-scale differences in cross-domain scenes.Its segmentation accuracy surpasses that of the SOTA methods,which meets the real-time requirements.展开更多
Traditional image segmentation methods based on MRF converge slowly and require pre-defined weight. These disadvantages are addressed, and a fast segmentation approach based on simple Markov random field (MRF) for S...Traditional image segmentation methods based on MRF converge slowly and require pre-defined weight. These disadvantages are addressed, and a fast segmentation approach based on simple Markov random field (MRF) for SAR image is proposed. The approach is firstly used to perform coarse segmentation in blocks. Then the image is modeled with simple MRF and adaptive variable weighting forms are applied in homogeneous and heterogeneous regions. As a result, the convergent speed is accelerated while the segmentation results in homogeneous regions and boarders are improved. Simulations with synthetic and real SAR images demonstrate the effectiveness of the proposed approach.展开更多
Simple linear iterative cluster(SLIC) is widely used because controllable superpixel number, accurate edge covering, symmetrical production and fast speed of calculation. The main problem of the SLIC algorithm is its ...Simple linear iterative cluster(SLIC) is widely used because controllable superpixel number, accurate edge covering, symmetrical production and fast speed of calculation. The main problem of the SLIC algorithm is its under-segmentation when applied to segment artificial structure images with unobvious boundaries and narrow regions. Therefore, an improved clustering segmentation algorithm to correct the segmentation results of SLIC is presented in this paper. The allocation of pixels is not only related to its own characteristic, but also to those of its surrounding pixels.Hence, it is appropriate to improve the standard SLIC through the pixels by focusing on boundaries. An improved SLIC method adheres better to the boundaries in the image is proposed, by using the first and second order difference operators as magnified factors. Experimental results demonstrate that the proposed method achieves an excellent boundary adherence for artificial structure images. The application of the proposed method is extended to images with an unobvious boundary in the Berkeley Segmentation Dataset BSDS500. In comparison with SLIC, the boundary adherence is increased obviously.展开更多
To overcome the shortcomings of 1 D and 2 D Otsu’s thresholding techniques, the 3 D Otsu method has been developed.Among all Otsu’s methods, 3 D Otsu technique provides the best threshold values for the multi-level ...To overcome the shortcomings of 1 D and 2 D Otsu’s thresholding techniques, the 3 D Otsu method has been developed.Among all Otsu’s methods, 3 D Otsu technique provides the best threshold values for the multi-level thresholding processes. In this paper, to improve the quality of segmented images, a simple and effective multilevel thresholding method is introduced. The proposed approach focuses on preserving edge detail by computing the 3 D Otsu along the fusion phenomena. The advantages of the presented scheme include higher quality outcomes, better preservation of tiny details and boundaries and reduced execution time with rising threshold levels. The fusion approach depends upon the differences between pixel intensity values within a small local space of an image;it aims to improve localized information after the thresholding process. The fusion of images based on local contrast can improve image segmentation performance by minimizing the loss of local contrast, loss of details and gray-level distributions. Results show that the proposed method yields more promising segmentation results when compared to conventional1 D Otsu, 2 D Otsu and 3 D Otsu methods, as evident from the objective and subjective evaluations.展开更多
This study proposes a novel nature-inspired meta-heuristic optimizer based on the Reptile Search Algorithm combed with Salp Swarm Algorithm for image segmentation using gray-scale multi-level thresholding,called RSA-S...This study proposes a novel nature-inspired meta-heuristic optimizer based on the Reptile Search Algorithm combed with Salp Swarm Algorithm for image segmentation using gray-scale multi-level thresholding,called RSA-SSA.The proposed method introduces a better search space to find the optimal solution at each iteration.However,we proposed RSA-SSA to avoid the searching problem in the same area and determine the optimal multi-level thresholds.The obtained solutions by the proposed method are represented using the image histogram.The proposed RSA-SSA employed Otsu’s variance class function to get the best threshold values at each level.The performance measure for the proposed method is valid by detecting fitness function,structural similarity index,peak signal-to-noise ratio,and Friedman ranking test.Several benchmark images of COVID-19 validate the performance of the proposed RSA-SSA.The results showed that the proposed RSA-SSA outperformed other metaheuristics optimization algorithms published in the literature.展开更多
<span style="font-family:Verdana;">Detecting and segmenting the lung regions in chest X-ray images is an important part in artificial intelligence-based computer-aided diagnosis/detection (AI-CAD) syst...<span style="font-family:Verdana;">Detecting and segmenting the lung regions in chest X-ray images is an important part in artificial intelligence-based computer-aided diagnosis/detection (AI-CAD) systems for chest radiography. However, if the chest X-ray images themselves are used as training data for the AI-CAD system, the system might learn the irrelevant image-based information resulting in the decrease of system’s performance. In this study, we propose a lung region segmentation method that can automatically remove the shoulder and scapula regions, mediastinum, and diaphragm regions in advance from various chest X-ray images to be used as learning data. The proposed method consists of three main steps. First, employ the simple linear iterative clustering algorithm, the lazy snapping technique and local entropy filter to generate an entropy map. Second, apply morphological operations to the entropy map to obtain a lung mask. Third, perform automated segmentation of the lung field using the obtained mask. A total of 30 images were used for the experiments. In order to verify the effectiveness of the proposed method, two other texture maps, namely, the maps created from the standard deviation filtering and the range filtering, were used for comparison. As a result, the proposed method using the entropy map was able to appropriately remove the unnecessary regions. In addition, this method was able to remove the markers present in the image, but the other two methods could not. The experimental results have revealed that our proposed method is a highly generalizable and useful algorithm. We believe that this method might act an important role to enhance the performance of AI-CAD systems for chest X-ray images.</span>展开更多
Objective For computer-aided Chinese medical diagnosis and aiming at the problem of insufficient segmentation,a novel multi-level method based on the multi-scale fusion residual neural network(MF2ResU-Net)model is pro...Objective For computer-aided Chinese medical diagnosis and aiming at the problem of insufficient segmentation,a novel multi-level method based on the multi-scale fusion residual neural network(MF2ResU-Net)model is proposed.Methods To obtain refined features of retinal blood vessels,three cascade connected UNet networks are employed.To deal with the problem of difference between the parts of encoder and decoder,in MF2ResU-Net,shortcut connections are used to combine the encoder and decoder layers in the blocks.To refine the feature of segmentation,atrous spatial pyramid pooling(ASPP)is embedded to achieve multi-scale features for the final segmentation networks.Results The MF2ResU-Net was superior to the existing methods on the criteria of sensitivity(Sen),specificity(Spe),accuracy(ACC),and area under curve(AUC),the values of which are 0.8013 and 0.8102,0.9842 and 0.9809,0.9700 and 0.9776,and 0.9797 and 0.9837,respectively for DRIVE and CHASE DB1.The results of experiments demonstrated the effectiveness and robustness of the model in the segmentation of complex curvature and small blood vessels.Conclusion Based on residual connections and multi-feature fusion,the proposed method can obtain accurate segmentation of retinal blood vessels by refining the segmentation features,which can provide another diagnosis method for computer-aided Chinese medical diagnosis.展开更多
A novel stepwise thresholding method for fuzzy image segmentation is proposed. Unlike the published iterative or recursive thresholding mehtods, this method segments regions into sub-regions iteratively by increasing ...A novel stepwise thresholding method for fuzzy image segmentation is proposed. Unlike the published iterative or recursive thresholding mehtods, this method segments regions into sub-regions iteratively by increasing threshold value in a stepwise manner, based on a preset intensity homogeneity criteria. The method is particularly suited to segmentation of the laser scanning confocal microscopy (LSCM) images, computerised tomography (CT) images, magnetic resonance (MR) images, fingerprint images, etc. The method has been tested on some typical fuzzy image data sets. In this paper, the novel stepwise thresholding is first addressed. Next a new method of region labelling for region extraction is introduced. Then the design of intensity homogeneity segmentation criteria is presented. Some examples of the experiment results of fuzzy image segmentation by the method are given at the end.展开更多
Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task...Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.展开更多
Complex product development will inevitably face the design planning of the multi-coupled activities, and overlapping these activities could potentially reduce product development time, but there is a risk of the addi...Complex product development will inevitably face the design planning of the multi-coupled activities, and overlapping these activities could potentially reduce product development time, but there is a risk of the additional cost. Although the downstream task information dependence to the upstream task is already considered in the current researches, but the design process overall iteration caused by the information interdependence between activities is hardly discussed; especially the impact on the design process' overall iteration from the valid information accumulation process. Secondly, most studies only focus on the single overlapping process of two activities, rarely take multi-segment and multi-ply overlapping process of multi coupled activities into account; especially the inherent link between product development time and cost which originates from the overlapping process of multi coupled activities. For the purpose of solving the above problems, as to the insufficiency of the accumulated valid information in overlapping process, the function of the valid information evolution (VIE) degree is constructed. Stochastic process theory is used to describe the design information exchange and the valid information accumulation in the overlapping segment, and then the planning models of the single overlapping segment are built. On these bases, by analyzing overlapping processes and overlapping features of multi-coupling activities, multi-segment and multi-ply overlapping planning models are built; by sorting overlapping processes and analyzing the construction of these planning models, two conclusions are obtained: (1) As to multi-segment and multi-ply overlapping of multi coupled activities, the total decrement of the task set development time is the sum of the time decrement caused by basic overlapping segments, and minus the sum of the time increment caused by multiple overlapping segments; (2) the total increment of development cost is the sum of the cost increment caused by all overlapping process. And then, based on overlapping degree analysis of these planning models, by the V1E degree function, the four lemmas theory proofs are represented, and two propositions are finally proved: (1) The multi-ply overlapping of the multi coupled activities will weaken the basic overlapping effect on the development cycle time reduction (2) Overlapping the multi coupled activities will decrease product development cycle, but increase product development cost. And there is trade-off between development time and cost. And so, two methods are given to slacken and eliminate multi-ply overlapping effects. At last, an example about a vehicle upper subsystem design illustrates the application of the proposed models; compared with a sequential execution pattern, the decreasing of development cycle (22%) and the increasing of development cost (3%) show the validity of the method in the example The proposed research not only lays a theoretical foundation for correctly planning complex product development process, but also provides specific and effective operation methods for overlapping multi coupled activities.展开更多
By converting an optimal control problem for nonlinear systems to a Hamiltonian system,a symplecitc-preserving method is proposed.The state and costate variables are approximated by the Lagrange polynomial.The state v...By converting an optimal control problem for nonlinear systems to a Hamiltonian system,a symplecitc-preserving method is proposed.The state and costate variables are approximated by the Lagrange polynomial.The state variables at two ends of the time interval are taken as independent variables.Based on the dual variable principle,nonlinear optimal control problems are replaced with nonlinear equations.Furthermore,in the implementation of the symplectic algorithm,based on the 2N algorithm,a multilevel method is proposed.When the time grid is refined from low level to high level,the initial state and costate variables of the nonlinear equations can be obtained from the Lagrange interpolation at the low level grid to improve efficiency.Numerical simulations show the precision and the efficiency of the proposed algorithm in this paper.展开更多
文摘Inspections of power transmission lines(PTLs)conducted using unmanned aerial vehicles(UAVs)are complicated by the fine structure of the lines and complex backgrounds,making accurate and efficient segmentation challenging.This study presents the Wavelet-Guided Transformer U-Net(WGT-UNet)model,a new hybrid net-work that combines Convolutional Neural Networks(CNNs),Discrete Wavelet Transform(DWT),and Transformer architectures.The model’s primary contribution is based on spatial and channel attention mechanisms derived from wavelet subbands to guide the Transformer’s self-attention structure.Thus,low and high frequency components are separated at each stage using DWT,suppressing structural noise and making linear objects more prominent.The developed design is supported by multi-component hybrid cost functions that simultaneously solve class imbalance,edge sharpness,structural integrity,and spatial regularity issues.Furthermore,high segmentation success has been achieved in producing sharp boundaries and continuous line structures with the DWT-guided attention mechanism.Experiments conducted on the TTPLA dataset reveal that the version using the ConvNeXt backbone outperforms the current state-of-the-art approaches with an F1-Score of 79.33%and an Intersection over Union(IoU)value of 68.38%.The models and visual outputs of the developed method and all compared models can be accessed at https://github.com/burhanbarakli/WGT-UNET.
文摘Thyroid nodules,a common disorder in the endocrine system,require accurate segmentation in ultrasound images for effective diagnosis and treatment.However,achieving precise segmentation remains a challenge due to various factors,including scattering noise,low contrast,and limited resolution in ultrasound images.Although existing segmentation models have made progress,they still suffer from several limitations,such as high error rates,low generalizability,overfitting,limited feature learning capability,etc.To address these challenges,this paper proposes a Multi-level Relation Transformer-based U-Net(MLRT-UNet)to improve thyroid nodule segmentation.The MLRTUNet leverages a novel Relation Transformer,which processes images at multiple scales,overcoming the limitations of traditional encoding methods.This transformer integrates both local and global features effectively through selfattention and cross-attention units,capturing intricate relationships within the data.The approach also introduces a Co-operative Transformer Fusion(CTF)module to combine multi-scale features from different encoding layers,enhancing the model’s ability to capture complex patterns in the data.Furthermore,the Relation Transformer block enhances long-distance dependencies during the decoding process,improving segmentation accuracy.Experimental results showthat the MLRT-UNet achieves high segmentation accuracy,reaching 98.2% on the Digital Database Thyroid Image(DDT)dataset,97.8% on the Thyroid Nodule 3493(TG3K)dataset,and 98.2% on the Thyroid Nodule3K(TN3K)dataset.These findings demonstrate that the proposed method significantly enhances the accuracy of thyroid nodule segmentation,addressing the limitations of existing models.
文摘Retinal vessel segmentation is a challenging medical task owing to small size of dataset,micro blood vessels and low image contrast.To address these issues,we introduce a novel convolutional neural network in this paper,which takes the advantage of both adversarial learning and recurrent neural network.An iterative design of network with recurrent unit is performed to refine the segmentation results from input retinal image gradually.Recurrent unit preserves high-level semantic information for feature reuse,so as to output a sufficiently refined segmentation map instead of a coarse mask.Moreover,an adversarial loss is imposing the integrity and connectivity constraints on the segmented vessel regions,thus greatly reducing topology errors of segmentation.The experimental results on the DRIVE dataset show that our method achieves area under curve and sensitivity of 98.17%and 80.64%,respectively.Our method achieves superior performance in retinal vessel segmentation compared with other existing state-of-the-art methods.
文摘Existing glass segmentation networks have high computational complexity and large memory occupation,leading to high hardware requirements and time overheads for model inference,which is not conducive to efficiency-seeking real-time tasks such as autonomous driving.The inefficiency of the models is mainly due to employing homogeneous modules to process features of different layers.These modules require computationally intensive convolutions and weight calculation branches with numerous parameters to accommodate the differences in information across layers.We propose an efficient glass segmentation network(EGSNet)based on multi-level heterogeneous architecture and boundary awareness to balance the model performance and efficiency.EGSNet divides the feature layers from different stages into low-level understanding,semantic-level understanding,and global understanding with boundary guidance.Based on the information differences among the different layers,we further propose the multi-angle collaborative enhancement(MCE)module,which extracts the detailed information from shallow features,and the large-scale contextual feature extraction(LCFE)module to understand semantic logic through deep features.The models are trained and evaluated on the glass segmentation datasets HSO(Home-Scene-Oriented)and Trans10k-stuff,respectively,and EGSNet achieves the best efficiency and performance compared to advanced methods.In the HSO test set results,the IoU,Fβ,MAE(Mean Absolute Error),and BER(Balance Error Rate)of EGSNet are 0.804,0.847,0.084,and 0.085,and the GFLOPs(Giga Floating Point Operations Per Second)are only 27.15.Experimental results show that EGSNet significantly improves the efficiency of the glass segmentation task with better performance.
基金supported by the National Key Research and Development of China(No.2022YFB2503400).
文摘Semantic segmentation for mixed scenes of aerial remote sensing and road traffic is one of the key technologies for visual perception of flying cars.The State-of-the-Art(SOTA)semantic segmentation methods have made remarkable achievements in both fine-grained segmentation and real-time performance.However,when faced with the huge differences in scale and semantic categories brought about by the mixed scenes of aerial remote sensing and road traffic,they still face great challenges and there is little related research.Addressing the above issue,this paper proposes a semantic segmentation model specifically for mixed datasets of aerial remote sensing and road traffic scenes.First,a novel decoding-recoding multi-scale feature iterative refinement structure is proposed,which utilizes the re-integration and continuous enhancement of multi-scale information to effectively deal with the huge scale differences between cross-domain scenes,while using a fully convolutional structure to ensure the lightweight and real-time requirements.Second,a welldesigned cross-window attention mechanism combined with a global information integration decoding block forms an enhanced global context perception,which can effectively capture the long-range dependencies and multi-scale global context information of different scenes,thereby achieving fine-grained semantic segmentation.The proposed method is tested on a large-scale mixed dataset of aerial remote sensing and road traffic scenes.The results confirm that it can effectively deal with the problem of large-scale differences in cross-domain scenes.Its segmentation accuracy surpasses that of the SOTA methods,which meets the real-time requirements.
基金supported by the Specialized Research Found for the Doctoral Program of Higher Education (20070699013)the Natural Science Foundation of Shaanxi Province (2006F05)the Aeronautical Science Foundation (05I53076)
文摘Traditional image segmentation methods based on MRF converge slowly and require pre-defined weight. These disadvantages are addressed, and a fast segmentation approach based on simple Markov random field (MRF) for SAR image is proposed. The approach is firstly used to perform coarse segmentation in blocks. Then the image is modeled with simple MRF and adaptive variable weighting forms are applied in homogeneous and heterogeneous regions. As a result, the convergent speed is accelerated while the segmentation results in homogeneous regions and boarders are improved. Simulations with synthetic and real SAR images demonstrate the effectiveness of the proposed approach.
基金Supported by Defense Industrial Technology Development Program(JCKY2017602C016)
文摘Simple linear iterative cluster(SLIC) is widely used because controllable superpixel number, accurate edge covering, symmetrical production and fast speed of calculation. The main problem of the SLIC algorithm is its under-segmentation when applied to segment artificial structure images with unobvious boundaries and narrow regions. Therefore, an improved clustering segmentation algorithm to correct the segmentation results of SLIC is presented in this paper. The allocation of pixels is not only related to its own characteristic, but also to those of its surrounding pixels.Hence, it is appropriate to improve the standard SLIC through the pixels by focusing on boundaries. An improved SLIC method adheres better to the boundaries in the image is proposed, by using the first and second order difference operators as magnified factors. Experimental results demonstrate that the proposed method achieves an excellent boundary adherence for artificial structure images. The application of the proposed method is extended to images with an unobvious boundary in the Berkeley Segmentation Dataset BSDS500. In comparison with SLIC, the boundary adherence is increased obviously.
文摘To overcome the shortcomings of 1 D and 2 D Otsu’s thresholding techniques, the 3 D Otsu method has been developed.Among all Otsu’s methods, 3 D Otsu technique provides the best threshold values for the multi-level thresholding processes. In this paper, to improve the quality of segmented images, a simple and effective multilevel thresholding method is introduced. The proposed approach focuses on preserving edge detail by computing the 3 D Otsu along the fusion phenomena. The advantages of the presented scheme include higher quality outcomes, better preservation of tiny details and boundaries and reduced execution time with rising threshold levels. The fusion approach depends upon the differences between pixel intensity values within a small local space of an image;it aims to improve localized information after the thresholding process. The fusion of images based on local contrast can improve image segmentation performance by minimizing the loss of local contrast, loss of details and gray-level distributions. Results show that the proposed method yields more promising segmentation results when compared to conventional1 D Otsu, 2 D Otsu and 3 D Otsu methods, as evident from the objective and subjective evaluations.
文摘This study proposes a novel nature-inspired meta-heuristic optimizer based on the Reptile Search Algorithm combed with Salp Swarm Algorithm for image segmentation using gray-scale multi-level thresholding,called RSA-SSA.The proposed method introduces a better search space to find the optimal solution at each iteration.However,we proposed RSA-SSA to avoid the searching problem in the same area and determine the optimal multi-level thresholds.The obtained solutions by the proposed method are represented using the image histogram.The proposed RSA-SSA employed Otsu’s variance class function to get the best threshold values at each level.The performance measure for the proposed method is valid by detecting fitness function,structural similarity index,peak signal-to-noise ratio,and Friedman ranking test.Several benchmark images of COVID-19 validate the performance of the proposed RSA-SSA.The results showed that the proposed RSA-SSA outperformed other metaheuristics optimization algorithms published in the literature.
文摘<span style="font-family:Verdana;">Detecting and segmenting the lung regions in chest X-ray images is an important part in artificial intelligence-based computer-aided diagnosis/detection (AI-CAD) systems for chest radiography. However, if the chest X-ray images themselves are used as training data for the AI-CAD system, the system might learn the irrelevant image-based information resulting in the decrease of system’s performance. In this study, we propose a lung region segmentation method that can automatically remove the shoulder and scapula regions, mediastinum, and diaphragm regions in advance from various chest X-ray images to be used as learning data. The proposed method consists of three main steps. First, employ the simple linear iterative clustering algorithm, the lazy snapping technique and local entropy filter to generate an entropy map. Second, apply morphological operations to the entropy map to obtain a lung mask. Third, perform automated segmentation of the lung field using the obtained mask. A total of 30 images were used for the experiments. In order to verify the effectiveness of the proposed method, two other texture maps, namely, the maps created from the standard deviation filtering and the range filtering, were used for comparison. As a result, the proposed method using the entropy map was able to appropriately remove the unnecessary regions. In addition, this method was able to remove the markers present in the image, but the other two methods could not. The experimental results have revealed that our proposed method is a highly generalizable and useful algorithm. We believe that this method might act an important role to enhance the performance of AI-CAD systems for chest X-ray images.</span>
基金Key R&D Projects in Hebei Province(22370301D)Scientific Research Foundation of Hebei University for Distinguished Young Scholars(521100221081)Scientific Research Foundation of Colleges and Universities in Hebei Province(QN2022107)。
文摘Objective For computer-aided Chinese medical diagnosis and aiming at the problem of insufficient segmentation,a novel multi-level method based on the multi-scale fusion residual neural network(MF2ResU-Net)model is proposed.Methods To obtain refined features of retinal blood vessels,three cascade connected UNet networks are employed.To deal with the problem of difference between the parts of encoder and decoder,in MF2ResU-Net,shortcut connections are used to combine the encoder and decoder layers in the blocks.To refine the feature of segmentation,atrous spatial pyramid pooling(ASPP)is embedded to achieve multi-scale features for the final segmentation networks.Results The MF2ResU-Net was superior to the existing methods on the criteria of sensitivity(Sen),specificity(Spe),accuracy(ACC),and area under curve(AUC),the values of which are 0.8013 and 0.8102,0.9842 and 0.9809,0.9700 and 0.9776,and 0.9797 and 0.9837,respectively for DRIVE and CHASE DB1.The results of experiments demonstrated the effectiveness and robustness of the model in the segmentation of complex curvature and small blood vessels.Conclusion Based on residual connections and multi-feature fusion,the proposed method can obtain accurate segmentation of retinal blood vessels by refining the segmentation features,which can provide another diagnosis method for computer-aided Chinese medical diagnosis.
文摘A novel stepwise thresholding method for fuzzy image segmentation is proposed. Unlike the published iterative or recursive thresholding mehtods, this method segments regions into sub-regions iteratively by increasing threshold value in a stepwise manner, based on a preset intensity homogeneity criteria. The method is particularly suited to segmentation of the laser scanning confocal microscopy (LSCM) images, computerised tomography (CT) images, magnetic resonance (MR) images, fingerprint images, etc. The method has been tested on some typical fuzzy image data sets. In this paper, the novel stepwise thresholding is first addressed. Next a new method of region labelling for region extraction is introduced. Then the design of intensity homogeneity segmentation criteria is presented. Some examples of the experiment results of fuzzy image segmentation by the method are given at the end.
文摘Salient object detection(SOD)models struggle to simultaneously preserve global structure,maintain sharp object boundaries,and sustain computational efficiency in complex scenes.In this study,we propose SPSALNet,a task-driven two-stage(macro–micro)architecture that restructures the SOD process around superpixel representations.In the proposed approach,a“split-and-enhance”principle,introduced to our knowledge for the first time in the SOD literature,hierarchically classifies superpixels and then applies targeted refinement only to ambiguous or error-prone regions.At the macro stage,the image is partitioned into content-adaptive superpixel regions,and each superpixel is represented by a high-dimensional region-level feature vector.These representations define a regional decomposition problem in which superpixels are assigned to three classes:background,object interior,and transition regions.Superpixel tokens interact with a global feature vector from a deep network backbone through a cross-attention module and are projected into an enriched embedding space that jointly encodes local topology and global context.At the micro stage,the model employs a U-Net-based refinement process that allocates computational resources only to ambiguous transition regions.The image and distance–similarity maps derived from superpixels are processed through a dual-encoder pathway.Subsequently,channel-aware fusion blocks adaptively combine information from these two sources,producing sharper and more stable object boundaries.Experimental results show that SPSALNet achieves high accuracy with lower computational cost compared to recent competing methods.On the PASCAL-S and DUT-OMRON datasets,SPSALNet exhibits a clear performance advantage across all key metrics,and it ranks first on accuracy-oriented measures on HKU-IS.On the challenging DUT-OMRON benchmark,SPSALNet reaches a MAE of 0.034.Across all datasets,it preserves object boundaries and regional structure in a stable and competitive manner.
基金sponsored by Jiangsu Provincial Colleges and Universities Natural Science Foundation of China (Grant No.08KJD410001)Humanities and Social Sciences Planning Fund of Ministry of Education of China (Grant No. 12YJAZH151)Humanities and Social Sciences Youth Fund of Ministry of Education of China (Grant No. 12YJCZH209)
文摘Complex product development will inevitably face the design planning of the multi-coupled activities, and overlapping these activities could potentially reduce product development time, but there is a risk of the additional cost. Although the downstream task information dependence to the upstream task is already considered in the current researches, but the design process overall iteration caused by the information interdependence between activities is hardly discussed; especially the impact on the design process' overall iteration from the valid information accumulation process. Secondly, most studies only focus on the single overlapping process of two activities, rarely take multi-segment and multi-ply overlapping process of multi coupled activities into account; especially the inherent link between product development time and cost which originates from the overlapping process of multi coupled activities. For the purpose of solving the above problems, as to the insufficiency of the accumulated valid information in overlapping process, the function of the valid information evolution (VIE) degree is constructed. Stochastic process theory is used to describe the design information exchange and the valid information accumulation in the overlapping segment, and then the planning models of the single overlapping segment are built. On these bases, by analyzing overlapping processes and overlapping features of multi-coupling activities, multi-segment and multi-ply overlapping planning models are built; by sorting overlapping processes and analyzing the construction of these planning models, two conclusions are obtained: (1) As to multi-segment and multi-ply overlapping of multi coupled activities, the total decrement of the task set development time is the sum of the time decrement caused by basic overlapping segments, and minus the sum of the time increment caused by multiple overlapping segments; (2) the total increment of development cost is the sum of the cost increment caused by all overlapping process. And then, based on overlapping degree analysis of these planning models, by the V1E degree function, the four lemmas theory proofs are represented, and two propositions are finally proved: (1) The multi-ply overlapping of the multi coupled activities will weaken the basic overlapping effect on the development cycle time reduction (2) Overlapping the multi coupled activities will decrease product development cycle, but increase product development cost. And there is trade-off between development time and cost. And so, two methods are given to slacken and eliminate multi-ply overlapping effects. At last, an example about a vehicle upper subsystem design illustrates the application of the proposed models; compared with a sequential execution pattern, the decreasing of development cycle (22%) and the increasing of development cost (3%) show the validity of the method in the example The proposed research not only lays a theoretical foundation for correctly planning complex product development process, but also provides specific and effective operation methods for overlapping multi coupled activities.
基金supported by the National Natural Science Foundation of China(Nos.10632030,10902020,and 10721062)the Research Fund for the Doctoral Program of Higher Education of China(No.20070141067)+2 种基金the Doctoral Fund of Liaoning Province(No.20081091)the Key Laboratory Fund of Liaoning Province of China(No.2009S018)the Young Researcher Funds of Dalian University of Technology(No.SFDUT07002)
文摘By converting an optimal control problem for nonlinear systems to a Hamiltonian system,a symplecitc-preserving method is proposed.The state and costate variables are approximated by the Lagrange polynomial.The state variables at two ends of the time interval are taken as independent variables.Based on the dual variable principle,nonlinear optimal control problems are replaced with nonlinear equations.Furthermore,in the implementation of the symplectic algorithm,based on the 2N algorithm,a multilevel method is proposed.When the time grid is refined from low level to high level,the initial state and costate variables of the nonlinear equations can be obtained from the Lagrange interpolation at the low level grid to improve efficiency.Numerical simulations show the precision and the efficiency of the proposed algorithm in this paper.