Nonlinear analysis of heart rate variability (HRV) has become important as heart behaves as a complex system. In this work, the approximate entropy (ApEn) has been used as a nonlinear measure. A new concept of est...Nonlinear analysis of heart rate variability (HRV) has become important as heart behaves as a complex system. In this work, the approximate entropy (ApEn) has been used as a nonlinear measure. A new concept of estimating the ApEn in different segments of long length of the recorded data called modified multiple scale (segment) entropy (MMPE) is introduced. The idea of estimating the approximate entropy in different segments is useful to detect the nonlinear dynamics of the heart present in the entire length of data. The present work has been carried out for three cases namely the normal healthy heart (NHH) data, congestive heart failure (CHF) data and Atrial fibrillation (AF) data and the data are analyzed using MMPE techniques. It is observed that the mean value of ApEn for NHH data is much higher than the mean values for CHF data and AF data. The ApEn profiles of CHF, AF and NHH data for different segments obtained using MPE profiles measures the heart dynamism for the three different cases. Also the power spectral density is obtained using fast fourier transform (FFT) analysis and the ratio of LF/HF (low frequency/high frequency) power are computed on multiple scales/segments namely MPLH (multiple scale low frequency to high frequency) for the NHH data, CHF data and AF data and analyzed using MPLH techniques. The results are presented and discussed in the paper.展开更多
Brain tumors present significant challenges in medical diagnosis and treatment,where early detection is crucial for reducing morbidity and mortality rates.This research introduces a novel deep learning model,the Progr...Brain tumors present significant challenges in medical diagnosis and treatment,where early detection is crucial for reducing morbidity and mortality rates.This research introduces a novel deep learning model,the Progressive Layered U-Net(PLU-Net),designed to improve brain tumor segmentation accuracy from Magnetic Resonance Imaging(MRI)scans.The PLU-Net extends the standard U-Net architecture by incorporating progressive layering,attention mechanisms,and multi-scale data augmentation.The progressive layering involves a cascaded structure that refines segmentation masks across multiple stages,allowing the model to capture features at different scales and resolutions.Attention gates within the convolutional layers selectively focus on relevant features while suppressing irrelevant ones,enhancing the model's ability to delineate tumor boundaries.Additionally,multi-scale data augmentation techniques increase the diversity of training data and boost the model's generalization capabilities.Evaluated on the BraTS 2021 dataset,the PLU-Net achieved state-of-the-art performance with a dice coefficient of 0.91,specificity of 0.92,sensitivity of 0.89,Hausdorff95 of 2.5,outperforming other modified U-Net architectures in segmentation accuracy.These results underscore the effectiveness of the PLU-Net in improving brain tumor segmentation from MRI scans,supporting clinicians in early diagnosis,treatment planning,and the development of new therapies.展开更多
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ...Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.展开更多
Background:Diabetic retinopathy(DR)is one of the primary causes of visual impairment globally,resulting from microvascular abnormalities in the retina.Accurate segmentation of retinal blood vessels from fundus images ...Background:Diabetic retinopathy(DR)is one of the primary causes of visual impairment globally,resulting from microvascular abnormalities in the retina.Accurate segmentation of retinal blood vessels from fundus images plays a pivotal role in the early diagnosis,progression monitoring,and treatment planning of DR and related ocular conditions.Traditional convolutional neural networks often struggle with capturing the intricate structures of thin vessels under varied illumination and contrast conditions.Methods:In this study,we propose an improved U-Net-based framework named MSAC U-Net,which enhances feature extraction and reconstruction through multiscale and attention-based modules.Specifically,the encoder replaces standard convolutions with a Multiscale Asymmetric Convolution(MSAC)block,incorporating parallel 1×n,n×1,and n×n kernels at different scales(3×3,5×5,7×7)to effectively capture fine-grained vascular structures.To further refine spatial representation,skip connections are utilized,and the decoder is augmented with dual activation strategies,Squeeze-and-Excitation blocks,and Convolutional Block Attention Modules for improved contextual understanding.Results:The model was evaluated on the publicly available DRIVE dataset.It achieved an accuracy of 96.48%,sensitivity of 88.31%,specificity of 97.90%,and an AUC of 98.59%,demonstrating superior performance compared to several state-of-the-art segmentation methods.Conclusion:The proposed MSAC U-Net provides a robust and accurate approach for retinal vessel segmentation,offering substantial clinical value in the early detection and management of diabetic retinopathy.Its design contributes to enhanced segmentation reliability and may serve as a foundation for broader applications in medical image analysis.展开更多
Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding ...Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.展开更多
Semantic segmentation has made significant breakthroughs in various application fields,but achieving both accurate and efficient segmentation with limited computational resources remains a major challenge.To this end,...Semantic segmentation has made significant breakthroughs in various application fields,but achieving both accurate and efficient segmentation with limited computational resources remains a major challenge.To this end,we propose CGMISeg,an efficient semantic segmentation architecture based on a context-guided multi-scale interaction strategy,aiming to significantly reduce computational overhead while maintaining segmentation accuracy.CGMISeg consists of three core components:context-aware attention modulation,feature reconstruction,and crossinformation fusion.Context-aware attention modulation is carefully designed to capture key contextual information through channel and spatial attention mechanisms.The feature reconstruction module reconstructs contextual information from different scales,modeling key rectangular areas by capturing critical contextual information in both horizontal and vertical directions,thereby enhancing the focus on foreground features.The cross-information fusion module aims to fuse the reconstructed high-level features with the original low-level features during upsampling,promoting multi-scale interaction and enhancing the model’s ability to handle objects at different scales.We extensively evaluated CGMISeg on ADE20K,Cityscapes,and COCO-Stuff,three widely used datasets benchmarks,and the experimental results show that CGMISeg exhibits significant advantages in segmentation performance,computational efficiency,and inference speed,clearly outperforming several mainstream methods,including SegFormer,Feedformer,and SegNext.Specifically,CGMISeg achieves 42.9%mIoU(Mean Intersection over Union)and 15.7 FPS(Frames Per Second)on the ADE20K dataset with 3.8 GFLOPs(Giga Floating-point Operations Per Second),outperforming Feedformer and SegNeXt by 3.7%and 1.8%in mIoU,respectively,while also offering reduced computational complexity and faster inference.CGMISeg strikes an excellent balance between accuracy and efficiency,significantly enhancing both computational and inference performance while maintaining high precision,showcasing exceptional practical value and strong potential for widespread applications.展开更多
Lung cancer(LC)is a major cancer which accounts for higher mortality rates worldwide.Doctors utilise many imaging modalities for identifying lung tumours and their severity in earlier stages.Nowadays,machine learning(...Lung cancer(LC)is a major cancer which accounts for higher mortality rates worldwide.Doctors utilise many imaging modalities for identifying lung tumours and their severity in earlier stages.Nowadays,machine learning(ML)and deep learning(DL)methodologies are utilised for the robust detection and prediction of lung tumours.Recently,multi modal imaging emerged as a robust technique for lung tumour detection by combining various imaging features.To cope with that,we propose a novel multi modal imaging technique named versatile scale malleable image integration and patch wise attention network(VSMI2−PANet)which adopts three imaging modalities named computed tomography(CT),magnetic resonance imaging(MRI)and single photon emission computed tomography(SPECT).The designed model accepts input from CT and MRI images and passes it to the VSMI2 module that is composed of three sub-modules named image cropping module,scale malleable convolution layer(SMCL)and PANet module.CT and MRI images are subjected to image cropping module in a parallel manner to crop the meaningful image patches and provide them to the SMCL module.The SMCL module is composed of adaptive convolutional layers that investigate those patches in a parallel manner by preserving the spatial information.The output from the SMCL is then fused and provided to the PANet module.The PANet module examines the fused patches by analysing its height,width and channels of the image patch.As a result,it provides an output as high-resolution spatial attention maps indicating the location of suspicious tumours.The high-resolution spatial attention maps are then provided as an input to the backbone module which uses light wave transformer(LWT)for segmenting the lung tumours into three classes,such as normal,benign and malignant.In addition,the LWT also accepts SPECT image as input for capturing the variations precisely to segment the lung tumours.The performance of the proposed model is validated using several performance metrics,such as accuracy,precision,recall,F1-score and AUC curve,and the results show that the proposed work outperforms the existing approaches.展开更多
In high-risk industrial environments like nuclear power plants,precise defect identification and localization are essential for maintaining production stability and safety.However,the complexity of such a harsh enviro...In high-risk industrial environments like nuclear power plants,precise defect identification and localization are essential for maintaining production stability and safety.However,the complexity of such a harsh environment leads to significant variations in the shape and size of the defects.To address this challenge,we propose the multivariate time series segmentation network(MSSN),which adopts a multiscale convolutional network with multi-stage and depth-separable convolutions for efficient feature extraction through variable-length templates.To tackle the classification difficulty caused by structural signal variance,MSSN employs logarithmic normalization to adjust instance distributions.Furthermore,it integrates classification with smoothing loss functions to accurately identify defect segments amid similar structural and defect signal subsequences.Our algorithm evaluated on both the Mackey-Glass dataset and industrial dataset achieves over 95%localization and demonstrates the capture capability on the synthetic dataset.In a nuclear plant's heat transfer tube dataset,it captures 90%of defect instances with75%middle localization F1 score.展开更多
Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete v...Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete vessel segmentation and poor continuity.In this study,we propose CT-MFENet to address the aforementioned issues.First,the use of context transformer(CT)allows for the integration of contextual feature information,which helps establish the connection between pixels and solve the problem of incomplete vessel continuity.Second,multi-scale dense residual networks are used instead of traditional CNN to address the issue of inadequate local feature extraction when the model encounters vessels at multiple scales.In the decoding stage,we introduce a local-global fusion module.It enhances the localization of vascular information and reduces the semantic gap between high-and low-level features.To address the class imbalance in retinal images,we propose a hybrid loss function that enhances the segmentation ability of the model for topological structures.We conducted experiments on the publicly available DRIVE,CHASEDB1,STARE,and IOSTAR datasets.The experimental results show that our CT-MFENet performs better than most existing methods,including the baseline U-Net.展开更多
Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional a...Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional approaches often fail in the face of challenges such as low contrast, morphological variability, and densely packed structures. Recent advancements in deep learning have transformed segmentation capabilities through the integration of fine-scale detail preservation, coarse-scale contextual modeling, and multi-scale feature fusion. This work provides a comprehensive analysis of state-of-the-art deep learning models, including U-Net variants, attention-based frameworks, and Transformer-integrated networks, highlighting innovations that improve accuracy, generalizability, and computational efficiency. Key architectural components such as convolution operations, shallow and deep blocks, skip connections, and hybrid encoders are examined for their roles in enhancing spatial representation and semantic consistency. We further discuss the importance of hierarchical and instance-aware segmentation and annotation in interpreting complex biological scenes and multiplexed medical images. By bridging methodological developments with diverse application domains, this paper outlines current trends and future directions for semantic segmentation, emphasizing its critical role in facilitating annotation, diagnosis, and discovery in biomedical research.展开更多
Optimal scale selection is the key step of the slope segmentation. Taking three geomorphological units in different parts of the loess as test areas and 5 m-resolution DEMs as original test date, this paper employed t...Optimal scale selection is the key step of the slope segmentation. Taking three geomorphological units in different parts of the loess as test areas and 5 m-resolution DEMs as original test date, this paper employed the changed ROC-LV (Lucian, 2010) in judging the optimal scales in the slope segmentation process. The experiment results showed that this method is effective in determining the optimal scale in the slope segmentation. The results also showed that the slope segmentation of the different geomorphological units require different optimal scales because the landform complexity is varied. The three test areas require the same scale which could distinguish the small gully because all the test areas have many gullies of the same size, however, when come to distinguish the basins, since the complexity of the three areas is different, the test areas require different scales.展开更多
A new algorithm for segmentation of suspected lung ROI(regions of interest)by mean-shift clustering and multi-scale HESSIAN matrix dot filtering was proposed.Original image was firstly filtered by multi-scale HESSIAN ...A new algorithm for segmentation of suspected lung ROI(regions of interest)by mean-shift clustering and multi-scale HESSIAN matrix dot filtering was proposed.Original image was firstly filtered by multi-scale HESSIAN matrix dot filters,round suspected nodular lesions in the image were enhanced,and linear shape regions of the trachea and vascular were suppressed.Then,three types of information,such as,shape filtering value of HESSIAN matrix,gray value,and spatial location,were introduced to feature space.The kernel function of mean-shift clustering was divided into product form of three kinds of kernel functions corresponding to the three feature information.Finally,bandwidths were calculated adaptively to determine the bandwidth of each suspected area,and they were used in mean-shift clustering segmentation.Experimental results show that by the introduction of HESSIAN matrix of dot filtering information to mean-shift clustering,nodular regions can be segmented from blood vessels,trachea,or cross regions connected to the nodule,non-nodular areas can be removed from ROIs properly,and ground glass object(GGO)nodular areas can also be segmented.For the experimental data set of 127 different forms of nodules,the average accuracy of the proposed algorithm is more than 90%.展开更多
In this study, we examined the thermal effects throughout the process of the placement of span-scale girder segments on a 6×110-m continuous steel box girder in the Hong Kong-Zhuhai-Macao Bridge. Firstly, when a ...In this study, we examined the thermal effects throughout the process of the placement of span-scale girder segments on a 6×110-m continuous steel box girder in the Hong Kong-Zhuhai-Macao Bridge. Firstly, when a span-scale girder segment is temporarily stored in the open air, temperature gradients will significantly increase the maximum reaction force on temporary supports and cause local buckling at the bottom of the girder segment. Secondly, due to the temperature difference of the girder segments before and after girth-welding, some residual thermal deflections will appear on the girder segments because the boundary conditions of the structure are changed by the girth-welding. Thirdly, the thermal expansion and thermal bending of girder segments will cause movement and rotation of bearings, which must be considered in setting bearings. We propose control measures for these problems based on finite element method simulation with field-measured temperatures. The local buckling during open-air storage can be avoided by reasonably determining the appropriate positions of temporary supports using analysis of overall and local stresses. The residual thermal deflections can be overcome by performing girth-welding during a period when the vertical temperature difference of the girder is within 1 °C, such as after 22:00. Some formulas are proposed to determine the pre-set distances for bearings, in which the movement and rotation of the bearings due to dead loads and thermal loads are considered. Finally, the feasibility of these control measures in the placement of span-scale girder segments on a real continuous girder was verified: no local buckling was observed during open-air storage;the residual thermal deflections after girth-welding were controlled within 5 mm and the residual pre-set distances of bearings when the whole continuous girder reached its design state were controlled within 20 mm.展开更多
The oxide scale present on the feedstock particles is critical for inter-particle bond formation in the cold spray(CS)coating process,therefore,oxide scale break-up is a prerequisite for clean metallic contact which g...The oxide scale present on the feedstock particles is critical for inter-particle bond formation in the cold spray(CS)coating process,therefore,oxide scale break-up is a prerequisite for clean metallic contact which greatly improves the quality of inter-particle bonding within the deposited coating.In general,a spray powder which contains a thicker oxide scale on its surface(i.e.,powders having high oxygen content)requires a higher critical particle velocity for coating formation,which also lowers the deposition efficiency(DE)making the whole process a challenging task.In this work,it is reported for the first time that an artificially oxidized copper(Cu)powder containing a high oxygen content of 0.81 wt.%with a thick surface oxide scale of 0.71μm.,can help achieve an astonishing increment in DE.A transition of surficial oxide scale evolution starting with crack initiations followed by segmenting to peeling-off was observed during the high velocity particle impact of the particles,which helps in achieving an astounding increment in DE.Single-particle deposit observations revealed that the thick oxide scale peels off from most of the sprayed powder surfaces during the high-velocity impact,which leaves a clean metallic surface on the deposited particle.This makes the successive particles to bond easily and thus leads to a higher DE.Further,owning to the peeling-off of the oxide scale from the feedstock particles,very few discontinuous oxide scale segments are retained at inter-particle boundaries ensuring a high electrical conductivity within the resulting deposit.Dependency of the oxide scale threshold thickness for peeling-off during the high velocity particle impact was also investigated.展开更多
In the smart logistics industry,unmanned forklifts that intelligently identify logistics pallets can improve work efficiency in warehousing and transportation and are better than traditional manual forklifts driven by...In the smart logistics industry,unmanned forklifts that intelligently identify logistics pallets can improve work efficiency in warehousing and transportation and are better than traditional manual forklifts driven by humans.Therefore,they play a critical role in smart warehousing,and semantics segmentation is an effective method to realize the intelligent identification of logistics pallets.However,most current recognition algorithms are ineffective due to the diverse types of pallets,their complex shapes,frequent blockades in production environments,and changing lighting conditions.This paper proposes a novel multi-feature fusion-guided multiscale bidirectional attention(MFMBA)neural network for logistics pallet segmentation.To better predict the foreground category(the pallet)and the background category(the cargo)of a pallet image,our approach extracts three types of features(grayscale,texture,and Hue,Saturation,Value features)and fuses them.The multiscale architecture deals with the problem that the size and shape of the pallet may appear different in the image in the actual,complex environment,which usually makes feature extraction difficult.Our study proposes a multiscale architecture that can extract additional semantic features.Also,since a traditional attention mechanism only assigns attention rights from a single direction,we designed a bidirectional attention mechanism that assigns cross-attention weights to each feature from two directions,horizontally and vertically,significantly improving segmentation.Finally,comparative experimental results show that the precision of the proposed algorithm is 0.53%–8.77%better than that of other methods we compared.展开更多
Traditional models for semantic segmentation in point clouds primarily focus on smaller scales.However,in real-world applications,point clouds often exhibit larger scales,leading to heavy computational and memory requ...Traditional models for semantic segmentation in point clouds primarily focus on smaller scales.However,in real-world applications,point clouds often exhibit larger scales,leading to heavy computational and memory requirements.The key to handling large-scale point clouds lies in leveraging random sampling,which offers higher computational efficiency and lower memory consumption compared to other sampling methods.Nevertheless,the use of random sampling can potentially result in the loss of crucial points during the encoding stage.To address these issues,this paper proposes cross-fusion self-attention network(CFSA-Net),a lightweight and efficient network architecture specifically designed for directly processing large-scale point clouds.At the core of this network is the incorporation of random sampling alongside a local feature extraction module based on cross-fusion self-attention(CFSA).This module effectively integrates long-range contextual dependencies between points by employing hierarchical position encoding(HPC).Furthermore,it enhances the interaction between each point’s coordinates and feature information through cross-fusion self-attention pooling,enabling the acquisition of more comprehensive geometric information.Finally,a residual optimization(RO)structure is introduced to extend the receptive field of individual points by stacking hierarchical position encoding and cross-fusion self-attention pooling,thereby reducing the impact of information loss caused by random sampling.Experimental results on the Stanford Large-Scale 3D Indoor Spaces(S3DIS),Semantic3D,and SemanticKITTI datasets demonstrate the superiority of this algorithm over advanced approaches such as RandLA-Net and KPConv.These findings underscore the excellent performance of CFSA-Net in large-scale 3D semantic segmentation.展开更多
Zanthoxylum bungeanum Maxim,generally called prickly ash,is widely grown in China.Zanthoxylum rust is the main disease affecting the growth and quality of Zanthoxylum.Traditional method for recognizing the degree of i...Zanthoxylum bungeanum Maxim,generally called prickly ash,is widely grown in China.Zanthoxylum rust is the main disease affecting the growth and quality of Zanthoxylum.Traditional method for recognizing the degree of infection of Zanthoxylum rust mainly rely on manual experience.Due to the complex colors and shapes of rust areas,the accuracy of manual recognition is low and difficult to be quantified.In recent years,the application of artificial intelligence technology in the agricultural field has gradually increased.In this paper,based on the DeepLabV2 model,we proposed a Zanthoxylum rust image segmentation model based on the FASPP module and enhanced features of rust areas.This paper constructed a fine-grained Zanthoxylum rust image dataset.In this dataset,the Zanthoxylum rust image was segmented and labeled according to leaves,spore piles,and brown lesions.The experimental results showed that the Zanthoxylum rust image segmentation method proposed in this paper was effective.The segmentation accuracy rates of leaves,spore piles and brown lesions reached 99.66%,85.16%and 82.47%respectively.MPA reached 91.80%,and MIoU reached 84.99%.At the same time,the proposed image segmentation model also had good efficiency,which can process 22 images per minute.This article provides an intelligent method for efficiently and accurately recognizing the degree of infection of Zanthoxylum rust.展开更多
As an important part of the new generation of information technology,the Internet of Things(IoT)has been widely concerned and regarded as an enabling technology of the next generation of health care system.The fundus ...As an important part of the new generation of information technology,the Internet of Things(IoT)has been widely concerned and regarded as an enabling technology of the next generation of health care system.The fundus photography equipment is connected to the cloud platform through the IoT,so as to realize the realtime uploading of fundus images and the rapid issuance of diagnostic suggestions by artificial intelligence.At the same time,important security and privacy issues have emerged.The data uploaded to the cloud platform involves more personal attributes,health status and medical application data of patients.Once leaked,abused or improperly disclosed,personal information security will be violated.Therefore,it is important to address the security and privacy issues of massive medical and healthcare equipment connecting to the infrastructure of IoT healthcare and health systems.To meet this challenge,we propose MIA-UNet,a multi-scale iterative aggregation U-network,which aims to achieve accurate and efficient retinal vessel segmentation for ophthalmic auxiliary diagnosis while ensuring that the network has low computational complexity to adapt to mobile terminals.In this way,users do not need to upload the data to the cloud platform,and can analyze and process the fundus images on their own mobile terminals,thus eliminating the leakage of personal information.Specifically,the interconnection between encoder and decoder,as well as the internal connection between decoder subnetworks in classic U-Net are redefined and redesigned.Furthermore,we propose a hybrid loss function to smooth the gradient and deal with the imbalance between foreground and background.Compared with the UNet,the segmentation performance of the proposed network is significantly improved on the premise that the number of parameters is only increased by 2%.When applied to three publicly available datasets:DRIVE,STARE and CHASE DB1,the proposed network achieves the accuracy/F1-score of 96.33%/84.34%,97.12%/83.17%and 97.06%/84.10%,respectively.The experimental results show that the MIA-UNet is superior to the state-of-the-art methods.展开更多
Watershed segmentation is sensitive to noises and irregular details within the image,which frequently leads to a serious over-segmentation Linear filtering before watershed segmentation can reduce over-segmentation to...Watershed segmentation is sensitive to noises and irregular details within the image,which frequently leads to a serious over-segmentation Linear filtering before watershed segmentation can reduce over-segmentation to some extent,however,it often causes the position offset of object contours.For the purpose of reducing over-segmentation to preserve the location of object contours,the watershed segmentation based on the hierarchical multi-scale modification of morphological gradient is proposed.Firstly,multi-scale morphological filtering was employed to smooth the original image.Then,the gradient image was divided into multi-levels by the volume of three-dimension topographic relief,where the lower gradient layers were further modifiedby morphological closing with larger-sized structuring-elements,and the higher layers with the smaller one.In this way,most local minimums caused by irregular details and noises can be removed,while region contour positions corresponding to the target area were largely preserved.Finally,morphological watershed algorithm was employed to implement segmentation on the modified gradient image.The experimental results show that the proposed method can greatly reduce the over-segmentation of the watershed and avoid the position offset of the object contours.展开更多
Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often...Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.展开更多
文摘Nonlinear analysis of heart rate variability (HRV) has become important as heart behaves as a complex system. In this work, the approximate entropy (ApEn) has been used as a nonlinear measure. A new concept of estimating the ApEn in different segments of long length of the recorded data called modified multiple scale (segment) entropy (MMPE) is introduced. The idea of estimating the approximate entropy in different segments is useful to detect the nonlinear dynamics of the heart present in the entire length of data. The present work has been carried out for three cases namely the normal healthy heart (NHH) data, congestive heart failure (CHF) data and Atrial fibrillation (AF) data and the data are analyzed using MMPE techniques. It is observed that the mean value of ApEn for NHH data is much higher than the mean values for CHF data and AF data. The ApEn profiles of CHF, AF and NHH data for different segments obtained using MPE profiles measures the heart dynamism for the three different cases. Also the power spectral density is obtained using fast fourier transform (FFT) analysis and the ratio of LF/HF (low frequency/high frequency) power are computed on multiple scales/segments namely MPLH (multiple scale low frequency to high frequency) for the NHH data, CHF data and AF data and analyzed using MPLH techniques. The results are presented and discussed in the paper.
文摘Brain tumors present significant challenges in medical diagnosis and treatment,where early detection is crucial for reducing morbidity and mortality rates.This research introduces a novel deep learning model,the Progressive Layered U-Net(PLU-Net),designed to improve brain tumor segmentation accuracy from Magnetic Resonance Imaging(MRI)scans.The PLU-Net extends the standard U-Net architecture by incorporating progressive layering,attention mechanisms,and multi-scale data augmentation.The progressive layering involves a cascaded structure that refines segmentation masks across multiple stages,allowing the model to capture features at different scales and resolutions.Attention gates within the convolutional layers selectively focus on relevant features while suppressing irrelevant ones,enhancing the model's ability to delineate tumor boundaries.Additionally,multi-scale data augmentation techniques increase the diversity of training data and boost the model's generalization capabilities.Evaluated on the BraTS 2021 dataset,the PLU-Net achieved state-of-the-art performance with a dice coefficient of 0.91,specificity of 0.92,sensitivity of 0.89,Hausdorff95 of 2.5,outperforming other modified U-Net architectures in segmentation accuracy.These results underscore the effectiveness of the PLU-Net in improving brain tumor segmentation from MRI scans,supporting clinicians in early diagnosis,treatment planning,and the development of new therapies.
基金supported by the Natural Science Foundation of the Anhui Higher Education Institutions of China(Grant Nos.2023AH040149 and 2024AH051915)the Anhui Provincial Natural Science Foundation(Grant No.2208085MF168)+1 种基金the Science and Technology Innovation Tackle Plan Project of Maanshan(Grant No.2024RGZN001)the Scientific Research Fund Project of Anhui Medical University(Grant No.2023xkj122).
文摘Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.
基金supported by the Guangdong Basic and Applied Basic Research Foundation(2024A1515010987)the Medical Scientific Research Foundation of Guangdong Province(B2024035).
文摘Background:Diabetic retinopathy(DR)is one of the primary causes of visual impairment globally,resulting from microvascular abnormalities in the retina.Accurate segmentation of retinal blood vessels from fundus images plays a pivotal role in the early diagnosis,progression monitoring,and treatment planning of DR and related ocular conditions.Traditional convolutional neural networks often struggle with capturing the intricate structures of thin vessels under varied illumination and contrast conditions.Methods:In this study,we propose an improved U-Net-based framework named MSAC U-Net,which enhances feature extraction and reconstruction through multiscale and attention-based modules.Specifically,the encoder replaces standard convolutions with a Multiscale Asymmetric Convolution(MSAC)block,incorporating parallel 1×n,n×1,and n×n kernels at different scales(3×3,5×5,7×7)to effectively capture fine-grained vascular structures.To further refine spatial representation,skip connections are utilized,and the decoder is augmented with dual activation strategies,Squeeze-and-Excitation blocks,and Convolutional Block Attention Modules for improved contextual understanding.Results:The model was evaluated on the publicly available DRIVE dataset.It achieved an accuracy of 96.48%,sensitivity of 88.31%,specificity of 97.90%,and an AUC of 98.59%,demonstrating superior performance compared to several state-of-the-art segmentation methods.Conclusion:The proposed MSAC U-Net provides a robust and accurate approach for retinal vessel segmentation,offering substantial clinical value in the early detection and management of diabetic retinopathy.Its design contributes to enhanced segmentation reliability and may serve as a foundation for broader applications in medical image analysis.
基金supported by Natural Science Foundation Programme of Gansu Province(No.24JRRA231)National Natural Science Foundation of China(No.62061023)Gansu Provincial Science and Technology Plan Key Research and Development Program Project(No.24YFFA024).
文摘Despite its remarkable performance on natural images,the segment anything model(SAM)lacks domain-specific information in medical imaging.and faces the challenge of losing local multi-scale information in the encoding phase.This paper presents a medical image segmentation model based on SAM with a local multi-scale feature encoder(LMSFE-SAM)to address the issues above.Firstly,based on the SAM,a local multi-scale feature encoder is introduced to improve the representation of features within local receptive field,thereby supplying the Vision Transformer(ViT)branch in SAM with enriched local multi-scale contextual information.At the same time,a multiaxial Hadamard product module(MHPM)is incorporated into the local multi-scale feature encoder in a lightweight manner to reduce the quadratic complexity and noise interference.Subsequently,a cross-branch balancing adapter is designed to balance the local and global information between the local multi-scale feature encoder and the ViT encoder in SAM.Finally,to obtain smaller input image size and to mitigate overlapping in patch embeddings,the size of the input image is reduced from 1024×1024 pixels to 256×256 pixels,and a multidimensional information adaptation component is developed,which includes feature adapters,position adapters,and channel-spatial adapters.This component effectively integrates the information from small-sized medical images into SAM,enhancing its suitability for clinical deployment.The proposed model demonstrates an average enhancement ranging from 0.0387 to 0.3191 across six objective evaluation metrics on BUSI,DDTI,and TN3K datasets compared to eight other representative image segmentation models.This significantly enhances the performance of the SAM on medical images,providing clinicians with a powerful tool in clinical diagnosis.
基金supported by the National Natural Science Foundation of China(62162007)the Guizhou Provincial Basic Research Program(Natural Science)(No.QianKeHeJiChu-ZK[2024]YiBan079).
文摘Semantic segmentation has made significant breakthroughs in various application fields,but achieving both accurate and efficient segmentation with limited computational resources remains a major challenge.To this end,we propose CGMISeg,an efficient semantic segmentation architecture based on a context-guided multi-scale interaction strategy,aiming to significantly reduce computational overhead while maintaining segmentation accuracy.CGMISeg consists of three core components:context-aware attention modulation,feature reconstruction,and crossinformation fusion.Context-aware attention modulation is carefully designed to capture key contextual information through channel and spatial attention mechanisms.The feature reconstruction module reconstructs contextual information from different scales,modeling key rectangular areas by capturing critical contextual information in both horizontal and vertical directions,thereby enhancing the focus on foreground features.The cross-information fusion module aims to fuse the reconstructed high-level features with the original low-level features during upsampling,promoting multi-scale interaction and enhancing the model’s ability to handle objects at different scales.We extensively evaluated CGMISeg on ADE20K,Cityscapes,and COCO-Stuff,three widely used datasets benchmarks,and the experimental results show that CGMISeg exhibits significant advantages in segmentation performance,computational efficiency,and inference speed,clearly outperforming several mainstream methods,including SegFormer,Feedformer,and SegNext.Specifically,CGMISeg achieves 42.9%mIoU(Mean Intersection over Union)and 15.7 FPS(Frames Per Second)on the ADE20K dataset with 3.8 GFLOPs(Giga Floating-point Operations Per Second),outperforming Feedformer and SegNeXt by 3.7%and 1.8%in mIoU,respectively,while also offering reduced computational complexity and faster inference.CGMISeg strikes an excellent balance between accuracy and efficiency,significantly enhancing both computational and inference performance while maintaining high precision,showcasing exceptional practical value and strong potential for widespread applications.
基金supported by the VTT Technical Research Centre of Finland and the work of Nayef Alqahtani is supported by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia(Grant KFU251882).
文摘Lung cancer(LC)is a major cancer which accounts for higher mortality rates worldwide.Doctors utilise many imaging modalities for identifying lung tumours and their severity in earlier stages.Nowadays,machine learning(ML)and deep learning(DL)methodologies are utilised for the robust detection and prediction of lung tumours.Recently,multi modal imaging emerged as a robust technique for lung tumour detection by combining various imaging features.To cope with that,we propose a novel multi modal imaging technique named versatile scale malleable image integration and patch wise attention network(VSMI2−PANet)which adopts three imaging modalities named computed tomography(CT),magnetic resonance imaging(MRI)and single photon emission computed tomography(SPECT).The designed model accepts input from CT and MRI images and passes it to the VSMI2 module that is composed of three sub-modules named image cropping module,scale malleable convolution layer(SMCL)and PANet module.CT and MRI images are subjected to image cropping module in a parallel manner to crop the meaningful image patches and provide them to the SMCL module.The SMCL module is composed of adaptive convolutional layers that investigate those patches in a parallel manner by preserving the spatial information.The output from the SMCL is then fused and provided to the PANet module.The PANet module examines the fused patches by analysing its height,width and channels of the image patch.As a result,it provides an output as high-resolution spatial attention maps indicating the location of suspicious tumours.The high-resolution spatial attention maps are then provided as an input to the backbone module which uses light wave transformer(LWT)for segmenting the lung tumours into three classes,such as normal,benign and malignant.In addition,the LWT also accepts SPECT image as input for capturing the variations precisely to segment the lung tumours.The performance of the proposed model is validated using several performance metrics,such as accuracy,precision,recall,F1-score and AUC curve,and the results show that the proposed work outperforms the existing approaches.
基金supported by the National Science and Technology Major Project of the Ministry of Science and Technology of China(2024ZD0608100)the National Natural Science Foundation of China(62332017,U22A2022)
文摘In high-risk industrial environments like nuclear power plants,precise defect identification and localization are essential for maintaining production stability and safety.However,the complexity of such a harsh environment leads to significant variations in the shape and size of the defects.To address this challenge,we propose the multivariate time series segmentation network(MSSN),which adopts a multiscale convolutional network with multi-stage and depth-separable convolutions for efficient feature extraction through variable-length templates.To tackle the classification difficulty caused by structural signal variance,MSSN employs logarithmic normalization to adjust instance distributions.Furthermore,it integrates classification with smoothing loss functions to accurately identify defect segments amid similar structural and defect signal subsequences.Our algorithm evaluated on both the Mackey-Glass dataset and industrial dataset achieves over 95%localization and demonstrates the capture capability on the synthetic dataset.In a nuclear plant's heat transfer tube dataset,it captures 90%of defect instances with75%middle localization F1 score.
基金the National Natural Science Foundation of China(No.62266025)。
文摘Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete vessel segmentation and poor continuity.In this study,we propose CT-MFENet to address the aforementioned issues.First,the use of context transformer(CT)allows for the integration of contextual feature information,which helps establish the connection between pixels and solve the problem of incomplete vessel continuity.Second,multi-scale dense residual networks are used instead of traditional CNN to address the issue of inadequate local feature extraction when the model encounters vessels at multiple scales.In the decoding stage,we introduce a local-global fusion module.It enhances the localization of vascular information and reduces the semantic gap between high-and low-level features.To address the class imbalance in retinal images,we propose a hybrid loss function that enhances the segmentation ability of the model for topological structures.We conducted experiments on the publicly available DRIVE,CHASEDB1,STARE,and IOSTAR datasets.The experimental results show that our CT-MFENet performs better than most existing methods,including the baseline U-Net.
基金Open Access funding provided by the National Institutes of Health(NIH)The funding for this project was provided by NCATS Intramural Fund.
文摘Semantic segmentation plays a foundational role in biomedical image analysis, providing precise information about cellular, tissue, and organ structures in both biological and medical imaging modalities. Traditional approaches often fail in the face of challenges such as low contrast, morphological variability, and densely packed structures. Recent advancements in deep learning have transformed segmentation capabilities through the integration of fine-scale detail preservation, coarse-scale contextual modeling, and multi-scale feature fusion. This work provides a comprehensive analysis of state-of-the-art deep learning models, including U-Net variants, attention-based frameworks, and Transformer-integrated networks, highlighting innovations that improve accuracy, generalizability, and computational efficiency. Key architectural components such as convolution operations, shallow and deep blocks, skip connections, and hybrid encoders are examined for their roles in enhancing spatial representation and semantic consistency. We further discuss the importance of hierarchical and instance-aware segmentation and annotation in interpreting complex biological scenes and multiplexed medical images. By bridging methodological developments with diverse application domains, this paper outlines current trends and future directions for semantic segmentation, emphasizing its critical role in facilitating annotation, diagnosis, and discovery in biomedical research.
文摘Optimal scale selection is the key step of the slope segmentation. Taking three geomorphological units in different parts of the loess as test areas and 5 m-resolution DEMs as original test date, this paper employed the changed ROC-LV (Lucian, 2010) in judging the optimal scales in the slope segmentation process. The experiment results showed that this method is effective in determining the optimal scale in the slope segmentation. The results also showed that the slope segmentation of the different geomorphological units require different optimal scales because the landform complexity is varied. The three test areas require the same scale which could distinguish the small gully because all the test areas have many gullies of the same size, however, when come to distinguish the basins, since the complexity of the three areas is different, the test areas require different scales.
基金Projects(61172002,61001047,60671050)supported by the National Natural Science Foundation of ChinaProject(N100404010)supported by Fundamental Research Grant Scheme for the Central Universities,China
文摘A new algorithm for segmentation of suspected lung ROI(regions of interest)by mean-shift clustering and multi-scale HESSIAN matrix dot filtering was proposed.Original image was firstly filtered by multi-scale HESSIAN matrix dot filters,round suspected nodular lesions in the image were enhanced,and linear shape regions of the trachea and vascular were suppressed.Then,three types of information,such as,shape filtering value of HESSIAN matrix,gray value,and spatial location,were introduced to feature space.The kernel function of mean-shift clustering was divided into product form of three kinds of kernel functions corresponding to the three feature information.Finally,bandwidths were calculated adaptively to determine the bandwidth of each suspected area,and they were used in mean-shift clustering segmentation.Experimental results show that by the introduction of HESSIAN matrix of dot filtering information to mean-shift clustering,nodular regions can be segmented from blood vessels,trachea,or cross regions connected to the nodule,non-nodular areas can be removed from ROIs properly,and ground glass object(GGO)nodular areas can also be segmented.For the experimental data set of 127 different forms of nodules,the average accuracy of the proposed algorithm is more than 90%.
基金Project supported by the National Natural Science Foundation of China(Nos.51578496 and 51878603)the Zhejiang Provincial Natural Science Foundation of China(No.LZ16E080001)。
文摘In this study, we examined the thermal effects throughout the process of the placement of span-scale girder segments on a 6×110-m continuous steel box girder in the Hong Kong-Zhuhai-Macao Bridge. Firstly, when a span-scale girder segment is temporarily stored in the open air, temperature gradients will significantly increase the maximum reaction force on temporary supports and cause local buckling at the bottom of the girder segment. Secondly, due to the temperature difference of the girder segments before and after girth-welding, some residual thermal deflections will appear on the girder segments because the boundary conditions of the structure are changed by the girth-welding. Thirdly, the thermal expansion and thermal bending of girder segments will cause movement and rotation of bearings, which must be considered in setting bearings. We propose control measures for these problems based on finite element method simulation with field-measured temperatures. The local buckling during open-air storage can be avoided by reasonably determining the appropriate positions of temporary supports using analysis of overall and local stresses. The residual thermal deflections can be overcome by performing girth-welding during a period when the vertical temperature difference of the girder is within 1 °C, such as after 22:00. Some formulas are proposed to determine the pre-set distances for bearings, in which the movement and rotation of the bearings due to dead loads and thermal loads are considered. Finally, the feasibility of these control measures in the placement of span-scale girder segments on a real continuous girder was verified: no local buckling was observed during open-air storage;the residual thermal deflections after girth-welding were controlled within 5 mm and the residual pre-set distances of bearings when the whole continuous girder reached its design state were controlled within 20 mm.
基金supported financially by the National Natural Science Foundation of China(No.51875443)the Guangdong Basic and Applied Basic Research Foundation(Nos.2019B1515120016 and 202002030290)+3 种基金the Shaanxi Co-Innovation Projects(No.2015KTTSGY03-03)the Shaanxi Natural Science Foundation(No.2015JQ5200)the Open Project from The Key Lab of Guangdong for Modern Surface Engineering Technologyfinancial support by Guangdong Academy of Sciences’Project of Constructing First-class Domestic Research Institutions(Nos.2019GDASYL-0503006,2020GDASYL-20200302011)。
文摘The oxide scale present on the feedstock particles is critical for inter-particle bond formation in the cold spray(CS)coating process,therefore,oxide scale break-up is a prerequisite for clean metallic contact which greatly improves the quality of inter-particle bonding within the deposited coating.In general,a spray powder which contains a thicker oxide scale on its surface(i.e.,powders having high oxygen content)requires a higher critical particle velocity for coating formation,which also lowers the deposition efficiency(DE)making the whole process a challenging task.In this work,it is reported for the first time that an artificially oxidized copper(Cu)powder containing a high oxygen content of 0.81 wt.%with a thick surface oxide scale of 0.71μm.,can help achieve an astonishing increment in DE.A transition of surficial oxide scale evolution starting with crack initiations followed by segmenting to peeling-off was observed during the high velocity particle impact of the particles,which helps in achieving an astounding increment in DE.Single-particle deposit observations revealed that the thick oxide scale peels off from most of the sprayed powder surfaces during the high-velocity impact,which leaves a clean metallic surface on the deposited particle.This makes the successive particles to bond easily and thus leads to a higher DE.Further,owning to the peeling-off of the oxide scale from the feedstock particles,very few discontinuous oxide scale segments are retained at inter-particle boundaries ensuring a high electrical conductivity within the resulting deposit.Dependency of the oxide scale threshold thickness for peeling-off during the high velocity particle impact was also investigated.
基金supported by the Postgraduate Scientific Research Innovation Project of Hunan Province under Grant QL20210212the Scientific Innovation Fund for Postgraduates of Central South University of Forestry and Technology under Grant CX202102043.
文摘In the smart logistics industry,unmanned forklifts that intelligently identify logistics pallets can improve work efficiency in warehousing and transportation and are better than traditional manual forklifts driven by humans.Therefore,they play a critical role in smart warehousing,and semantics segmentation is an effective method to realize the intelligent identification of logistics pallets.However,most current recognition algorithms are ineffective due to the diverse types of pallets,their complex shapes,frequent blockades in production environments,and changing lighting conditions.This paper proposes a novel multi-feature fusion-guided multiscale bidirectional attention(MFMBA)neural network for logistics pallet segmentation.To better predict the foreground category(the pallet)and the background category(the cargo)of a pallet image,our approach extracts three types of features(grayscale,texture,and Hue,Saturation,Value features)and fuses them.The multiscale architecture deals with the problem that the size and shape of the pallet may appear different in the image in the actual,complex environment,which usually makes feature extraction difficult.Our study proposes a multiscale architecture that can extract additional semantic features.Also,since a traditional attention mechanism only assigns attention rights from a single direction,we designed a bidirectional attention mechanism that assigns cross-attention weights to each feature from two directions,horizontally and vertically,significantly improving segmentation.Finally,comparative experimental results show that the precision of the proposed algorithm is 0.53%–8.77%better than that of other methods we compared.
基金funded by the National Natural Science Foundation of China Youth Project(61603127).
文摘Traditional models for semantic segmentation in point clouds primarily focus on smaller scales.However,in real-world applications,point clouds often exhibit larger scales,leading to heavy computational and memory requirements.The key to handling large-scale point clouds lies in leveraging random sampling,which offers higher computational efficiency and lower memory consumption compared to other sampling methods.Nevertheless,the use of random sampling can potentially result in the loss of crucial points during the encoding stage.To address these issues,this paper proposes cross-fusion self-attention network(CFSA-Net),a lightweight and efficient network architecture specifically designed for directly processing large-scale point clouds.At the core of this network is the incorporation of random sampling alongside a local feature extraction module based on cross-fusion self-attention(CFSA).This module effectively integrates long-range contextual dependencies between points by employing hierarchical position encoding(HPC).Furthermore,it enhances the interaction between each point’s coordinates and feature information through cross-fusion self-attention pooling,enabling the acquisition of more comprehensive geometric information.Finally,a residual optimization(RO)structure is introduced to extend the receptive field of individual points by stacking hierarchical position encoding and cross-fusion self-attention pooling,thereby reducing the impact of information loss caused by random sampling.Experimental results on the Stanford Large-Scale 3D Indoor Spaces(S3DIS),Semantic3D,and SemanticKITTI datasets demonstrate the superiority of this algorithm over advanced approaches such as RandLA-Net and KPConv.These findings underscore the excellent performance of CFSA-Net in large-scale 3D semantic segmentation.
基金This work was supported by Natural Science Foundation of China(Grant No.62071098)Sichuan Science and Technology Program(Grant Nos.2019YFG0191,2021YFG0307)Sichuan Zizhou Agricultural Science and Technology Co.,Ltd.project:Internet+smart Zanthoxylum planting weather risk warning system.
文摘Zanthoxylum bungeanum Maxim,generally called prickly ash,is widely grown in China.Zanthoxylum rust is the main disease affecting the growth and quality of Zanthoxylum.Traditional method for recognizing the degree of infection of Zanthoxylum rust mainly rely on manual experience.Due to the complex colors and shapes of rust areas,the accuracy of manual recognition is low and difficult to be quantified.In recent years,the application of artificial intelligence technology in the agricultural field has gradually increased.In this paper,based on the DeepLabV2 model,we proposed a Zanthoxylum rust image segmentation model based on the FASPP module and enhanced features of rust areas.This paper constructed a fine-grained Zanthoxylum rust image dataset.In this dataset,the Zanthoxylum rust image was segmented and labeled according to leaves,spore piles,and brown lesions.The experimental results showed that the Zanthoxylum rust image segmentation method proposed in this paper was effective.The segmentation accuracy rates of leaves,spore piles and brown lesions reached 99.66%,85.16%and 82.47%respectively.MPA reached 91.80%,and MIoU reached 84.99%.At the same time,the proposed image segmentation model also had good efficiency,which can process 22 images per minute.This article provides an intelligent method for efficiently and accurately recognizing the degree of infection of Zanthoxylum rust.
基金This work was supported in part by the National Natural Science Foundation of China(Nos.62072074,62076054,62027827,61902054)the Frontier Science and Technology Innovation Projects of National Key R&D Program(No.2019QY1405)+2 种基金the Sichuan Science and Technology Innovation Platform and Talent Plan(No.2020JDJQ0020)the Sichuan Science and Technology Support Plan(No.2020YFSY0010)the Natural Science Foundation of Guangdong Province(No.2018A030313354).
文摘As an important part of the new generation of information technology,the Internet of Things(IoT)has been widely concerned and regarded as an enabling technology of the next generation of health care system.The fundus photography equipment is connected to the cloud platform through the IoT,so as to realize the realtime uploading of fundus images and the rapid issuance of diagnostic suggestions by artificial intelligence.At the same time,important security and privacy issues have emerged.The data uploaded to the cloud platform involves more personal attributes,health status and medical application data of patients.Once leaked,abused or improperly disclosed,personal information security will be violated.Therefore,it is important to address the security and privacy issues of massive medical and healthcare equipment connecting to the infrastructure of IoT healthcare and health systems.To meet this challenge,we propose MIA-UNet,a multi-scale iterative aggregation U-network,which aims to achieve accurate and efficient retinal vessel segmentation for ophthalmic auxiliary diagnosis while ensuring that the network has low computational complexity to adapt to mobile terminals.In this way,users do not need to upload the data to the cloud platform,and can analyze and process the fundus images on their own mobile terminals,thus eliminating the leakage of personal information.Specifically,the interconnection between encoder and decoder,as well as the internal connection between decoder subnetworks in classic U-Net are redefined and redesigned.Furthermore,we propose a hybrid loss function to smooth the gradient and deal with the imbalance between foreground and background.Compared with the UNet,the segmentation performance of the proposed network is significantly improved on the premise that the number of parameters is only increased by 2%.When applied to three publicly available datasets:DRIVE,STARE and CHASE DB1,the proposed network achieves the accuracy/F1-score of 96.33%/84.34%,97.12%/83.17%and 97.06%/84.10%,respectively.The experimental results show that the MIA-UNet is superior to the state-of-the-art methods.
基金National Natural Science Foundation of China(No.61261029)
文摘Watershed segmentation is sensitive to noises and irregular details within the image,which frequently leads to a serious over-segmentation Linear filtering before watershed segmentation can reduce over-segmentation to some extent,however,it often causes the position offset of object contours.For the purpose of reducing over-segmentation to preserve the location of object contours,the watershed segmentation based on the hierarchical multi-scale modification of morphological gradient is proposed.Firstly,multi-scale morphological filtering was employed to smooth the original image.Then,the gradient image was divided into multi-levels by the volume of three-dimension topographic relief,where the lower gradient layers were further modifiedby morphological closing with larger-sized structuring-elements,and the higher layers with the smaller one.In this way,most local minimums caused by irregular details and noises can be removed,while region contour positions corresponding to the target area were largely preserved.Finally,morphological watershed algorithm was employed to implement segmentation on the modified gradient image.The experimental results show that the proposed method can greatly reduce the over-segmentation of the watershed and avoid the position offset of the object contours.
基金This research was supported by the National Natural Science Foundation of China No.62276086the National Key R&D Program of China No.2022YFD2000100Zhejiang Provincial Natural Science Foundation of China under Grant No.LTGN23D010002.
文摘Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.