In the context of automated analysis of eye fundus images, it is an important common fallacy that prior works achieve very high scores in segmentation of lesions, and that fallacy is fueled by some reviews reporting v...In the context of automated analysis of eye fundus images, it is an important common fallacy that prior works achieve very high scores in segmentation of lesions, and that fallacy is fueled by some reviews reporting very high scores, and perhaps some confusion with terms. A simple analysis of the detail of the few prior works that really do segmentation reveals scores between 7% and 70% in sensitivity for 1 FPI. That is clearly sub-par with medical doctors trained to detect signs of Diabetic Retinopathy, since they can distinguish well the contours of lesions in Eye Fundus Images (EFI). Still, a full segmentation of lesions could be an important step for both visualization and further automated analysis using rigorous quantification or areas and numbers of lesions to better diagnose. I discuss what prior work really does, using evidence-based analysis, and confront with segmentation networks, comparing on the terms used by prior work to show that the best performing segmentation network outperforms those prior works. I also compare architectures to understand how the network architecture influences the results. I conclude that, with the correct architecture and tuning, the semantic segmentation network improves up to 20 percentage points over prior work in the real task of segmentation of lesions. I also conclude that the network architecture and optimizations are important factors and that there are still important limitations in current work.展开更多
Thetransformer-based semantic segmentation approaches,which divide the image into different regions by sliding windows and model the relation inside each window,have achieved outstanding success.However,since the rela...Thetransformer-based semantic segmentation approaches,which divide the image into different regions by sliding windows and model the relation inside each window,have achieved outstanding success.However,since the relation modeling between windows was not the primary emphasis of previous work,it was not fully utilized.To address this issue,we propose a Graph-Segmenter,including a graph transformer and a boundary-aware attention module,which is an effective network for simultaneously modeling the more profound relation between windows in a global view and various pixels inside each window as a local one,and for substantial low-cost boundary adjustment.Specifically,we treat every window and pixel inside the window as nodes to construct graphs for both views and devise the graph transformer.The introduced boundary-awareattentionmoduleoptimizes theedge information of the target objects by modeling the relationship between the pixel on the object's edge.Extensive experiments on three widely used semantic segmentation datasets(Cityscapes,ADE-20k and PASCAL Context)demonstrate that our proposed network,a Graph Transformer with Boundary-aware Attention,can achieve state-of-the-art segmentation performance.展开更多
Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learni...Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learning(DL)methods automate crack detection,but many still struggle with variable crack patterns and environmental conditions.This study aims to address these limitations by introducing the Masker Transformer,a novel hybrid deep learning model that integrates the precise localization capabilities of Mask Region-based Convolutional Neural Network(Mask R-CNN)with the global contextual awareness of Vision Transformer(ViT).The research focuses on leveraging the strengths of both architectures to enhance segmentation accuracy and adaptability across different pavement conditions.We evaluated the performance of theMaskerTransformer against other state-of-theartmodels such asU-Net,TransformerU-Net(TransUNet),U-NetTransformer(UNETr),SwinU-NetTransformer(Swin-UNETr),You Only Look Once version 8(YoloV8),and Mask R-CNN using two benchmark datasets:Crack500 and DeepCrack.The findings reveal that the MaskerTransformer significantly outperforms the existing models,achieving the highest Dice SimilarityCoefficient(DSC),precision,recall,and F1-Score across both datasets.Specifically,the model attained a DSC of 80.04%on Crack500 and 91.37%on DeepCrack,demonstrating superior segmentation accuracy and reliability.The high precision and recall rates further substantiate its effectiveness in real-world applications,suggesting that the Masker Transformer can serve as a robust tool for automated pavement crack detection,potentially replacing more traditional methods.展开更多
Retinal aging has been recognized as a significant risk factor for various retinal disorders,including diabetic retinopathy,age-related macular degeneration,and glaucoma,following a growing understanding of the molecu...Retinal aging has been recognized as a significant risk factor for various retinal disorders,including diabetic retinopathy,age-related macular degeneration,and glaucoma,following a growing understanding of the molecular underpinnings of their development.This comprehensive review explores the mechanisms of retinal aging and investigates potential neuroprotective approaches,focusing on the activation of transcription factor EB.Recent meta-analyses have demonstrated promising outcomes of transcription factor EB-targeted strategies,such as exercise,calorie restriction,rapamycin,and metformin,in patients and animal models of these common retinal diseases.The review critically assesses the role of transcription factor EB in retinal biology during aging,its neuroprotective effects,and its therapeutic potential for retinal disorders.The impact of transcription factor EB on retinal aging is cell-specific,influencing metabolic reprogramming and energy homeostasis in retinal neurons through the regulation of mitochondrial quality control and nutrient-sensing pathways.In vascular endothelial cells,transcription factor EB controls important processes,including endothelial cell proliferation,endothelial tube formation,and nitric oxide levels,thereby influencing the inner blood-retinal barrier,angiogenesis,and retinal microvasculature.Additionally,transcription factor EB affects vascular smooth muscle cells,inhibiting vascular calcification and atherogenesis.In retinal pigment epithelial cells,transcription factor EB modulates functions such as autophagy,lysosomal dynamics,and clearance of the aging pigment lipofuscin,thereby promoting photoreceptor survival and regulating vascular endothelial growth factor A expression involved in neovascularization.These cell-specific functions of transcription factor EB significantly impact retinal aging mechanisms encompassing proteostasis,neuronal synapse plasticity,energy metabolism,microvasculature,and inflammation,ultimately offering protection against retinal aging and diseases.The review emphasizes transcription factor EB as a potential therapeutic target for retinal diseases.Therefore,it is imperative to obtain well-controlled direct experimental evidence to confirm the efficacy of transcription factor EB modulation in retinal diseases while minimizing its risk of adverse effects.展开更多
In recent years,advancements in autonomous vehicle technology have accelerated,promising safer and more efficient transportation systems.However,achieving fully autonomous driving in challenging weather conditions,par...In recent years,advancements in autonomous vehicle technology have accelerated,promising safer and more efficient transportation systems.However,achieving fully autonomous driving in challenging weather conditions,particularly in snowy environments,remains a challenge.Snow-covered roads introduce unpredictable surface conditions,occlusions,and reduced visibility,that require robust and adaptive path detection algorithms.This paper presents an enhanced road detection framework for snowy environments,leveraging Simple Framework forContrastive Learning of Visual Representations(SimCLR)for Self-Supervised pretraining,hyperparameter optimization,and uncertainty-aware object detection to improve the performance of YouOnly Look Once version 8(YOLOv8).Themodel is trained and evaluated on a custom-built dataset collected from snowy roads in Tromsø,Norway,which covers a range of snow textures,illumination conditions,and road geometries.The proposed framework achieves scores in terms of mAP@50 equal to 99%and mAP@50–95 equal to 97%,demonstrating the effectiveness of YOLOv8 for real-time road detection in extreme winter conditions.The findings contribute to the safe and reliable deployment of autonomous vehicles in Arctic environments,enabling robust decision-making in hazardous weather conditions.This research lays the groundwork for more resilient perceptionmodels in self-driving systems,paving the way for the future development of intelligent and adaptive transportation networks.展开更多
The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology play...The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology plays a crucial role in vehicle localization and navigation. Traditional Simultaneous Localization and Mapping (SLAM) systems are designed for use in static environments, and they can result in poor performance in terms of accuracy and robustness when used in dynamic environments where objects are in constant movement. To address this issue, a new real-time visual SLAM system called MG-SLAM has been developed. Based on ORB-SLAM2, MG-SLAM incorporates a dynamic target detection process that enables the detection of both known and unknown moving objects. In this process, a separate semantic segmentation thread is required to segment dynamic target instances, and the Mask R-CNN algorithm is applied on the Graphics Processing Unit (GPU) to accelerate segmentation. To reduce computational cost, only key frames are segmented to identify known dynamic objects. Additionally, a multi-view geometry method is adopted to detect unknown moving objects. The results demonstrate that MG-SLAM achieves higher precision, with an improvement from 0.2730 m to 0.0135 m in precision. Moreover, the processing time required by MG-SLAM is significantly reduced compared to other dynamic scene SLAM algorithms, which illustrates its efficacy in locating objects in dynamic scenes.展开更多
Brain tumors present significant challenges in medical diagnosis and treatment,where early detection is crucial for reducing morbidity and mortality rates.This research introduces a novel deep learning model,the Progr...Brain tumors present significant challenges in medical diagnosis and treatment,where early detection is crucial for reducing morbidity and mortality rates.This research introduces a novel deep learning model,the Progressive Layered U-Net(PLU-Net),designed to improve brain tumor segmentation accuracy from Magnetic Resonance Imaging(MRI)scans.The PLU-Net extends the standard U-Net architecture by incorporating progressive layering,attention mechanisms,and multi-scale data augmentation.The progressive layering involves a cascaded structure that refines segmentation masks across multiple stages,allowing the model to capture features at different scales and resolutions.Attention gates within the convolutional layers selectively focus on relevant features while suppressing irrelevant ones,enhancing the model's ability to delineate tumor boundaries.Additionally,multi-scale data augmentation techniques increase the diversity of training data and boost the model's generalization capabilities.Evaluated on the BraTS 2021 dataset,the PLU-Net achieved state-of-the-art performance with a dice coefficient of 0.91,specificity of 0.92,sensitivity of 0.89,Hausdorff95 of 2.5,outperforming other modified U-Net architectures in segmentation accuracy.These results underscore the effectiveness of the PLU-Net in improving brain tumor segmentation from MRI scans,supporting clinicians in early diagnosis,treatment planning,and the development of new therapies.展开更多
Brazil’s deforestation monitoring integrates accuracy and current monitoring for land use and land cover applications.Regular monitoring of deforestation and non-deforestation requires Sentinel-2 multispectral satell...Brazil’s deforestation monitoring integrates accuracy and current monitoring for land use and land cover applications.Regular monitoring of deforestation and non-deforestation requires Sentinel-2 multispectral satellite images of several bands at various frequencies,the mix of high-and low-resolution images that make object classification difficult because of the mixed pixel problem.Accuracy is impacted by the mixed pixel problem,which occurs when pixels belong to different classes and makes detection challenging.To identify mixed pixels,Band Math is used to merge numerous bands to generate a new band NDVI.Thresholding is used to analyze the edges of deforested and non-deforested areas.Segmentation is then used to analyze the pixels which helps to identify the number of mixed pixels to compute the deforested and non-deforested areas.Segmented image pixels are used to categorize the deforestation of the Brazilian Amazon Forest between 2019 and 2023.Verify how many pixels are mixed to improve accuracy and identify mixed pixel issues;compare the mixed and pure pixels of fuzzy clustering with the subtracted morphological image pixels.With the help of segmentation and clustering researchers effectively validate mixed pixels in a specific area.The proposed methodology is easy to analyze and helpful for an appropriate calculation of deforested and non-deforested areas.展开更多
Waterproof performance of gaskets between segments is the focus of shield tunnels.This paper proposed an analytical method for determining seepage characteristics at tunnel-gasketed joints based on the hydraulic fract...Waterproof performance of gaskets between segments is the focus of shield tunnels.This paper proposed an analytical method for determining seepage characteristics at tunnel-gasketed joints based on the hydraulic fracturing theories.First,the mathematical model was established,and the seepage governing equation and boundary conditions were obtained.Second,three dimensionless parameters were introduced for simplifying the expressions,and the seepage governing equations were normalized.Third,analytical expressions were derived for the interface opening and liquid pressure.Moreover,the influencing factors of seepage process at the gasketed interface were analyzed.Parametric analyses revealed that,in the normalized criterion of liquid viscosity,the liquid tip coordinate was influenced by the degree of negative pressure in the liquid lag region,which was related to the initial contact stress.The coordinate of the liquid tip affected the liquid pressure distribution and the interface opening,which were analyzed under different liquid tip coordinate conditions.Finally,under two limit states,comparative analysis showed that the results of the variation trend of the proposed method agree well with those of previous research.Overall,the proposed analytical method provides a novel solution for the design of the waterproof in shield tunnels.展开更多
Karst fractures serve as crucial seepage channels and storage spaces for carbonate natural gas reservoirs,and electrical image logs are vital data for visualizing and characterizing such fractures.However,the conventi...Karst fractures serve as crucial seepage channels and storage spaces for carbonate natural gas reservoirs,and electrical image logs are vital data for visualizing and characterizing such fractures.However,the conventional approach of identifying fractures using electrical image logs predominantly relies on manual processes that are not only time-consuming but also highly subjective.In addition,the heterogeneity and strong dissolution tendency of karst carbonate reservoirs lead to complexity and variety in fracture geometry,which makes it difficult to accurately identify fractures.In this paper,the electrical image logs network(EILnet)da deep-learning-based intelligent semantic segmentation model with a selective attention mechanism and selective feature fusion moduledwas created to enable the intelligent identification and segmentation of different types of fractures through electrical logging images.Data from electrical image logs representing structural and induced fractures were first selected using the sliding window technique before image inpainting and data augmentation were implemented for these images to improve the generalizability of the model.Various image-processing tools,including the bilateral filter,Laplace operator,and Gaussian low-pass filter,were also applied to the electrical logging images to generate a multi-attribute dataset to help the model learn the semantic features of the fractures.The results demonstrated that the EILnet model outperforms mainstream deep-learning semantic segmentation models,such as Fully Convolutional Networks(FCN-8s),U-Net,and SegNet,for both the single-channel dataset and the multi-attribute dataset.The EILnet provided significant advantages for the single-channel dataset,and its mean intersection over union(MIoU)and pixel accuracy(PA)were 81.32%and 89.37%,respectively.In the case of the multi-attribute dataset,the identification capability of all models improved to varying degrees,with the EILnet achieving the highest MIoU and PA of 83.43%and 91.11%,respectively.Further,applying the EILnet model to various blind wells demonstrated its ability to provide reliable fracture identification,thereby indicating its promising potential applications.展开更多
Real-time prediction and precise control of sinter quality are pivotal for energy saving,cost reduction,quality improvement and efficiency enhancement in the ironmaking process.To advance,the accuracy and comprehensiv...Real-time prediction and precise control of sinter quality are pivotal for energy saving,cost reduction,quality improvement and efficiency enhancement in the ironmaking process.To advance,the accuracy and comprehensiveness of sinter quality prediction,an intelligent flare monitoring system for sintering machine tails that combines hybrid neural networks integrating convolutional neural network with long short-term memory(CNN-LSTM)networks was proposed.The system utilized a high-temperature thermal imager for image acquisition at the sintering machine tail and employed a zone-triggered method to accurately capture dynamic feature images under challenging conditions of high-temperature,high dust,and occlusion.The feature images were then segmented through a triple-iteration multi-thresholding approach based on the maximum between-class variance method to minimize detail loss during the segmentation process.Leveraging the advantages of CNN and LSTM networks in capturing temporal and spatial information,a comprehensive model for sinter quality prediction was constructed,with inputs including the proportion of combustion layer,porosity rate,temperature distribution,and image features obtained from the convolutional neural network,and outputs comprising quality indicators such as underburning index,uniformity index,and FeO content of the sinter.The accuracy is notably increased,achieving a 95.8%hit rate within an error margin of±1.0.After the system is applied,the average qualified rate of FeO content increases from 87.24%to 89.99%,representing an improvement of 2.75%.The average monthly solid fuel consumption is reduced from 49.75 to 46.44 kg/t,leading to a 6.65%reduction and underscoring significant energy saving and cost reduction effects.展开更多
The Tan-Lu Fault Zone is a large NNE-trending fault zone that has a substantial effect on the development of eastern China and its earthquake disaster prevention efforts. Aiming at the azimuthally anisotropic structur...The Tan-Lu Fault Zone is a large NNE-trending fault zone that has a substantial effect on the development of eastern China and its earthquake disaster prevention efforts. Aiming at the azimuthally anisotropic structure in the upper crust and seismogenic tectonics in the Hefei segment of this fault, we collected phase velocity dispersion data of fundamental mode Rayleigh waves from ambient noise cross-correlation functions of ~400 temporal seismographs in an area of approximately 80 × 70 km along the fault zone. The period band of the dispersion data was ~0.5–10 s. We inverted for the upper crustal three-dimensional(3-D) shear velocity model with azimuthal anisotropy from the surface to 10 km depth by using a 3-D direct azimuthal anisotropy inversion method. The inversion result shows the spatial distribution characteristics of the tectonic units in the upper crust. Additionally, the deformation of the Tan-Lu Fault Zone and its conjugated fault systems could be inferred from the anisotropy model. In particular, the faults that have remained active from the early and middle Pleistocene control the anisotropic characteristics of the upper crustal structure in this area. The direction of fast axes near the fault zone area in the upper crust is consistent with the strike of the faults, whereas for the region far away from the fault zone, the direction of fast axes is consistent with the direction of the regional principal stress caused by plate movement. Combined with the azimuthal anisotropy models in the deep crust and uppermost mantle from the surface wave and Pn wave, the different anisotropic patterns caused by the Tan-Lu Fault Zone and its conjugated fault system nearby are shown in the upper and lower crust. Furthermore,by using the double-difference method, we relocated the Lujiang earthquake series, which contained 32 earthquakes with a depth shallower than 10 km. Both the Vs model and earthquake relocation results indicate that earthquakes mostly occurred in the vicinity of structural boundaries with fractured media, with high-level development of cracks and small-scale faults jammed between more rigid areas.展开更多
A segmented predictor-corrector method is proposed for hypersonic glide vehicles to address the issue of the slow computational speed of obtaining guidance commands using the traditional predictor-corrector guidance m...A segmented predictor-corrector method is proposed for hypersonic glide vehicles to address the issue of the slow computational speed of obtaining guidance commands using the traditional predictor-corrector guidance method.Firstly,an altitude-energy profile is designed,and the bank angle is derived analytically as the initial iteration value for the predictor-corrector method.The predictor-corrector guidance method has been improved by deriving an analytical form for predicting the range-to-go error,which greatly accelerates the iterative speed.Then,a segmented guidance algorithm is proposed.The above analytically predictor-corrector guidance method is adopted when the energy exceeds an energy threshold.When the energy is less than the threshold,the equidistant test method is used to calculate the bank angle command,which ensures guidance accuracy as well as computational efficiency.Additionally,an adaptive guidance cycle strategy is applied to reduce the computational time of the reentry guidance trajectory.Finally,the accuracy and robustness of the proposed method are verified through a series of simulations and Monte-Carlo experiments.Compared with the traditional integral method,the proposed method requires 75%less computation time on average and achieves a lower landing error.展开更多
Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time perfor...Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.展开更多
Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)t...Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)training the model solely with copy-paste mixed pictures from labeled and unlabeled input loses a lot of labeled information;(2)low-quality pseudo-labels can cause confirmation bias in pseudo-supervised learning on unlabeled data;(3)the segmentation performance in low-contrast and local regions is less than optimal.We design a Stochastic Augmentation-Based Dual-Teaching Auxiliary Training Strategy(SADT),which enhances feature diversity and learns high-quality features to overcome these problems.To be more precise,SADT trains the Student Network by using pseudo-label-based training from Teacher Network 1 and supervised learning with labeled data,which prevents the loss of rare labeled data.We introduce a bi-directional copy-pastemask with progressive high-entropy filtering to reduce data distribution disparities and mitigate confirmation bias in pseudo-supervision.For the mixed images,Deep-Shallow Spatial Contrastive Learning(DSSCL)is proposed in the feature spaces of Teacher Network 2 and the Student Network to improve the segmentation capabilities in low-contrast and local areas.In this procedure,the features retrieved by the Student Network are subjected to a random feature perturbation technique.On two openly available datasets,extensive trials show that our proposed SADT performs much better than the state-ofthe-art semi-supervised medical segmentation techniques.Using only 10%of the labeled data for training,SADT was able to acquire a Dice score of 90.10%on the ACDC(Automatic Cardiac Diagnosis Challenge)dataset.展开更多
Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it cha...Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network.展开更多
Liver cancer remains a leading cause of mortality worldwide,and precise diagnostic tools are essential for effective treatment planning.Liver Tumors(LTs)vary significantly in size,shape,and location,and can present wi...Liver cancer remains a leading cause of mortality worldwide,and precise diagnostic tools are essential for effective treatment planning.Liver Tumors(LTs)vary significantly in size,shape,and location,and can present with tissues of similar intensities,making automatically segmenting and classifying LTs from abdominal tomography images crucial and challenging.This review examines recent advancements in Liver Segmentation(LS)and Tumor Segmentation(TS)algorithms,highlighting their strengths and limitations regarding precision,automation,and resilience.Performance metrics are utilized to assess key detection algorithms and analytical methods,emphasizing their effectiveness and relevance in clinical contexts.The review also addresses ongoing challenges in liver tumor segmentation and identification,such as managing high variability in patient data and ensuring robustness across different imaging conditions.It suggests directions for future research,with insights into technological advancements that can enhance surgical planning and diagnostic accuracy by comparing popular methods.This paper contributes to a comprehensive understanding of current liver tumor detection techniques,provides a roadmap for future innovations,and improves diagnostic and therapeutic outcomes for liver cancer by integrating recent progress with remaining challenges.展开更多
文摘In the context of automated analysis of eye fundus images, it is an important common fallacy that prior works achieve very high scores in segmentation of lesions, and that fallacy is fueled by some reviews reporting very high scores, and perhaps some confusion with terms. A simple analysis of the detail of the few prior works that really do segmentation reveals scores between 7% and 70% in sensitivity for 1 FPI. That is clearly sub-par with medical doctors trained to detect signs of Diabetic Retinopathy, since they can distinguish well the contours of lesions in Eye Fundus Images (EFI). Still, a full segmentation of lesions could be an important step for both visualization and further automated analysis using rigorous quantification or areas and numbers of lesions to better diagnose. I discuss what prior work really does, using evidence-based analysis, and confront with segmentation networks, comparing on the terms used by prior work to show that the best performing segmentation network outperforms those prior works. I also compare architectures to understand how the network architecture influences the results. I conclude that, with the correct architecture and tuning, the semantic segmentation network improves up to 20 percentage points over prior work in the real task of segmentation of lesions. I also conclude that the network architecture and optimizations are important factors and that there are still important limitations in current work.
文摘Thetransformer-based semantic segmentation approaches,which divide the image into different regions by sliding windows and model the relation inside each window,have achieved outstanding success.However,since the relation modeling between windows was not the primary emphasis of previous work,it was not fully utilized.To address this issue,we propose a Graph-Segmenter,including a graph transformer and a boundary-aware attention module,which is an effective network for simultaneously modeling the more profound relation between windows in a global view and various pixels inside each window as a local one,and for substantial low-cost boundary adjustment.Specifically,we treat every window and pixel inside the window as nodes to construct graphs for both views and devise the graph transformer.The introduced boundary-awareattentionmoduleoptimizes theedge information of the target objects by modeling the relationship between the pixel on the object's edge.Extensive experiments on three widely used semantic segmentation datasets(Cityscapes,ADE-20k and PASCAL Context)demonstrate that our proposed network,a Graph Transformer with Boundary-aware Attention,can achieve state-of-the-art segmentation performance.
文摘Detecting pavement cracks is critical for road safety and infrastructure management.Traditional methods,relying on manual inspection and basic image processing,are time-consuming and prone to errors.Recent deep-learning(DL)methods automate crack detection,but many still struggle with variable crack patterns and environmental conditions.This study aims to address these limitations by introducing the Masker Transformer,a novel hybrid deep learning model that integrates the precise localization capabilities of Mask Region-based Convolutional Neural Network(Mask R-CNN)with the global contextual awareness of Vision Transformer(ViT).The research focuses on leveraging the strengths of both architectures to enhance segmentation accuracy and adaptability across different pavement conditions.We evaluated the performance of theMaskerTransformer against other state-of-theartmodels such asU-Net,TransformerU-Net(TransUNet),U-NetTransformer(UNETr),SwinU-NetTransformer(Swin-UNETr),You Only Look Once version 8(YoloV8),and Mask R-CNN using two benchmark datasets:Crack500 and DeepCrack.The findings reveal that the MaskerTransformer significantly outperforms the existing models,achieving the highest Dice SimilarityCoefficient(DSC),precision,recall,and F1-Score across both datasets.Specifically,the model attained a DSC of 80.04%on Crack500 and 91.37%on DeepCrack,demonstrating superior segmentation accuracy and reliability.The high precision and recall rates further substantiate its effectiveness in real-world applications,suggesting that the Masker Transformer can serve as a robust tool for automated pavement crack detection,potentially replacing more traditional methods.
基金supported by the Start-up Fund for new faculty from the Hong Kong Polytechnic University(PolyU)(A0043215)(to SA)the General Research Fund and Research Impact Fund from the Hong Kong Research Grants Council(15106018,R5032-18)(to DYT)+1 种基金the Research Center for SHARP Vision in PolyU(P0045843)(to SA)the InnoHK scheme from the Hong Kong Special Administrative Region Government(to DYT).
文摘Retinal aging has been recognized as a significant risk factor for various retinal disorders,including diabetic retinopathy,age-related macular degeneration,and glaucoma,following a growing understanding of the molecular underpinnings of their development.This comprehensive review explores the mechanisms of retinal aging and investigates potential neuroprotective approaches,focusing on the activation of transcription factor EB.Recent meta-analyses have demonstrated promising outcomes of transcription factor EB-targeted strategies,such as exercise,calorie restriction,rapamycin,and metformin,in patients and animal models of these common retinal diseases.The review critically assesses the role of transcription factor EB in retinal biology during aging,its neuroprotective effects,and its therapeutic potential for retinal disorders.The impact of transcription factor EB on retinal aging is cell-specific,influencing metabolic reprogramming and energy homeostasis in retinal neurons through the regulation of mitochondrial quality control and nutrient-sensing pathways.In vascular endothelial cells,transcription factor EB controls important processes,including endothelial cell proliferation,endothelial tube formation,and nitric oxide levels,thereby influencing the inner blood-retinal barrier,angiogenesis,and retinal microvasculature.Additionally,transcription factor EB affects vascular smooth muscle cells,inhibiting vascular calcification and atherogenesis.In retinal pigment epithelial cells,transcription factor EB modulates functions such as autophagy,lysosomal dynamics,and clearance of the aging pigment lipofuscin,thereby promoting photoreceptor survival and regulating vascular endothelial growth factor A expression involved in neovascularization.These cell-specific functions of transcription factor EB significantly impact retinal aging mechanisms encompassing proteostasis,neuronal synapse plasticity,energy metabolism,microvasculature,and inflammation,ultimately offering protection against retinal aging and diseases.The review emphasizes transcription factor EB as a potential therapeutic target for retinal diseases.Therefore,it is imperative to obtain well-controlled direct experimental evidence to confirm the efficacy of transcription factor EB modulation in retinal diseases while minimizing its risk of adverse effects.
文摘In recent years,advancements in autonomous vehicle technology have accelerated,promising safer and more efficient transportation systems.However,achieving fully autonomous driving in challenging weather conditions,particularly in snowy environments,remains a challenge.Snow-covered roads introduce unpredictable surface conditions,occlusions,and reduced visibility,that require robust and adaptive path detection algorithms.This paper presents an enhanced road detection framework for snowy environments,leveraging Simple Framework forContrastive Learning of Visual Representations(SimCLR)for Self-Supervised pretraining,hyperparameter optimization,and uncertainty-aware object detection to improve the performance of YouOnly Look Once version 8(YOLOv8).Themodel is trained and evaluated on a custom-built dataset collected from snowy roads in Tromsø,Norway,which covers a range of snow textures,illumination conditions,and road geometries.The proposed framework achieves scores in terms of mAP@50 equal to 99%and mAP@50–95 equal to 97%,demonstrating the effectiveness of YOLOv8 for real-time road detection in extreme winter conditions.The findings contribute to the safe and reliable deployment of autonomous vehicles in Arctic environments,enabling robust decision-making in hazardous weather conditions.This research lays the groundwork for more resilient perceptionmodels in self-driving systems,paving the way for the future development of intelligent and adaptive transportation networks.
基金funded by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(grant number 22KJD440001)Changzhou Science&Technology Program(grant number CJ20220232).
文摘The Internet of Vehicles (IoV) has become an important direction in the field of intelligent transportation, in which vehicle positioning is a crucial part. SLAM (Simultaneous Localization and Mapping) technology plays a crucial role in vehicle localization and navigation. Traditional Simultaneous Localization and Mapping (SLAM) systems are designed for use in static environments, and they can result in poor performance in terms of accuracy and robustness when used in dynamic environments where objects are in constant movement. To address this issue, a new real-time visual SLAM system called MG-SLAM has been developed. Based on ORB-SLAM2, MG-SLAM incorporates a dynamic target detection process that enables the detection of both known and unknown moving objects. In this process, a separate semantic segmentation thread is required to segment dynamic target instances, and the Mask R-CNN algorithm is applied on the Graphics Processing Unit (GPU) to accelerate segmentation. To reduce computational cost, only key frames are segmented to identify known dynamic objects. Additionally, a multi-view geometry method is adopted to detect unknown moving objects. The results demonstrate that MG-SLAM achieves higher precision, with an improvement from 0.2730 m to 0.0135 m in precision. Moreover, the processing time required by MG-SLAM is significantly reduced compared to other dynamic scene SLAM algorithms, which illustrates its efficacy in locating objects in dynamic scenes.
文摘Brain tumors present significant challenges in medical diagnosis and treatment,where early detection is crucial for reducing morbidity and mortality rates.This research introduces a novel deep learning model,the Progressive Layered U-Net(PLU-Net),designed to improve brain tumor segmentation accuracy from Magnetic Resonance Imaging(MRI)scans.The PLU-Net extends the standard U-Net architecture by incorporating progressive layering,attention mechanisms,and multi-scale data augmentation.The progressive layering involves a cascaded structure that refines segmentation masks across multiple stages,allowing the model to capture features at different scales and resolutions.Attention gates within the convolutional layers selectively focus on relevant features while suppressing irrelevant ones,enhancing the model's ability to delineate tumor boundaries.Additionally,multi-scale data augmentation techniques increase the diversity of training data and boost the model's generalization capabilities.Evaluated on the BraTS 2021 dataset,the PLU-Net achieved state-of-the-art performance with a dice coefficient of 0.91,specificity of 0.92,sensitivity of 0.89,Hausdorff95 of 2.5,outperforming other modified U-Net architectures in segmentation accuracy.These results underscore the effectiveness of the PLU-Net in improving brain tumor segmentation from MRI scans,supporting clinicians in early diagnosis,treatment planning,and the development of new therapies.
文摘Brazil’s deforestation monitoring integrates accuracy and current monitoring for land use and land cover applications.Regular monitoring of deforestation and non-deforestation requires Sentinel-2 multispectral satellite images of several bands at various frequencies,the mix of high-and low-resolution images that make object classification difficult because of the mixed pixel problem.Accuracy is impacted by the mixed pixel problem,which occurs when pixels belong to different classes and makes detection challenging.To identify mixed pixels,Band Math is used to merge numerous bands to generate a new band NDVI.Thresholding is used to analyze the edges of deforested and non-deforested areas.Segmentation is then used to analyze the pixels which helps to identify the number of mixed pixels to compute the deforested and non-deforested areas.Segmented image pixels are used to categorize the deforestation of the Brazilian Amazon Forest between 2019 and 2023.Verify how many pixels are mixed to improve accuracy and identify mixed pixel issues;compare the mixed and pure pixels of fuzzy clustering with the subtracted morphological image pixels.With the help of segmentation and clustering researchers effectively validate mixed pixels in a specific area.The proposed methodology is easy to analyze and helpful for an appropriate calculation of deforested and non-deforested areas.
基金Project(52278421)supported by the National Natural Science Foundation of ChinaProject(2024ZZTS0754)supported by the Fundamental Research Funds for the Central Universities of Central South University,China+2 种基金Project(2023CXQD067)supported by the Central South University Innovation-Driven Research Programme,ChinaProject(2022QNRC001)supported by Young Elite Scientists Sponsorship Program by CASTProject(2023TJ-N24)supported by the Youth Talent Program by China Railway Society and the Hunan Provincial Science and Technology Promotion Talent Project。
文摘Waterproof performance of gaskets between segments is the focus of shield tunnels.This paper proposed an analytical method for determining seepage characteristics at tunnel-gasketed joints based on the hydraulic fracturing theories.First,the mathematical model was established,and the seepage governing equation and boundary conditions were obtained.Second,three dimensionless parameters were introduced for simplifying the expressions,and the seepage governing equations were normalized.Third,analytical expressions were derived for the interface opening and liquid pressure.Moreover,the influencing factors of seepage process at the gasketed interface were analyzed.Parametric analyses revealed that,in the normalized criterion of liquid viscosity,the liquid tip coordinate was influenced by the degree of negative pressure in the liquid lag region,which was related to the initial contact stress.The coordinate of the liquid tip affected the liquid pressure distribution and the interface opening,which were analyzed under different liquid tip coordinate conditions.Finally,under two limit states,comparative analysis showed that the results of the variation trend of the proposed method agree well with those of previous research.Overall,the proposed analytical method provides a novel solution for the design of the waterproof in shield tunnels.
基金the National Natural Science Foundation of China(42472194,42302153,and 42002144)the Fundamental Research Funds for the Central Univer-sities(22CX06002A).
文摘Karst fractures serve as crucial seepage channels and storage spaces for carbonate natural gas reservoirs,and electrical image logs are vital data for visualizing and characterizing such fractures.However,the conventional approach of identifying fractures using electrical image logs predominantly relies on manual processes that are not only time-consuming but also highly subjective.In addition,the heterogeneity and strong dissolution tendency of karst carbonate reservoirs lead to complexity and variety in fracture geometry,which makes it difficult to accurately identify fractures.In this paper,the electrical image logs network(EILnet)da deep-learning-based intelligent semantic segmentation model with a selective attention mechanism and selective feature fusion moduledwas created to enable the intelligent identification and segmentation of different types of fractures through electrical logging images.Data from electrical image logs representing structural and induced fractures were first selected using the sliding window technique before image inpainting and data augmentation were implemented for these images to improve the generalizability of the model.Various image-processing tools,including the bilateral filter,Laplace operator,and Gaussian low-pass filter,were also applied to the electrical logging images to generate a multi-attribute dataset to help the model learn the semantic features of the fractures.The results demonstrated that the EILnet model outperforms mainstream deep-learning semantic segmentation models,such as Fully Convolutional Networks(FCN-8s),U-Net,and SegNet,for both the single-channel dataset and the multi-attribute dataset.The EILnet provided significant advantages for the single-channel dataset,and its mean intersection over union(MIoU)and pixel accuracy(PA)were 81.32%and 89.37%,respectively.In the case of the multi-attribute dataset,the identification capability of all models improved to varying degrees,with the EILnet achieving the highest MIoU and PA of 83.43%and 91.11%,respectively.Further,applying the EILnet model to various blind wells demonstrated its ability to provide reliable fracture identification,thereby indicating its promising potential applications.
基金founded by the Open Project Program of Anhui Province Key Laboratory of Metallurgical Engineering and Resources Recycling(Anhui University of Technology)(No.SKF21-06)Research Fund for Young Teachers of Anhui University of Technology in 2020(No.QZ202001).
文摘Real-time prediction and precise control of sinter quality are pivotal for energy saving,cost reduction,quality improvement and efficiency enhancement in the ironmaking process.To advance,the accuracy and comprehensiveness of sinter quality prediction,an intelligent flare monitoring system for sintering machine tails that combines hybrid neural networks integrating convolutional neural network with long short-term memory(CNN-LSTM)networks was proposed.The system utilized a high-temperature thermal imager for image acquisition at the sintering machine tail and employed a zone-triggered method to accurately capture dynamic feature images under challenging conditions of high-temperature,high dust,and occlusion.The feature images were then segmented through a triple-iteration multi-thresholding approach based on the maximum between-class variance method to minimize detail loss during the segmentation process.Leveraging the advantages of CNN and LSTM networks in capturing temporal and spatial information,a comprehensive model for sinter quality prediction was constructed,with inputs including the proportion of combustion layer,porosity rate,temperature distribution,and image features obtained from the convolutional neural network,and outputs comprising quality indicators such as underburning index,uniformity index,and FeO content of the sinter.The accuracy is notably increased,achieving a 95.8%hit rate within an error margin of±1.0.After the system is applied,the average qualified rate of FeO content increases from 87.24%to 89.99%,representing an improvement of 2.75%.The average monthly solid fuel consumption is reduced from 49.75 to 46.44 kg/t,leading to a 6.65%reduction and underscoring significant energy saving and cost reduction effects.
基金financially supported by the National Key Research and Development Program of China (2022YFC3005600)the Foundation of the Anhui Educational Commission (2023AH051198)+1 种基金the National Natural Science Foundation of China (42125401 and 42104063)the Joint Open Fund of Mengcheng National Geophysical Observatory (MENGO-202201)。
文摘The Tan-Lu Fault Zone is a large NNE-trending fault zone that has a substantial effect on the development of eastern China and its earthquake disaster prevention efforts. Aiming at the azimuthally anisotropic structure in the upper crust and seismogenic tectonics in the Hefei segment of this fault, we collected phase velocity dispersion data of fundamental mode Rayleigh waves from ambient noise cross-correlation functions of ~400 temporal seismographs in an area of approximately 80 × 70 km along the fault zone. The period band of the dispersion data was ~0.5–10 s. We inverted for the upper crustal three-dimensional(3-D) shear velocity model with azimuthal anisotropy from the surface to 10 km depth by using a 3-D direct azimuthal anisotropy inversion method. The inversion result shows the spatial distribution characteristics of the tectonic units in the upper crust. Additionally, the deformation of the Tan-Lu Fault Zone and its conjugated fault systems could be inferred from the anisotropy model. In particular, the faults that have remained active from the early and middle Pleistocene control the anisotropic characteristics of the upper crustal structure in this area. The direction of fast axes near the fault zone area in the upper crust is consistent with the strike of the faults, whereas for the region far away from the fault zone, the direction of fast axes is consistent with the direction of the regional principal stress caused by plate movement. Combined with the azimuthal anisotropy models in the deep crust and uppermost mantle from the surface wave and Pn wave, the different anisotropic patterns caused by the Tan-Lu Fault Zone and its conjugated fault system nearby are shown in the upper and lower crust. Furthermore,by using the double-difference method, we relocated the Lujiang earthquake series, which contained 32 earthquakes with a depth shallower than 10 km. Both the Vs model and earthquake relocation results indicate that earthquakes mostly occurred in the vicinity of structural boundaries with fractured media, with high-level development of cracks and small-scale faults jammed between more rigid areas.
基金National Natural Science Foundation of China(Nos.61773387 and 62022061).
文摘A segmented predictor-corrector method is proposed for hypersonic glide vehicles to address the issue of the slow computational speed of obtaining guidance commands using the traditional predictor-corrector guidance method.Firstly,an altitude-energy profile is designed,and the bank angle is derived analytically as the initial iteration value for the predictor-corrector method.The predictor-corrector guidance method has been improved by deriving an analytical form for predicting the range-to-go error,which greatly accelerates the iterative speed.Then,a segmented guidance algorithm is proposed.The above analytically predictor-corrector guidance method is adopted when the energy exceeds an energy threshold.When the energy is less than the threshold,the equidistant test method is used to calculate the bank angle command,which ensures guidance accuracy as well as computational efficiency.Additionally,an adaptive guidance cycle strategy is applied to reduce the computational time of the reentry guidance trajectory.Finally,the accuracy and robustness of the proposed method are verified through a series of simulations and Monte-Carlo experiments.Compared with the traditional integral method,the proposed method requires 75%less computation time on average and achieves a lower landing error.
基金supported by the National Key Research and Development Project of China(No.2023YFB3709605)the National Natural Science Foundation of China(No.62073193)the National College Student Innovation Training Program(No.202310422122)。
文摘Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.
基金supported by the Natural Science Foundation of China(No.41804112,author:Chengyun Song).
文摘Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)training the model solely with copy-paste mixed pictures from labeled and unlabeled input loses a lot of labeled information;(2)low-quality pseudo-labels can cause confirmation bias in pseudo-supervised learning on unlabeled data;(3)the segmentation performance in low-contrast and local regions is less than optimal.We design a Stochastic Augmentation-Based Dual-Teaching Auxiliary Training Strategy(SADT),which enhances feature diversity and learns high-quality features to overcome these problems.To be more precise,SADT trains the Student Network by using pseudo-label-based training from Teacher Network 1 and supervised learning with labeled data,which prevents the loss of rare labeled data.We introduce a bi-directional copy-pastemask with progressive high-entropy filtering to reduce data distribution disparities and mitigate confirmation bias in pseudo-supervision.For the mixed images,Deep-Shallow Spatial Contrastive Learning(DSSCL)is proposed in the feature spaces of Teacher Network 2 and the Student Network to improve the segmentation capabilities in low-contrast and local areas.In this procedure,the features retrieved by the Student Network are subjected to a random feature perturbation technique.On two openly available datasets,extensive trials show that our proposed SADT performs much better than the state-ofthe-art semi-supervised medical segmentation techniques.Using only 10%of the labeled data for training,SADT was able to acquire a Dice score of 90.10%on the ACDC(Automatic Cardiac Diagnosis Challenge)dataset.
基金supported in part by the National Natural Science Foundation of China under Grants 62463002,62062021 and 62473033in part by the Guiyang Scientific Plan Project[2023]48–11,in part by QKHZYD[2023]010 Guizhou Province Science and Technology Innovation Base Construction Project“Key Laboratory Construction of Intelligent Mountain Agricultural Equipment”.
文摘Solar cell defect detection is crucial for quality inspection in photovoltaic power generation modules.In the production process,defect samples occur infrequently and exhibit random shapes and sizes,which makes it challenging to collect defective samples.Additionally,the complex surface background of polysilicon cell wafers complicates the accurate identification and localization of defective regions.This paper proposes a novel Lightweight Multiscale Feature Fusion network(LMFF)to address these challenges.The network comprises a feature extraction network,a multi-scale feature fusion module(MFF),and a segmentation network.Specifically,a feature extraction network is proposed to obtain multi-scale feature outputs,and a multi-scale feature fusion module(MFF)is used to fuse multi-scale feature information effectively.In order to capture finer-grained multi-scale information from the fusion features,we propose a multi-scale attention module(MSA)in the segmentation network to enhance the network’s ability for small target detection.Moreover,depthwise separable convolutions are introduced to construct depthwise separable residual blocks(DSR)to reduce the model’s parameter number.Finally,to validate the proposed method’s defect segmentation and localization performance,we constructed three solar cell defect detection datasets:SolarCells,SolarCells-S,and PVEL-S.SolarCells and SolarCells-S are monocrystalline silicon datasets,and PVEL-S is a polycrystalline silicon dataset.Experimental results show that the IOU of our method on these three datasets can reach 68.5%,51.0%,and 92.7%,respectively,and the F1-Score can reach 81.3%,67.5%,and 96.2%,respectively,which surpasses other commonly usedmethods and verifies the effectiveness of our LMFF network.
基金the“Intelligent Recognition Industry Service Center”as part of the Featured Areas Research Center Program under the Higher Education Sprout Project by the Ministry of Education(MOE)in Taiwan,and the National Science and Technology Council,Taiwan,under grants 113-2221-E-224-041 and 113-2622-E-224-002.Additionally,partial support was provided by Isuzu Optics Corporation.
文摘Liver cancer remains a leading cause of mortality worldwide,and precise diagnostic tools are essential for effective treatment planning.Liver Tumors(LTs)vary significantly in size,shape,and location,and can present with tissues of similar intensities,making automatically segmenting and classifying LTs from abdominal tomography images crucial and challenging.This review examines recent advancements in Liver Segmentation(LS)and Tumor Segmentation(TS)algorithms,highlighting their strengths and limitations regarding precision,automation,and resilience.Performance metrics are utilized to assess key detection algorithms and analytical methods,emphasizing their effectiveness and relevance in clinical contexts.The review also addresses ongoing challenges in liver tumor segmentation and identification,such as managing high variability in patient data and ensuring robustness across different imaging conditions.It suggests directions for future research,with insights into technological advancements that can enhance surgical planning and diagnostic accuracy by comparing popular methods.This paper contributes to a comprehensive understanding of current liver tumor detection techniques,provides a roadmap for future innovations,and improves diagnostic and therapeutic outcomes for liver cancer by integrating recent progress with remaining challenges.