Rapidly and accurately assessing the geometric characteristics of coarse aggregate particles is crucial for ensuring pavement performance in highway engineering.This article introduces an innovative system for the thr...Rapidly and accurately assessing the geometric characteristics of coarse aggregate particles is crucial for ensuring pavement performance in highway engineering.This article introduces an innovative system for the three-dimensional(3D)surface reconstruction of coarse aggregate particles using occlusion-free multi-view imaging.The system captures synchronized images of particles in free fall,employing a matte sphere and a nonlinear optimization approach to estimate the camera projection matrices.A pre-trained segmentation model is utilized to eliminate the background of the images.The Shape from Silhouettes(SfS)algorithm is then applied to generate 3D voxel data,followed by the Marching Cubes algorithm to construct the 3D surface contour.Validation against standard parts and diverse coarse aggregate particles confirms the method's high accuracy,with an average measurement precision of 0.434 mm and a significant increase in scanning and reconstruction efficiency.展开更多
A new algorithm is proposed for restoring disocclusion regions in depth-image-based rendering (DIBR) warped images. Current solutions include layered depth image (LDI), pre-filtering methods, and post-processing m...A new algorithm is proposed for restoring disocclusion regions in depth-image-based rendering (DIBR) warped images. Current solutions include layered depth image (LDI), pre-filtering methods, and post-processing methods. The LDI is complicated, and pre-filtering of depth images causes noticeable geometrical distortions in cases of large baseline warping. This paper presents a depth-aided inpainting method which inherits merits from Criminisi's inpainting algorithm. The proposed method features incorporation of a depth cue into texture estimation. The algorithm efficiently handles depth ambiguity by penalizing larger Lagrange multipliers of flling points closer to the warping position compared with the surrounding existing points. We perform morphological operations on depth images to accelerate the algorithm convergence, and adopt a luma-first strategy to adapt to various color sampling formats. Experiments on test multi-view sequence showed that our method has superiority in depth differentiation and geometrical loyalty in the restoration of warped images. Also, peak signal-to-noise ratio (PSNR) statistics on non-hole regions and whole image comparisons both compare favorably to those obtained by state of the art techniques.展开更多
In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relationa...In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relational graph location network(RGLN)to perform this task.In this network,we propose a heterogeneous graph construction approach for graph classification tasks,which aims to describe the location in a more appropriate way,thereby improving the expression ability of the location representation module.Experiments show that the expression ability of the proposed graph construction approach outperforms the compared methods by a large margin.In addition,the proposed localization method outperforms the compared localization methods by around 1.7%in terms of meter-level accuracy.展开更多
Targeting at a reliable image matching of multiple remote sensing images for the generation of digital surface models,this paper presents a geometric-constrained multi-view image matching method,based on an energy min...Targeting at a reliable image matching of multiple remote sensing images for the generation of digital surface models,this paper presents a geometric-constrained multi-view image matching method,based on an energy minimization framework.By employing a geometrical constraint,the cost value of the energy function was calculated from multiple images,and the cost value was aggregated in an image space using a semi-global optimization approach.A homography transform parameter calculation method is proposed for fast calculation of projection pixel on each image when calculation cost values.It is based on the known interior orientation parameters,exterior orientation parameters,and a given elevation value.For an efficient and reliable processing of multiple remote sensing images,the proposed matching method was performed via a coarse-to-fine strategy through image pyramid.Three sets of airborne remote sensing images were used to evaluate the performance of the proposed method.Results reveal that the multi-view image matching can improve matching reliability.Moreover,the experimental results show that the proposed method performs better than traditional methods.展开更多
Traditional three-dimensional(3D)image reconstruction method,which highly dependent on the environment and has poor reconstruction effect,is easy to lead to mismatch and poor real-time performance.The accuracy of feat...Traditional three-dimensional(3D)image reconstruction method,which highly dependent on the environment and has poor reconstruction effect,is easy to lead to mismatch and poor real-time performance.The accuracy of feature extraction from multiple images affects the reliability and real-time performance of 3D reconstruction technology.To solve the problem,a multi-view image 3D reconstruction algorithm based on self-encoding convolutional neural network is proposed in this paper.The algorithm first extracts the feature information of multiple two-dimensional(2D)images based on scale and rotation invariance parameters of Scale-invariant feature transform(SIFT)operator.Secondly,self-encoding learning neural network is introduced into the feature refinement process to take full advantage of its feature extraction ability.Then,Fish-Net is used to replace the U-Net structure inside the self-encoding network to improve gradient propagation between U-Net structures,and Generative Adversarial Networks(GAN)loss function is used to replace mean square error(MSE)to better express image features,discarding useless features to obtain effective image features.Finally,an incremental structure from motion(SFM)algorithm is performed to calculate rotation matrix and translation vector of the camera,and the feature points are triangulated to obtain a sparse spatial point cloud,and meshlab software is used to display the results.Simulation experiments show that compared with the traditional method,the image feature extraction method proposed in this paper can significantly improve the rendering effect of 3D point cloud,with an accuracy rate of 92.5%and a reconstruction complete rate of 83.6%.展开更多
Through the analysis and comparison of shortcomings and advantages of existing technologies on object modeling in 3D applications,we propose a new modeling method for virtual scene based on multi-view image sequence t...Through the analysis and comparison of shortcomings and advantages of existing technologies on object modeling in 3D applications,we propose a new modeling method for virtual scene based on multi-view image sequence to model irregular objects efficiently in 3D application.In 3D scene,this method can get better visual effect by tracking the viewer's real-time perspective position and projecting the photos from different perspectives dynamically.The philosophy of design,the steps of development and some other relevant topics are discussed in details,and the validity of the algorithm is analyzed.The results demonstrate that this method represents more superiority on simulating irregular objects by applying it to the modeling of virtual museum.展开更多
The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time perfor...Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.展开更多
Karst fractures serve as crucial seepage channels and storage spaces for carbonate natural gas reservoirs,and electrical image logs are vital data for visualizing and characterizing such fractures.However,the conventi...Karst fractures serve as crucial seepage channels and storage spaces for carbonate natural gas reservoirs,and electrical image logs are vital data for visualizing and characterizing such fractures.However,the conventional approach of identifying fractures using electrical image logs predominantly relies on manual processes that are not only time-consuming but also highly subjective.In addition,the heterogeneity and strong dissolution tendency of karst carbonate reservoirs lead to complexity and variety in fracture geometry,which makes it difficult to accurately identify fractures.In this paper,the electrical image logs network(EILnet)da deep-learning-based intelligent semantic segmentation model with a selective attention mechanism and selective feature fusion moduledwas created to enable the intelligent identification and segmentation of different types of fractures through electrical logging images.Data from electrical image logs representing structural and induced fractures were first selected using the sliding window technique before image inpainting and data augmentation were implemented for these images to improve the generalizability of the model.Various image-processing tools,including the bilateral filter,Laplace operator,and Gaussian low-pass filter,were also applied to the electrical logging images to generate a multi-attribute dataset to help the model learn the semantic features of the fractures.The results demonstrated that the EILnet model outperforms mainstream deep-learning semantic segmentation models,such as Fully Convolutional Networks(FCN-8s),U-Net,and SegNet,for both the single-channel dataset and the multi-attribute dataset.The EILnet provided significant advantages for the single-channel dataset,and its mean intersection over union(MIoU)and pixel accuracy(PA)were 81.32%and 89.37%,respectively.In the case of the multi-attribute dataset,the identification capability of all models improved to varying degrees,with the EILnet achieving the highest MIoU and PA of 83.43%and 91.11%,respectively.Further,applying the EILnet model to various blind wells demonstrated its ability to provide reliable fracture identification,thereby indicating its promising potential applications.展开更多
The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is i...The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is imperative to ensure thorough tumor resection.In this study,we integrate Raman imaging technology with an artificial intelligence(AI)generative model,proposing an innovative approach for intraoperative margin status diagnosis.This method utilizes Raman imaging to swiftly and non-invasively capture tissue Raman images,which are then transformed into hematoxylin-eosin(H&E)-stained histopathological images using an AI generative model for histopathological diagnosis.The generated H&E-stained images clearly illustrate the tissue’s pathological conditions.Independently reviewed by three pathologists,the overall diagnostic accuracy for distinguishing between tumor tissue and normal muscle tissue reaches 86.7%.Notably,it outperforms current clinical practices,especially in TSCC with positive lymph node metastasis or moderately differentiated grades.This advancement highlights the potential of AI-enhanced Raman imaging to significantly improve intraoperative assessments and surgical margin evaluations,promising a versatile diagnostic tool beyond TSCC.展开更多
Indirect X-ray modulation imaging has been adopted in a number of solar missions and provided reconstructed X-ray images of solar flares that are of great scientific importance.However,the assessment of the image qual...Indirect X-ray modulation imaging has been adopted in a number of solar missions and provided reconstructed X-ray images of solar flares that are of great scientific importance.However,the assessment of the image quality of the reconstruction is still difficult,which is particularly useful for scheme design of X-ray imaging systems,testing and improvement of imaging algorithms,and scientific research of X-ray sources.Currently,there is no specified method to quantitatively evaluate the quality of X-ray image reconstruction and the point-spread function(PSF)of an X-ray imager.In this paper,we propose percentage proximity degree(PPD)by considering the imaging characteristics of X-ray image reconstruction and in particular,sidelobes and their effects on imaging quality.After testing a variety of imaging quality assessments in six aspects,we utilized the technique for order preference by similarity to ideal solution to the indices that meet the requirements.Then we develop the final quality index for X-ray image reconstruction,QuIX,which consists of the selected indices and the new PPD.QuIX performs well in a series of tests,including assessment of instrument PSF and simulation tests under different grid configurations,as well as imaging tests with RHESSI data.It is also a useful tool for testing of imaging algorithms,and determination of imaging parameters for both RHESSI and ASO-S/Hard X-ray Imager,such as field of view,beam width factor,and detector selection.展开更多
Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify sp...Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.展开更多
Imaging observations of solar X-ray bursts can reveal details of the energy release process and particle acceleration in flares.Most hard X-ray imagers make use of the modulation-based Fourier transform imaging method...Imaging observations of solar X-ray bursts can reveal details of the energy release process and particle acceleration in flares.Most hard X-ray imagers make use of the modulation-based Fourier transform imaging method,an indirect imaging technique that requires algorithms to reconstruct and optimize images.During the last decade,a variety of algorithms have been developed and improved.However,it is difficult to quantitatively evaluate the image quality of different solutions without a true,reference image of observation.How to choose the values of imaging parameters for these algorithms to get the best performance is also an open question.In this study,we present a detailed test of the characteristics of these algorithms,imaging dynamic range and a crucial parameter for the CLEAN method,clean beam width factor(CBWF).We first used SDO/AIA EUV images to compute DEM maps and calculate thermal X-ray maps.Then these realistic sources and several types of simulated sources are used as the ground truth in the imaging simulations for both RHESSI and ASO-S/HXI.The different solutions are evaluated quantitatively by a number of means.The overall results suggest that EM,PIXON,and CLEAN are exceptional methods for sidelobe elimination,producing images with clear source details.Although MEM_GE,MEM_NJIT,VIS_WV and VIS_CS possess fast imaging processes and generate good images,they too possess associated imperfections unique to each method.The two forward fit algorithms,VF and FF,perform differently,and VF appears to be more robust and useful.We also demonstrated the imaging capability of HXI and available HXI algorithms.Furthermore,the effect of CBWF on image quality was investigated,and the optimal settings for both RHESSI and HXI were proposed.展开更多
Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive te...Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive text data.Their potential integration into clinical settings offers a promising avenue that could transform clinical diagnosis and decision-making processes in the future(Thirunavukarasu et al.,2023).This article aims to provide an in-depth analysis of LLMs’current and potential impact on clinical practices.Their ability to generate differential diagnosis lists underscores their potential as invaluable tools in medical practice and education(Hirosawa et al.,2023;Koga et al.,2023).展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Image-maps,a hybrid design with satellite images as background and map symbols uploaded,aim to combine the advantages of maps’high interpretation efficiency and satellite images’realism.The usability of image-maps i...Image-maps,a hybrid design with satellite images as background and map symbols uploaded,aim to combine the advantages of maps’high interpretation efficiency and satellite images’realism.The usability of image-maps is influenced by the representations of background images and map symbols.Many researchers explored the optimizations for background images and symbolization techniques for symbols to reduce the complexity of image-maps and improve the usability.However,little literature was found for the optimum amount of symbol loading.This study focuses on the effects of background image complexity and map symbol load on the usability(i.e.,effectiveness and efficiency)of image-maps.Experiments were conducted by user studies via eye-tracking equipment and an online questionnaire survey.Experimental data sets included image-maps with ten levels of map symbol load in ten areas.Forty volunteers took part in the target searching experiments.It has been found that the usability,i.e.,average time viewed(efficiency)and average revisits(effectiveness)of targets recorded,is influenced by the complexity of background images,a peak exists for optimum symbol load for an image-map.The optimum levels for symbol load for different image-maps also have a peak when the complexity of the background image/image map increases.The complexity of background images serves as a guideline for optimum map symbol load in image-map design.This study enhanced user experience by optimizing visual clarity and managing cognitive load.Understanding how these factors interact can help create adaptive maps that maintain clarity and usability,guiding AI algorithms to adjust symbol density based on user context.This research establishes the practices for map design,making cartographic tools more innovative and more user-centric.展开更多
The unmanned aerial vehicle(UAV)images captured under low-light conditions are often suffering from noise and uneven illumination.To address these issues,we propose a low-light image enhancement algorithm for UAV imag...The unmanned aerial vehicle(UAV)images captured under low-light conditions are often suffering from noise and uneven illumination.To address these issues,we propose a low-light image enhancement algorithm for UAV images,which is inspired by the Retinex theory and guided by a light weighted map.Firstly,we propose a new network for reflectance component processing to suppress the noise in images.Secondly,we construct an illumination enhancement module that uses a light weighted map to guide the enhancement process.Finally,the processed reflectance and illumination components are recombined to obtain the enhancement results.Experimental results show that our method can suppress the noise in images while enhancing image brightness,and prevent over enhancement in bright regions.Code and data are available at https://gitee.com/baixiaotong2/uav-images.git.展开更多
Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale who...Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.展开更多
基金Supported by the Key R&D Projects in Shaanxi Province(2022JBGS3-08)。
文摘Rapidly and accurately assessing the geometric characteristics of coarse aggregate particles is crucial for ensuring pavement performance in highway engineering.This article introduces an innovative system for the three-dimensional(3D)surface reconstruction of coarse aggregate particles using occlusion-free multi-view imaging.The system captures synchronized images of particles in free fall,employing a matte sphere and a nonlinear optimization approach to estimate the camera projection matrices.A pre-trained segmentation model is utilized to eliminate the background of the images.The Shape from Silhouettes(SfS)algorithm is then applied to generate 3D voxel data,followed by the Marching Cubes algorithm to construct the 3D surface contour.Validation against standard parts and diverse coarse aggregate particles confirms the method's high accuracy,with an average measurement precision of 0.434 mm and a significant increase in scanning and reconstruction efficiency.
基金Project supported by the National Natural Science Foundation of China (No 60802013)the Natural Science Foundation of Zhe-jiang Province, China (No Y106574)
文摘A new algorithm is proposed for restoring disocclusion regions in depth-image-based rendering (DIBR) warped images. Current solutions include layered depth image (LDI), pre-filtering methods, and post-processing methods. The LDI is complicated, and pre-filtering of depth images causes noticeable geometrical distortions in cases of large baseline warping. This paper presents a depth-aided inpainting method which inherits merits from Criminisi's inpainting algorithm. The proposed method features incorporation of a depth cue into texture estimation. The algorithm efficiently handles depth ambiguity by penalizing larger Lagrange multipliers of flling points closer to the warping position compared with the surrounding existing points. We perform morphological operations on depth images to accelerate the algorithm convergence, and adopt a luma-first strategy to adapt to various color sampling formats. Experiments on test multi-view sequence showed that our method has superiority in depth differentiation and geometrical loyalty in the restoration of warped images. Also, peak signal-to-noise ratio (PSNR) statistics on non-hole regions and whole image comparisons both compare favorably to those obtained by state of the art techniques.
文摘In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relational graph location network(RGLN)to perform this task.In this network,we propose a heterogeneous graph construction approach for graph classification tasks,which aims to describe the location in a more appropriate way,thereby improving the expression ability of the location representation module.Experiments show that the expression ability of the proposed graph construction approach outperforms the compared methods by a large margin.In addition,the proposed localization method outperforms the compared localization methods by around 1.7%in terms of meter-level accuracy.
基金This work was supported by the National Key Research and Development Program of China[grant number 2017YFC0803802]and the National Natural Science Foundation of China[grant number 41771486].
文摘Targeting at a reliable image matching of multiple remote sensing images for the generation of digital surface models,this paper presents a geometric-constrained multi-view image matching method,based on an energy minimization framework.By employing a geometrical constraint,the cost value of the energy function was calculated from multiple images,and the cost value was aggregated in an image space using a semi-global optimization approach.A homography transform parameter calculation method is proposed for fast calculation of projection pixel on each image when calculation cost values.It is based on the known interior orientation parameters,exterior orientation parameters,and a given elevation value.For an efficient and reliable processing of multiple remote sensing images,the proposed matching method was performed via a coarse-to-fine strategy through image pyramid.Three sets of airborne remote sensing images were used to evaluate the performance of the proposed method.Results reveal that the multi-view image matching can improve matching reliability.Moreover,the experimental results show that the proposed method performs better than traditional methods.
基金This work is funded by Key Scientific Research Projects of Colleges and Universities in Henan Province under Grant 22A460022Training Plan for Young Backbone Teachers in Colleges and Universities in Henan Province under Grant 2021GGJS077.
文摘Traditional three-dimensional(3D)image reconstruction method,which highly dependent on the environment and has poor reconstruction effect,is easy to lead to mismatch and poor real-time performance.The accuracy of feature extraction from multiple images affects the reliability and real-time performance of 3D reconstruction technology.To solve the problem,a multi-view image 3D reconstruction algorithm based on self-encoding convolutional neural network is proposed in this paper.The algorithm first extracts the feature information of multiple two-dimensional(2D)images based on scale and rotation invariance parameters of Scale-invariant feature transform(SIFT)operator.Secondly,self-encoding learning neural network is introduced into the feature refinement process to take full advantage of its feature extraction ability.Then,Fish-Net is used to replace the U-Net structure inside the self-encoding network to improve gradient propagation between U-Net structures,and Generative Adversarial Networks(GAN)loss function is used to replace mean square error(MSE)to better express image features,discarding useless features to obtain effective image features.Finally,an incremental structure from motion(SFM)algorithm is performed to calculate rotation matrix and translation vector of the camera,and the feature points are triangulated to obtain a sparse spatial point cloud,and meshlab software is used to display the results.Simulation experiments show that compared with the traditional method,the image feature extraction method proposed in this paper can significantly improve the rendering effect of 3D point cloud,with an accuracy rate of 92.5%and a reconstruction complete rate of 83.6%.
文摘Through the analysis and comparison of shortcomings and advantages of existing technologies on object modeling in 3D applications,we propose a new modeling method for virtual scene based on multi-view image sequence to model irregular objects efficiently in 3D application.In 3D scene,this method can get better visual effect by tracking the viewer's real-time perspective position and projecting the photos from different perspectives dynamically.The philosophy of design,the steps of development and some other relevant topics are discussed in details,and the validity of the algorithm is analyzed.The results demonstrate that this method represents more superiority on simulating irregular objects by applying it to the modeling of virtual museum.
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.
基金supported by the National Key Research and Development Project of China(No.2023YFB3709605)the National Natural Science Foundation of China(No.62073193)the National College Student Innovation Training Program(No.202310422122)。
文摘Potential high-temperature risks exist in heat-prone components of electric moped charging devices,such as sockets,interfaces,and controllers.Traditional detection methods have limitations in terms of real-time performance and monitoring scope.To address this,a temperature detection method based on infrared image processing has been proposed:utilizing the median filtering algorithm to denoise the original infrared image,then applying an image segmentation algorithm to divide the image.
基金the National Natural Science Foundation of China(42472194,42302153,and 42002144)the Fundamental Research Funds for the Central Univer-sities(22CX06002A).
文摘Karst fractures serve as crucial seepage channels and storage spaces for carbonate natural gas reservoirs,and electrical image logs are vital data for visualizing and characterizing such fractures.However,the conventional approach of identifying fractures using electrical image logs predominantly relies on manual processes that are not only time-consuming but also highly subjective.In addition,the heterogeneity and strong dissolution tendency of karst carbonate reservoirs lead to complexity and variety in fracture geometry,which makes it difficult to accurately identify fractures.In this paper,the electrical image logs network(EILnet)da deep-learning-based intelligent semantic segmentation model with a selective attention mechanism and selective feature fusion moduledwas created to enable the intelligent identification and segmentation of different types of fractures through electrical logging images.Data from electrical image logs representing structural and induced fractures were first selected using the sliding window technique before image inpainting and data augmentation were implemented for these images to improve the generalizability of the model.Various image-processing tools,including the bilateral filter,Laplace operator,and Gaussian low-pass filter,were also applied to the electrical logging images to generate a multi-attribute dataset to help the model learn the semantic features of the fractures.The results demonstrated that the EILnet model outperforms mainstream deep-learning semantic segmentation models,such as Fully Convolutional Networks(FCN-8s),U-Net,and SegNet,for both the single-channel dataset and the multi-attribute dataset.The EILnet provided significant advantages for the single-channel dataset,and its mean intersection over union(MIoU)and pixel accuracy(PA)were 81.32%and 89.37%,respectively.In the case of the multi-attribute dataset,the identification capability of all models improved to varying degrees,with the EILnet achieving the highest MIoU and PA of 83.43%and 91.11%,respectively.Further,applying the EILnet model to various blind wells demonstrated its ability to provide reliable fracture identification,thereby indicating its promising potential applications.
基金supported by the National Natural Science Foundation of China(Grant Nos.82272955 and 22203057)the Natural Science Foundation of Fujian Province(Grant No.2021J011361).
文摘The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is imperative to ensure thorough tumor resection.In this study,we integrate Raman imaging technology with an artificial intelligence(AI)generative model,proposing an innovative approach for intraoperative margin status diagnosis.This method utilizes Raman imaging to swiftly and non-invasively capture tissue Raman images,which are then transformed into hematoxylin-eosin(H&E)-stained histopathological images using an AI generative model for histopathological diagnosis.The generated H&E-stained images clearly illustrate the tissue’s pathological conditions.Independently reviewed by three pathologists,the overall diagnostic accuracy for distinguishing between tumor tissue and normal muscle tissue reaches 86.7%.Notably,it outperforms current clinical practices,especially in TSCC with positive lymph node metastasis or moderately differentiated grades.This advancement highlights the potential of AI-enhanced Raman imaging to significantly improve intraoperative assessments and surgical margin evaluations,promising a versatile diagnostic tool beyond TSCC.
基金supported by the National Natural Science Foundation of China(NSFC)12333010the National Key R&D Program of China 2022YFF0503002+3 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences(grant No.XDB0560000)the NSFC 11921003supported by the Prominent Postdoctoral Project of Jiangsu Province(2023ZB304)supported by the Strategic Priority Research Program on Space Science,the Chinese Academy of Sciences,grant No.XDA15320000.
文摘Indirect X-ray modulation imaging has been adopted in a number of solar missions and provided reconstructed X-ray images of solar flares that are of great scientific importance.However,the assessment of the image quality of the reconstruction is still difficult,which is particularly useful for scheme design of X-ray imaging systems,testing and improvement of imaging algorithms,and scientific research of X-ray sources.Currently,there is no specified method to quantitatively evaluate the quality of X-ray image reconstruction and the point-spread function(PSF)of an X-ray imager.In this paper,we propose percentage proximity degree(PPD)by considering the imaging characteristics of X-ray image reconstruction and in particular,sidelobes and their effects on imaging quality.After testing a variety of imaging quality assessments in six aspects,we utilized the technique for order preference by similarity to ideal solution to the indices that meet the requirements.Then we develop the final quality index for X-ray image reconstruction,QuIX,which consists of the selected indices and the new PPD.QuIX performs well in a series of tests,including assessment of instrument PSF and simulation tests under different grid configurations,as well as imaging tests with RHESSI data.It is also a useful tool for testing of imaging algorithms,and determination of imaging parameters for both RHESSI and ASO-S/Hard X-ray Imager,such as field of view,beam width factor,and detector selection.
基金the Deanship of Scientifc Research at King Khalid University for funding this work through large group Research Project under grant number RGP2/421/45supported via funding from Prince Sattam bin Abdulaziz University project number(PSAU/2024/R/1446)+1 种基金supported by theResearchers Supporting Project Number(UM-DSR-IG-2023-07)Almaarefa University,Riyadh,Saudi Arabia.supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2021R1F1A1055408).
文摘Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.
基金supported by the National Key R&D Program of China 2022YFF0503002the National Natural Science Foundation of China(NSFC,Grant Nos.12333010 and 12233012)+2 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences(grant No.XDB0560000)supported by the Prominent Postdoctoral Project of Jiangsu Province(2023ZB304)supported by the Strategic Priority Research Program on Space Science,the Chinese Academy of Sciences,grant No.XDA15320000.
文摘Imaging observations of solar X-ray bursts can reveal details of the energy release process and particle acceleration in flares.Most hard X-ray imagers make use of the modulation-based Fourier transform imaging method,an indirect imaging technique that requires algorithms to reconstruct and optimize images.During the last decade,a variety of algorithms have been developed and improved.However,it is difficult to quantitatively evaluate the image quality of different solutions without a true,reference image of observation.How to choose the values of imaging parameters for these algorithms to get the best performance is also an open question.In this study,we present a detailed test of the characteristics of these algorithms,imaging dynamic range and a crucial parameter for the CLEAN method,clean beam width factor(CBWF).We first used SDO/AIA EUV images to compute DEM maps and calculate thermal X-ray maps.Then these realistic sources and several types of simulated sources are used as the ground truth in the imaging simulations for both RHESSI and ASO-S/HXI.The different solutions are evaluated quantitatively by a number of means.The overall results suggest that EM,PIXON,and CLEAN are exceptional methods for sidelobe elimination,producing images with clear source details.Although MEM_GE,MEM_NJIT,VIS_WV and VIS_CS possess fast imaging processes and generate good images,they too possess associated imperfections unique to each method.The two forward fit algorithms,VF and FF,perform differently,and VF appears to be more robust and useful.We also demonstrated the imaging capability of HXI and available HXI algorithms.Furthermore,the effect of CBWF on image quality was investigated,and the optimal settings for both RHESSI and HXI were proposed.
文摘Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive text data.Their potential integration into clinical settings offers a promising avenue that could transform clinical diagnosis and decision-making processes in the future(Thirunavukarasu et al.,2023).This article aims to provide an in-depth analysis of LLMs’current and potential impact on clinical practices.Their ability to generate differential diagnosis lists underscores their potential as invaluable tools in medical practice and education(Hirosawa et al.,2023;Koga et al.,2023).
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金National Natural Science Foundation of China(No.42301518)Hubei Key Laboratory of Regional Development and Environmental Response(No.2023(A)002)Key Laboratory of the Evaluation and Monitoring of Southwest Land Resources(Ministry of Education)(No.TDSYS202304).
文摘Image-maps,a hybrid design with satellite images as background and map symbols uploaded,aim to combine the advantages of maps’high interpretation efficiency and satellite images’realism.The usability of image-maps is influenced by the representations of background images and map symbols.Many researchers explored the optimizations for background images and symbolization techniques for symbols to reduce the complexity of image-maps and improve the usability.However,little literature was found for the optimum amount of symbol loading.This study focuses on the effects of background image complexity and map symbol load on the usability(i.e.,effectiveness and efficiency)of image-maps.Experiments were conducted by user studies via eye-tracking equipment and an online questionnaire survey.Experimental data sets included image-maps with ten levels of map symbol load in ten areas.Forty volunteers took part in the target searching experiments.It has been found that the usability,i.e.,average time viewed(efficiency)and average revisits(effectiveness)of targets recorded,is influenced by the complexity of background images,a peak exists for optimum symbol load for an image-map.The optimum levels for symbol load for different image-maps also have a peak when the complexity of the background image/image map increases.The complexity of background images serves as a guideline for optimum map symbol load in image-map design.This study enhanced user experience by optimizing visual clarity and managing cognitive load.Understanding how these factors interact can help create adaptive maps that maintain clarity and usability,guiding AI algorithms to adjust symbol density based on user context.This research establishes the practices for map design,making cartographic tools more innovative and more user-centric.
基金supported by the National Natural Science Foundation of China(Nos.62201454 and 62306235)the Xi’an Science and Technology Program of Xi’an Science and Technology Bureau(No.23SFSF0004)。
文摘The unmanned aerial vehicle(UAV)images captured under low-light conditions are often suffering from noise and uneven illumination.To address these issues,we propose a low-light image enhancement algorithm for UAV images,which is inspired by the Retinex theory and guided by a light weighted map.Firstly,we propose a new network for reflectance component processing to suppress the noise in images.Secondly,we construct an illumination enhancement module that uses a light weighted map to guide the enhancement process.Finally,the processed reflectance and illumination components are recombined to obtain the enhancement results.Experimental results show that our method can suppress the noise in images while enhancing image brightness,and prevent over enhancement in bright regions.Code and data are available at https://gitee.com/baixiaotong2/uav-images.git.
基金supported by the National Natural Science Foundation of China(No.82371933)the National Natural Science Foundation of Shandong Province of China(No.ZR2021MH120)+1 种基金the Taishan Scholars Project(No.tsqn202211378)the Shandong Provincial Natural Science Foundation for Excellent Young Scholars(No.ZR2024YQ075).
文摘Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.