Image captioning,a pivotal research area at the intersection of image understanding,artificial intelligence,and linguistics,aims to generate natural language descriptions for images.This paper proposes an efficient im...Image captioning,a pivotal research area at the intersection of image understanding,artificial intelligence,and linguistics,aims to generate natural language descriptions for images.This paper proposes an efficient image captioning model named Mob-IMWTC,which integrates improved wavelet convolution(IMWTC)with an enhanced MobileNet V3 architecture.The enhanced MobileNet V3 integrates a transformer encoder as its encoding module and a transformer decoder as its decoding module.This innovative neural network significantly reduces the memory space required and model training time,while maintaining a high level of accuracy in generating image descriptions.IMWTC facilitates large receptive fields without significantly increasing the number of parameters or computational overhead.The improvedMobileNet V3 model has its classifier removed,and simultaneously,it employs IMWTC layers to replace the original convolutional layers.This makes Mob-IMWTC exceptionally well-suited for deployment on lowresource devices.Experimental results,based on objective evaluation metrics such as BLEU,ROUGE,CIDEr,METEOR,and SPICE,demonstrate that Mob-IMWTC outperforms state-of-the-art models,including three CNN architectures(CNN-LSTM,CNN-Att-LSTM,CNN-Tran),two mainstream methods(LCM-Captioner,ClipCap),and our previous work(Mob-Tran).Subjective evaluations further validate the model’s superiority in terms of grammaticality,adequacy,logic,readability,and humanness.Mob-IMWTC offers a lightweight yet effective solution for image captioning,making it suitable for deployment on resource-constrained devices.展开更多
The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce...The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce poor computer vision results.The common image denoising techniques tend to remove significant image details and also remove noise,provided they are based on space and frequency filtering.The updated framework presented in this paper is a novel denoising model that makes use of Boruta-driven feature selection using a Long Short-Term Memory Autoencoder(LSTMAE).The Boruta algorithm identifies the most useful depth features that are used to maximize the spatial structure integrity and reduce redundancy.An LSTMAE is then used to process these selected features and model depth pixel sequences to generate robust,noise-resistant representations.The system uses the encoder to encode the input data into a latent space that has been compressed before it is decoded to retrieve the clean image.Experiments on a benchmark data set show that the suggested technique attains a PSNR of 45 dB and an SSIM of 0.90,which is 10 dB higher than the performance of conventional convolutional autoencoders and 15 times higher than that of the wavelet-based models.Moreover,the feature selection step will decrease the input dimensionality by 40%,resulting in a 37.5%reduction in training time and a real-time inference rate of 200 FPS.Boruta-LSTMAE framework,therefore,offers a highly efficient and scalable system for depth image denoising,with a high potential to be applied to close-range 3D systems,such as robotic manipulation and gesture-based interfaces.展开更多
Research has been conducted to reduce resource consumption in 3D medical image segmentation for diverse resource-constrained environments.However,decreasing the number of parameters to enhance computational efficiency...Research has been conducted to reduce resource consumption in 3D medical image segmentation for diverse resource-constrained environments.However,decreasing the number of parameters to enhance computational efficiency can also lead to performance degradation.Moreover,these methods face challenges in balancing global and local features,increasing the risk of errors in multi-scale segmentation.This issue is particularly pronounced when segmenting small and complex structures within the human body.To address this problem,we propose a multi-stage hierarchical architecture composed of a detector and a segmentor.The detector extracts regions of interest(ROIs)in a 3D image,while the segmentor performs segmentation in the extracted ROI.Removing unnecessary areas in the detector allows the segmentation to be performed on a more compact input.The segmentor is designed with multiple stages,where each stage utilizes different input sizes.It implements a stage-skippingmechanism that deactivates certain stages using the initial input size.This approach minimizes unnecessary computations on segmenting the essential regions to reduce computational overhead.The proposed framework preserves segmentation performance while reducing resource consumption,enabling segmentation even in resource-constrained environments.展开更多
Background In recent years,the demand for interactive photorealistic three-dimensional(3D)environments has increased in various fields,including architecture,engineering,and entertainment.However,achieving a balance b...Background In recent years,the demand for interactive photorealistic three-dimensional(3D)environments has increased in various fields,including architecture,engineering,and entertainment.However,achieving a balance between the quality and efficiency of high-performance 3D applications and virtual reality(VR)remains challenging.Methods This study addresses this issue by revisiting and extending view interpolation for image-based rendering(IBR),which enables the exploration of spacious open environments in 3D and VR.Therefore,we introduce multimorphing,a novel rendering method based on the spatial data structure of 2D image patches,called the image graph.Using this approach,novel views can be rendered with up to six degrees of freedom using only a sparse set of views.The rendering process does not require 3D reconstruction of the geometry or per-pixel depth information,and all relevant data for the output are extracted from the local morphing cells of the image graph.The detection of parallax image regions during preprocessing reduces rendering artifacts by extrapolating image patches from adjacent cells in real-time.In addition,a GPU-based solution was presented to resolve exposure inconsistencies within a dataset,enabling seamless transitions of brightness when moving between areas with varying light intensities.Results Experiments on multiple real-world and synthetic scenes demonstrate that the presented method achieves high"VR-compatible"frame rates,even on mid-range and legacy hardware,respectively.While achieving adequate visual quality even for sparse datasets,it outperforms other IBR and current neural rendering approaches.Conclusions Using the correspondence-based decomposition of input images into morphing cells of 2D image patches,multidimensional image morphing provides high-performance novel view generation,supporting open 3D and VR environments.Nevertheless,the handling of morphing artifacts in the parallax image regions remains a topic for future research.展开更多
Deep learning networks are increasingly exploited in the field of neuronal soma segmentation.However,annotating dataset is also an expensive and time-consuming task.Unsupervised domain adaptation is an effective metho...Deep learning networks are increasingly exploited in the field of neuronal soma segmentation.However,annotating dataset is also an expensive and time-consuming task.Unsupervised domain adaptation is an effective method to mitigate the problem,which is able to learn an adaptive segmentation model by transferring knowledge from a rich-labeled source domain.In this paper,we propose a multi-level distribution alignment-based unsupervised domain adaptation network(MDA-Net)for segmentation of 3D neuronal soma images.Distribution alignment is performed in both feature space and output space.In the feature space,features from different scales are adaptively fused to enhance the feature extraction capability for small target somata and con-strained to be domain invariant by adversarial adaptation strategy.In the output space,local discrepancy maps that can reveal the spatial structures of somata are constructed on the predicted segmentation results.Then thedistribution alignment is performed on the local discrepancies maps across domains to obtain a superior discrepancy map in the target domain,achieving refined segmentation performance of neuronal somata.Additionally,after a period of distribution align-ment procedure,a portion of target samples with high confident pseudo-labels are selected as training data,which assist in learning a more adaptive segmentation network.We verified the superiority of the proposed algorithm by comparing several domain adaptation networks on two 3D mouse brain neuronal somata datasets and one macaque brain neuronal soma dataset.展开更多
This study introduces a novel method for reconstructing the 3D model of aluminum foam using cross-sectional sequence images.Combining precision milling and image acquisition,high-qual-ity cross-sectional images are ob...This study introduces a novel method for reconstructing the 3D model of aluminum foam using cross-sectional sequence images.Combining precision milling and image acquisition,high-qual-ity cross-sectional images are obtained.Pore structures are segmented by the U-shaped network(U-Net)neural network integrated with the Canny edge detection operator,ensuring accurate pore delineation and edge extraction.The trained U-Net achieves 98.55%accuracy.The 2D data are superimposed and processed into 3D point clouds,enabling reconstruction of the pore structure and aluminum skeleton.Analysis of pore 01 shows the cross-sectional area initially increases,and then decreases with milling depth,with a uniform point distribution of 40 per layer.The reconstructed model exhibits a porosity of 77.5%,with section overlap rates between the 2D pore segmentation and the reconstructed model exceeding 96%,confirming high fidelity.Equivalent sphere diameters decrease with size,averaging 1.95 mm.Compression simulations reveal that the stress-strain curve of the 3D reconstruction model of aluminum foam exhibits fluctuations,and the stresses in the reconstruction model concentrate on thin cell walls,leading to localized deformations.This method accurately restores the aluminum foam’s complex internal structure,improving reconstruction preci-sion and simulation reliability.The approach offers a cost-efficient,high-precision technique for optimizing material performance in engineering applications.展开更多
Methods and procedures of three-dimensional (3D) characterization of the pore structure features in the packed ore particle bed are focused. X-ray computed tomography was applied to deriving the cross-sectional imag...Methods and procedures of three-dimensional (3D) characterization of the pore structure features in the packed ore particle bed are focused. X-ray computed tomography was applied to deriving the cross-sectional images of specimens with single particle size of 1-2, 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10 ram. Based on the in-house developed 3D image analysis programs using Matlab, the volume porosity, pore size distribution and degree of connectivity were calculated and analyzed in detail. The results indicate that the volume porosity, the mean diameter of pores and the effective pore size (d50) increase with the increasing of particle size. Lognormal distribution or Gauss distribution is mostly suitable to model the pore size distribution. The degree of connectivity investigated on the basis of cluster-labeling algorithm also increases with increasing the particle size approximately.展开更多
目的探讨T_2 star mapping、T_1 images与3D DESS融合伪彩图在关节软骨损伤中的诊断价值。方法对26例关节软骨损伤患者行T_2 star mapping、T_1 images和3D DESS扫描,并将T_1 images、T_2 star mapping与3D DESS图像融合,评价患者股骨...目的探讨T_2 star mapping、T_1 images与3D DESS融合伪彩图在关节软骨损伤中的诊断价值。方法对26例关节软骨损伤患者行T_2 star mapping、T_1 images和3D DESS扫描,并将T_1 images、T_2 star mapping与3D DESS图像融合,评价患者股骨、胫骨、髌骨关节软骨损伤程度并与关节镜结果对比,计算融合伪彩图诊断软骨损伤的特异性、敏感性及与关节镜诊断结果一致性。结果 T_1 images-3D DESS融合伪彩图诊断关节软骨损伤的敏感度、特异度及Kappa值分别为92.8%、93.0%、0.769,T_2 star mapping-3D DESS融合伪彩图诊断关节软骨损伤的敏感度、特异度及Kappa值分别为91.4%、94.2%、0.787。结论 T_2 star mapping、T_1 images与3D DESS融合伪彩图在关节软骨早期损伤评价上优于关节镜。展开更多
A novel algorithm of 3-D surface image registration is proposed. It makes use of the array information of 3-D points and takes vector/vertex-like features as the basis of the matching. That array information of 3-D po...A novel algorithm of 3-D surface image registration is proposed. It makes use of the array information of 3-D points and takes vector/vertex-like features as the basis of the matching. That array information of 3-D points can be easily obtained when capturing original 3-D images. The iterative least-mean-squared (LMS) algorithm is applied to optimizing adaptively the transformation matrix parameters. These can effectively improve the registration performance and hurry up the matching process. Experimental results show that it can reach a good subjective impression on aligned 3-D images. Although the algorithm focuses primarily on the human head model, it can also be used for other objects with small modifications.展开更多
A developed stereo particle image velocimetry(stereo-PIV) system was proposed to measure three-dimensional(3D) soil deformation around a laterally loaded pile in sand.The stereo-PIV technique extended 2D measurement t...A developed stereo particle image velocimetry(stereo-PIV) system was proposed to measure three-dimensional(3D) soil deformation around a laterally loaded pile in sand.The stereo-PIV technique extended 2D measurement to 3D based on a binocular vision model,where two cameras with a well geometrical setting were utilized to image the same object simultaneously.This system utilized two open software packages and some simple programs in MATLAB,which can easily be adjusted to meet user needs at a low cost.The failure planes form an angle with the horizontal line,which are measured at 27°-29°,approximately three-fourths of the frictional angle of soil.The edge of the strain wedge formed in front of the pile is an arc,which is slightly different from the straight line reported in the literature.The active and passive influence zones are about twice and six times of the diameter of the pile,respectively.The test demonstrates the good performance and feasibility of this stereo-PIV system for more advanced geotechnical testing.展开更多
A new object-oriented method has been developed for the extraction of Mars rocks from Mars rover data. It is based on a combination of Mars rover imagery and 3D point cloud data. First, Navcam or Pancam images taken b...A new object-oriented method has been developed for the extraction of Mars rocks from Mars rover data. It is based on a combination of Mars rover imagery and 3D point cloud data. First, Navcam or Pancam images taken by the Mars rovers are segmented into homogeneous objects with a mean-shift algorithm. Then, the objects in the segmented images are classified into small rock candidates, rock shadows, and large objects. Rock shadows and large objects are considered as the regions within which large rocks may exist. In these regions, large rock candidates are extracted through ground-plane fitting with the 3D point cloud data. Small and large rock candidates are combined and postprocessed to obtain the final rock extraction results. The shape properties of the rocks (angularity, circularity, width, height, and width-height ratio) have been calculated for subsequent ~eological studies.展开更多
To overcome the shortcomings of 1 D and 2 D Otsu’s thresholding techniques, the 3 D Otsu method has been developed.Among all Otsu’s methods, 3 D Otsu technique provides the best threshold values for the multi-level ...To overcome the shortcomings of 1 D and 2 D Otsu’s thresholding techniques, the 3 D Otsu method has been developed.Among all Otsu’s methods, 3 D Otsu technique provides the best threshold values for the multi-level thresholding processes. In this paper, to improve the quality of segmented images, a simple and effective multilevel thresholding method is introduced. The proposed approach focuses on preserving edge detail by computing the 3 D Otsu along the fusion phenomena. The advantages of the presented scheme include higher quality outcomes, better preservation of tiny details and boundaries and reduced execution time with rising threshold levels. The fusion approach depends upon the differences between pixel intensity values within a small local space of an image;it aims to improve localized information after the thresholding process. The fusion of images based on local contrast can improve image segmentation performance by minimizing the loss of local contrast, loss of details and gray-level distributions. Results show that the proposed method yields more promising segmentation results when compared to conventional1 D Otsu, 2 D Otsu and 3 D Otsu methods, as evident from the objective and subjective evaluations.展开更多
An ILRIS-36D 3-D laser image scanning system was used to monitor the Anjialing strip mine slope on Pingshuo in Shanxi province. The basic working principles, performance indexes, features and data collection and proce...An ILRIS-36D 3-D laser image scanning system was used to monitor the Anjialing strip mine slope on Pingshuo in Shanxi province. The basic working principles, performance indexes, features and data collection and processing methods are illus-trated. The point cloud results are analyzed in detail. The rescale range analysis method was used to analyze the deformation char-acteristics of the slope. The results show that the trend of slope displacement is stable and that the degree of landslide danger is low. This work indicates that 3-D laser image scanning can supply multi-parameter, high precision real time data over long distances. These data can be used to study the distortion of the slope quickly and accurately.展开更多
Mineral dissemination and pore space distribution in ore particles are important features that influence heap leaching performance.To quantify the mineral dissemination and pore space distribution of an ore particle,a...Mineral dissemination and pore space distribution in ore particles are important features that influence heap leaching performance.To quantify the mineral dissemination and pore space distribution of an ore particle,a cylindrical copper oxide ore sample(I center dot 4.6 mm x 5.6 mm)was scanned using high-resolution X-ray computed tomography(HRXCT),a nondestructive imaging technology,at a spatial resolution of 4.85 mu m.Combined with three-dimensional(3D)image analysis techniques,the main mineral phases and pore space were segmented and the volume fraction of each phase was calculated.In addition,the mass fraction of each mineral phase was estimated and the result was validated with that obtained using traditional techniques.Furthermore,the pore phase features,including the pore size distribution,pore surface area,pore fractal dimension,pore centerline,and the pore connectivity,were investigated quantitatively.The pore space analysis results indicate that the pore size distribution closely fits a log-normal distribution and that the pore space morphology is complicated,with a large surface area and low connectivity.This study demonstrates that the combination of HRXCT and 3D image analysis is an effective tool for acquiring 3D mineralogical and pore structural data.展开更多
In this paper, uniaxial compression tests were carried out on a series of composite rock specimens with different dip angles, which were made from two types of rock-like material with different strength. The acoustic ...In this paper, uniaxial compression tests were carried out on a series of composite rock specimens with different dip angles, which were made from two types of rock-like material with different strength. The acoustic emission technique was used to monitor the acoustic signal characteristics of composite rock specimens during the entire loading process. At the same time, an optical non-contact 3 D digital image correlation technique was used to study the evolution of axial strain field and the maximal strain field before and after the peak strength at different stress levels during the loading process. The effect of bedding plane inclination on the deformation and strength during uniaxial loading was analyzed. The methods of solving the elastic constants of hard and weak rock were described. The damage evolution process, deformation and failure mechanism, and failure mode during uniaxial loading were fully determined. The experimental results show that the θ = 0?–45?specimens had obvious plastic deformation during loading, and the brittleness of the θ = 60?–90?specimens gradually increased during the loading process. When the anisotropic angle θincreased from 0?to 90?, the peak strength, peak strain,and apparent elastic modulus all decreased initially and then increased. The failure mode of the composite rock specimen during uniaxial loading can be divided into three categories:tensile fracture across the discontinuities(θ = 0?–30?), slid-ing failure along the discontinuities(θ = 45?–75?), and tensile-split along the discontinuities(θ = 90?). The axial strain of the weak and hard rock layers in the composite rock specimen during the loading process was significantly different from that of the θ = 0?–45?specimens and was almost the same as that of the θ = 60?–90?specimens. As for the strain localization highlighted in the maximum principal strain field, the θ = 0?–30?specimens appeared in the rock matrix approximately parallel to the loading direction,while in the θ = 45?–90?specimens it appeared at the hard and weak rock layer interface.展开更多
In view of the fact that the traditional Hausdorff image matching algorithm is very sensitive to the image size as well as the unsatisfactory real-time performance in practical applications,an image matching algorithm...In view of the fact that the traditional Hausdorff image matching algorithm is very sensitive to the image size as well as the unsatisfactory real-time performance in practical applications,an image matching algorithm is proposed based on the combination of Yolov3.Firstly,the features of the reference image are selected for pretraining,and then the training results are used to extract the features of the real images before the coordinates of the center points of the feature area are used to complete the coarse matching.Finally,the Hausdorff algorithm is used to complete the fine image matching.Experiments show that the proposed algorithm significantly improves the speed and accuracy of image matching.Also,it is robust to rotation changes.展开更多
Fusion methods based on multi-scale transforms have become the mainstream of the pixel-level image fusion. However,most of these methods cannot fully exploit spatial domain information of source images, which lead to ...Fusion methods based on multi-scale transforms have become the mainstream of the pixel-level image fusion. However,most of these methods cannot fully exploit spatial domain information of source images, which lead to the degradation of image.This paper presents a fusion framework based on block-matching and 3D(BM3D) multi-scale transform. The algorithm first divides the image into different blocks and groups these 2D image blocks into 3D arrays by their similarity. Then it uses a 3D transform which consists of a 2D multi-scale and a 1D transform to transfer the arrays into transform coefficients, and then the obtained low-and high-coefficients are fused by different fusion rules. The final fused image is obtained from a series of fused 3D image block groups after the inverse transform by using an aggregation process. In the experimental part, we comparatively analyze some existing algorithms and the using of different transforms, e.g. non-subsampled Contourlet transform(NSCT), non-subsampled Shearlet transform(NSST), in the 3D transform step. Experimental results show that the proposed fusion framework can not only improve subjective visual effect, but also obtain better objective evaluation criteria than state-of-the-art methods.展开更多
Image fusion is performed between one band of multi-spectral image and two bands of hyperspectral image to produce fused image with the same spatial resolution as source multi-spectral image and the same spectral reso...Image fusion is performed between one band of multi-spectral image and two bands of hyperspectral image to produce fused image with the same spatial resolution as source multi-spectral image and the same spectral resolution as source hyperspeetral image. According to the characteristics and 3-Dimensional (3-D) feature analysis of multi-spectral and hyperspectral image data volume, the new fusion approach using 3-D wavelet based method is proposed. This approach is composed of four major procedures: Spatial and spectral resampling, 3-D wavelet transform, wavelet coefficient integration and 3-D inverse wavelet transform. Especially, a novel method, Ratio Image Based Spectral Resampling (RIBSR)method, is proposed to accomplish data resampling in spectral domain by utilizing the property of ratio image. And a new fusion rule, Average and Substitution (A&S) rule, is employed as the fusion rule to accomplish wavelet coefficient integration. Experimental results illustrate that the fusion approach using 3-D wavelet transform can utilize both spatial and spectral characteristics of source images more adequately and produce fused image with higher quality and fewer artifacts than fusion approach using 2-D wavelet transform. It is also revealed that RIBSR method is capable of interpolating the missing data more effectively and correctly, and A&S rule can integrate coefficients of source images in 3-D wavelet domain to preserve both spatial and spectral features of source images more properly.展开更多
Plants respond to drought stress with different physical manners, such as morphology and color of leaves. Thus, plants can be considered as a sort of living-sensors for monitoring dynamic of soil water content or the ...Plants respond to drought stress with different physical manners, such as morphology and color of leaves. Thus, plants can be considered as a sort of living-sensors for monitoring dynamic of soil water content or the stored water in plant body. Because of difficulty to identify the early wilting symptom of plants from the results in 2D (two-dimension) space, this paper presented a preliminary study with 3D (three-dimension)-based image, in which a laser scanner was used for achieving the morphological information of zucchini (Cucurbita pepo) leaves. Moreover, a leaf wilting index (DLWIF) was defined by fractal dimension. The experiment consisted of phase-1 for observing the temporal variation of DLWIF and phase-2 for the validation of this index. During the experiment, air temperature, luminous intensity, and volumetric soil water contents (VSWC) were simultaneously recorded over time. The results of both phases fitted the bisector (line: 1:1) with R2=0.903 and REMS=0.155. More significantly, the influence of VSWC with three levels (0.22, 0.30, and 0.36 cm3 cm-3) on the response of plant samples to drought stress was observed from separated traces of DLWIF. In brief, two conclusions have been made: (i) the laser scanner is an effective tool for the non-contact detection of morphological wilting of plants, and (ii) defined DLWIF can be a promising indicator for a category of plants like zucchini.展开更多
基金funded by National Social Science Fund of China,grant number 23BYY197.
文摘Image captioning,a pivotal research area at the intersection of image understanding,artificial intelligence,and linguistics,aims to generate natural language descriptions for images.This paper proposes an efficient image captioning model named Mob-IMWTC,which integrates improved wavelet convolution(IMWTC)with an enhanced MobileNet V3 architecture.The enhanced MobileNet V3 integrates a transformer encoder as its encoding module and a transformer decoder as its decoding module.This innovative neural network significantly reduces the memory space required and model training time,while maintaining a high level of accuracy in generating image descriptions.IMWTC facilitates large receptive fields without significantly increasing the number of parameters or computational overhead.The improvedMobileNet V3 model has its classifier removed,and simultaneously,it employs IMWTC layers to replace the original convolutional layers.This makes Mob-IMWTC exceptionally well-suited for deployment on lowresource devices.Experimental results,based on objective evaluation metrics such as BLEU,ROUGE,CIDEr,METEOR,and SPICE,demonstrate that Mob-IMWTC outperforms state-of-the-art models,including three CNN architectures(CNN-LSTM,CNN-Att-LSTM,CNN-Tran),two mainstream methods(LCM-Captioner,ClipCap),and our previous work(Mob-Tran).Subjective evaluations further validate the model’s superiority in terms of grammaticality,adequacy,logic,readability,and humanness.Mob-IMWTC offers a lightweight yet effective solution for image captioning,making it suitable for deployment on resource-constrained devices.
文摘The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce poor computer vision results.The common image denoising techniques tend to remove significant image details and also remove noise,provided they are based on space and frequency filtering.The updated framework presented in this paper is a novel denoising model that makes use of Boruta-driven feature selection using a Long Short-Term Memory Autoencoder(LSTMAE).The Boruta algorithm identifies the most useful depth features that are used to maximize the spatial structure integrity and reduce redundancy.An LSTMAE is then used to process these selected features and model depth pixel sequences to generate robust,noise-resistant representations.The system uses the encoder to encode the input data into a latent space that has been compressed before it is decoded to retrieve the clean image.Experiments on a benchmark data set show that the suggested technique attains a PSNR of 45 dB and an SSIM of 0.90,which is 10 dB higher than the performance of conventional convolutional autoencoders and 15 times higher than that of the wavelet-based models.Moreover,the feature selection step will decrease the input dimensionality by 40%,resulting in a 37.5%reduction in training time and a real-time inference rate of 200 FPS.Boruta-LSTMAE framework,therefore,offers a highly efficient and scalable system for depth image denoising,with a high potential to be applied to close-range 3D systems,such as robotic manipulation and gesture-based interfaces.
文摘Research has been conducted to reduce resource consumption in 3D medical image segmentation for diverse resource-constrained environments.However,decreasing the number of parameters to enhance computational efficiency can also lead to performance degradation.Moreover,these methods face challenges in balancing global and local features,increasing the risk of errors in multi-scale segmentation.This issue is particularly pronounced when segmenting small and complex structures within the human body.To address this problem,we propose a multi-stage hierarchical architecture composed of a detector and a segmentor.The detector extracts regions of interest(ROIs)in a 3D image,while the segmentor performs segmentation in the extracted ROI.Removing unnecessary areas in the detector allows the segmentation to be performed on a more compact input.The segmentor is designed with multiple stages,where each stage utilizes different input sizes.It implements a stage-skippingmechanism that deactivates certain stages using the initial input size.This approach minimizes unnecessary computations on segmenting the essential regions to reduce computational overhead.The proposed framework preserves segmentation performance while reducing resource consumption,enabling segmentation even in resource-constrained environments.
基金Supported by the Bavarian Academic Forum(BayWISS),as a part of the joint academic partnership digitalization program.
文摘Background In recent years,the demand for interactive photorealistic three-dimensional(3D)environments has increased in various fields,including architecture,engineering,and entertainment.However,achieving a balance between the quality and efficiency of high-performance 3D applications and virtual reality(VR)remains challenging.Methods This study addresses this issue by revisiting and extending view interpolation for image-based rendering(IBR),which enables the exploration of spacious open environments in 3D and VR.Therefore,we introduce multimorphing,a novel rendering method based on the spatial data structure of 2D image patches,called the image graph.Using this approach,novel views can be rendered with up to six degrees of freedom using only a sparse set of views.The rendering process does not require 3D reconstruction of the geometry or per-pixel depth information,and all relevant data for the output are extracted from the local morphing cells of the image graph.The detection of parallax image regions during preprocessing reduces rendering artifacts by extrapolating image patches from adjacent cells in real-time.In addition,a GPU-based solution was presented to resolve exposure inconsistencies within a dataset,enabling seamless transitions of brightness when moving between areas with varying light intensities.Results Experiments on multiple real-world and synthetic scenes demonstrate that the presented method achieves high"VR-compatible"frame rates,even on mid-range and legacy hardware,respectively.While achieving adequate visual quality even for sparse datasets,it outperforms other IBR and current neural rendering approaches.Conclusions Using the correspondence-based decomposition of input images into morphing cells of 2D image patches,multidimensional image morphing provides high-performance novel view generation,supporting open 3D and VR environments.Nevertheless,the handling of morphing artifacts in the parallax image regions remains a topic for future research.
基金supported by the Fund of Key Laboratory of Biomedical Engineering of Hainan Province(No.BME20240001)the STI2030-Major Projects(No.2021ZD0200104)the National Natural Science Foundations of China under Grant 61771437.
文摘Deep learning networks are increasingly exploited in the field of neuronal soma segmentation.However,annotating dataset is also an expensive and time-consuming task.Unsupervised domain adaptation is an effective method to mitigate the problem,which is able to learn an adaptive segmentation model by transferring knowledge from a rich-labeled source domain.In this paper,we propose a multi-level distribution alignment-based unsupervised domain adaptation network(MDA-Net)for segmentation of 3D neuronal soma images.Distribution alignment is performed in both feature space and output space.In the feature space,features from different scales are adaptively fused to enhance the feature extraction capability for small target somata and con-strained to be domain invariant by adversarial adaptation strategy.In the output space,local discrepancy maps that can reveal the spatial structures of somata are constructed on the predicted segmentation results.Then thedistribution alignment is performed on the local discrepancies maps across domains to obtain a superior discrepancy map in the target domain,achieving refined segmentation performance of neuronal somata.Additionally,after a period of distribution align-ment procedure,a portion of target samples with high confident pseudo-labels are selected as training data,which assist in learning a more adaptive segmentation network.We verified the superiority of the proposed algorithm by comparing several domain adaptation networks on two 3D mouse brain neuronal somata datasets and one macaque brain neuronal soma dataset.
基金supported by the Key Research and DevelopmentPlan in Shanxi Province of China(No.201803D421045)the Natural Science Foundation of Shanxi Province(No.2021-0302-123104)。
文摘This study introduces a novel method for reconstructing the 3D model of aluminum foam using cross-sectional sequence images.Combining precision milling and image acquisition,high-qual-ity cross-sectional images are obtained.Pore structures are segmented by the U-shaped network(U-Net)neural network integrated with the Canny edge detection operator,ensuring accurate pore delineation and edge extraction.The trained U-Net achieves 98.55%accuracy.The 2D data are superimposed and processed into 3D point clouds,enabling reconstruction of the pore structure and aluminum skeleton.Analysis of pore 01 shows the cross-sectional area initially increases,and then decreases with milling depth,with a uniform point distribution of 40 per layer.The reconstructed model exhibits a porosity of 77.5%,with section overlap rates between the 2D pore segmentation and the reconstructed model exceeding 96%,confirming high fidelity.Equivalent sphere diameters decrease with size,averaging 1.95 mm.Compression simulations reveal that the stress-strain curve of the 3D reconstruction model of aluminum foam exhibits fluctuations,and the stresses in the reconstruction model concentrate on thin cell walls,leading to localized deformations.This method accurately restores the aluminum foam’s complex internal structure,improving reconstruction preci-sion and simulation reliability.The approach offers a cost-efficient,high-precision technique for optimizing material performance in engineering applications.
基金Projects(50934002,51074013,51304076,51104100)supported by the National Natural Science Foundation of ChinaProject(IRT0950)supported by the Program for Changjiang Scholars Innovative Research Team in Universities,ChinaProject(2012M510007)supported by China Postdoctoral Science Foundation
文摘Methods and procedures of three-dimensional (3D) characterization of the pore structure features in the packed ore particle bed are focused. X-ray computed tomography was applied to deriving the cross-sectional images of specimens with single particle size of 1-2, 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10 ram. Based on the in-house developed 3D image analysis programs using Matlab, the volume porosity, pore size distribution and degree of connectivity were calculated and analyzed in detail. The results indicate that the volume porosity, the mean diameter of pores and the effective pore size (d50) increase with the increasing of particle size. Lognormal distribution or Gauss distribution is mostly suitable to model the pore size distribution. The degree of connectivity investigated on the basis of cluster-labeling algorithm also increases with increasing the particle size approximately.
文摘目的探讨T_2 star mapping、T_1 images与3D DESS融合伪彩图在关节软骨损伤中的诊断价值。方法对26例关节软骨损伤患者行T_2 star mapping、T_1 images和3D DESS扫描,并将T_1 images、T_2 star mapping与3D DESS图像融合,评价患者股骨、胫骨、髌骨关节软骨损伤程度并与关节镜结果对比,计算融合伪彩图诊断软骨损伤的特异性、敏感性及与关节镜诊断结果一致性。结果 T_1 images-3D DESS融合伪彩图诊断关节软骨损伤的敏感度、特异度及Kappa值分别为92.8%、93.0%、0.769,T_2 star mapping-3D DESS融合伪彩图诊断关节软骨损伤的敏感度、特异度及Kappa值分别为91.4%、94.2%、0.787。结论 T_2 star mapping、T_1 images与3D DESS融合伪彩图在关节软骨早期损伤评价上优于关节镜。
文摘A novel algorithm of 3-D surface image registration is proposed. It makes use of the array information of 3-D points and takes vector/vertex-like features as the basis of the matching. That array information of 3-D points can be easily obtained when capturing original 3-D images. The iterative least-mean-squared (LMS) algorithm is applied to optimizing adaptively the transformation matrix parameters. These can effectively improve the registration performance and hurry up the matching process. Experimental results show that it can reach a good subjective impression on aligned 3-D images. Although the algorithm focuses primarily on the human head model, it can also be used for other objects with small modifications.
基金Project(104244) supported by the Natural Sciences and Engineering Research Council of Canada
文摘A developed stereo particle image velocimetry(stereo-PIV) system was proposed to measure three-dimensional(3D) soil deformation around a laterally loaded pile in sand.The stereo-PIV technique extended 2D measurement to 3D based on a binocular vision model,where two cameras with a well geometrical setting were utilized to image the same object simultaneously.This system utilized two open software packages and some simple programs in MATLAB,which can easily be adjusted to meet user needs at a low cost.The failure planes form an angle with the horizontal line,which are measured at 27°-29°,approximately three-fourths of the frictional angle of soil.The edge of the strain wedge formed in front of the pile is an arc,which is slightly different from the straight line reported in the literature.The active and passive influence zones are about twice and six times of the diameter of the pile,respectively.The test demonstrates the good performance and feasibility of this stereo-PIV system for more advanced geotechnical testing.
基金supported by the National Natural Science Foundation of China(Nos.41171355and41002120)
文摘A new object-oriented method has been developed for the extraction of Mars rocks from Mars rover data. It is based on a combination of Mars rover imagery and 3D point cloud data. First, Navcam or Pancam images taken by the Mars rovers are segmented into homogeneous objects with a mean-shift algorithm. Then, the objects in the segmented images are classified into small rock candidates, rock shadows, and large objects. Rock shadows and large objects are considered as the regions within which large rocks may exist. In these regions, large rock candidates are extracted through ground-plane fitting with the 3D point cloud data. Small and large rock candidates are combined and postprocessed to obtain the final rock extraction results. The shape properties of the rocks (angularity, circularity, width, height, and width-height ratio) have been calculated for subsequent ~eological studies.
文摘To overcome the shortcomings of 1 D and 2 D Otsu’s thresholding techniques, the 3 D Otsu method has been developed.Among all Otsu’s methods, 3 D Otsu technique provides the best threshold values for the multi-level thresholding processes. In this paper, to improve the quality of segmented images, a simple and effective multilevel thresholding method is introduced. The proposed approach focuses on preserving edge detail by computing the 3 D Otsu along the fusion phenomena. The advantages of the presented scheme include higher quality outcomes, better preservation of tiny details and boundaries and reduced execution time with rising threshold levels. The fusion approach depends upon the differences between pixel intensity values within a small local space of an image;it aims to improve localized information after the thresholding process. The fusion of images based on local contrast can improve image segmentation performance by minimizing the loss of local contrast, loss of details and gray-level distributions. Results show that the proposed method yields more promising segmentation results when compared to conventional1 D Otsu, 2 D Otsu and 3 D Otsu methods, as evident from the objective and subjective evaluations.
基金supported by the National "Eleventh Five-Year" Forestry Support Program of China (No2006BAD03A1603)
文摘An ILRIS-36D 3-D laser image scanning system was used to monitor the Anjialing strip mine slope on Pingshuo in Shanxi province. The basic working principles, performance indexes, features and data collection and processing methods are illus-trated. The point cloud results are analyzed in detail. The rescale range analysis method was used to analyze the deformation char-acteristics of the slope. The results show that the trend of slope displacement is stable and that the degree of landslide danger is low. This work indicates that 3-D laser image scanning can supply multi-parameter, high precision real time data over long distances. These data can be used to study the distortion of the slope quickly and accurately.
基金financially supported by the National Natural Science Foundation of China(No.51304076)the Natural Science Foundation of Hunan Province,China(No.14JJ4064)
文摘Mineral dissemination and pore space distribution in ore particles are important features that influence heap leaching performance.To quantify the mineral dissemination and pore space distribution of an ore particle,a cylindrical copper oxide ore sample(I center dot 4.6 mm x 5.6 mm)was scanned using high-resolution X-ray computed tomography(HRXCT),a nondestructive imaging technology,at a spatial resolution of 4.85 mu m.Combined with three-dimensional(3D)image analysis techniques,the main mineral phases and pore space were segmented and the volume fraction of each phase was calculated.In addition,the mass fraction of each mineral phase was estimated and the result was validated with that obtained using traditional techniques.Furthermore,the pore phase features,including the pore size distribution,pore surface area,pore fractal dimension,pore centerline,and the pore connectivity,were investigated quantitatively.The pore space analysis results indicate that the pore size distribution closely fits a log-normal distribution and that the pore space morphology is complicated,with a large surface area and low connectivity.This study demonstrates that the combination of HRXCT and 3D image analysis is an effective tool for acquiring 3D mineralogical and pore structural data.
基金supported by the National Basic Research 973 Program of China (Grant 2014CB046905)the Natural Science Foundation of Jiangsu Province for Distinguished Young Scholars (Grant BK20150005)+1 种基金the Fundamental Research Funds for the Central Universities (China University of Mining and Technology) (Grant 2014XT03)the innovation research project for academic graduate of Jiangsu Province (Grant KYLX16_0536)
文摘In this paper, uniaxial compression tests were carried out on a series of composite rock specimens with different dip angles, which were made from two types of rock-like material with different strength. The acoustic emission technique was used to monitor the acoustic signal characteristics of composite rock specimens during the entire loading process. At the same time, an optical non-contact 3 D digital image correlation technique was used to study the evolution of axial strain field and the maximal strain field before and after the peak strength at different stress levels during the loading process. The effect of bedding plane inclination on the deformation and strength during uniaxial loading was analyzed. The methods of solving the elastic constants of hard and weak rock were described. The damage evolution process, deformation and failure mechanism, and failure mode during uniaxial loading were fully determined. The experimental results show that the θ = 0?–45?specimens had obvious plastic deformation during loading, and the brittleness of the θ = 60?–90?specimens gradually increased during the loading process. When the anisotropic angle θincreased from 0?to 90?, the peak strength, peak strain,and apparent elastic modulus all decreased initially and then increased. The failure mode of the composite rock specimen during uniaxial loading can be divided into three categories:tensile fracture across the discontinuities(θ = 0?–30?), slid-ing failure along the discontinuities(θ = 45?–75?), and tensile-split along the discontinuities(θ = 90?). The axial strain of the weak and hard rock layers in the composite rock specimen during the loading process was significantly different from that of the θ = 0?–45?specimens and was almost the same as that of the θ = 60?–90?specimens. As for the strain localization highlighted in the maximum principal strain field, the θ = 0?–30?specimens appeared in the rock matrix approximately parallel to the loading direction,while in the θ = 45?–90?specimens it appeared at the hard and weak rock layer interface.
基金supported by the Foundation of Graduate Innovation Center in Nanjing University of Aeronautics and Astronautics(No.kfjj20191506)。
文摘In view of the fact that the traditional Hausdorff image matching algorithm is very sensitive to the image size as well as the unsatisfactory real-time performance in practical applications,an image matching algorithm is proposed based on the combination of Yolov3.Firstly,the features of the reference image are selected for pretraining,and then the training results are used to extract the features of the real images before the coordinates of the center points of the feature area are used to complete the coarse matching.Finally,the Hausdorff algorithm is used to complete the fine image matching.Experiments show that the proposed algorithm significantly improves the speed and accuracy of image matching.Also,it is robust to rotation changes.
基金supported by the National Natural Science Foundation of China(6157206361401308)+6 种基金the Fundamental Research Funds for the Central Universities(2016YJS039)the Natural Science Foundation of Hebei Province(F2016201142F2016201187)the Natural Social Foundation of Hebei Province(HB15TQ015)the Science Research Project of Hebei Province(QN2016085ZC2016040)the Natural Science Foundation of Hebei University(2014-303)
文摘Fusion methods based on multi-scale transforms have become the mainstream of the pixel-level image fusion. However,most of these methods cannot fully exploit spatial domain information of source images, which lead to the degradation of image.This paper presents a fusion framework based on block-matching and 3D(BM3D) multi-scale transform. The algorithm first divides the image into different blocks and groups these 2D image blocks into 3D arrays by their similarity. Then it uses a 3D transform which consists of a 2D multi-scale and a 1D transform to transfer the arrays into transform coefficients, and then the obtained low-and high-coefficients are fused by different fusion rules. The final fused image is obtained from a series of fused 3D image block groups after the inverse transform by using an aggregation process. In the experimental part, we comparatively analyze some existing algorithms and the using of different transforms, e.g. non-subsampled Contourlet transform(NSCT), non-subsampled Shearlet transform(NSST), in the 3D transform step. Experimental results show that the proposed fusion framework can not only improve subjective visual effect, but also obtain better objective evaluation criteria than state-of-the-art methods.
文摘Image fusion is performed between one band of multi-spectral image and two bands of hyperspectral image to produce fused image with the same spatial resolution as source multi-spectral image and the same spectral resolution as source hyperspeetral image. According to the characteristics and 3-Dimensional (3-D) feature analysis of multi-spectral and hyperspectral image data volume, the new fusion approach using 3-D wavelet based method is proposed. This approach is composed of four major procedures: Spatial and spectral resampling, 3-D wavelet transform, wavelet coefficient integration and 3-D inverse wavelet transform. Especially, a novel method, Ratio Image Based Spectral Resampling (RIBSR)method, is proposed to accomplish data resampling in spectral domain by utilizing the property of ratio image. And a new fusion rule, Average and Substitution (A&S) rule, is employed as the fusion rule to accomplish wavelet coefficient integration. Experimental results illustrate that the fusion approach using 3-D wavelet transform can utilize both spatial and spectral characteristics of source images more adequately and produce fused image with higher quality and fewer artifacts than fusion approach using 2-D wavelet transform. It is also revealed that RIBSR method is capable of interpolating the missing data more effectively and correctly, and A&S rule can integrate coefficients of source images in 3-D wavelet domain to preserve both spatial and spectral features of source images more properly.
基金the Chinese-German Center for Scientific Promotion (Chinesisch-Deutsches Zentrum für Wissenschaftsfrderung) under the Project of Sino-German Research Group (GZ494)the Beijing Municipal Education Commission for Building Scientific Research and Scientific Research Base (2008BJKY01)+1 种基金the German Academic Exchange Service (DAAD),and China Scholarship Council (CSC) for enhancing our cooperationthe International Cooperation Fund of Ministry of Science and Technology, China (2010DFA34670)
文摘Plants respond to drought stress with different physical manners, such as morphology and color of leaves. Thus, plants can be considered as a sort of living-sensors for monitoring dynamic of soil water content or the stored water in plant body. Because of difficulty to identify the early wilting symptom of plants from the results in 2D (two-dimension) space, this paper presented a preliminary study with 3D (three-dimension)-based image, in which a laser scanner was used for achieving the morphological information of zucchini (Cucurbita pepo) leaves. Moreover, a leaf wilting index (DLWIF) was defined by fractal dimension. The experiment consisted of phase-1 for observing the temporal variation of DLWIF and phase-2 for the validation of this index. During the experiment, air temperature, luminous intensity, and volumetric soil water contents (VSWC) were simultaneously recorded over time. The results of both phases fitted the bisector (line: 1:1) with R2=0.903 and REMS=0.155. More significantly, the influence of VSWC with three levels (0.22, 0.30, and 0.36 cm3 cm-3) on the response of plant samples to drought stress was observed from separated traces of DLWIF. In brief, two conclusions have been made: (i) the laser scanner is an effective tool for the non-contact detection of morphological wilting of plants, and (ii) defined DLWIF can be a promising indicator for a category of plants like zucchini.