期刊文献+
共找到40篇文章
< 1 2 >
每页显示 20 50 100
Reconstructing the 3D digital core with a fully convolutional neural network 被引量:2
1
作者 Li Qiong Chen Zheng +4 位作者 He Jian-Jun Hao Si-Yu Wang Rui Yang Hao-Tao Sun Hua-Jun 《Applied Geophysics》 SCIE CSCD 2020年第3期401-410,共10页
In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for... In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for a fully convolutional neural networkmodel. This model is used to reconstruct the three-dimensional (3D) digital core of Bereasandstone based on a small number of CT images. The Hamming distance together with theMinkowski functions for porosity, average volume specifi c surface area, average curvature,and connectivity of both the real core and the digital reconstruction are used to evaluate theaccuracy of the proposed method. The results show that the reconstruction achieved relativeerrors of 6.26%, 1.40%, 6.06%, and 4.91% for the four Minkowski functions and a Hammingdistance of 0.04479. This demonstrates that the proposed method can not only reconstructthe physical properties of real sandstone but can also restore the real characteristics of poredistribution in sandstone, is the ability to which is a new way to characterize the internalmicrostructure of rocks. 展开更多
关键词 Fully convolutional neural network 3d digital core numerical simulation training set
在线阅读 下载PDF
Image-Based Flow Prediction of Vocal Folds Using 3D Convolutional Neural Networks
2
作者 Yang Zhang Tianmei Pu +1 位作者 Jiasen Xu Chunhua Zhou 《Journal of Bionic Engineering》 SCIE EI CSCD 2024年第2期991-1002,共12页
In this work,a three dimensional(3D)convolutional neural network(CNN)model based on image slices of various normal and pathological vocal folds is proposed for accurate and efficient prediction of glottal flows.The 3D... In this work,a three dimensional(3D)convolutional neural network(CNN)model based on image slices of various normal and pathological vocal folds is proposed for accurate and efficient prediction of glottal flows.The 3D CNN model is composed of the feature extraction block and regression block.The feature extraction block is capable of learning low dimensional features from the high dimensional image data of the glottal shape,and the regression block is employed to flatten the output from the feature extraction block and obtain the desired glottal flow data.The input image data is the condensed set of 2D image slices captured in the axial plane of the 3D vocal folds,where these glottal shapes are synthesized based on the equations of normal vibration modes.The output flow data is the corresponding flow rate,averaged glottal pressure and nodal pressure distributions over the glottal surface.The 3D CNN model is built to establish the mapping between the input image data and output flow data.The ground-truth flow variables of each glottal shape in the training and test datasets are obtained by a high-fidelity sharp-interface immersed-boundary solver.The proposed model is trained to predict the concerned flow variables for glottal shapes in the test set.The present 3D CNN model is more efficient than traditional Computational Fluid Dynamics(CFD)models while the accuracy can still be retained,and more powerful than previous data-driven prediction models because more details of the glottal flow can be provided.The prediction performance of the trained 3D CNN model in accuracy and efficiency indicates that this model could be promising for future clinical applications. 展开更多
关键词 Vocal folds Computational fluid dynamics Machine learning 3d convolutional neural network
在线阅读 下载PDF
SGT-Net: A Transformer-Based Stratified Graph Convolutional Network for 3D Point Cloud Semantic Segmentation
3
作者 Suyi Liu Jianning Chi +2 位作者 Chengdong Wu Fang Xu Xiaosheng Yu 《Computers, Materials & Continua》 SCIE EI 2024年第6期4471-4489,共19页
In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and... In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and inherently sparse.Therefore,it is very difficult to extract long-range contexts and effectively aggregate local features for semantic segmentation in 3D point cloud space.Most current methods either focus on local feature aggregation or long-range context dependency,but fail to directly establish a global-local feature extractor to complete the point cloud semantic segmentation tasks.In this paper,we propose a Transformer-based stratified graph convolutional network(SGT-Net),which enlarges the effective receptive field and builds direct long-range dependency.Specifically,we first propose a novel dense-sparse sampling strategy that provides dense local vertices and sparse long-distance vertices for subsequent graph convolutional network(GCN).Secondly,we propose a multi-key self-attention mechanism based on the Transformer to further weight augmentation for crucial neighboring relationships and enlarge the effective receptive field.In addition,to further improve the efficiency of the network,we propose a similarity measurement module to determine whether the neighborhood near the center point is effective.We demonstrate the validity and superiority of our method on the S3DIS and ShapeNet datasets.Through ablation experiments and segmentation visualization,we verify that the SGT model can improve the performance of the point cloud semantic segmentation. 展开更多
关键词 3d point cloud semantic segmentation long-range contexts global-local feature graph convolutional network dense-sparse sampling strategy
在线阅读 下载PDF
Review of Artificial Intelligence for Oil and Gas Exploration: Convolutional Neural Network Approaches and the U-Net 3D Model
4
作者 Weiyan Liu 《Open Journal of Geology》 CAS 2024年第4期578-593,共16页
Deep learning, especially through convolutional neural networks (CNN) such as the U-Net 3D model, has revolutionized fault identification from seismic data, representing a significant leap over traditional methods. Ou... Deep learning, especially through convolutional neural networks (CNN) such as the U-Net 3D model, has revolutionized fault identification from seismic data, representing a significant leap over traditional methods. Our review traces the evolution of CNN, emphasizing the adaptation and capabilities of the U-Net 3D model in automating seismic fault delineation with unprecedented accuracy. We find: 1) The transition from basic neural networks to sophisticated CNN has enabled remarkable advancements in image recognition, which are directly applicable to analyzing seismic data. The U-Net 3D model, with its innovative architecture, exemplifies this progress by providing a method for detailed and accurate fault detection with reduced manual interpretation bias. 2) The U-Net 3D model has demonstrated its superiority over traditional fault identification methods in several key areas: it has enhanced interpretation accuracy, increased operational efficiency, and reduced the subjectivity of manual methods. 3) Despite these achievements, challenges such as the need for effective data preprocessing, acquisition of high-quality annotated datasets, and achieving model generalization across different geological conditions remain. Future research should therefore focus on developing more complex network architectures and innovative training strategies to refine fault identification performance further. Our findings confirm the transformative potential of deep learning, particularly CNN like the U-Net 3D model, in geosciences, advocating for its broader integration to revolutionize geological exploration and seismic analysis. 展开更多
关键词 deep Learning convolutional Neural networks (CNN) Seismic Fault Identification U-Net 3d Model Geological Exploration
在线阅读 下载PDF
Enhancing SS-OCT 3D image reconstruction:A real-time system with stripe artifact suppression and GPU parallel acceleration
5
作者 Dandan LIU 《虚拟现实与智能硬件(中英文)》 2026年第1期115-130,共16页
Optical coherence tomography(OCT),particularly Swept-Source OCT,is widely employed in medical diagnostics and industrial inspections owing to its high-resolution imaging capabilities.However,Swept-Source OCT 3D imagin... Optical coherence tomography(OCT),particularly Swept-Source OCT,is widely employed in medical diagnostics and industrial inspections owing to its high-resolution imaging capabilities.However,Swept-Source OCT 3D imaging often suffers from stripe artifacts caused by unstable light sources,system noise,and environmental interference,posing challenges to real-time processing of large-scale datasets.To address this issue,this study introduces a real-time reconstruction system that integrates stripe-artifact suppression and parallel computing using a graphics processing unit.This approach employs a frequency-domain filtering algorithm with adaptive anti-suppression parameters,dynamically adjusted through an image quality evaluation function and optimized using a convolutional neural network for complex frequency-domain feature learning.Additionally,a graphics processing unit integrated 3D reconstruction framework is developed,enhancing data processing throughput and real-time performance via a dual-queue decoupling mechanism.Experimental results demonstrate significant improvements in structural similarity(0.92),peak signal-to-noise ratio(31.62 dB),and stripe suppression ratio(15.73 dB)compared with existing methods.On the RTX 4090 platform,the proposed system achieved an end-to-end delay of 94.36 milliseconds,a frame rate of 10.3 frames per second,and a throughput of 121.5 million voxels per second,effectively suppressing artifacts while preserving image details and enhancing real-time 3D reconstruction performance. 展开更多
关键词 Stripe artifact suppression 3d reconstruction GPU parallel computing Adaptive frequency domain filtering convolutional neural network
在线阅读 下载PDF
Segmentation of retinal fluid based on deep learning:application of three-dimensional fully convolutional neural networks in optical coherence tomography images 被引量:4
6
作者 Meng-Xiao Li Su-Qin Yu +4 位作者 Wei Zhang Hao Zhou Xun Xu Tian-Wei Qian Yong-Jing Wan 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2019年第6期1012-1020,共9页
AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segment... AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data. 展开更多
关键词 optical COHERENCE tomography IMAGES FLUId segmentation 2d fully convolutional network 3d fully convolutional network
原文传递
CurveNet:Curvature-Based Multitask Learning Deep Networks for 3D Object Recognition 被引量:4
7
作者 A.A.M.Muzahid Wanggen Wan +2 位作者 Ferdous Sohel Lianyao Wu Li Hou 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2021年第6期1177-1187,共11页
In computer vision fields,3D object recognition is one of the most important tasks for many real-world applications.Three-dimensional convolutional neural networks(CNNs)have demonstrated their advantages in 3D object ... In computer vision fields,3D object recognition is one of the most important tasks for many real-world applications.Three-dimensional convolutional neural networks(CNNs)have demonstrated their advantages in 3D object recognition.In this paper,we propose to use the principal curvature directions of 3D objects(using a CAD model)to represent the geometric features as inputs for the 3D CNN.Our framework,namely CurveNet,learns perceptually relevant salient features and predicts object class labels.Curvature directions incorporate complex surface information of a 3D object,which helps our framework to produce more precise and discriminative features for object recognition.Multitask learning is inspired by sharing features between two related tasks,where we consider pose classification as an auxiliary task to enable our CurveNet to better generalize object label classification.Experimental results show that our proposed framework using curvature vectors performs better than voxels as an input for 3D object classification.We further improved the performance of CurveNet by combining two networks with both curvature direction and voxels of a 3D object as the inputs.A Cross-Stitch module was adopted to learn effective shared features across multiple representations.We evaluated our methods using three publicly available datasets and achieved competitive performance in the 3D object recognition task. 展开更多
关键词 3d shape analysis convolutional neural network dNNs object classification volumetric CNN
在线阅读 下载PDF
Behavior recognition algorithm based on the improved R3D and LSTM network fusion 被引量:1
8
作者 Wu Jin An Yiyuan +1 位作者 Dai Wei Zhao Bo 《High Technology Letters》 EI CAS 2021年第4期381-387,共7页
Because behavior recognition is based on video frame sequences,this paper proposes a behavior recognition algorithm that combines 3D residual convolutional neural network(R3D)and long short-term memory(LSTM).First,the... Because behavior recognition is based on video frame sequences,this paper proposes a behavior recognition algorithm that combines 3D residual convolutional neural network(R3D)and long short-term memory(LSTM).First,the residual module is extended to three dimensions,which can extract features in the time and space domain at the same time.Second,by changing the size of the pooling layer window the integrity of the time domain features is preserved,at the same time,in order to overcome the difficulty of network training and over-fitting problems,the batch normalization(BN)layer and the dropout layer are added.After that,because the global average pooling layer(GAP)is affected by the size of the feature map,the network cannot be further deepened,so the convolution layer and maxpool layer are added to the R3D network.Finally,because LSTM has the ability to memorize information and can extract more abstract timing features,the LSTM network is introduced into the R3D network.Experimental results show that the R3D+LSTM network achieves 91%recognition rate on the UCF-101 dataset. 展开更多
关键词 behavior recognition three-dimensional residual convolutional neural network(R3d) long short-term memory(LSTM) dROPOUT batch normalization(BN)
在线阅读 下载PDF
Short‐term and long‐term memory self‐attention network for segmentation of tumours in 3D medical images
9
作者 Mingwei Wen Quan Zhou +3 位作者 Bo Tao Pavel Shcherbakov Yang Xu Xuming Zhang 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第4期1524-1537,共14页
Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shap... Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shapes and sizes.The popular deep learning‐based segmentation algorithms generally rely on the convolutional neural network(CNN)and Transformer.The former cannot extract the global image features effectively while the latter lacks the inductive bias and involves the complicated computation for 3D volume data.The existing hybrid CNN‐Transformer network can only provide the limited performance improvement or even poorer segmentation performance than the pure CNN.To address these issues,a short‐term and long‐term memory self‐attention network is proposed.Firstly,a distinctive self‐attention block uses the Transformer to explore the correlation among the region features at different levels extracted by the CNN.Then,the memory structure filters and combines the above information to exclude the similar regions and detect the multiple tumours.Finally,the multi‐layer reconstruction blocks will predict the tumour boundaries.Experimental results demonstrate that our method outperforms other methods in terms of subjective visual and quantitative evaluation.Compared with the most competitive method,the proposed method provides Dice(82.4%vs.76.6%)and Hausdorff distance 95%(HD95)(10.66 vs.11.54 mm)on the KiTS19 as well as Dice(80.2%vs.78.4%)and HD95(9.632 vs.12.17 mm)on the LiTS. 展开更多
关键词 3d medical images convolutional neural network self‐attention network TRANSFORMER tumor segmentation
在线阅读 下载PDF
Evaluating Method of Lower Limb Coordination Based on Spatial-Temporal Dependency Networks
10
作者 Xuelin Qin Huinan Sang +3 位作者 Shihua Wu Shishu Chen Zhiwei Chen Yongjun Ren 《Computers, Materials & Continua》 2025年第10期1959-1980,共22页
As an essential tool for quantitative analysis of lower limb coordination,optical motion capture systems with marker-based encoding still suffer from inefficiency,high costs,spatial constraints,and the requirement for... As an essential tool for quantitative analysis of lower limb coordination,optical motion capture systems with marker-based encoding still suffer from inefficiency,high costs,spatial constraints,and the requirement for multiple markers.While 3D pose estimation algorithms combined with ordinary cameras offer an alternative,their accuracy often deteriorates under significant body occlusion.To address the challenge of insufficient 3D pose estimation precision in occluded scenarios—which hinders the quantitative analysis of athletes’lower-limb coordination—this paper proposes a multimodal training framework integrating spatiotemporal dependency networks with text-semantic guidance.Compared to traditional optical motion capture systems,this work achieves low-cost,high-precision motion parameter acquisition through the following innovations:(1)spatiotemporal dependency attention module is designed to establish dynamic spatiotemporal correlation graphs via cross-frame joint semantic matching,effectively resolving the feature fragmentation issue in existing methods.(2)noise-suppressed multi-scale temporal module is proposed,leveraging KL divergence-based information gain analysis for progressive feature filtering in long-range dependencies,reducing errors by 1.91 mm compared to conventional temporal convolutions.(3)text-pose contrastive learning paradigm is introduced for the first time,where BERT-generated action descriptions align semantic-geometric features via the BERT encoder,significantly enhancing robustness under severe occlusion(50%joint invisibility).On the Human3.6M dataset,the proposed method achieves an MPJPE of 56.21 mm under Protocol 1,outperforming the state-of-the-art baseline MHFormer by 3.3%.Extensive ablation studies on Human3.6M demonstrate the individual contributions of the core modules:the spatiotemporal dependency module and noise-suppressed multi-scale temporal module reduce MPJPE by 0.30 and 0.34 mm,respectively,while the multimodal training strategy further decreases MPJPE by 0.6 mm through text-skeleton contrastive learning.Comparative experiments involving 16 athletes show that the sagittal plane coupling angle measurements of hip-ankle joints differ by less than 1.2°from those obtained via traditional optical systems(two one-sided t-tests,p<0.05),validating real-world reliability.This study provides an AI-powered analytical solution for competitive sports training,serving as a viable alternative to specialized equipment. 展开更多
关键词 Graph convolutional networks lower limb coordination quantification 3d pose estimation
在线阅读 下载PDF
Integrating deep learning and logging data analytics for lithofacies classification and 3D modeling of tight sandstone reservoirs 被引量:6
11
作者 Jing-Jing Liu Jian-Chao Liu 《Geoscience Frontiers》 SCIE CAS CSCD 2022年第1期350-363,共14页
The lithofacies classification is essential for oil and gas reservoir exploration and development.The traditional method of lithofacies classification is based on"core calibration logging"and the experience ... The lithofacies classification is essential for oil and gas reservoir exploration and development.The traditional method of lithofacies classification is based on"core calibration logging"and the experience of geologists.This approach has strong subjectivity,low efficiency,and high uncertainty.This uncertainty may be one of the key factors affecting the results of 3 D modeling of tight sandstone reservoirs.In recent years,deep learning,which is a cutting-edge artificial intelligence technology,has attracted attention from various fields.However,the study of deep-learning techniques in the field of lithofacies classification has not been sufficient.Therefore,this paper proposes a novel hybrid deep-learning model based on the efficient data feature-extraction ability of convolutional neural networks(CNN)and the excellent ability to describe time-dependent features of long short-term memory networks(LSTM)to conduct lithological facies-classification experiments.The results of a series of experiments show that the hybrid CNN-LSTM model had an average accuracy of 87.3%and the best classification effect compared to the CNN,LSTM or the three commonly used machine learning models(Support vector machine,random forest,and gradient boosting decision tree).In addition,the borderline synthetic minority oversampling technique(BSMOTE)is introduced to address the class-imbalance issue of raw data.The results show that processed data balance can significantly improve the accuracy of lithofacies classification.Beside that,based on the fine lithofacies constraints,the sequential indicator simulation method is used to establish a three-dimensional lithofacies model,which completes the fine description of the spatial distribution of tight sandstone reservoirs in the study area.According to this comprehensive analysis,the proposed CNN-LSTM model,which eliminates class imbalance,can be effectively applied to lithofacies classification,and is expected to improve the reality of the geological model for the tight sandstone reservoirs. 展开更多
关键词 deep learning convolutional neural networks LSTM Lithological-facies classification 3d modeling Class imbalance
在线阅读 下载PDF
Panicle-3D: A low-cost 3D-modeling method for rice panicles based on deep learning, shape from silhouette, and supervoxel clustering 被引量:3
12
作者 Dan Wu Lejun Yu +10 位作者 Junli Ye Ruifang Zhai Lingfeng Duan Lingbo Liu Nai Wu Zedong Geng Jingbo Fu Chenglong Huang Shangbin Chen Qian Liu Wanneng Yang 《The Crop Journal》 SCIE CSCD 2022年第5期1386-1398,共13页
Self-occlusions are common in rice canopy images and strongly influence the calculation accuracies of panicle traits. Such interference can be largely eliminated if panicles are phenotyped at the 3 D level.Research on... Self-occlusions are common in rice canopy images and strongly influence the calculation accuracies of panicle traits. Such interference can be largely eliminated if panicles are phenotyped at the 3 D level.Research on 3 D panicle phenotyping has been limited. Given that existing 3 D modeling techniques do not focus on specified parts of a target object, an efficient method for panicle modeling of large numbers of rice plants is lacking. This paper presents an automatic and nondestructive method for 3 D panicle modeling. The proposed method integrates shoot rice reconstruction with shape from silhouette, 2 D panicle segmentation with a deep convolutional neural network, and 3 D panicle segmentation with ray tracing and supervoxel clustering. A multiview imaging system was built to acquire image sequences of rice canopies with an efficiency of approximately 4 min per rice plant. The execution time of panicle modeling per rice plant using 90 images was approximately 26 min. The outputs of the algorithm for a single rice plant are a shoot rice model, surface shoot rice model, panicle model, and surface panicle model, all represented by a list of spatial coordinates. The efficiency and performance were evaluated and compared with the classical structure-from-motion algorithm. The results demonstrated that the proposed method is well qualified to recover the 3 D shapes of rice panicles from multiview images and is readily adaptable to rice plants of diverse accessions and growth stages. The proposed algorithm is superior to the structure-from-motion method in terms of texture preservation and computational efficiency. The sample images and implementation of the algorithm are available online. This automatic, cost-efficient, and nondestructive method of 3 D panicle modeling may be applied to high-throughput 3 D phenotyping of large rice populations. 展开更多
关键词 Panicle phenotyping deep convolutional neural network 3d reconstruction Shape from silhouette Point-cloud segmentation Ray tracing Supervoxel clustering
在线阅读 下载PDF
3D Bounding Box Proposal for on-Street Parking Space Status Sensing in Real World Conditions 被引量:1
13
作者 Yaocheng Zheng Weiwei Zhang +1 位作者 Xuncheng Wu Bo Zhao 《Computer Modeling in Engineering & Sciences》 SCIE EI 2019年第6期559-576,共18页
Vision-based technologies have been extensively applied for on-street parking space sensing,aiming at providing timely and accurate information for drivers and improving daily travel convenience.However,it faces great... Vision-based technologies have been extensively applied for on-street parking space sensing,aiming at providing timely and accurate information for drivers and improving daily travel convenience.However,it faces great challenges as a partial visualization regularly occurs owing to occlusion from static or dynamic objects or a limited perspective of camera.This paper presents an imagery-based framework to infer parking space status by generating 3D bounding box of the vehicle.A specially designed convolutional neural network based on ResNet and feature pyramid network is proposed to overcome challenges from partial visualization and occlusion.It predicts 3D box candidates on multi-scale feature maps with five different 3D anchors,which generated by clustering diverse scales of ground truth box according to different vehicle templates in the source data set.Subsequently,vehicle distribution map is constructed jointly from the coordinates of vehicle box and artificially segmented parking spaces,where the normative degree of parked vehicle is calculated by computing the intersection over union between vehicle’s box and parking space edge.In space status inference,to further eliminate mutual vehicle interference,three adjacent spaces are combined into one unit and then a multinomial logistic regression model is trained to refine the status of the unit.Experiments on KITTI benchmark and Shanghai road show that the proposed method outperforms most monocular approaches in 3D box regression and achieves satisfactory accuracy in space status inference. 展开更多
关键词 3d OBJECT PROPOSAL image processing and analysis PARKING space detection fully convolutional network MULTINOMIAL LOGISTIC regression model
在线阅读 下载PDF
Mural Anomaly Region Detection Algorithm Based on Hyperspectral Multiscale Residual Attention Network
14
作者 Bolin Guo Shi Qiu +1 位作者 Pengchang Zhang Xingjia Tang 《Computers, Materials & Continua》 SCIE EI 2024年第10期1809-1833,共25页
Mural paintings hold significant historical information and possess substantial artistic and cultural value.However,murals are inevitably damaged by natural environmental factors such as wind and sunlight,as well as b... Mural paintings hold significant historical information and possess substantial artistic and cultural value.However,murals are inevitably damaged by natural environmental factors such as wind and sunlight,as well as by human activities.For this reason,the study of damaged areas is crucial for mural restoration.These damaged regions differ significantly from undamaged areas and can be considered abnormal targets.Traditional manual visual processing lacks strong characterization capabilities and is prone to omissions and false detections.Hyperspectral imaging can reflect the material properties more effectively than visual characterization methods.Thus,this study employs hyperspectral imaging to obtain mural information and proposes a mural anomaly detection algorithm based on a hyperspectral multi-scale residual attention network(HM-MRANet).The innovations of this paper include:(1)Constructing mural painting hyperspectral datasets.(2)Proposing a multi-scale residual spectral-spatial feature extraction module based on a 3D CNN(Convolutional Neural Networks)network to better capture multiscale information and improve performance on small-sample hyperspectral datasets.(3)Proposing the Enhanced Residual Attention Module(ERAM)to address the feature redundancy problem,enhance the network’s feature discrimination ability,and further improve abnormal area detection accuracy.The experimental results show that the AUC(Area Under Curve),Specificity,and Accuracy of this paper’s algorithm reach 85.42%,88.84%,and 87.65%,respectively,on this dataset.These results represent improvements of 3.07%,1.11%and 2.68%compared to the SSRN algorithm,demonstrating the effectiveness of this method for mural anomaly detection. 展开更多
关键词 MURALS anomaly detection HYPERSPECTRAL 3d CNN(convolutional Neural networks) residual network
在线阅读 下载PDF
Automatic detection of breast lesions in automated 3D breast ultrasound with cross-organ transfer learning
15
作者 Lingyun BAO Zhengrui HUANG +7 位作者 Zehui LIN Yue SUN Hui CHEN You LI Zhang LI Xiaochen YUAN Lin XU Tao TAN 《虚拟现实与智能硬件(中英文)》 EI 2024年第3期239-251,共13页
Background Deep convolutional neural networks have garnered considerable attention in numerous machine learning applications,particularly in visual recognition tasks such as image and video analyses.There is a growing... Background Deep convolutional neural networks have garnered considerable attention in numerous machine learning applications,particularly in visual recognition tasks such as image and video analyses.There is a growing interest in applying this technology to diverse applications in medical image analysis.Automated three dimensional Breast Ultrasound is a vital tool for detecting breast cancer,and computer-assisted diagnosis software,developed based on deep learning,can effectively assist radiologists in diagnosis.However,the network model is prone to overfitting during training,owing to challenges such as insufficient training data.This study attempts to solve the problem caused by small datasets and improve model detection performance.Methods We propose a breast cancer detection framework based on deep learning(a transfer learning method based on cross-organ cancer detection)and a contrastive learning method based on breast imaging reporting and data systems(BI-RADS).Results When using cross organ transfer learning and BIRADS based contrastive learning,the average sensitivity of the model increased by a maximum of 16.05%.Conclusion Our experiments have demonstrated that the parameters and experiences of cross-organ cancer detection can be mutually referenced,and contrastive learning method based on BI-RADS can improve the detection performance of the model. 展开更多
关键词 Breast ultrasound Automated 3d breast ultrasound Breast cancers deep learning Transfer learning convolutional neural networks Computer-aided diagnosis Cross organ learning
在线阅读 下载PDF
Web3D Learning Framework for 3D Shape Retrieval Based on Hybrid Convolutional Neural Networks 被引量:1
16
作者 Wen Zhou Jinyuan Jia +1 位作者 Chengxi Huang Yongqing Cheng 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2020年第1期93-102,共10页
With the rapid development of Web3 D technologies, sketch-based model retrieval has become an increasingly important challenge, while the application of Virtual Reality and 3 D technologies has made shape retrieval of... With the rapid development of Web3 D technologies, sketch-based model retrieval has become an increasingly important challenge, while the application of Virtual Reality and 3 D technologies has made shape retrieval of furniture over a web browser feasible. In this paper, we propose a learning framework for shape retrieval based on two Siamese VGG-16 Convolutional Neural Networks(CNNs), and a CNN-based hybrid learning algorithm to select the best view for a shape. In this algorithm, the AlexNet and VGG-16 CNN architectures are used to perform classification tasks and to extract features, respectively. In addition, a feature fusion method is used to measure the similarity relation of the output features from the two Siamese networks. The proposed framework can provide new alternatives for furniture retrieval in the Web3 D environment. The primary innovation is in the employment of deep learning methods to solve the challenge of obtaining the best view of 3 D furniture,and to address cross-domain feature learning problems. We conduct an experiment to verify the feasibility of the framework and the results show our approach to be superior in comparison to many mainstream state-of-the-art approaches. 展开更多
关键词 WEB3d sketch-based model RETRIEVAL convolutional NEURAL networks(CNNs) best VIEW cross-domain
原文传递
An Interactive platform for low-cost 3D building modeling from VGI data using convolutional neural network 被引量:1
17
作者 Hongchao Fan Gefei Kong Chaoquan Zhang 《Big Earth Data》 EI 2021年第1期49-65,共17页
The applications of 3D building models are limited as producing them requires massive labor and time costs as well as expensive devices.In this paper,we aim to propose a novel and web-based interactive platform,VGI3D,... The applications of 3D building models are limited as producing them requires massive labor and time costs as well as expensive devices.In this paper,we aim to propose a novel and web-based interactive platform,VGI3D,to overcome these challenges.The platform is designed to reconstruct 3D building models by using free images from internet users or volunteered geographic informa-tion(VGI)platform,even though not all these images are of high quality.Our interactive platform can effectively obtain each 3D building model from images in 30 seconds,with the help of user interaction module and convolutional neural network(CNN).The user interaction module provides the boundary of building facades for 3D building modeling.And this CNN can detect facade elements even though multiple architectural styles and complex scenes are within the images.Moreover,user interaction module is designed as simple as possible to make it easier to use for both of expert and non-expert users.Meanwhile,we conducted a usability testing and collected feedback from participants to better optimize platform and user experience.In general,the usage of VGI data reduces labor and device costs,and CNN simplifies the process of elements extraction in 3D building modeling.Hence,our proposed platform offers a promising solution to the 3D modeling community. 展开更多
关键词 3d building modeling VGI convolutional neural network user interaction low cost
原文传递
Deep Learning-Based Lip-Reading for Vocal Impaired Patient Rehabilitation
18
作者 Chiara Innocente Matteo Boemio +6 位作者 Gianmarco Lorenzetti Ilaria Pulito Diego Romagnoli Valeria Saponaro Giorgia Marullo Luca Ulrich Enrico Vezzetti 《Computer Modeling in Engineering & Sciences》 2025年第5期1355-1379,共25页
Lip-reading technology,based on visual speech decoding and automatic speech recognition,offers a promising solution to overcoming communication barriers,particularly for individuals with temporary or permanent speech ... Lip-reading technology,based on visual speech decoding and automatic speech recognition,offers a promising solution to overcoming communication barriers,particularly for individuals with temporary or permanent speech impairments.However,most Visual Speech Recognition(VSR)research has primarily focused on the English language and general-purpose applications,limiting its practical applicability in medical and rehabilitative settings.This study introduces the first Deep Learning(DL)based lip-reading system for the Italian language designed to assist individuals with vocal cord pathologies in daily interactions,facilitating communication for patients recovering from vocal cord surgeries,whether temporarily or permanently impaired.To ensure relevance and effectiveness in real-world scenarios,a carefully curated vocabulary of twenty-five Italian words was selected,encompassing critical semantic fields such as Needs,Questions,Answers,Emergencies,Greetings,Requests,and Body Parts.These words were chosen to address both essential daily communication and urgent medical assistance requests.Our approach combines a spatiotemporal Convolutional Neural Network(CNN)with a bidirectional Long Short-Term Memory(BiLSTM)recurrent network,and a Connectionist Temporal Classification(CTC)loss function to recognize individual words,without requiring predefined words boundaries.The experimental results demonstrate the system’s robust performance in recognizing target words,reaching an average accuracy of 96.4%in individual word recognition,suggesting that the system is particularly well-suited for offering support in constrained clinical and caregiving environments,where quick and reliable communication is critical.In conclusion,the study highlights the importance of developing language-specific,application-driven VSR solutions,particularly for non-English languages with limited linguistic resources.By bridging the gap between deep learning-based lip-reading and real-world clinical needs,this research advances assistive communication technologies,paving the way for more inclusive and medically relevant applications of VSR in rehabilitation and healthcare. 展开更多
关键词 LIP-REAdING deep learning automatic speech recognition visual speech decoding 3d convolutional neural network
在线阅读 下载PDF
Efficient 3D Biomedical Image Segmentation by Parallelly Multiscale Transformer−CNN Aggregation Network
19
作者 Wei Liu Yuxiao He +8 位作者 Tiantian Man Fulin Zhu Qiaoliang Chen Yaqi Huang Xuyu Feng Bin Li Ying Wan Jian He Shengyuan Deng 《Chemical & Biomedical Imaging》 2025年第8期522-533,共12页
Accurate and automated segmentation of 3D biomedical images is a sophisticated imperative in clinical diagnosis,imaging-guided surgery,and prognosis judgment.Although the burgeoning of deep learning technologies has f... Accurate and automated segmentation of 3D biomedical images is a sophisticated imperative in clinical diagnosis,imaging-guided surgery,and prognosis judgment.Although the burgeoning of deep learning technologies has fostered smart segmentators,the successive and simultaneous garnering global and local features still remains challenging,which is essential for an exact and efficient imageological assay.To this end,a segmentation solution dubbed the mixed parallel shunted transformer(MPSTrans)is developed here,highlighting 3DMPST blocks in a U-form framework.It enabled not only comprehensive characteristic capture and multiscale slice synchronization but also deep supervision in the decoder to facilitate the fetching of hierarchical representations.Performing on an unpublished colon cancer data set,this model achieved an impressive increase in dice similarity coefficient(DSC)and a 1.718 mm decease in Hausdorff distance at 95%(HD95),alongside a substantial shrink of computational load of 56.7%in giga floating-point operations per second(GFLOPs).Meanwhile,MPSTrans outperforms other mainstream methods(Swin UNETR,UNETR,nnU-Net,PHTrans,and 3D U-Net)on three public multiorgan(aorta,gallbladder,kidney,liver,pancreas,spleen,stomach,etc.)and multimodal(CT,PET-CT,and MRI)data sets of medical segmentation decathlon(MSD)brain tumor,multiatlas labeling beyond cranial vault(BCV),and automated cardiac diagnosis challenge(ACDC),accentuating its adaptability.These results reflect the potential of MPSTrans to advance the state-of-the-art in biomedical imaging analysis,which would offer a robust tool for enhanced diagnostic capacity. 展开更多
关键词 3d biomedical image segmentation shunted transformer convolutional neural networks parallel architecture multiscale feature extraction
在线阅读 下载PDF
3D Filtering by Block Matching and Convolutional Neural Network for Image Denoising
20
作者 Bei-Ji Zou Yun-Di Guo +3 位作者 Qi He Ping-Bo Ouyang Ke Liu Zai-Liang Chen 《Journal of Computer Science & Technology》 SCIE EI CSCD 2018年第4期838-848,共11页
Block matching based 3D filtering methods have achieved great success in image denoising tasks. However the manually set filtering operation could not well describe a good model to transform noisy images to clean imag... Block matching based 3D filtering methods have achieved great success in image denoising tasks. However the manually set filtering operation could not well describe a good model to transform noisy images to clean images. In this paper, we introduce convolutional neural network (CNN) for the 3D filtering step to learn a well fitted model for denoising. With a trainable model, prior knowledge is utilized for better mapping from noisy images to clean images. This block matching and CNN joint model (BMCNN) could denoise images with different sizes and different noise intensity well, especially images with high noise levels. The experimental results demonstrate that among all competing methods, this method achieves the highest peak signal to noise ratio (PSNR) when denoising images with high noise levels (σ 〉 40), and the best visual quality when denoising images with all the tested noise levels. 展开更多
关键词 block matching convolutional neural network (CNN) dENOISING 3d filtering
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部