期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
Survey on path and view planning for UAVs
1
作者 Xiaohui ZHOU Zimu YI +2 位作者 Yilin LIU Kai HUANG Hui HUANG 《Virtual Reality & Intelligent Hardware》 2020年第1期56-69,共14页
Background In recent decades,unmanned aerial vehicles(UAVs)have developed rapidly and been widely applied in many domains,including photography,reconstruction,monitoring,and search and rescue.In such applications,one ... Background In recent decades,unmanned aerial vehicles(UAVs)have developed rapidly and been widely applied in many domains,including photography,reconstruction,monitoring,and search and rescue.In such applications,one key issue is path and view planning,which tells UAVs exactly where to fly and how to search.Methods With specific consideration for three popular UAV applications(scene reconstruction,environment exploration,and aerial cinematography),we present a survey that should assist researchers in positioning and evaluating their works in the context of existing solutions.Results/Conclusions It should also help newcomers and practitioners in related fields quickly gain an overview of the vast literature.In addition to the current research status,we analyze and elaborate on advantages,disadvantages,and potential explorative trends for each application domain. 展开更多
关键词 Unmanned aerial vehicle Path planning View panning Multi-view reconstruction Autonomous exploration Scene navigation Obstacle avoidance Drone cinematography Camera control
在线阅读 下载PDF
Long working distance portable smartphone microscopy for metallic mesh defect detection
2
作者 Zhengang LU Hongsheng QIN +2 位作者 Jing LI Ming SUN Jiubin TAN 《Frontiers of Information Technology & Electronic Engineering》 2025年第7期1131-1143,共13页
Metallic mesh is a transparent electromagnetic shielding film with a fine metal line structure.However,in production preparation or actual use it can develop defects that affect the optoelectronic performance.The deve... Metallic mesh is a transparent electromagnetic shielding film with a fine metal line structure.However,in production preparation or actual use it can develop defects that affect the optoelectronic performance.The development of in situ non-destructive testing(NDT)devices for metallic mesh requires long working distances,reflective optical path design,and miniaturization.To address the limitations of existing smartphone microscopes,which feature short working distances and inadequate transmission imaging for industrial in situ inspection,we propose a novel long-working-distance reflective smartphone microscopy(LD-RSM)system.LD-RSM comprises a 4f optical imaging system with external optical components and a smartphone.This system uses a beam splitter to achieve reflective imaging with the illumination system and imaging system on the same side of the sample.It achieves an optical resolution of 4.92µm and a working distance of up to 22.23 mm.Additionally,we introduce dual-prior weighted robust principal component analysis(DW-RPCA)for defect detection.This approach leverages spectral filter fusion and the Hough transform to model different defect types,which enhances the accuracy and efficiency of defect identification.Coupled with a double-threshold segmentation approach,the DW-RPCA method achieves a pixel-level defect detection accuracy(f-value)of 0.856 and 0.848 in square and circular metallic mesh datasets,respectively.Our work shows strong potential in the field of in situ industrial product inspection. 展开更多
关键词 Smartphone microscope Defect detection Reflective portable imaging Metallic mesh Low-rank decomposition
原文传递
Taming diffusion model for exemplar-based image translation
3
作者 Hao Ma Jingyuan Yang Hui Huang 《Computational Visual Media》 CSCD 2024年第6期1031-1043,共13页
Exemplar-based image translation involves converting semantic masks into photorealistic images that adopt the style of a given exemplar.However,most existing GAN-based translation methods fail to produce photorealisti... Exemplar-based image translation involves converting semantic masks into photorealistic images that adopt the style of a given exemplar.However,most existing GAN-based translation methods fail to produce photorealistic results.In this study,we propose a new diffusion model-based approach for generating high-quality images that are semantically aligned with the input mask and resemble an exemplar in style.The proposed method trains a conditional denoising diffusion probabilistic model(DDPM)with a SPADE module to integrate the semantic map.We then used a novel contextual loss and auxiliary color loss to guide the optimization process,resulting in images that were visually pleasing and semantically accurate.Experiments demonstrate that our method outperforms state-of-the-art approaches in terms of both visual quality and quantitative metrics. 展开更多
关键词 EXEMPLAR image translation denoising diffusion probabilistic model(DDPM)
原文传递
CLIP-Flow:Decoding images encoded in CLIP space
4
作者 Hao Ma Ming Li +4 位作者 Jingyuan Yang Or Patashnik Dani Lischinski Daniel Cohen-Or Hui Huang 《Computational Visual Media》 CSCD 2024年第6期1157-1168,共12页
This study introduces CLIP-Flow,a novel network for generating images from a given image or text.To effectively utilize the rich semantics contained in both modalities,we designed a semantics-guided methodology for im... This study introduces CLIP-Flow,a novel network for generating images from a given image or text.To effectively utilize the rich semantics contained in both modalities,we designed a semantics-guided methodology for image-and text-to-image synthesis.In particular,we adopted Contrastive Language-Image Pretraining(CLIP)as an encoder to extract semantics and StyleGAN as a decoder to generate images from such information.Moreover,to bridge the embedding space of CLIP and latent space of StyleGAN,real NVP is employed and modified with activation normalization and invertible convolution.As the images and text in CLIP share the same representation space,text prompts can be fed directly into CLIP-Flow to achieve text-to-image synthesis.We conducted extensive experiments on several datasets to validate the effectiveness of the proposed image-to-image synthesis method.In addition,we tested on the public dataset Multi-Modal CelebA-HQ,for text-to-image synthesis.Experiments validated that our approach can generate high-quality text-matching images,and is comparable with state-of-the-art methods,both qualitatively and quantitatively. 展开更多
关键词 image-to-image text-to-image contrastive language-image pretraining(CLIP) FLOW StyleGAN
原文传递
A Survey of Blue-Noise Sampling and Its Applications 被引量:7
5
作者 严冬明 郭建伟 +2 位作者 王斌 张晓鹏 Peter Wonk 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第3期439-452,共14页
In this paper, we survey recent approaches to blue-noise sampling and discuss their beneficial applications. We discuss the sampling algorithms that use points as sampling primitives and classify the sampling algorith... In this paper, we survey recent approaches to blue-noise sampling and discuss their beneficial applications. We discuss the sampling algorithms that use points as sampling primitives and classify the sampling algorithms based on various aspects, e.g., the sampling domain and the type of algorithm. We demonstrate several well-known applications that can be improved by recent blue-noise sampling techniques, as well as some new applications such as dynamic sampling and blue-noise remeshing. 展开更多
关键词 blue-noise sampling Poisson-disk sampling Lloyd relaxation RENDERING REMESHING
原文传递
Recurrent 3D attentional networks for end-to-end active object recognition 被引量:1
6
作者 Min Liu Yifei Shi +3 位作者 Lintao Zheng Kai Xu Hui Huang Dinesh Manocha 《Computational Visual Media》 CSCD 2019年第1期91-103,共13页
Active vision is inherently attention-driven:an agent actively selects views to attend in order to rapidly perform a vision task while improving its internal representation of the scene being observed.Inspired by the ... Active vision is inherently attention-driven:an agent actively selects views to attend in order to rapidly perform a vision task while improving its internal representation of the scene being observed.Inspired by the recent success of attention-based models in 2D vision tasks based on single RGB images, we address multi-view depth-based active object recognition using an attention mechanism, by use of an end-to-end recurrent 3D attentional network. The architecture takes advantage of a recurrent neural network to store and update an internal representation. Our model,trained with 3D shape datasets, is able to iteratively attend the best views targeting an object of interest for recognizing it. To realize 3D view selection, we derive a 3D spatial transformer network. It is dierentiable,allowing training with backpropagation, and so achieving much faster convergence than the reinforcement learning employed by most existing attention-based models. Experiments show that our method, with only depth input, achieves state-of-the-art next-best-view performance both in terms of time taken and recognition accuracy. 展开更多
关键词 active object RECOGNITION RECURRENT neural network next-best-view 3D ATTENTION
原文传递
Fitting boxes to Manhattan scenes using linear integer programming
7
作者 Minglei Li Liangliang Nan Shaochuang Liu 《International Journal of Digital Earth》 SCIE EI CSCD 2016年第8期806-817,共12页
We propose an approach for automatic generation of building models by assembling a set of boxes using a Manhattan-world assumption.The method first aligns the point cloud with a per-building local coordinate system,an... We propose an approach for automatic generation of building models by assembling a set of boxes using a Manhattan-world assumption.The method first aligns the point cloud with a per-building local coordinate system,and then fits axis-aligned planes to the point cloud through an iterative regularization process.The refined planes partition the space of the data into a series of compact cubic cells(candidate boxes)spanning the entire 3D space of the input data.We then choose to approximate the target building by the assembly of a subset of these candidate boxes using a binary linear programming formulation.The objective function is designed to maximize the point cloud coverage and the compactness of the final model.Finally,all selected boxes are merged into a lightweight polygonal mesh model,which is suitable for interactive visualization of large scale urban scenes.Experimental results and a comparison with state-of-the-art methods demonstrate the effectiveness of the proposed framework. 展开更多
关键词 Urban building models aerial point cloud Manhattan scenes linear integer programming
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部