期刊文献+
共找到24篇文章
< 1 2 >
每页显示 20 50 100
Research on indoor positioning and navigating technology based on scale hierarchical visual image feature matching
1
作者 BIE Haoze QIN Danyang +1 位作者 YANG Jiaqiang LI Sitong 《High Technology Letters》 2025年第2期164-174,共11页
The impact of location services on people’s lives has grown significantly in the era of widespread smart device usage.Due to global navigation satellite system(GNSS)signal rejection,weak signal strength in indoor env... The impact of location services on people’s lives has grown significantly in the era of widespread smart device usage.Due to global navigation satellite system(GNSS)signal rejection,weak signal strength in indoor environments and radio signal interference caused by multiwall environments,which collectively lead to significant positioning errors,vision-based positioning has emerged as a crucial method in indoor positioning research.This paper introduces a scale hierarchical matching model to tackle challenges associated with large visual databases and high scene similarity,both of which will compromise matching accuracy and lead to prolonged positioning delays.The proposed model establishes an image feature database using GIST features and speeded up robust feature(SURF)in the offline stage.In the online stage,a positioning navigating algorithm is constructed based on Dijkstra’s path planning.Additionally,a corresponding Android application has been developed to facilitate visual positioning and navigation in indoor environments.Experimental results obtained in real indoor environments demonstrate that the proposed method significantly enhances positioning accuracy compared with similar algorithms,while effectively reducing time overhead.This improvement caters to the requirements for indoor positioning and navigation,thereby meeting user needs. 展开更多
关键词 visual feature scale hierarchy feature matching indoor positioning indoor navigation
在线阅读 下载PDF
A Novelty Framework in Image-Captioning with Visual Attention-Based Refined Visual Features
2
作者 Alaa Thobhani Beiji Zou +4 位作者 Xiaoyan Kui Amr Abdussalam Muhammad Asim Mohammed ELAffendi Sajid Shah 《Computers, Materials & Continua》 2025年第3期3943-3964,共22页
Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do ... Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do not evolve with the changing linguistic context,which can hinder the ability to form meaningful connections between the image and the generated captions.This limitation often leads to captions that are less accurate or descriptive.In this paper,we propose a novel approach to enhance image captioning by introducing dynamic interactions where visual features continuously adapt to the evolving linguistic context.Our model strengthens the alignment between visual and linguistic elements,resulting in more coherent and contextually appropriate captions.Specifically,we introduce two innovative modules:the Visual Weighting Module(VWM)and the Enhanced Features Attention Module(EFAM).The VWM adjusts visual features using partial attention,enabling dynamic reweighting of the visual inputs,while the EFAM further refines these features to improve their relevance to the generated caption.By continuously adjusting visual features in response to the linguistic context,our model bridges the gap between static visual features and dynamic language generation.We demonstrate the effectiveness of our approach through experiments on the MS-COCO dataset,where our method outperforms state-of-the-art techniques in terms of caption quality and contextual relevance.Our results show that dynamic visual-linguistic alignment significantly enhances image captioning performance. 展开更多
关键词 Image-captioning visual attention deep learning visual features
在线阅读 下载PDF
NeOR: neural exploration with feature-based visual odometry and tracking-failure-reduction policy
3
作者 ZHU Ziheng LIU Jialing +2 位作者 CHEN Kaiqi TONG Qiyi LIU Ruyu 《Optoelectronics Letters》 2025年第5期290-297,共8页
Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework f... Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework for embodied visual exploration that possesses the efficient exploration capabilities of deep reinforcement learning(DRL)-based exploration policies and leverages feature-based visual odometry(VO) for more accurate mapping and positioning results. An improved local policy is also proposed to reduce tracking failures of feature-based VO in weakly textured scenes through a refined multi-discrete action space, keyframe fusion, and an auxiliary task. The experimental results demonstrate that Ne OR has better mapping and positioning accuracy compared to other entirely learning-based exploration frameworks and improves the robustness of feature-based VO by significantly reducing tracking failures in weakly textured scenes. 展开更多
关键词 intelligent visual agents deep reinforcement learning drl based embodied visual exploration feature based visual odometry tracking failure reduction policy neural exploration deep reinforcement learning
原文传递
Heterogeneous data-driven aerodynamic modeling based on physical feature embedding 被引量:2
4
作者 Weiwei ZHANG Xuhao PENG +1 位作者 Jiaqing KOU Xu WANG 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第3期1-6,共6页
Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full ... Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework. 展开更多
关键词 Transonic flow Data-driven modeling feature embedding Heterogenous data feature visualization
原文传递
A Concise and Varied Visual Features-Based Image Captioning Model with Visual Selection
5
作者 Alaa Thobhani Beiji Zou +4 位作者 Xiaoyan Kui Amr Abdussalam Muhammad Asim Naveed Ahmed Mohammed Ali Alshara 《Computers, Materials & Continua》 SCIE EI 2024年第11期2873-2894,共22页
Image captioning has gained increasing attention in recent years.Visual characteristics found in input images play a crucial role in generating high-quality captions.Prior studies have used visual attention mechanisms... Image captioning has gained increasing attention in recent years.Visual characteristics found in input images play a crucial role in generating high-quality captions.Prior studies have used visual attention mechanisms to dynamically focus on localized regions of the input image,improving the effectiveness of identifying relevant image regions at each step of caption generation.However,providing image captioning models with the capability of selecting the most relevant visual features from the input image and attending to them can significantly improve the utilization of these features.Consequently,this leads to enhanced captioning network performance.In light of this,we present an image captioning framework that efficiently exploits the extracted representations of the image.Our framework comprises three key components:the Visual Feature Detector module(VFD),the Visual Feature Visual Attention module(VFVA),and the language model.The VFD module is responsible for detecting a subset of the most pertinent features from the local visual features,creating an updated visual features matrix.Subsequently,the VFVA directs its attention to the visual features matrix generated by the VFD,resulting in an updated context vector employed by the language model to generate an informative description.Integrating the VFD and VFVA modules introduces an additional layer of processing for the visual features,thereby contributing to enhancing the image captioning model’s performance.Using the MS-COCO dataset,our experiments show that the proposed framework competes well with state-of-the-art methods,effectively leveraging visual representations to improve performance.The implementation code can be found here:https://github.com/althobhani/VFDICM(accessed on 30 July 2024). 展开更多
关键词 Visual attention image captioning visual feature detector visual feature visual attention
在线阅读 下载PDF
Video-Based Deception Detection with Non-Contact Heart Rate Monitoring and Multi-Modal Feature Selection
6
作者 Yanfeng Li Jincheng Bian +1 位作者 Yiqun Gao Rencheng Song 《Journal of Beijing Institute of Technology》 EI CAS 2024年第3期175-185,共11页
Deception detection plays a crucial role in criminal investigation.Videos contain a wealth of information regarding apparent and physiological changes in individuals,and thus can serve as an effective means of decepti... Deception detection plays a crucial role in criminal investigation.Videos contain a wealth of information regarding apparent and physiological changes in individuals,and thus can serve as an effective means of deception detection.In this paper,we investigate video-based deception detection considering both apparent visual features such as eye gaze,head pose and facial action unit(AU),and non-contact heart rate detected by remote photoplethysmography(rPPG)technique.Multiple wrapper-based feature selection methods combined with the K-nearest neighbor(KNN)and support vector machine(SVM)classifiers are employed to screen the most effective features for deception detection.We evaluate the performance of the proposed method on both a self-collected physiological-assisted visual deception detection(PV3D)dataset and a public bag-oflies(BOL)dataset.Experimental results demonstrate that the SVM classifier with symbiotic organisms search(SOS)feature selection yields the best overall performance,with an area under the curve(AUC)of 83.27%and accuracy(ACC)of 83.33%for PV3D,and an AUC of 71.18%and ACC of 70.33%for BOL.This demonstrates the stability and effectiveness of the proposed method in video-based deception detection tasks. 展开更多
关键词 deception detection apparent visual features remote photoplethysmography non-contact heart rate feature selection
在线阅读 下载PDF
VTAN: A Novel Video Transformer Attention-Based Network for Dynamic Sign Language Recognition
7
作者 Ziyang Deng Weidong Min +2 位作者 Qing Han Mengxue Liu Longfei Li 《Computers, Materials & Continua》 2025年第2期2793-2812,共20页
Dynamic sign language recognition holds significant importance, particularly with the application of deep learning to address its complexity. However, existing methods face several challenges. Firstly, recognizing dyn... Dynamic sign language recognition holds significant importance, particularly with the application of deep learning to address its complexity. However, existing methods face several challenges. Firstly, recognizing dynamic sign language requires identifying keyframes that best represent the signs, and missing these keyframes reduces accuracy. Secondly, some methods do not focus enough on hand regions, which are small within the overall frame, leading to information loss. To address these challenges, we propose a novel Video Transformer Attention-based Network (VTAN) for dynamic sign language recognition. Our approach prioritizes informative frames and hand regions effectively. To tackle the first issue, we designed a keyframe extraction module enhanced by a convolutional autoencoder, which focuses on selecting information-rich frames and eliminating redundant ones from the video sequences. For the second issue, we developed a soft attention-based transformer module that emphasizes extracting features from hand regions, ensuring that the network pays more attention to hand information within sequences. This dual-focus approach improves effective dynamic sign language recognition by addressing the key challenges of identifying critical frames and emphasizing hand regions. Experimental results on two public benchmark datasets demonstrate the effectiveness of our network, outperforming most of the typical methods in sign language recognition tasks. 展开更多
关键词 Dynamic sign language recognition TRANSFORMER soft attention attention-based visual feature aggregation
在线阅读 下载PDF
An investigation of the visual features of urban street vitality using a convolutional neural network 被引量:6
8
作者 Yi Qi Sonam Chodron Drolma +4 位作者 Xiang Zhang Jing Liang Haibing Jiang Jiangang Xu Tianhua Ni 《Geo-Spatial Information Science》 SCIE CSCD 2020年第4期341-351,共11页
As a well-known urban landscape concept to describe urban space quality,urban street vitality is a subjective human perception of the urban environment but difficult to evaluate directly from the physical space.The st... As a well-known urban landscape concept to describe urban space quality,urban street vitality is a subjective human perception of the urban environment but difficult to evaluate directly from the physical space.The study utilized a modern machine learning computer vision algorithm in the urban build environment to simulate the process,which starts with the visual perception of the urban street landscape and ends with the human reaction to street vitality.By analyzing the optimized trained model,we tried to identify urban street vitality’s visual features and evaluate their importance.A region around the Mochou Lake in Nanjing,China,was set as our study area.Seven investigators surveyed the area,recorded their evaluation score on each site’s vitality level with a corresponding picture taken on site.A total of 370 pictures and recorded score pairs from 231 valid survey sites were used to train a convolutional neural network.After optimization,a deep neural network model with 43 layers,including 11 convolutional ones,was created.Heat maps were then used to identify the features which lead to high vitality score outputs.The spatial distributions of different types of feature entities were also analyzed to help identify the spatial effects.The study found that visual features,including human,construction site,shop front,and roadside/walking pavement,are vital ones that correspond to the vitality of the urban street.The consistency of these critical features with traditional urban vitality features indicates the model had learned useful knowledge from the training process.Applying the trained model in urban planning practices can help to improve the city environment for better attraction of residents’activities and communications. 展开更多
关键词 Urban street vitality visual feature convolutional neural network NANJING China
原文传递
Video Concept Detection Based on Multiple Features and Classifiers Fusion 被引量:1
9
作者 Dong Yuan Zhang Jiwei +2 位作者 Zhao Nan Chang Xiaofu Liu Wei 《China Communications》 SCIE CSCD 2012年第8期105-121,共17页
The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the ... The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the problem of semantic gap that low level features extracted by computers always fail to coincide with high-level concepts interpreted by humans. In this paper, we present a generic scheme for the detection video semantic concepts based on multiple visual features machine learning. Various global and local low-level visual features are systelrtically investigated, and kernelbased learning method equips the concept detection system to explore the potential of these features. Then we combine the different features and sub-systen on both classifier-level and kernel-level fusion that contribute to a more robust system Our proposed system is tested on the TRECVID dataset. The resulted Mean Average Precision (MAP) score is rmch better than the benchmark perforrmnce, which proves that our concepts detection engine develops a generic model and perforrrs well on both object and scene type concepts. 展开更多
关键词 concept detection visual feature extraction kemel-based learning classifier fusion
在线阅读 下载PDF
Hard exudates referral system in eye fundus utilizing speeded up robust features 被引量:1
10
作者 Syed Ali Gohar Naqvi Hafiz Muhammad Faisal Zafar Ihsanul Haq 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2017年第7期1171-1174,共4页
In the paper a referral system to assist the medical experts in the screening/referral of diabetic retinopathy is suggested. The system has been developed by a sequential use of different existing mathematical techniq... In the paper a referral system to assist the medical experts in the screening/referral of diabetic retinopathy is suggested. The system has been developed by a sequential use of different existing mathematical techniques. These techniques involve speeded up robust features(SURF), K-means clustering and visual dictionaries(VD). Three databases are mixed to test the working of the system when the sources are dissimilar. When experiments were performed an area under the curve(AUC) of 0.9343 was attained. The results acquired from the system are promising. 展开更多
关键词 referral system speeded up robust features eye fundus visual dictionaries
原文传递
On Lemon Defect Recognition with Visual Feature Extraction and Transfers Learning
11
作者 Yizhi He Tiancheng Zhu +1 位作者 Mingxuan Wang Hanqing Lu 《Journal of Data Analysis and Information Processing》 2021年第4期233-248,共16页
Applying machine learning to lemon defect recognition can improve the efficiency of lemon quality detection. This paper proposes a deep learning-based classification method with visual feature extraction and transfer ... Applying machine learning to lemon defect recognition can improve the efficiency of lemon quality detection. This paper proposes a deep learning-based classification method with visual feature extraction and transfer learning to recognize defect lemons (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;">, green and mold defects). First, the data enhancement and brightness compensation techniques are used for data prepossessing. The visual feature extraction is used to quantify the defects and determine the feature variables as the bandit basis for classification. Then we construct a convolutional neural network with an embedded Visual Geome</span><span style="font-family:Verdana;">try Group 16 based (VGG16-based) network using transfer learning. The proposed model is compared with many benchmark models such as</span><span style="font-family:Verdana;"> K-</span></span><span style="font-family:Verdana;">n</span><span style="font-family:Verdana;">earest</span><span style="font-family:""> </span><span style="font-family:Verdana;">Neighbor (KNN) and Support Vector Machine (SVM). Result</span><span style="font-family:Verdana;">s</span><span style="font-family:Verdana;"> show that the proposed model achieves the highest accuracy (95.44%) in the testing data set. The research provides a new solution for lemon defect recognition. 展开更多
关键词 Machine Learning Visual feature Extraction Convolutional Neural Networks Transfer Learning
在线阅读 下载PDF
Deep radio signal clustering with interpretability analysis based on saliency map
12
作者 Huaji Zhou Jing Bai +3 位作者 Yiran Wang Junjie Ren Xiaoniu Yang Licheng Jiao 《Digital Communications and Networks》 CSCD 2024年第5期1448-1458,共11页
With the development of information technology,radio communication technology has made rapid progress.Many radio signals that have appeared in space are difficult to classify without manually labeling.Unsupervised rad... With the development of information technology,radio communication technology has made rapid progress.Many radio signals that have appeared in space are difficult to classify without manually labeling.Unsupervised radio signal clustering methods have recently become an urgent need for this situation.Meanwhile,the high complexity of deep learning makes it difficult to understand the decision results of the clustering models,making it essential to conduct interpretable analysis.This paper proposed a combined loss function for unsupervised clustering based on autoencoder.The combined loss function includes reconstruction loss and deep clustering loss.Deep clustering loss is added based on reconstruction loss,which makes similar deep features converge more in feature space.In addition,a features visualization method for signal clustering was proposed to analyze the interpretability of autoencoder utilizing Saliency Map.Extensive experiments have been conducted on a modulated signal dataset,and the results indicate the superior performance of our proposed method over other clustering algorithms.In particular,for the simulated dataset containing six modulation modes,when the SNR is 20dB,the clustering accuracy of the proposed method is greater than 78%.The interpretability analysis of the clustering model was performed to visualize the significant features of different modulated signals and verified the high separability of the features extracted by clustering model. 展开更多
关键词 Unsupervised radio signal clustering Autoencoder Clustering features visualization Deep learning interpretability
在线阅读 下载PDF
Enhancing visual feature constraints in segmentation models for photovoltaic panel recognition
13
作者 Zhiyu Zhao Kangning Li +1 位作者 Yunhao Chen Jinyang Wang 《Energy and AI》 2025年第3期963-981,共19页
The integration of remote sensing and artificial intelligence technologies into photovoltaic(PV)power generation has significantly enhanced the efficiency and precision of monitoring and evaluating PV station construc... The integration of remote sensing and artificial intelligence technologies into photovoltaic(PV)power generation has significantly enhanced the efficiency and precision of monitoring and evaluating PV station construction.However,most semantic segmentation models are primarily developed for natural scenes,often neglecting the distinctive visual attributes of PV panels.We introduce a visual feature constraint method designed to tailor the segmentation network to the unique aspects of PV panels,including their texture,color,and shape.The method incorporates a constraint module,comprised of three adversarial autoencoders,into a conventional segmentation model.This technique represents a versatile training framework that can be seamlessly integrated with state-of-theart models,providing clear insights into the learning process.Experimental results with UperNet,SegFormer,DeepLabV3+,TransUNet,CorrMatch,SCSM and UKAN as baseline models show a maximum IoU improvement of 2.16%.Notably,UperNet attains the superior segmentation outcomes,whereas DeepLabV3+exhibits the greatest benefit from the imposed constraints.Furthermore,our findings reveal that various models exhibit distinct sensitivities to different visual features,and employing multiple constraints typically yields better results than relying on single-feature constraints.Collectively,our proposed method showcases its potential to advance PV panel segmentation in remote sensing applications,presenting a scalable and effective solution. 展开更多
关键词 Photovoltaic panels Visual features feature constraints SEGMENTATION
在线阅读 下载PDF
Robust Local Light Field Synthesis via Occlusion-aware Sampling and Deep Visual Feature Fusion
14
作者 Wenpeng Xing Jie Chen Yike Guo 《Machine Intelligence Research》 EI CSCD 2023年第3期408-420,共13页
Novel view synthesis has attracted tremendous research attention recently for its applications in virtual reality and immersive telepresence.Rendering a locally immersive light field(LF)based on arbitrary large baseli... Novel view synthesis has attracted tremendous research attention recently for its applications in virtual reality and immersive telepresence.Rendering a locally immersive light field(LF)based on arbitrary large baseline RGB references is a challenging problem that lacks efficient solutions with existing novel view synthesis techniques.In this work,we aim at truthfully rendering local immersive novel views/LF images based on large baseline LF captures and a single RGB image in the target view.To fully explore the precious information from source LF captures,we propose a novel occlusion-aware source sampler(OSS)module which efficiently transfers the pixels of source views to the target view′s frustum in an occlusion-aware manner.An attention-based deep visual fusion module is proposed to fuse the revealed occluded background content with a preliminary LF into a final refined LF.The proposed source sampling and fusion mechanism not only helps to provide information for occluded regions from varying observation angles,but also proves to be able to effectively enhance the visual rendering quality.Experimental results show that our proposed method is able to render high-quality LF images/novel views with sparse RGB references and outperforms state-of-the-art LF rendering and novel view synthesis methods. 展开更多
关键词 Novel view synthesis light field(LF)imaging multi-view stereo occlusion sampling deep visual feature(DVF)fusion
原文传递
Visual feature inter-learning for sign language recognition in emergency medicine
15
作者 WEI Chao LI Yunpeng LIU Jingze 《Optoelectronics Letters》 2025年第10期619-625,共7页
Accessible communication based on sign language recognition(SLR)is the key to emergency medical assistance for the hearing-impaired community.Balancing the capture of both local and global information in SLR for emerg... Accessible communication based on sign language recognition(SLR)is the key to emergency medical assistance for the hearing-impaired community.Balancing the capture of both local and global information in SLR for emergency medicine poses a significant challenge.To address this,we propose a novel approach based on the inter-learning of visual features between global and local information.Specifically,our method enhances the perception capabilities of the visual feature extractor by strategically leveraging the strengths of convolutional neural network(CNN),which are adept at capturing local features,and visual transformers which perform well at perceiving global features.Furthermore,to mitigate the issue of overfitting caused by the limited availability of sign language data for emergency medical applications,we introduce an enhanced short temporal module for data augmentation through additional subsequences.Experimental results on three publicly available sign language datasets demonstrate the efficacy of the proposed approach. 展开更多
关键词 sign language recognition slr visual feature inter learning emergency medicine visual feature extractor capture both local global information enhances perception capabilities emergency medical assistance sign language recognition
原文传递
Visual learning graph convolution for multi-grained orange quality grading 被引量:1
16
作者 GUAN Zhi-bin ZHANG Yan-qi +4 位作者 CHAI Xiu-juan CHAI Xin ZHANG Ning ZHANG Jian-hua SUN Tan 《Journal of Integrative Agriculture》 SCIE CAS CSCD 2023年第1期279-291,共13页
The quality of oranges is grounded on their appearance and diameter.Appearance refers to the skin’s smoothness and surface cleanliness;diameter refers to the transverse diameter size.They are visual attributes that v... The quality of oranges is grounded on their appearance and diameter.Appearance refers to the skin’s smoothness and surface cleanliness;diameter refers to the transverse diameter size.They are visual attributes that visual perception technologies can automatically identify.Nonetheless,the current orange quality assessment needs to address two issues:1)There are no image datasets for orange quality grading;2)It is challenging to effectively learn the fine-grained and distinct visual semantics of oranges from diverse angles.This study collected 12522 images from 2087 oranges for multi-grained grading tasks.In addition,it presented a visual learning graph convolution approach for multi-grained orange quality grading,including a backbone network and a graph convolutional network(GCN).The backbone network’s object detection,data augmentation,and feature extraction can remove extraneous visual information.GCN was utilized to learn the topological semantics of orange feature maps.Finally,evaluation results proved that the recognition accuracy of diameter size,appearance,and fine-grained orange quality were 99.50,97.27,and 97.99%,respectively,indicating that the proposed approach is superior to others. 展开更多
关键词 GCN MULTI-VIEW FINE-GRAINED visual feature APPEARANCE diameter size
在线阅读 下载PDF
Webpage Matching Based on Visual Similarity
17
作者 Mengmeng Ge Xiangzhan Yu +1 位作者 Lin Ye Jiantao Shi 《Computers, Materials & Continua》 SCIE EI 2022年第5期3393-3405,共13页
With the rapid development of the Internet,the types of webpages are more abundant than in previous decades.However,it becomes severe that people are facing more and more significant network security risks and enormou... With the rapid development of the Internet,the types of webpages are more abundant than in previous decades.However,it becomes severe that people are facing more and more significant network security risks and enormous losses caused by phishing webpages,which imitate the interface of real webpages and deceive the victims.To better identify and distinguish phishing webpages,a visual feature extraction method and a visual similarity algorithm are proposed.First,the visual feature extraction method improves the Visionbased Page Segmentation(VIPS)algorithm to extract the visual block and calculate its signature by perceptual hash technology.Second,the visual similarity algorithm presents a one-to-one correspondence based on the visual blocks’coordinates and thresholds.Then the weights are assigned according to the tree structure,and the similarity of the visual blocks is calculated on the basis of the measurement of the visual features’Hamming distance.Further,the visual similarity of webpages is generated by integrating the similarity and weight of different visual blocks.Finally,multiple pairs of phishing webpages and legitimate webpages are evaluated to verify the feasibility of the algorithm.The experimental results achieve excellent performance and demonstrate that our method can achieve 94%accuracy. 展开更多
关键词 Web security visual feature perceptual hash visual similarity
在线阅读 下载PDF
Structured Computational Modeling of Human Visual System for No-reference Image Quality Assessment
18
作者 Wen-Han Zhu Wei Sun +2 位作者 Xiong-Kuo Min Guang-Tao Zhai Xiao-Kang Yang 《International Journal of Automation and computing》 EI CSCD 2021年第2期204-218,共15页
Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate eval... Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate evaluator for visual experience,thus the modeling of human visual system(HVS)is a core issue for objective IQA and visual experience optimization.The traditional model based on black box fitting has low interpretability and it is difficult to guide the experience optimization effectively,while the model based on physiological simulation is hard to integrate into practical visual communication services due to its high computational complexity.For bridging the gap between signal distortion and visual experience,in this paper,we propose a novel perceptual no-reference(NR)IQA algorithm based on structural computational modeling of HVS.According to the mechanism of the human brain,we divide the visual signal processing into a low-level visual layer,a middle-level visual layer and a high-level visual layer,which conduct pixel information processing,primitive information processing and global image information processing,respectively.The natural scene statistics(NSS)based features,deep features and free-energy based features are extracted from these three layers.The support vector regression(SVR)is employed to aggregate features to the final quality prediction.Extensive experimental comparisons on three widely used benchmark IQA databases(LIVE,CSIQ and TID2013)demonstrate that our proposed metric is highly competitive with or outperforms the state-of-the-art NR IQA measures. 展开更多
关键词 Image quality assessment(IQA) no-reference(NR) structural computational modeling human visual system visual feature extraction
原文传递
Historical Arabic Images Classification and Retrieval Using Siamese Deep Learning Model
19
作者 Manal M.Khayyat Lamiaa A.Elrefaei Mashael M.Khayyat 《Computers, Materials & Continua》 SCIE EI 2022年第7期2109-2125,共17页
Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of eff... Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of efforts trying to automate the classification operation and retrieve similar images accurately.To reach this goal,we developed a VGG19 deep convolutional neural network to extract the visual features from the images automatically.Then,the distances among the extracted features vectors are measured and a similarity score is generated using a Siamese deep neural network.The Siamese model built and trained at first from scratch but,it didn’t generated high evaluation metrices.Thus,we re-built it from VGG19 pre-trained deep learning model to generate higher evaluation metrices.Afterward,three different distance metrics combined with the Sigmoid activation function are experimented looking for the most accurate method formeasuring the similarities among the retrieved images.Reaching that the highest evaluation parameters generated using the Cosine distance metric.Moreover,the Graphics Processing Unit(GPU)utilized to run the code instead of running it on the Central Processing Unit(CPU).This step optimized the execution further since it expedited both the training and the retrieval time efficiently.After extensive experimentation,we reached satisfactory solution recording 0.98 and 0.99 F-score for the classification and for the retrieval,respectively. 展开更多
关键词 Visual features vectors deep learning models distance methods similar image retrieval
在线阅读 下载PDF
Book Retrieval Method Based on QR Code and CBIR Technology
20
作者 Qiuyan Wang Haibing Dong 《Journal on Artificial Intelligence》 2019年第2期101-110,共10页
It is the development trend of library information management,which applies the mature and cutting-edge information technology to library information retrieval.In order to realize the rapid retrieval of massive book i... It is the development trend of library information management,which applies the mature and cutting-edge information technology to library information retrieval.In order to realize the rapid retrieval of massive book information,this paper proposes a book retrieval method combining QR code with image retrieval technology.This method analyzes the visual features of book images,design a book image retrieval method based on boundary contour and regional pixel distribution features,and realizes the association retrieval of book information combined with the QR code,so as to improve the efficiency of book retrieval.The experimental results show that,the books can be retrieved effectively through the boundary contour and regional pixel distribution features,the book information can be displayed through QR code,readers can be provided with fast and intelligent massive book retrieval services. 展开更多
关键词 Book retrieval image retrieval QR code visual features.
在线阅读 下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部