期刊文献+
共找到23篇文章
< 1 2 >
每页显示 20 50 100
Analysis on the visual recognition effect of Tibetan-Chinese bilingual guide signs 被引量:1
1
作者 Chen Fei Su Guanhong +3 位作者 Zhang Danni Wang Chenzhu Zhang Yunlong Bo Wu 《Journal of Southeast University(English Edition)》 EI CAS 2023年第3期292-300,共9页
The influence of Tibetan characters on the visual recognition effects of Tibetan-Chinese bilingual guide signs based on drivers visual characteristics was studied.Four versions of Tibetan-Chinese bilingual guide signs... The influence of Tibetan characters on the visual recognition effects of Tibetan-Chinese bilingual guide signs based on drivers visual characteristics was studied.Four versions of Tibetan-Chinese bilingual guide signs with different heights and aspect ratios of Tibetan characters were designed,and corresponding road simulation models were established.10 Tibetan drivers and 10 Han drivers were selected to conduct driving simulation experiments using a driving simulator and eye tracker.The resultant data of the participant s pupil diameter and the visual recognition duration obtained from the eye tracker system were analyzed by analysis of variance.Combining results from the statistical analysis of driving simulator data and the questionnaire results on the visual recognition experience,it can be concluded that for Tibetan drivers,when the height of Tibetan characters was 2/3 of the height of Chinese characters,the visual recognition effect of the signs was better than that of 1/3 and 1/2 of the height of Chinese characters,indicating that increasing the height of Tibetan characters was conducive to improving the visual recognition effect of guide signs.The aspect ratio form of Tibetan had no significant effect on the level of difficulty encountered in drivers visual recognition,but it would affect the aesthetics of the bilingual guide signs.The recommended character height in Tibetan should be increased to improve the visual recognition process for Tibetan drivers. 展开更多
关键词 Tibetan-Chinese bilingual guide sign driving simulation visual recognition characteristics
在线阅读 下载PDF
Design of an Intelligent Robotic Excavator Based on Binocular Visual Recognition Technique 被引量:1
2
作者 ZHANG Xin LIU Jing WEN Huai-xing 《International Journal of Plant Engineering and Management》 2009年第1期48-51,共4页
Research on intelligent and robotic excavator has become a focus both at home and abroad, and this type of excavator becomes more and more important in application. In this paper, we developed a control system which c... Research on intelligent and robotic excavator has become a focus both at home and abroad, and this type of excavator becomes more and more important in application. In this paper, we developed a control system which can make the intelligent robotic excavator perform excavating operation autonomously. It can recognize the excavating targets by itself, program the operation automatically based on the original parameter, and finish all the tasks. Experimental results indicate the validity in real-time performance and precision of the control system. The intelligent robotic excavator can remarkably ease the labor intensity and enhance the working efficiency. 展开更多
关键词 excavating robot binocular visual recognition distributed control system trajectory tracing
在线阅读 下载PDF
Visual recognition of melamine in milk via selective metallo-hydrogel formation
3
作者 Xiaoling Bao Jianhong Liu +4 位作者 Qingshu Zheng Wei Pei Yimei Yang Yanyun Dai Tao Tu 《Chinese Chemical Letters》 SCIE CAS CSCD 2019年第12期2266-2270,共5页
A series of novel six-coordinated terpyridine zinc complexes,containing ammonium salts and thymine fragment at the two terminals,have been designed and synthesized,which can function as highly sensitive visualized sen... A series of novel six-coordinated terpyridine zinc complexes,containing ammonium salts and thymine fragment at the two terminals,have been designed and synthesized,which can function as highly sensitive visualized sensors for melamine detection via selective metallo-hydrogel formation.After fully characterization by various techniques,the complementary triple-hydrogen-bonding between the thymine fragment and melamine,as well as π-π stacking interactions may be responsible for the selective metallo-hydrogel formation.In light of the possible interference aroused by milk ingredients(proteins,peptides and amino acids) and legal/illegal additives(urine,sugars and vitamins),a series of control experiments are therefore involved.To our delight,this visual recognition is highly selective,no gelation was observed with the selected milk ingredients or additives.Remarkably,this new developed protocol enables convenient and highly selective visual recognition of melamine at a concentration as low as 10 ppm in raw milk without any tedious pretreatment. 展开更多
关键词 Hydrogen-bonding interaction Pincer zinc complex MELAMINE Metallo-hydrogel visual recognition
原文传递
Preliminary study on visual recognition under low visibility conditions caused by artificial dynamic smog
4
作者 Xu-Hong Zhang Zhe-Yi Chen +6 位作者 Bin-Bin Su Karunanedi Soobraydoo Hao-Ran Wu Qin-Zhuan Ren Lu Sun Fan Lyu Jun Jiang 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2018年第11期1821-1828,共8页
AIM: To quantitatively evaluate the effect of a simulated smog environment on human visual function by psychophysical methods.METHODS: The smog environment was simulated in a 40×40×60 cm3 glass chamber fil... AIM: To quantitatively evaluate the effect of a simulated smog environment on human visual function by psychophysical methods.METHODS: The smog environment was simulated in a 40×40×60 cm3 glass chamber filled with a PM2.5 aerosol, and 14 subjects with normal visual function were examined by psychophysical methods with the foggy smog box placed in front of their eyes. The transmission of light through the smog box, an indication of the percentage concentration of smog, was determined with a luminance meter. Visual function under different smog concentrations was evaluated by the E-visual acuity, crowded E-visual acuity and contrast sensitivity.RESULTS: E-visual acuity, crowded E-visual acuity and contrast sensitivity were all impaired with a decrease in the transmission rate(TR) according to power functions, with invariable exponents of-1.41,-1.62 and-0.7, respectively, and R2 values of 0.99 for E and crowded E-visual acuity, 0.96 for contrast sensitivity. Crowded E-visual acuity decreased faster than E-visual acuity. There was a good correlation between the TR, extinction coefficient and visibility under heavy-smog conditions.CONCLUSION: Increases in smog concentration have a strong effect on visual function. 展开更多
关键词 visual recognition low visibility conditions artificial smog
原文传递
Research on ancient town style construction strategies based on coupled quantitative analysis of AI visual recognition and scenic beauty evaluation
5
作者 Wu Jin Bifeng Zhu Hiroatsu Fukuda 《Frontiers of Architectural Research》 2025年第3期654-671,共18页
How to create the scenery is the key issue in ancient towns.In this study,50 photos were collected and distributed through the Internet.First,456 online questionnaires with 25,080 data were got.Respondents'favorit... How to create the scenery is the key issue in ancient towns.In this study,50 photos were collected and distributed through the Internet.First,456 online questionnaires with 25,080 data were got.Respondents'favoritism was affected by gender,age,region,profession,and education.Second,SAM computer model was applied to image recognition of Wuzhen style photos,analyzing their visual elements.Third,SPSS software was used to analyze the correlation between subjective beauty degree score and objective landscape elements.Based on the coupled quantitative analysis of AI visual recognition and beauty degree score,it is found that the landscape elements that tourists cared most about are:water bodies,ancient buildings and boats.The proportions of the best landscape elements for the spatial sense of the ancient town are the sky ranged from 26.4%to 38.2%,water body ranged from 19.7%to 34.3%,and buildings ranged from 10.4%to 38.2%.This study reveals the pattern of different types of tourists'evaluation of the landscape to summarize the landscape construction strategy of ancient towns in Jiangnan accordingly.The results are not only beneflt to the cultural tourism of Wuzhen,but can also be applied to many ancient towns in Jiangnan. 展开更多
关键词 create scenery computer model analyze correlation AI visual recognition quantitative analysis online questionnaires scenic beauty evaluation ancient towns
原文传递
CerfeVPR: Cross-Environment Robust Feature Enhancement for Visual Place Recognition
6
作者 Lingyun Xiang Hang Fu Chunfang Yang 《Computers, Materials & Continua》 2025年第7期325-345,共21页
In the Visual Place Recognition(VPR)task,existing research has leveraged large-scale pre-trained models to improve the performance of place recognition.However,when there are significant environmental differences betw... In the Visual Place Recognition(VPR)task,existing research has leveraged large-scale pre-trained models to improve the performance of place recognition.However,when there are significant environmental differences between query images and reference images,a large number of ineffective local features will interfere with the extraction of key landmark features,leading to the retrieval of visually similar but geographically different images.To address this perceptual aliasing problem caused by environmental condition changes,we propose a novel Visual Place Recognition method with Cross-Environment Robust Feature Enhancement(CerfeVPR).This method uses the GAN network to generate similar images of the original images under different environmental conditions,thereby enhancing the learning of robust features of the original images.This enables the global descriptor to effectively ignore appearance changes caused by environmental factors such as seasons and lighting,showing better place recognition accuracy than other methods.Meanwhile,we introduce a large kernel convolution adapter to fine tune the pre-trained model,obtaining a better image feature representation for subsequent robust feature learning.Then,we process the information of different local regions in the general features through a 3-layer pyramid scene parsing network and fuse it with a tag that retains global information to construct a multi-dimensional image feature representation.Based on this,we use the fused features of similar images to drive the robust feature learning of the original images and complete the feature matching between query images and retrieved images.Experiments on multiple commonly used datasets show that our method exhibits excellent performance.On average,CerfeVPR achieves the highest results,with all Recall@N values exceeding 90%.In particular,on the highly challenging Nordland dataset,the R@1 metric is improved by 4.6%,significantly outperforming other methods,which fully verifies the superiority of CerfeVPR in visual place recognition under complex environments. 展开更多
关键词 visual place recognition cross-environment robustness pre-trained model feature learning
在线阅读 下载PDF
DIEONet:Domain-Invariant Information Extraction and Optimization Network for Visual Place Recognition
7
作者 Shaoqi Hou Zebang Qin +3 位作者 Chenyu Wu Guangqiang Yin Xinzhong Wang Zhiguo Wang 《Computers, Materials & Continua》 2025年第3期5019-5033,共15页
Visual Place Recognition(VPR)technology aims to use visual information to judge the location of agents,which plays an irreplaceable role in tasks such as loop closure detection and relocation.It is well known that pre... Visual Place Recognition(VPR)technology aims to use visual information to judge the location of agents,which plays an irreplaceable role in tasks such as loop closure detection and relocation.It is well known that previous VPR algorithms emphasize the extraction and integration of general image features,while ignoring the mining of salient features that play a key role in the discrimination of VPR tasks.To this end,this paper proposes a Domain-invariant Information Extraction and Optimization Network(DIEONet)for VPR.The core of the algorithm is a newly designed Domain-invariant Information Mining Module(DIMM)and a Multi-sample Joint Triplet Loss(MJT Loss).Specifically,DIMM incorporates the interdependence between different spatial regions of the feature map in the cascaded convolutional unit group,which enhances the model’s attention to the domain-invariant static object class.MJT Loss introduces the“joint processing of multiple samples”mechanism into the original triplet loss,and adds a new distance constraint term for“positive and negative”samples,so that the model can avoid falling into local optimum during training.We demonstrate the effectiveness of our algorithm by conducting extensive experiments on several authoritative benchmarks.In particular,the proposed method achieves the best performance on the TokyoTM dataset with a Recall@1 metric of 92.89%. 展开更多
关键词 visual place recognition domain-invariant information mining module multi-sample joint triplet loss
在线阅读 下载PDF
Detection and Recognition of Spray Code Numbers on Can Surfaces Based on OCR
8
作者 Hailong Wang Junchao Shi 《Computers, Materials & Continua》 SCIE EI 2025年第1期1109-1128,共20页
A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can ... A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can bottom spray code number recognition.In the coding number detection stage,Differentiable Binarization Network is used as the backbone network,combined with the Attention and Dilation Convolutions Path Aggregation Network feature fusion structure to enhance the model detection effect.In terms of text recognition,using the Scene Visual Text Recognition coding number recognition network for end-to-end training can alleviate the problem of coding recognition errors caused by image color distortion due to variations in lighting and background noise.In addition,model pruning and quantization are used to reduce the number ofmodel parameters to meet deployment requirements in resource-constrained environments.A comparative experiment was conducted using the dataset of tank bottom spray code numbers collected on-site,and a transfer experiment was conducted using the dataset of packaging box production date.The experimental results show that the algorithm proposed in this study can effectively locate the coding of cans at different positions on the roller conveyor,and can accurately identify the coding numbers at high production line speeds.The Hmean value of the coding number detection is 97.32%,and the accuracy of the coding number recognition is 98.21%.This verifies that the algorithm proposed in this paper has high accuracy in coding number detection and recognition. 展开更多
关键词 Can coding recognition differentiable binarization network scene visual text recognition model pruning and quantification transport model
在线阅读 下载PDF
Visualization of flatness pattern recognition based on T-S cloud inference network 被引量:2
9
作者 张秀玲 赵亮 +1 位作者 臧佳音 樊红敏 《Journal of Central South University》 SCIE EI CAS CSCD 2015年第2期560-566,共7页
Flatness pattern recognition is the key of the flatness control. The accuracy of the present flatness pattern recognition is limited and the shape defects cannot be reflected intuitively. In order to improve it, a nov... Flatness pattern recognition is the key of the flatness control. The accuracy of the present flatness pattern recognition is limited and the shape defects cannot be reflected intuitively. In order to improve it, a novel method via T-S cloud inference network optimized by genetic algorithm(GA) is proposed. T-S cloud inference network is constructed with T-S fuzzy neural network and the cloud model. So, the rapid of fuzzy logic and the uncertainty of cloud model for processing data are both taken into account. What's more, GA possesses good parallel design structure and global optimization characteristics. Compared with the simulation recognition results of traditional BP Algorithm, GA is more accurate and effective. Moreover, virtual reality technology is introduced into the field of shape control by Lab VIEW, MATLAB mixed programming. And virtual flatness pattern recognition interface is designed.Therefore, the data of engineering analysis and the actual model are combined with each other, and the shape defects could be seen more lively and intuitively. 展开更多
关键词 pattern recognition T-S cloud inference network cloud model mixed programming virtual reality visual recognition
在线阅读 下载PDF
Efficient Visual Recognition:A Survey on Recent Advances and Brain-inspired Methodologies 被引量:1
10
作者 Yang Wu Ding-Heng Wang +5 位作者 Xiao-Tong Lu Fan Yang Man Yao Wei-Sheng Dong Jian-Bo Shi Guo-Qi Li 《Machine Intelligence Research》 EI CSCD 2022年第5期366-411,共46页
Visual recognition is currently one of the most important and active research areas in computer vision,pattern recognition,and even the general field of artificial intelligence.It has great fundamental importance and ... Visual recognition is currently one of the most important and active research areas in computer vision,pattern recognition,and even the general field of artificial intelligence.It has great fundamental importance and strong industrial needs,particularly the modern deep neural networks(DNNs)and some brain-inspired methodologies,have largely boosted the recognition performance on many concrete tasks,with the help of large amounts of training data and new powerful computation resources.Although recognition accuracy is usually the first concern for new progresses,efficiency is actually rather important and sometimes critical for both academic research and industrial applications.Moreover,insightful views on the opportunities and challenges of efficiency are also highly required for the entire community.While general surveys on the efficiency issue have been done from various perspectives,as far as we are aware,scarcely any of them focused on visual recognition systematically,and thus it is unclear which progresses are applicable to it and what else should be concerned.In this survey,we present the review of recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related and brain-inspired visual recognition approaches,including efficient network compression and dynamic brain-inspired networks.We investigate not only from the model but also from the data point of view(which is not the case in existing surveys)and focus on four typical data types(images,video,points,and events).This survey attempts to provide a systematic summary via a comprehensive survey that can serve as a valuable reference and inspire both researchers and practitioners working on visual recognition problems. 展开更多
关键词 visual recognition deep neural networks(DNNS) brain-inspired methodologies network compression dynamic inference SURVEY
原文传递
METHODS OF VISUAL RECOGNITION,POSITIONING AND ORIENTATING OF 3 D SIMPLE GEOMETRIC WORKPIECE
11
作者 王向军 王以忠 叶声华 《Transactions of Tianjin University》 EI CAS 1998年第2期36-40,共5页
The methods of visual recognition,positioning and orienting with simple 3 D geometric workpieces are presented in this paper.The principle and operating process of multiple orientation run le... The methods of visual recognition,positioning and orienting with simple 3 D geometric workpieces are presented in this paper.The principle and operating process of multiple orientation run length coding based on general orientation run length coding and visual recognition method are described elaborately.The method of positioning and orientating based on the moment of inertia of the workpiece binary image is stated also.It has been applied in a research on flexible automatic coordinate measuring system formed by integrating computer aided design,computer vision and computer aided inspection planning,with a coordinate measuring machine.The results show that integrating computer vision with measurement system is a feasible and effective approach to improve their flexibility and automation. 展开更多
关键词 automatic measurement visual recognition visual positioning visual orientating coordinate measuring machine
在线阅读 下载PDF
Recognition and rejection of foreign eggs of different colors in Barn Swallows
12
作者 Kui Yan Wei Liang 《Avian Research》 SCIE CSCD 2024年第3期374-378,共5页
Brood parasitic birds lay eggs in the nests of other birds,and the parasitized hosts can reduce the cost of raising unrelated offspring through the recognition of parasitic eggs.Hosts can adopt vision-based cognitive ... Brood parasitic birds lay eggs in the nests of other birds,and the parasitized hosts can reduce the cost of raising unrelated offspring through the recognition of parasitic eggs.Hosts can adopt vision-based cognitive mechanisms to recognize foreign eggs by comparing the colors of foreign and host eggs.However,there is currently no uniform conclusion as to whether this comparison involves the single or multiple threshold decision rules.In this study,we tested both hypotheses by adding model eggs of different colors to the nests of Barn Swallows(Hirundo rustica)of two geographical populations breeding in Hainan and Heilongjiang Provinces in China.Results showed that Barn Swallows rejected more white model eggs(moderate mimetic to their own eggs)and blue model eggs(highly non-mimetic eggs with shorter reflectance spectrum)than red model eggs(highly nonmimetic eggs with longer reflectance spectrum).There was no difference in the rejection rate of model eggs between the two populations of Barn Swallows,and clutch size was not a factor affecting egg recognition.Our results are consistent with the single rejection threshold model.This study provides strong experimental evidence that the color of model eggs can has an important effect on egg recognition in Barn Swallows,opening up new avenues to uncover the evolution of cuckoo egg mimicry and explore the cognitive mechanisms underlying the visual recognition of foreign eggs by hosts. 展开更多
关键词 Barn Swallow Egg color Hirundo rustica Multiple rejection threshold Single rejection threshold visual recognition system
在线阅读 下载PDF
Deep Learning-Based Approach for Arabic Visual Speech Recognition
13
作者 Nadia H.Alsulami Amani T.Jamal Lamiaa A.Elrefaei 《Computers, Materials & Continua》 SCIE EI 2022年第4期85-108,共24页
Lip-reading technologies are rapidly progressing following the breakthrough of deep learning.It plays a vital role in its many applications,such as:human-machine communication practices or security applications.In thi... Lip-reading technologies are rapidly progressing following the breakthrough of deep learning.It plays a vital role in its many applications,such as:human-machine communication practices or security applications.In this paper,we propose to develop an effective lip-reading recognition model for Arabic visual speech recognition by implementing deep learning algorithms.The Arabic visual datasets that have been collected contains 2400 records of Arabic digits and 960 records of Arabic phrases from 24 native speakers.The primary purpose is to provide a high-performance model in terms of enhancing the preprocessing phase.Firstly,we extract keyframes from our dataset.Secondly,we produce a Concatenated Frame Images(CFIs)that represent the utterance sequence in one single image.Finally,the VGG-19 is employed for visual features extraction in our proposed model.We have examined different keyframes:10,15,and 20 for comparing two types of approaches in the proposed model:(1)the VGG-19 base model and(2)VGG-19 base model with batch normalization.The results show that the second approach achieves greater accuracy:94%for digit recognition,97%for phrase recognition,and 93%for digits and phrases recognition in the test dataset.Therefore,our proposed model is superior to models based on CFIs input. 展开更多
关键词 Convolutional neural network deep learning lip reading transfer learning visual speech recognition
在线阅读 下载PDF
Relative attribute based incremental learning for image recognition 被引量:3
14
作者 Emrah Ergul 《CAAI Transactions on Intelligence Technology》 2017年第1期1-11,共11页
In this study, we propose an incremental learning approach based on a machine-machine interaction via relative attribute feedbacks that exploit comparative relationships among top level image categories. One machine a... In this study, we propose an incremental learning approach based on a machine-machine interaction via relative attribute feedbacks that exploit comparative relationships among top level image categories. One machine acts as 'Student (S)' with initially limited information and it endeavors to capture the task domain gradually by questioning its mentor on a pool of unlabeled data. The other machine is 'Teacher (T)' with the implicit knowledge for helping S on learning the class models. T initiates relative attributes as a communication channel by randomly sorting the classes on attribute space in an unsupervised manner. S starts modeling the categories in this intermediate level by using only a limited number of labeled data. Thereafter, it first selects an entropy-based sample from the pool of unlabeled data and triggers the conversation by propagating the selected image with its belief class in a query. Since T already knows the ground truth labels, it not only decides whether the belief is true or false, but it also provides an attribute-based feedback to S in each case without revealing the true label of the query sample if the belief is false. So the number of training data is increased virtually by dropping the falsely predicted sample back into the unlabeled pool. Next, S updates the attribute space which, in fact, has an impact on T's future responses, and then the category models are updated concurrently for the next run. We experience the weakly supervised algorithm on the real world datasets of faces and natural scenes in comparison with direct attribute prediction and semi-supervised learning approaches, and a noteworthy performance increase is achieved. 展开更多
关键词 Image classification Incremental learning Relative attribute visual recognition
在线阅读 下载PDF
Baseline Isolated Printed Text Image Database for Pashto Script Recognition
15
作者 Arfa Siddiqu Abdul Basit +3 位作者 Waheed Noor Muhammad Asfandyar Khan M.Saeed H.Kakar Azam Khan 《Intelligent Automation & Soft Computing》 SCIE 2023年第7期875-885,共11页
The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the... The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the absence of a standard publicly available dataset for several low-resource lan-guages,including the Pashto language remained a hurdle in the advancement of language processing.Realizing that,a clean dataset is the fundamental and core requirement of character recognition,this research begins with dataset generation and aims at a system capable of complete language understanding.Keeping in view the complete and full autonomous recognition of the cursive Pashto script.The first achievement of this research is a clean and standard dataset for the isolated characters of the Pashto script.In this paper,a database of isolated Pashto characters for forty four alphabets using various font styles has been introduced.In order to overcome the font style shortage,the graphical software Inkscape has been used to generate sufficient image data samples for each character.The dataset has been pre-processed and reduced in dimensions to 32×32 pixels,and further converted into the binary format with a black background and white text so that it resembles the Modified National Institute of Standards and Technology(MNIST)database.The benchmark database is publicly available for further research on the standard GitHub and Kaggle database servers both in pixel and Comma Separated Values(CSV)formats. 展开更多
关键词 Text-image database optical character recognition(OCR) pashto isolated characters visual recognition autonomous language understanding deep learning convolutional neural network(CNN)
在线阅读 下载PDF
A Vision-Based Fingertip-Writing Character Recognition System 被引量:1
16
作者 Ching-Long Shih Wen-Yo Lee Yu-Te Ku 《Journal of Computer and Communications》 2016年第4期160-168,共9页
This paper presents a vision-based fingertip-writing character recognition system. The overall system is implemented through a CMOS image camera on a FPGA chip. A blue cover is mounted on the top of a finger to simpli... This paper presents a vision-based fingertip-writing character recognition system. The overall system is implemented through a CMOS image camera on a FPGA chip. A blue cover is mounted on the top of a finger to simplify fingertip detection and to enhance recognition accuracy. For each character stroke, 8 sample points (including start and end points) are recorded. 7 tangent angles between consecutive sampled points are also recorded as features. In addition, 3 features angles are extracted: angles of the triangle consisting of the start point, end point and average point of all (8 total) sampled points. According to these key feature angles, a simple template matching K-nearest-neighbor classifier is applied to distinguish each character stroke. Experimental result showed that the system can successfully recognize fingertip-writing character strokes of digits and small lower case letter alphabets with an accuracy of almost 100%. Overall, the proposed finger-tip-writing recognition system provides an easy-to-use and accurate visual character input method. 展开更多
关键词 visual Character recognition Fingertip Detection Template Matching K-Nearest-Neighbor Classifier FPGA
在线阅读 下载PDF
Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning
17
作者 Nada Faisal Aljohani Emad Sami Jaha 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3037-3058,共22页
The continuing advances in deep learning have paved the way for several challenging ideas.One such idea is visual lip-reading,which has recently drawn many research interests.Lip-reading,often referred to as visual sp... The continuing advances in deep learning have paved the way for several challenging ideas.One such idea is visual lip-reading,which has recently drawn many research interests.Lip-reading,often referred to as visual speech recognition,is the ability to understand and predict spoken speech based solely on lip movements without using sounds.Due to the lack of research studies on visual speech recognition for the Arabic language in general,and its absence in the Quranic research,this research aims to fill this gap.This paper introduces a new publicly available Arabic lip-reading dataset containing 10490 videos captured from multiple viewpoints and comprising data samples at the letter level(i.e.,single letters(single alphabets)and Quranic disjoined letters)and in the word level based on the content and context of the book Al-Qaida Al-Noorania.This research uses visual speech recognition to recognize spoken Arabic letters(Arabic alphabets),Quranic disjoined letters,and Quranic words,mainly phonetic as they are recited in the Holy Quran according to Quranic study aid entitled Al-Qaida Al-Noorania.This study could further validate the correctness of pronunciation and,subsequently,assist people in correctly reciting Quran.Furthermore,a detailed description of the created dataset and its construction methodology is provided.This new dataset is used to train an effective pre-trained deep learning CNN model throughout transfer learning for lip-reading,achieving the accuracies of 83.3%,80.5%,and 77.5%on words,disjoined letters,and single letters,respectively,where an extended analysis of the results is provided.Finally,the experimental outcomes,different research aspects,and dataset collection consistency and challenges are discussed and concluded with several new promising trends for future work. 展开更多
关键词 visual speech recognition LIP-READING deep learning quranic Arabic dataset Tajwid
在线阅读 下载PDF
Nature's disguise:Empirical demonstration of dead-leaf masquerade in Kallima butterflies
18
作者 Zeng-Tao Zhang Long Yu +2 位作者 Hai-Zhen Chang Shi-Chang Zhang Dai-Qin Li 《Zoological Research》 SCIE CSCD 2024年第6期1201-1208,共8页
Animals deploy diverse color-based defenses against predators,including crypsis,mimicry,aposematism,and masquerade.While crypsis,mimicry,aposematism have been extensively studied,the strategy of masquerade-where organ... Animals deploy diverse color-based defenses against predators,including crypsis,mimicry,aposematism,and masquerade.While crypsis,mimicry,aposematism have been extensively studied,the strategy of masquerade-where organisms imitate inedible or inanimate objects such as leaves,twigs,stones,and bird droppings-remains comparatively underexplored,particularly in adult butterflies.The Indian oakleaf butterfly(Kallima inachus)exemplifies this phenomenon,with its wings resembling dead leaves,providing a classic example of natural selection.Although it has long been postulated that these butterflies evade predation by being misidentified as dead leaves,direct experimental evidence is lacking.In the current study,using domestic chicks as predators,we manipulated their prior experience with dead leaves(model objects)while maintaining constant exposure to butterflies to test whether dead-leaf masquerade provides a protective advantage by preventing recognition.Results showed a marked delay in the initiation of attacks by chicks familiar with dead leaves compared to those with no prior exposure or those exposed to visually altered leaves.Chicks with prior dead-leaf experience required a similar amount of time to attack the butterflies as they did to attack dead leaves.These findings provide the first empirical demonstration of dead-leaf masquerade in Kallima butterflies,shedding light on its evolutionary significance.Our study highlights the effectiveness of masquerade in inducing the misclassification of butterflies as inanimate objects,showcasing the precise mimicry achieved by these organisms when viewed in isolation from the model objects.This study advances our understanding of the evolution of masquerade and its role as a potent antipredator strategy in nature. 展开更多
关键词 Antipredator defense CAMOUFLAGE Deadleaf butterfly MASQUERADE Natural selection visual recognition
在线阅读 下载PDF
Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition
19
作者 Hongda Liu Yunlong Wang +4 位作者 Min Ren Junxing Hu Zhengquan Luo Guangqi Hou Zhenan Sun 《Machine Intelligence Research》 2025年第3期466-483,共18页
Skeleton-based action recognition has recently made significant progress.However,data imbalance is still a great challenge in real-world scenarios.The performance of current action recognition algorithms declines shar... Skeleton-based action recognition has recently made significant progress.However,data imbalance is still a great challenge in real-world scenarios.The performance of current action recognition algorithms declines sharply when training data suffers from heavy class imbalance.The imbalanced data actually degrades the representations learned by these methods and becomes the bottleneck for action recognition.How to learn unbiased representations from imbalanced action data is the key to long-tailed action recognition.In this paper,we propose a novel balanced representation learning method to address the long-tailed problem in action recognition.Firstly,a spatial-temporal action exploration strategy is presented to expand the sample space effectively,generating more valuable samples in a rebalanced manner.Secondly,we design a detached action-aware learning schedule to further mitigate the bias in the representation space.The schedule detaches the representation learning of tail classes from training and proposes an action-aware loss to impose more effective constraints.Additionally,a skip-type representation is proposed to provide complementary structural information.The proposed method is validated on four skeleton datasets,NTU RGB+D 60,NTU RGB+D 120,NW-UCLA and Kinetics.It not only achieves consistently large improvement compared to the state-of-the-art(SOTA)methods,but also demonstrates a superior generalization capacity through extensive experiments.Our code is available at https://github.com/firework8/BRL. 展开更多
关键词 Action recognition skeleton sequence long-tailed visual recognition imbalance learning.
原文传递
Virtual Reality-based Teleoperation with Robustness Against Modeling Errors 被引量:3
20
作者 蒋再男 刘宏 +1 位作者 王捷 黄剑斌 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2009年第3期325-333,共9页
This article investigates virtual reality (VR)-based teleoperation with robustness against modeling errors. VR technology is an effective way to overcome the large time delay during space robot teleoperation. However,... This article investigates virtual reality (VR)-based teleoperation with robustness against modeling errors. VR technology is an effective way to overcome the large time delay during space robot teleoperation. However, it depends highly on the accuracy of model. Model errors between the virtual and real environment exist inevitably. The existing way to deal with the problem is by means of either model matching or robot compliance control. As distinct from the existing methods, this article tries to combine m... 展开更多
关键词 space robot TELEOPERATION virtual reality model error visual recognition compliance control
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部