Real-time and accurate drogue pose measurement during docking is basic and critical for Autonomous Aerial Refueling(AAR).Vision measurement is the best practicable technique,but its measurement accuracy and robustness...Real-time and accurate drogue pose measurement during docking is basic and critical for Autonomous Aerial Refueling(AAR).Vision measurement is the best practicable technique,but its measurement accuracy and robustness are easily affected by limited computing power of airborne equipment,complex aerial scenes and partial occlusion.To address the above challenges,we propose a novel drogue keypoint detection and pose measurement algorithm based on monocular vision,and realize real-time processing on airborne embedded devices.Firstly,a lightweight network is designed with structural re-parameterization to reduce computational cost and improve inference speed.And a sub-pixel level keypoints prediction head and loss functions are adopted to improve keypoint detection accuracy.Secondly,a closed-form solution of drogue pose is computed based on double spatial circles,followed by a nonlinear refinement based on Levenberg-Marquardt optimization.Both virtual simulation and physical simulation experiments have been used to test the proposed method.In the virtual simulation,the mean pixel error of the proposed method is 0.787 pixels,which is significantly superior to that of other methods.In the physical simulation,the mean relative measurement error is 0.788%,and the mean processing time is 13.65 ms on embedded devices.展开更多
Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t...Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.展开更多
Visual sensors are used to measure the relative state of the chaser spacecraft to the target spacecraft during close range ren- dezvous phases. This article proposes a two-stage iterative algorithm based on an inverse...Visual sensors are used to measure the relative state of the chaser spacecraft to the target spacecraft during close range ren- dezvous phases. This article proposes a two-stage iterative algorithm based on an inverse projection ray approach to address the relative position and attitude estimation by using feature points and monocular vision. It consists of two stages: absolute orienta- tion and depth recovery. In the first stage, Umeyama's algorithm is used to fit the three-dimensional (3D) model set and estimate the 3D point set while in the second stage, the depths of the observed feature points are estimated. This procedure is repeated until the result converges. Moreover, the effectiveness and convergence of the proposed algorithm are verified through theoreti- cal analysis and mathematical simulation.展开更多
In order to decrease vehicle crashes, a new rear view vehicle detection system based on monocular vision is designed. First, a small and flexible hardware platform based on a DM642 digtal signal processor (DSP) micr...In order to decrease vehicle crashes, a new rear view vehicle detection system based on monocular vision is designed. First, a small and flexible hardware platform based on a DM642 digtal signal processor (DSP) micro-controller is built. Then, a two-step vehicle detection algorithm is proposed. In the first step, a fast vehicle edge and symmetry fusion algorithm is used and a low threshold is set so that all the possible vehicles have a nearly 100% detection rate (TP) and the non-vehicles have a high false detection rate (FP), i. e., all the possible vehicles can be obtained. In the second step, a classifier using a probabilistic neural network (PNN) which is based on multiple scales and an orientation Gabor feature is trained to classify the possible vehicles and eliminate the false detected vehicles from the candidate vehicles generated in the first step. Experimental results demonstrate that the proposed system maintains a high detection rate and a low false detection rate under different road, weather and lighting conditions.展开更多
A system for mobile robot localization and navigation was presented.With the proposed system,the robot can be located and navigated by a single landmark in a single image.And the navigation mode may be following-track...A system for mobile robot localization and navigation was presented.With the proposed system,the robot can be located and navigated by a single landmark in a single image.And the navigation mode may be following-track,teaching and playback,or programming.The basic idea is that the system computes the differences between the expected and the recognized position at each time and then controls the robot in a direction to reduce those differences.To minimize the robot sensor equipment,only one omnidirectional camera was used.Experiments in disturbing environments show that the presented algorithm is robust and easy to implement,without camera rectification.The rootmean-square error(RMSE) of localization is 1.4,cm,and the navigation error in teaching and playback is within 10,cm.展开更多
Vehicle anti-collision technique is a hot topic in the research area of Intelligent Transport System. The research on preceding vehicles detection and the distance measurement, which are the key techniques, makes grea...Vehicle anti-collision technique is a hot topic in the research area of Intelligent Transport System. The research on preceding vehicles detection and the distance measurement, which are the key techniques, makes great contributions to safe-driving. This paper presents a method which can be used to detect preceding vehicles and get the distance between own car and the car ahead. Firstly, an adaptive threshold method is used to get shadow feature, and a shadow!area merging approach is used to deal with the distortion of the shadow border. Region of interest(ROI) is obtained using shadow feature. Then in the ROI, symmetry feature is analyzed to verify whether there are vehicles and to locate the vehicles. Finally, using monocular vision distance measurement based on camera interior parameters and geometrical reasoning, we get the distance between own car and the preceding one. Experimental results show that the proposed method can detect the preceding vehicle effectively and get the distance between vehicles accurately.展开更多
Drogue recognition and 3D locating is a key problem during the docking phase of the autonomous aerial refueling (AAR). To solve this problem, a novel and effective method based on monocular vision is presented in th...Drogue recognition and 3D locating is a key problem during the docking phase of the autonomous aerial refueling (AAR). To solve this problem, a novel and effective method based on monocular vision is presented in this paper. Firstly, by employing computer vision with red-ring-shape feature, a drogue detection and recognition algorithm is proposed to guarantee safety and ensure the robustness to the drogue diversity and the changes in environmental condi- tions, without using a set of infrared light emitting diodes (LEDs) on the parachute part of the dro- gue. Secondly, considering camera lens distortion, a monocular vision measurement algorithm for drogue 3D locating is designed to ensure the accuracy and real-time performance of the system, with the drogue attitude provided. Finally, experiments are conducted to demonstrate the effective- ness of the proposed method. Experimental results show the performances of the entire system in contrast with other methods, which validates that the proposed method can recognize and locate the drogue three dimensionally, rapidly and precisely.展开更多
A new visual measurement method is proposed to estimate three-dimensional (3D) position of the object on the floor based on a single camera. The camera fixed on a robot is in an inclined position with respect to the...A new visual measurement method is proposed to estimate three-dimensional (3D) position of the object on the floor based on a single camera. The camera fixed on a robot is in an inclined position with respect to the floor. A measurement model with the camera's extrinsic parameters such as the height and pitch angle is described. Single image of a chessboard pattern placed on the floor is enough to calibrate the camera's extrinsic parameters after the camera's intrinsic parameters are calibrated. Then the position of object on the floor can be computed with the measurement model. Furthermore, the height of object can be calculated with the paired-points in the vertical line sharing the same position on the floor. Compared to the conventional method used to estimate the positions on the plane, this method can obtain the 3D positions. The indoor experiment testifies the accuracy and validity of the proposed method.展开更多
A hierarchical mobile robot simultaneous localization and mapping (SLAM) method that allows us to obtain accurate maps was presented. The local map level is composed of a set of local metric feature maps that are guar...A hierarchical mobile robot simultaneous localization and mapping (SLAM) method that allows us to obtain accurate maps was presented. The local map level is composed of a set of local metric feature maps that are guaranteed to be statistically independent. The global level is a topological graph whose arcs are labeled with the relative location between local maps. An estimation of these relative locations is maintained with local map alignment algorithm, and more accurate estimation is calculated through a global minimization procedure using the loop closure constraint. The local map is built with Rao-Blackwellised particle filter (RBPF), where the particle filter is used to extending the path posterior by sampling new poses. The landmark position estimation and update is implemented through extended Kalman filter (EKF). Monocular vision mounted on the robot tracks the 3D natural point landmarks, which are structured with matching scale invariant feature transform (SIFT) feature pairs. The matching for multi-dimension SIFT features is implemented with a KD-tree in the time cost of O(lbN). Experiment results on Pioneer mobile robot in a real indoor environment show the superior performance of our proposed method.展开更多
In the laser displacement sensors measurement system,the laser beam direction is an important parameter.Particularly,the azimuth and pitch angles are the most important parameters to a laser beam.In this paper,based o...In the laser displacement sensors measurement system,the laser beam direction is an important parameter.Particularly,the azimuth and pitch angles are the most important parameters to a laser beam.In this paper,based on monocular vision,a laser beam direction measurement method is proposed.First,place the charge coupled device(CCD)camera above the base plane,and adjust and fix the camera position so that the optical axis is nearly perpendicular to the base plane.The monocular vision localization model is established by using circular aperture calibration board.Then the laser beam generating device is placed and maintained on the base plane at fixed position.At the same time a special target block is placed on the base plane so that the laser beam can project to the special target and form a laser spot.The CCD camera placed above the base plane can acquire the laser spot and the image of the target block clearly,so the two-dimensional(2D)image coordinate of the centroid of the laser spot can be extracted by correlation algorithm.The target is moved at an equal distance along the laser beam direction,and the spots and target images of each moving under the current position are collected by the CCD camera.By using the relevant transformation formula and combining the intrinsic parameters of the target block,the2D coordinates of the gravity center of the spot are converted to the three-dimensional(3D)coordinate in the base plane.Because of the moving of the target,the3D coordinates of the gravity center of the laser spot at different positions are obtained,and these3D coordinates are synthesized into a space straight line to represent the laser beam to be measured.In the experiment,the target parameters are measured by high-precision instruments,and the calibration parameters of the camera are calibrated by a high-precision calibration board to establish the corresponding positioning model.The measurement accuracy is mainly guaranteed by the monocular vision positioning accuracy and the gravity center extraction accuracy.The experimental results show the maximum error of the angle between laser beams reaches to0.04°and the maximum error of beam pitch angle reaches to0.02°.展开更多
Building fences to manage the cattle grazing can be very expensive;cost inefficient. These do not provide dynamic control over the area in which the cattle are grazing. Existing virtual fencing techniques for the cont...Building fences to manage the cattle grazing can be very expensive;cost inefficient. These do not provide dynamic control over the area in which the cattle are grazing. Existing virtual fencing techniques for the control of herds of cattle, based on polygon coordinate definition of boundaries is limited in the area of land mass coverage and dynamism. This work seeks to develop a more robust and an improved monocular vision based boundary avoidance for non-invasive stray control system for cattle, with a view to increase land mass coverage in virtual fencing techniques and dynamism. The monocular vision based depth estimation will be modeled using concept of global Fourier Transform (FT) and local Wavelet Transform (WT) of image structure of scenes (boundaries). The magnitude of the global Fourier Transform gives the dominant orientations and textual patterns of the image;while the local Wavelet Transform gives the dominant spectral features of the image and their spatial distribution. Each scene picture or image is defined by features v, which contain the set of global (FT) and local (WT) statistics of the image. Scenes or boundaries distances are given by estimating the depth D by means of the image features v. Sound cues of intensity equivalent to the magnitude of the depth D are applied to the animal ears as stimuli. This brings about the desired control as animals tend to move away from uncomfortable sounds.展开更多
In terms of the requirement of automatically sorting pearls, the pearl contour feature extraction and shape recognition algorithm are studied in this paper to reckon with the rapid identification of pearls shape onlin...In terms of the requirement of automatically sorting pearls, the pearl contour feature extraction and shape recognition algorithm are studied in this paper to reckon with the rapid identification of pearls shape online,and a monocular dynamic machine vision-based pearl shape detection device is designed. Through blowing, the pearl is suspended in a funnel shaped container and flipped rapidly in the device. The entire surface image of the pearl to be measured can be promptly grasped by the camera placed right above the funnel. The results of illumination experiments conducted from different angles indicate that the image contour acquired by the medium angle illumination is better extracted. The pearl shape test indicates that the method is incorporated with the inflatable suspension device to classify the pearls into seven types according to the national standard,and additionally the average error rate is confined under 5.38%. The shape characteristic of the pearl can be detected promptly and reliably, and accordingly the high-speed automatic sorting can be satisfied.展开更多
Background:We investigate whether changes in visual plasticity induced by monocular deprivation can be maintained across multiple days.It has been known that monocular deprivation strengthens the deprived eye in adult...Background:We investigate whether changes in visual plasticity induced by monocular deprivation can be maintained across multiple days.It has been known that monocular deprivation strengthens the deprived eye in adults with normal vision for a short period of time(30-60 minutes).This has been shown through a variety of visual tasks such as binocular combination and rivalry.Methods:Ten subjects were recruited and patched for five consecutive days for two hours.We used a binocular phase combination task to measure the subjects’sensory eye balances.We initially measured their baseline of sensory eye balance,patched their dominant eye,and then conducted post-patching measurements at 0,3,6,12,24 and 48 minutes after patching.Results:We performed a 2-way ANOVA(Before vs.after patching×Day);we found that although the effect of monocular deprivation on the deprived eye was significant,F(1,9)=17.32,P=0.002,the effect of Day was not.Conclusions:Hence we found no accumulation of the patching effect across five days in healthy adults.This suggests that the degree of remnant neural plasticity in adult primary visual cortex may be too limited to be exploited therapeutically.展开更多
Orientation measurement of objects is vital in micro assembly.In this paper,we present a novel method based on monocular microscopic vision for 3-D orientation measurement of objects with planar surfaces.The proposed ...Orientation measurement of objects is vital in micro assembly.In this paper,we present a novel method based on monocular microscopic vision for 3-D orientation measurement of objects with planar surfaces.The proposed methods aim to measure the orientation of the object,which does not require calibrating the intrinsic parameters of microscopic camera.In our methods,the orientation of the object is firstly measured with analytical computation based on feature points.The results of the analytical computation are coarse because the information about feature points is not fully used.In order to improve the precision,the orientation measurement is converted into an optimization process base on the relationship between deviations in image space and in Cartesian space under microscopic vision.The results of the analytical computation are used as the initial values of the optimization process.The optimized variables are the three rotational angles of the object and the pixel equivalent coefficient.The objective of the optimization process is to minimize the coordinates differences of the feature points on the object.The precision of the orientation measurement is boosted effectively.Experimental and comparative results validate the effectiveness of the proposed methods.展开更多
A trajectory tracking method is presented for the visual navigation of the monocular mobile robot.The robot move along line trajectory drawn beforehand,recognized and stop on the stop-sign to finish special task.The r...A trajectory tracking method is presented for the visual navigation of the monocular mobile robot.The robot move along line trajectory drawn beforehand,recognized and stop on the stop-sign to finish special task.The robot uses a forward looking colorful digital camera to capture information in front of the robot,and by the use of HSI model partition the trajectory and the stop-sign out.Then the "sampling estimate" method was used to calculate the navigation parameters.The stop-sign is easily recognized and can identify 256 different signs.Tests indicate that the method can fit large-scale intensity of brightness and has more robustness and better real-time character.展开更多
This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as o...This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as other transformer-based models including Token to Token ViT,ViT withoutmemory,and Parallel ViT.Leveraging awidely-used steel surface defect dataset,the research applies data augmentation and t-distributed stochastic neighbor embedding(t-SNE)to enhance feature extraction and understanding.These techniques mitigated overfitting,stabilized training,and improved generalization capabilities.The LMViT model achieved a test accuracy of 97.22%,significantly outperforming ResNet18(88.89%)and ResNet50(88.90%),aswell as the Token to TokenViT(88.46%),ViT without memory(87.18),and Parallel ViT(91.03%).Furthermore,LMViT exhibited superior training and validation performance,attaining a validation accuracy of 98.2%compared to 91.0%for ResNet 18,96.0%for ResNet50,and 89.12%,87.51%,and 91.21%for Token to Token ViT,ViT without memory,and Parallel ViT,respectively.The findings highlight the LMViT’s ability to capture long-range dependencies in images,an areawhere CNNs struggle due to their reliance on local receptive fields and hierarchical feature extraction.The additional transformer-based models also demonstrate improved performance in capturing complex features over CNNs,with LMViT excelling particularly at detecting subtle and complex defects,which is critical for maintaining product quality and operational efficiency in industrial applications.For instance,the LMViT model successfully identified fine scratches and minor surface irregularities that CNNs often misclassify.This study not only demonstrates LMViT’s potential for real-world defect detection but also underscores the promise of other transformer-based architectures like Token to Token ViT,ViT without memory,and Parallel ViT in industrial scenarios where complex spatial relationships are key.Future research may focus on enhancing LMViT’s computational efficiency for deployment in real-time quality control systems.展开更多
Objective:This study aimed to investigate the prevalence,causes,and influencing factors of vision impairment in the elderly population aged 60 years and above in Mangxin Town,Kashgar region,Xinjiang,China.Located in a...Objective:This study aimed to investigate the prevalence,causes,and influencing factors of vision impairment in the elderly population aged 60 years and above in Mangxin Town,Kashgar region,Xinjiang,China.Located in a region characterized by intense ultraviolet radiation and arid climatic conditions,Mangxin Town presents unique environmental challenges that may exacerbate ocular health issues.Despite the global emphasis on addressing vision impairment among aging populations,there remains a paucity of updated and region-specific data in Xinjiang,necessitating this comprehensive assessment to inform targeted interventions.Methods:A cross-sectional study was conducted from May to June 2024,involving 1,311 elderly participants(76.76%participation rate)out of a total eligible population of 1,708 individuals aged≥60 years.Participants underwent detailed ocular examinations,including assessments of uncorrected visual acuity(UVA)and best-corrected visual acuity(BCVA)using standard logarithmic charts,slit-lamp biomicroscopy,optical coherence tomography(OCT,Topcon DRI OCT Triton),fundus photography,and intraocular pressure measurement(Canon TX-20 Tonometer).A multidisciplinary team of 10 ophthalmologists and 2 local village doctors,trained rigorously in standardized protocols,ensured consistent data collection.Demographic,lifestyle,and medical history data were collected via questionnaires.Statistical analyses,performed using STATA 16,included multivariate logistic regression to identify risk factors,with significance defined as P<0.05.Results:The overall prevalence of vision impairment was 13.21%(95%CI:11.37%-15.04%),with low vision at 11.76%(95%CI:10.01%-13.50%)and blindness at 1.45%(95%CI:0.80%-2.10%).Cataract emerged as the leading cause,responsible for 68.20%of cases,followed by glaucoma(5.80%),optic atrophy(5.20%),and age-related macular degeneration(2.90%).Vision impairment prevalence escalated significantly with age:7.74%in the 60–69 age group,17.79%in 70–79,and 33.72%in those≥80.Males exhibited higher prevalence than females(15.84%vs.10.45%,P=0.004).Multivariate analysis revealed age≥80 years(OR=6.43,95%CI:3.79%-10.90%),male sex(OR=0.53,95%CI:0.34%-0.83%),and daily exercise(OR=0.44,95%CI:0.20%-0.95%)as significant factors.History of eye disease showed a non-significant trend toward increased risk(OR=1.49,P=0.107).Education level,income,and smoking status showed no significant associations.Conclusions:This study underscores cataract as the predominant cause of vision impairment in Mangxin Town’s elderly population,with age and sex as critical determinants.The findings align with global patterns but highlight region-specific challenges,such as environmental factors contributing to cataract prevalence.Public health strategies should prioritize improving access to cataract surgery,enhancing grassroots ophthalmic infrastructure,and integrating portable screening technologies for early detection of fundus diseases.Additionally,promoting health education on UV protection and lifestyle modifications,such as regular exercise,may mitigate risks.Future research should expand to broader regions in Xinjiang,employ advanced diagnostic tools for complex conditions like glaucoma,and explore longitudinal trends to refine intervention strategies.These efforts are vital to reducing preventable blindness and improving quality of life for aging populations in underserved areas.展开更多
基金supported by the National Science Fund for Distinguished Young Scholars,China(No.51625501)Aeronautical Science Foundation of China(No.20240046051002)National Natural Science Foundation of China(No.52005028).
文摘Real-time and accurate drogue pose measurement during docking is basic and critical for Autonomous Aerial Refueling(AAR).Vision measurement is the best practicable technique,but its measurement accuracy and robustness are easily affected by limited computing power of airborne equipment,complex aerial scenes and partial occlusion.To address the above challenges,we propose a novel drogue keypoint detection and pose measurement algorithm based on monocular vision,and realize real-time processing on airborne embedded devices.Firstly,a lightweight network is designed with structural re-parameterization to reduce computational cost and improve inference speed.And a sub-pixel level keypoints prediction head and loss functions are adopted to improve keypoint detection accuracy.Secondly,a closed-form solution of drogue pose is computed based on double spatial circles,followed by a nonlinear refinement based on Levenberg-Marquardt optimization.Both virtual simulation and physical simulation experiments have been used to test the proposed method.In the virtual simulation,the mean pixel error of the proposed method is 0.787 pixels,which is significantly superior to that of other methods.In the physical simulation,the mean relative measurement error is 0.788%,and the mean processing time is 13.65 ms on embedded devices.
基金supported in part by the Major Project for New Generation of AI (2018AAA0100400)the National Natural Science Foundation of China (61836014,U21B2042,62072457,62006231)the InnoHK Program。
文摘Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.
基金Program for Changjiang Scholars and Innovative Research Team in University (IRT0520)Ph.D.Programs Foundation of Ministry of Education of China (20070213055)
文摘Visual sensors are used to measure the relative state of the chaser spacecraft to the target spacecraft during close range ren- dezvous phases. This article proposes a two-stage iterative algorithm based on an inverse projection ray approach to address the relative position and attitude estimation by using feature points and monocular vision. It consists of two stages: absolute orienta- tion and depth recovery. In the first stage, Umeyama's algorithm is used to fit the three-dimensional (3D) model set and estimate the 3D point set while in the second stage, the depths of the observed feature points are estimated. This procedure is repeated until the result converges. Moreover, the effectiveness and convergence of the proposed algorithm are verified through theoreti- cal analysis and mathematical simulation.
基金The National Key Technology R&D Program of China during the 11th Five-Year Plan Period(2009BAG13A04)Jiangsu Transportation Science Research Program(No.08X09)Program of Suzhou Science and Technology(No.SG201076)
文摘In order to decrease vehicle crashes, a new rear view vehicle detection system based on monocular vision is designed. First, a small and flexible hardware platform based on a DM642 digtal signal processor (DSP) micro-controller is built. Then, a two-step vehicle detection algorithm is proposed. In the first step, a fast vehicle edge and symmetry fusion algorithm is used and a low threshold is set so that all the possible vehicles have a nearly 100% detection rate (TP) and the non-vehicles have a high false detection rate (FP), i. e., all the possible vehicles can be obtained. In the second step, a classifier using a probabilistic neural network (PNN) which is based on multiple scales and an orientation Gabor feature is trained to classify the possible vehicles and eliminate the false detected vehicles from the candidate vehicles generated in the first step. Experimental results demonstrate that the proposed system maintains a high detection rate and a low false detection rate under different road, weather and lighting conditions.
基金Supported by National Natural Science Foundation of China (No. 31000422 and No. 61201081)Tianjin Municipal Education Commission(No.20110829)Tianjin Science and Technology Committee(No. 10JCZDJC22800)
文摘A system for mobile robot localization and navigation was presented.With the proposed system,the robot can be located and navigated by a single landmark in a single image.And the navigation mode may be following-track,teaching and playback,or programming.The basic idea is that the system computes the differences between the expected and the recognized position at each time and then controls the robot in a direction to reduce those differences.To minimize the robot sensor equipment,only one omnidirectional camera was used.Experiments in disturbing environments show that the presented algorithm is robust and easy to implement,without camera rectification.The rootmean-square error(RMSE) of localization is 1.4,cm,and the navigation error in teaching and playback is within 10,cm.
基金Key Projects in the Tianjin Science & Technology Pillay Program
文摘Vehicle anti-collision technique is a hot topic in the research area of Intelligent Transport System. The research on preceding vehicles detection and the distance measurement, which are the key techniques, makes great contributions to safe-driving. This paper presents a method which can be used to detect preceding vehicles and get the distance between own car and the car ahead. Firstly, an adaptive threshold method is used to get shadow feature, and a shadow!area merging approach is used to deal with the distortion of the shadow border. Region of interest(ROI) is obtained using shadow feature. Then in the ROI, symmetry feature is analyzed to verify whether there are vehicles and to locate the vehicles. Finally, using monocular vision distance measurement based on camera interior parameters and geometrical reasoning, we get the distance between own car and the preceding one. Experimental results show that the proposed method can detect the preceding vehicle effectively and get the distance between vehicles accurately.
基金supported by the National Natural Science Foundation of China(Nos.61473307,61304120)
文摘Drogue recognition and 3D locating is a key problem during the docking phase of the autonomous aerial refueling (AAR). To solve this problem, a novel and effective method based on monocular vision is presented in this paper. Firstly, by employing computer vision with red-ring-shape feature, a drogue detection and recognition algorithm is proposed to guarantee safety and ensure the robustness to the drogue diversity and the changes in environmental condi- tions, without using a set of infrared light emitting diodes (LEDs) on the parachute part of the dro- gue. Secondly, considering camera lens distortion, a monocular vision measurement algorithm for drogue 3D locating is designed to ensure the accuracy and real-time performance of the system, with the drogue attitude provided. Finally, experiments are conducted to demonstrate the effective- ness of the proposed method. Experimental results show the performances of the entire system in contrast with other methods, which validates that the proposed method can recognize and locate the drogue three dimensionally, rapidly and precisely.
基金supported by National Natural Science Foundation of China(Nos.61273352 and 61473295)National High Technology Research and Development Program of China(863 Program)(No.2015AA042307)Beijing Natural Science Foundation(No.4161002)
文摘A new visual measurement method is proposed to estimate three-dimensional (3D) position of the object on the floor based on a single camera. The camera fixed on a robot is in an inclined position with respect to the floor. A measurement model with the camera's extrinsic parameters such as the height and pitch angle is described. Single image of a chessboard pattern placed on the floor is enough to calibrate the camera's extrinsic parameters after the camera's intrinsic parameters are calibrated. Then the position of object on the floor can be computed with the measurement model. Furthermore, the height of object can be calculated with the paired-points in the vertical line sharing the same position on the floor. Compared to the conventional method used to estimate the positions on the plane, this method can obtain the 3D positions. The indoor experiment testifies the accuracy and validity of the proposed method.
基金The National High Technology Research and Development Program (863) of China (No2006AA04Z259)The National Natural Sci-ence Foundation of China (No60643005)
文摘A hierarchical mobile robot simultaneous localization and mapping (SLAM) method that allows us to obtain accurate maps was presented. The local map level is composed of a set of local metric feature maps that are guaranteed to be statistically independent. The global level is a topological graph whose arcs are labeled with the relative location between local maps. An estimation of these relative locations is maintained with local map alignment algorithm, and more accurate estimation is calculated through a global minimization procedure using the loop closure constraint. The local map is built with Rao-Blackwellised particle filter (RBPF), where the particle filter is used to extending the path posterior by sampling new poses. The landmark position estimation and update is implemented through extended Kalman filter (EKF). Monocular vision mounted on the robot tracks the 3D natural point landmarks, which are structured with matching scale invariant feature transform (SIFT) feature pairs. The matching for multi-dimension SIFT features is implemented with a KD-tree in the time cost of O(lbN). Experiment results on Pioneer mobile robot in a real indoor environment show the superior performance of our proposed method.
基金National Science and Technology Major Project of China(No.2016ZX04003001)Tianjin Research Program of Application Foundation and Advanced Technology(No.14JCZDJC39700)
文摘In the laser displacement sensors measurement system,the laser beam direction is an important parameter.Particularly,the azimuth and pitch angles are the most important parameters to a laser beam.In this paper,based on monocular vision,a laser beam direction measurement method is proposed.First,place the charge coupled device(CCD)camera above the base plane,and adjust and fix the camera position so that the optical axis is nearly perpendicular to the base plane.The monocular vision localization model is established by using circular aperture calibration board.Then the laser beam generating device is placed and maintained on the base plane at fixed position.At the same time a special target block is placed on the base plane so that the laser beam can project to the special target and form a laser spot.The CCD camera placed above the base plane can acquire the laser spot and the image of the target block clearly,so the two-dimensional(2D)image coordinate of the centroid of the laser spot can be extracted by correlation algorithm.The target is moved at an equal distance along the laser beam direction,and the spots and target images of each moving under the current position are collected by the CCD camera.By using the relevant transformation formula and combining the intrinsic parameters of the target block,the2D coordinates of the gravity center of the spot are converted to the three-dimensional(3D)coordinate in the base plane.Because of the moving of the target,the3D coordinates of the gravity center of the laser spot at different positions are obtained,and these3D coordinates are synthesized into a space straight line to represent the laser beam to be measured.In the experiment,the target parameters are measured by high-precision instruments,and the calibration parameters of the camera are calibrated by a high-precision calibration board to establish the corresponding positioning model.The measurement accuracy is mainly guaranteed by the monocular vision positioning accuracy and the gravity center extraction accuracy.The experimental results show the maximum error of the angle between laser beams reaches to0.04°and the maximum error of beam pitch angle reaches to0.02°.
文摘Building fences to manage the cattle grazing can be very expensive;cost inefficient. These do not provide dynamic control over the area in which the cattle are grazing. Existing virtual fencing techniques for the control of herds of cattle, based on polygon coordinate definition of boundaries is limited in the area of land mass coverage and dynamism. This work seeks to develop a more robust and an improved monocular vision based boundary avoidance for non-invasive stray control system for cattle, with a view to increase land mass coverage in virtual fencing techniques and dynamism. The monocular vision based depth estimation will be modeled using concept of global Fourier Transform (FT) and local Wavelet Transform (WT) of image structure of scenes (boundaries). The magnitude of the global Fourier Transform gives the dominant orientations and textual patterns of the image;while the local Wavelet Transform gives the dominant spectral features of the image and their spatial distribution. Each scene picture or image is defined by features v, which contain the set of global (FT) and local (WT) statistics of the image. Scenes or boundaries distances are given by estimating the depth D by means of the image features v. Sound cues of intensity equivalent to the magnitude of the depth D are applied to the animal ears as stimuli. This brings about the desired control as animals tend to move away from uncomfortable sounds.
基金the Foundation of Zhejiang Key Level1 Discipline of Forestry Engineering within the Research Project(No.2014lygcz018)the Public Welfare Project of Zhejiang Science and Technology Department(No.2012C32021)+1 种基金the Preresearch Project of the Research Center for Smart Agriculture and Forestry,Zhejiang Agricultural and Forestry University(No.2013ZHNL02)the Scientific Research Foundation of Zhejiang Agricultural and Forestry University(No.2012FR070)
文摘In terms of the requirement of automatically sorting pearls, the pearl contour feature extraction and shape recognition algorithm are studied in this paper to reckon with the rapid identification of pearls shape online,and a monocular dynamic machine vision-based pearl shape detection device is designed. Through blowing, the pearl is suspended in a funnel shaped container and flipped rapidly in the device. The entire surface image of the pearl to be measured can be promptly grasped by the camera placed right above the funnel. The results of illumination experiments conducted from different angles indicate that the image contour acquired by the medium angle illumination is better extracted. The pearl shape test indicates that the method is incorporated with the inflatable suspension device to classify the pearls into seven types according to the national standard,and additionally the average error rate is confined under 5.38%. The shape characteristic of the pearl can be detected promptly and reliably, and accordingly the high-speed automatic sorting can be satisfied.
文摘Background:We investigate whether changes in visual plasticity induced by monocular deprivation can be maintained across multiple days.It has been known that monocular deprivation strengthens the deprived eye in adults with normal vision for a short period of time(30-60 minutes).This has been shown through a variety of visual tasks such as binocular combination and rivalry.Methods:Ten subjects were recruited and patched for five consecutive days for two hours.We used a binocular phase combination task to measure the subjects’sensory eye balances.We initially measured their baseline of sensory eye balance,patched their dominant eye,and then conducted post-patching measurements at 0,3,6,12,24 and 48 minutes after patching.Results:We performed a 2-way ANOVA(Before vs.after patching×Day);we found that although the effect of monocular deprivation on the deprived eye was significant,F(1,9)=17.32,P=0.002,the effect of Day was not.Conclusions:Hence we found no accumulation of the patching effect across five days in healthy adults.This suggests that the degree of remnant neural plasticity in adult primary visual cortex may be too limited to be exploited therapeutically.
基金supported by National Natural Science Foundation of China(Nos.61733004 and 61873266).
文摘Orientation measurement of objects is vital in micro assembly.In this paper,we present a novel method based on monocular microscopic vision for 3-D orientation measurement of objects with planar surfaces.The proposed methods aim to measure the orientation of the object,which does not require calibrating the intrinsic parameters of microscopic camera.In our methods,the orientation of the object is firstly measured with analytical computation based on feature points.The results of the analytical computation are coarse because the information about feature points is not fully used.In order to improve the precision,the orientation measurement is converted into an optimization process base on the relationship between deviations in image space and in Cartesian space under microscopic vision.The results of the analytical computation are used as the initial values of the optimization process.The optimized variables are the three rotational angles of the object and the pixel equivalent coefficient.The objective of the optimization process is to minimize the coordinates differences of the feature points on the object.The precision of the orientation measurement is boosted effectively.Experimental and comparative results validate the effectiveness of the proposed methods.
基金supported by a grant from the National High Technology Research and Development Program of China (863 Program)(No.2002AA420110-3)the key project of the State Grid Corporation of China (SGKJ[2007]159)
文摘A trajectory tracking method is presented for the visual navigation of the monocular mobile robot.The robot move along line trajectory drawn beforehand,recognized and stop on the stop-sign to finish special task.The robot uses a forward looking colorful digital camera to capture information in front of the robot,and by the use of HSI model partition the trajectory and the stop-sign out.Then the "sampling estimate" method was used to calculate the navigation parameters.The stop-sign is easily recognized and can identify 256 different signs.Tests indicate that the method can fit large-scale intensity of brightness and has more robustness and better real-time character.
基金funded by Woosong University Academic Research 2024.
文摘This study investigates the application of Learnable Memory Vision Transformers(LMViT)for detecting metal surface flaws,comparing their performance with traditional CNNs,specifically ResNet18 and ResNet50,as well as other transformer-based models including Token to Token ViT,ViT withoutmemory,and Parallel ViT.Leveraging awidely-used steel surface defect dataset,the research applies data augmentation and t-distributed stochastic neighbor embedding(t-SNE)to enhance feature extraction and understanding.These techniques mitigated overfitting,stabilized training,and improved generalization capabilities.The LMViT model achieved a test accuracy of 97.22%,significantly outperforming ResNet18(88.89%)and ResNet50(88.90%),aswell as the Token to TokenViT(88.46%),ViT without memory(87.18),and Parallel ViT(91.03%).Furthermore,LMViT exhibited superior training and validation performance,attaining a validation accuracy of 98.2%compared to 91.0%for ResNet 18,96.0%for ResNet50,and 89.12%,87.51%,and 91.21%for Token to Token ViT,ViT without memory,and Parallel ViT,respectively.The findings highlight the LMViT’s ability to capture long-range dependencies in images,an areawhere CNNs struggle due to their reliance on local receptive fields and hierarchical feature extraction.The additional transformer-based models also demonstrate improved performance in capturing complex features over CNNs,with LMViT excelling particularly at detecting subtle and complex defects,which is critical for maintaining product quality and operational efficiency in industrial applications.For instance,the LMViT model successfully identified fine scratches and minor surface irregularities that CNNs often misclassify.This study not only demonstrates LMViT’s potential for real-world defect detection but also underscores the promise of other transformer-based architectures like Token to Token ViT,ViT without memory,and Parallel ViT in industrial scenarios where complex spatial relationships are key.Future research may focus on enhancing LMViT’s computational efficiency for deployment in real-time quality control systems.
基金supported by Science and Technology Planning Project of Guangzhou City(2024A04J4474).
文摘Objective:This study aimed to investigate the prevalence,causes,and influencing factors of vision impairment in the elderly population aged 60 years and above in Mangxin Town,Kashgar region,Xinjiang,China.Located in a region characterized by intense ultraviolet radiation and arid climatic conditions,Mangxin Town presents unique environmental challenges that may exacerbate ocular health issues.Despite the global emphasis on addressing vision impairment among aging populations,there remains a paucity of updated and region-specific data in Xinjiang,necessitating this comprehensive assessment to inform targeted interventions.Methods:A cross-sectional study was conducted from May to June 2024,involving 1,311 elderly participants(76.76%participation rate)out of a total eligible population of 1,708 individuals aged≥60 years.Participants underwent detailed ocular examinations,including assessments of uncorrected visual acuity(UVA)and best-corrected visual acuity(BCVA)using standard logarithmic charts,slit-lamp biomicroscopy,optical coherence tomography(OCT,Topcon DRI OCT Triton),fundus photography,and intraocular pressure measurement(Canon TX-20 Tonometer).A multidisciplinary team of 10 ophthalmologists and 2 local village doctors,trained rigorously in standardized protocols,ensured consistent data collection.Demographic,lifestyle,and medical history data were collected via questionnaires.Statistical analyses,performed using STATA 16,included multivariate logistic regression to identify risk factors,with significance defined as P<0.05.Results:The overall prevalence of vision impairment was 13.21%(95%CI:11.37%-15.04%),with low vision at 11.76%(95%CI:10.01%-13.50%)and blindness at 1.45%(95%CI:0.80%-2.10%).Cataract emerged as the leading cause,responsible for 68.20%of cases,followed by glaucoma(5.80%),optic atrophy(5.20%),and age-related macular degeneration(2.90%).Vision impairment prevalence escalated significantly with age:7.74%in the 60–69 age group,17.79%in 70–79,and 33.72%in those≥80.Males exhibited higher prevalence than females(15.84%vs.10.45%,P=0.004).Multivariate analysis revealed age≥80 years(OR=6.43,95%CI:3.79%-10.90%),male sex(OR=0.53,95%CI:0.34%-0.83%),and daily exercise(OR=0.44,95%CI:0.20%-0.95%)as significant factors.History of eye disease showed a non-significant trend toward increased risk(OR=1.49,P=0.107).Education level,income,and smoking status showed no significant associations.Conclusions:This study underscores cataract as the predominant cause of vision impairment in Mangxin Town’s elderly population,with age and sex as critical determinants.The findings align with global patterns but highlight region-specific challenges,such as environmental factors contributing to cataract prevalence.Public health strategies should prioritize improving access to cataract surgery,enhancing grassroots ophthalmic infrastructure,and integrating portable screening technologies for early detection of fundus diseases.Additionally,promoting health education on UV protection and lifestyle modifications,such as regular exercise,may mitigate risks.Future research should expand to broader regions in Xinjiang,employ advanced diagnostic tools for complex conditions like glaucoma,and explore longitudinal trends to refine intervention strategies.These efforts are vital to reducing preventable blindness and improving quality of life for aging populations in underserved areas.