Enhancing website security is crucial to combat malicious activities,and CAPTCHA(Completely Automated Public Turing tests to tell Computers and Humans Apart)has become a key method to distinguish humans from bots.Whil...Enhancing website security is crucial to combat malicious activities,and CAPTCHA(Completely Automated Public Turing tests to tell Computers and Humans Apart)has become a key method to distinguish humans from bots.While text-based CAPTCHAs are designed to challenge machines while remaining human-readable,recent advances in deep learning have enabled models to recognize them with remarkable efficiency.In this regard,we propose a novel two-layer visual attention framework for CAPTCHA recognition that builds on traditional attention mechanisms by incorporating Guided Visual Attention(GVA),which sharpens focus on relevant visual features.We have specifically adapted the well-established image captioning task to address this need.Our approach utilizes the first-level attention module as guidance to the second-level attention component,incorporating two LSTM(Long Short-Term Memory)layers to enhance CAPTCHA recognition.Our extensive evaluation across four diverse datasets—Weibo,BoC(Bank of China),Gregwar,and Captcha 0.3—shows the adaptability and efficacy of our method.Our approach demonstrated impressive performance,achieving an accuracy of 96.70%for BoC and 95.92%for Webo.These results underscore the effectiveness of our method in accurately recognizing and processing CAPTCHA datasets,showcasing its robustness,reliability,and ability to handle varied challenges in CAPTCHA recognition.展开更多
Robot vision guide is an important research area in industrial automation,and image-based target pose estimation is one of the most challenging problems.We focus on target pose estimation and present a solution based ...Robot vision guide is an important research area in industrial automation,and image-based target pose estimation is one of the most challenging problems.We focus on target pose estimation and present a solution based on the binocular stereo vision in this paper.To improve the robustness and speed of pose estimation,we propose a novel visual tracking algorithm based on Fourier-Mellin transform to extract the target region.We evaluate the proposed tracking algorithm on online tracking benchmark-50(OTB-50)and the results show that it outperforms other lightweight trackers,especially when the target is rotated or scaled.The final experiment proves that the improved pose estimation approach can achieve a position accuracy of 1.84 mm and a speed of 7 FPS(frames per second).Besides,this approach is robust to the variances of illumination and can work well in the range of 250-700 lux.展开更多
A dual operational modes mobile robot system based on visual guiding and visual servo control is presented. This system consists of a mobile robot with a two-axis manipulator and a tele-operation station. In the visua...A dual operational modes mobile robot system based on visual guiding and visual servo control is presented. This system consists of a mobile robot with a two-axis manipulator and a tele-operation station. In the visual guiding mode, for the robot works in an open loop visual servo control mode, the manipulating burden of the operator is reduced largely. In the visual servo mode the robot can locate the position of the target assigned by the operator and pick it up by its manipulator. With the help of the operator, the diffieuh problems of finding and handling a target in a complicated environment by the robot can be solved easily.展开更多
Three wheelers(3 Ws)are widely used in low and middle-income countries,particularly in Asia Pacific region as a comparatively cheap method to passenger transportation and goods delivery.The frequent use of 3 Ws in day...Three wheelers(3 Ws)are widely used in low and middle-income countries,particularly in Asia Pacific region as a comparatively cheap method to passenger transportation and goods delivery.The frequent use of 3 Ws in day-to-day activities have caused a large number of accidents causing injuries to their passengers.Less research has been carried out to identify the reasons behind 3 W accidents.The survey carried out prior to this research has identified that the stability control and speed control are the two key factors which the 3 W accidents attributed to.3 W fork is the main mechanical element that controls the balance and the stability of the vehicle.A damaged 3 W fork(a physical damage or a slight deformation)unbalances the 3 W and had been identified as one of the reasons for large number of accidents.Therefore,correctly reforming the damaged fork is of paramount importance,when concerning the safety of the 3 Ws.Traditionally,both heat-treating and cold-working techniques are used in the mending processes.Not only this manual-labor repairing process weakens the strength of the fork,but also the profile produced is inaccurate.This paper discusses a hydraulic operated fork mending machine with an image processing technique to reform the damaged forks in 3 Ws.An image comparator-based imaging technique is used for this machine vision-based visually guided fork repairing process.Three cameras have been used to capture the images from three perpendicular directions.A contour sketch of the original fork(before the deformation occurs)has been compared against the faulty fork,to assist the worker to carry out the repairing process.The preliminary experimentations have shown that the proposed technique can improve the repositioning of the camber angle by repairing the damaged fork.展开更多
Objective To investigate the potential value of saccade and antisaccade parameters in early identification of Parkinson's disease(PD)and its motor subtypes.Methods A total of 111 PD patients[tremor dominant(TD)typ...Objective To investigate the potential value of saccade and antisaccade parameters in early identification of Parkinson's disease(PD)and its motor subtypes.Methods A total of 111 PD patients[tremor dominant(TD)type in 45,postural instability/gait dfficulty(PIGD)type in 54 and indeterminate type in 12]and 54 healthy controls were recruited from Department of Neurology,Guangdong Provincial People's Hospital from July 2022 to July 2023.All subjects underwent oculomotor test including visually guided saccades and volitional antisaccades by the Eyeknow-M10-B3Eye tracker.For PD patients,TD and PIGD scoreswere measured using the Movement Disorder Society Unified Parkinson's Disease Rating Scale(MDS-UPDRS)Part II and Part II.Oculomotor parameters among TD,PIGD patients and healthy controls were firstly compared.Multiple linear regression analyses were performed to assess the relationship between ocular parameters with differences and TD/PIGD score.Then receiver operating characteristic(ROC)curve analysis was made between PD patients and healthy controls,as well as between PIGD and TD subtypes.Results Compared to healthy controls,PD patients showed significantly decreased saccadic accuracy[100.0%(90.0%,100.0%)vs 100.0%(100.0%,100.0%),U=1732.500,P<0.001],prolonged latency[252.2(228.5,300.1)ms us 227.7(214.2,241.8)ms,U=1401.000,P<0.001],minimum duration[233.6(211.2,278.8)ms vs 211.0(200.0,222.5)ms,U=1534.500,P<0.001],average duration[356.6(313.8,427.8)ms vs 279.4(267.4,312.9)ms,U=881.000,P<0.001],as well as decreased peak[444.4(335.0,593.7)°/s us 526.7(412.6,696.2)°/s,U=1971.000,P=0.007]and average velocity[196.3(144.4,240.5)/s us 256.7(226.7,312.0)°/s,U=1330.000,P<0.001]in saccades.And in antisaccades,PD patients also showed prolonged latency[432.0(362.9,599.8)ms us 352.9(309.8,407.6)ms,U=1553.000,P<0.001],minimum duration[333.4(299.8,377.6)ms vs 290.1(263.9,332.9)ms,U=1608.000,P<0.001],average duration[518.2(462.7,603.5)ms vs 424.2(377.1,473.5)ms,U=1181.000,P<0.001],decreased peak[458.5(327.9,604.3)°/s vs 560.4(440.3,698.5)°/s,U=1838.500,P=0.001]and average velocity[186.6(143.1,228.1)/s vs 263.2(217.2,301.5)/s,U=1131.000,P<0.001].There was no statistically significant difference in antisaccadic accuracy[55.0%(15.0%,80.0%)vs 66.7%(39.4%,86.9%),U=2167.500,P=0.053].Compared with TD subtype,PIGD patients showed significantly decreased antisaccadic peak velocity[416.2(300.3,534.3)/s s 527.1(402.3,636.4)/s,U=-26.474,P=0.009].After adjusting for age,gender and education,antisaccadic peak velocity was negatively correlated with PIGD score in PD patients(β=-0.296,P=0.001),and no correlation with TD score was found.The ROC analysis was performed on combined saccadic and antisaccade metrics between PD patients and healthy controls,with area under the curve(AUC)as 0.918.For antisaccadic peak velocity between PIGD and TD subtypes,the AUC was 0.690.Conclusion Eye movement metrics have potential value in distinguishing PD patients from healthy controls.The antisaccadic peak velocity is related to the severity of motor symptoms in PICD patients,which is helpful for distinguishing the motor subtypes of PD patients.展开更多
This paper thoroughly investigates the problem of robot self-location by line correspondences. The original contributions are three-fold: (1) Obtain the necessary and sufficient condition to determine linearly the rob...This paper thoroughly investigates the problem of robot self-location by line correspondences. The original contributions are three-fold: (1) Obtain the necessary and sufficient condition to determine linearly the robot's pose by two line correspondences. (2) Show that if the space lines are vertical ones, it is impossible to determine linearly the robot's pose no matter how many line correspondences we have, and the minimum number of line correspondences is 3 to determine uniquely (but non-linearly) the robot's pose. (3) Show that if the space lines are horizontal ones, the minimum number of line correspondences is 3 for linear determination and 2 for non-linear determination of the robot's pose.展开更多
基金supported by the National Natural Science Foundation of China(Nos.U22A2034,62177047)High Caliber Foreign Experts Introduction Plan funded by MOST,and Central South University Research Programme of Advanced Interdisciplinary Studies(No.2023QYJC020).
文摘Enhancing website security is crucial to combat malicious activities,and CAPTCHA(Completely Automated Public Turing tests to tell Computers and Humans Apart)has become a key method to distinguish humans from bots.While text-based CAPTCHAs are designed to challenge machines while remaining human-readable,recent advances in deep learning have enabled models to recognize them with remarkable efficiency.In this regard,we propose a novel two-layer visual attention framework for CAPTCHA recognition that builds on traditional attention mechanisms by incorporating Guided Visual Attention(GVA),which sharpens focus on relevant visual features.We have specifically adapted the well-established image captioning task to address this need.Our approach utilizes the first-level attention module as guidance to the second-level attention component,incorporating two LSTM(Long Short-Term Memory)layers to enhance CAPTCHA recognition.Our extensive evaluation across four diverse datasets—Weibo,BoC(Bank of China),Gregwar,and Captcha 0.3—shows the adaptability and efficacy of our method.Our approach demonstrated impressive performance,achieving an accuracy of 96.70%for BoC and 95.92%for Webo.These results underscore the effectiveness of our method in accurately recognizing and processing CAPTCHA datasets,showcasing its robustness,reliability,and ability to handle varied challenges in CAPTCHA recognition.
文摘Robot vision guide is an important research area in industrial automation,and image-based target pose estimation is one of the most challenging problems.We focus on target pose estimation and present a solution based on the binocular stereo vision in this paper.To improve the robustness and speed of pose estimation,we propose a novel visual tracking algorithm based on Fourier-Mellin transform to extract the target region.We evaluate the proposed tracking algorithm on online tracking benchmark-50(OTB-50)and the results show that it outperforms other lightweight trackers,especially when the target is rotated or scaled.The final experiment proves that the improved pose estimation approach can achieve a position accuracy of 1.84 mm and a speed of 7 FPS(frames per second).Besides,this approach is robust to the variances of illumination and can work well in the range of 250-700 lux.
基金Supported by the National High Technology Research and Development Program of China (No. 2003AA421030) and the National Science Foundation of China (No. 60375026).
文摘A dual operational modes mobile robot system based on visual guiding and visual servo control is presented. This system consists of a mobile robot with a two-axis manipulator and a tele-operation station. In the visual guiding mode, for the robot works in an open loop visual servo control mode, the manipulating burden of the operator is reduced largely. In the visual servo mode the robot can locate the position of the target assigned by the operator and pick it up by its manipulator. With the help of the operator, the diffieuh problems of finding and handling a target in a complicated environment by the robot can be solved easily.
文摘Three wheelers(3 Ws)are widely used in low and middle-income countries,particularly in Asia Pacific region as a comparatively cheap method to passenger transportation and goods delivery.The frequent use of 3 Ws in day-to-day activities have caused a large number of accidents causing injuries to their passengers.Less research has been carried out to identify the reasons behind 3 W accidents.The survey carried out prior to this research has identified that the stability control and speed control are the two key factors which the 3 W accidents attributed to.3 W fork is the main mechanical element that controls the balance and the stability of the vehicle.A damaged 3 W fork(a physical damage or a slight deformation)unbalances the 3 W and had been identified as one of the reasons for large number of accidents.Therefore,correctly reforming the damaged fork is of paramount importance,when concerning the safety of the 3 Ws.Traditionally,both heat-treating and cold-working techniques are used in the mending processes.Not only this manual-labor repairing process weakens the strength of the fork,but also the profile produced is inaccurate.This paper discusses a hydraulic operated fork mending machine with an image processing technique to reform the damaged forks in 3 Ws.An image comparator-based imaging technique is used for this machine vision-based visually guided fork repairing process.Three cameras have been used to capture the images from three perpendicular directions.A contour sketch of the original fork(before the deformation occurs)has been compared against the faulty fork,to assist the worker to carry out the repairing process.The preliminary experimentations have shown that the proposed technique can improve the repositioning of the camber angle by repairing the damaged fork.
文摘Objective To investigate the potential value of saccade and antisaccade parameters in early identification of Parkinson's disease(PD)and its motor subtypes.Methods A total of 111 PD patients[tremor dominant(TD)type in 45,postural instability/gait dfficulty(PIGD)type in 54 and indeterminate type in 12]and 54 healthy controls were recruited from Department of Neurology,Guangdong Provincial People's Hospital from July 2022 to July 2023.All subjects underwent oculomotor test including visually guided saccades and volitional antisaccades by the Eyeknow-M10-B3Eye tracker.For PD patients,TD and PIGD scoreswere measured using the Movement Disorder Society Unified Parkinson's Disease Rating Scale(MDS-UPDRS)Part II and Part II.Oculomotor parameters among TD,PIGD patients and healthy controls were firstly compared.Multiple linear regression analyses were performed to assess the relationship between ocular parameters with differences and TD/PIGD score.Then receiver operating characteristic(ROC)curve analysis was made between PD patients and healthy controls,as well as between PIGD and TD subtypes.Results Compared to healthy controls,PD patients showed significantly decreased saccadic accuracy[100.0%(90.0%,100.0%)vs 100.0%(100.0%,100.0%),U=1732.500,P<0.001],prolonged latency[252.2(228.5,300.1)ms us 227.7(214.2,241.8)ms,U=1401.000,P<0.001],minimum duration[233.6(211.2,278.8)ms vs 211.0(200.0,222.5)ms,U=1534.500,P<0.001],average duration[356.6(313.8,427.8)ms vs 279.4(267.4,312.9)ms,U=881.000,P<0.001],as well as decreased peak[444.4(335.0,593.7)°/s us 526.7(412.6,696.2)°/s,U=1971.000,P=0.007]and average velocity[196.3(144.4,240.5)/s us 256.7(226.7,312.0)°/s,U=1330.000,P<0.001]in saccades.And in antisaccades,PD patients also showed prolonged latency[432.0(362.9,599.8)ms us 352.9(309.8,407.6)ms,U=1553.000,P<0.001],minimum duration[333.4(299.8,377.6)ms vs 290.1(263.9,332.9)ms,U=1608.000,P<0.001],average duration[518.2(462.7,603.5)ms vs 424.2(377.1,473.5)ms,U=1181.000,P<0.001],decreased peak[458.5(327.9,604.3)°/s vs 560.4(440.3,698.5)°/s,U=1838.500,P=0.001]and average velocity[186.6(143.1,228.1)/s vs 263.2(217.2,301.5)/s,U=1131.000,P<0.001].There was no statistically significant difference in antisaccadic accuracy[55.0%(15.0%,80.0%)vs 66.7%(39.4%,86.9%),U=2167.500,P=0.053].Compared with TD subtype,PIGD patients showed significantly decreased antisaccadic peak velocity[416.2(300.3,534.3)/s s 527.1(402.3,636.4)/s,U=-26.474,P=0.009].After adjusting for age,gender and education,antisaccadic peak velocity was negatively correlated with PIGD score in PD patients(β=-0.296,P=0.001),and no correlation with TD score was found.The ROC analysis was performed on combined saccadic and antisaccade metrics between PD patients and healthy controls,with area under the curve(AUC)as 0.918.For antisaccadic peak velocity between PIGD and TD subtypes,the AUC was 0.690.Conclusion Eye movement metrics have potential value in distinguishing PD patients from healthy controls.The antisaccadic peak velocity is related to the severity of motor symptoms in PICD patients,which is helpful for distinguishing the motor subtypes of PD patients.
基金the National '863' High-Tech Programme of China under the grant No. 863-512-9915-01 and the National Natural Science Foundatio
文摘This paper thoroughly investigates the problem of robot self-location by line correspondences. The original contributions are three-fold: (1) Obtain the necessary and sufficient condition to determine linearly the robot's pose by two line correspondences. (2) Show that if the space lines are vertical ones, it is impossible to determine linearly the robot's pose no matter how many line correspondences we have, and the minimum number of line correspondences is 3 to determine uniquely (but non-linearly) the robot's pose. (3) Show that if the space lines are horizontal ones, the minimum number of line correspondences is 3 for linear determination and 2 for non-linear determination of the robot's pose.