The rapid advancements in computer vision(CV)technology have transformed the traditional approaches to material microstructure analysis.This review outlines the history of CV and explores the applications of deep-lear...The rapid advancements in computer vision(CV)technology have transformed the traditional approaches to material microstructure analysis.This review outlines the history of CV and explores the applications of deep-learning(DL)-driven CV in four key areas of materials science:microstructure-based performance prediction,microstructure information generation,microstructure defect detection,and crystal structure-based property prediction.The CV has significantly reduced the cost of traditional experimental methods used in material performance prediction.Moreover,recent progress made in generating microstructure images and detecting microstructural defects using CV has led to increased efficiency and reliability in material performance assessments.The DL-driven CV models can accelerate the design of new materials with optimized performance by integrating predictions based on both crystal and microstructural data,thereby allowing for the discovery and innovation of next-generation materials.Finally,the review provides insights into the rapid interdisciplinary developments in the field of materials science and future prospects.展开更多
In the competitive retail industry of the digital era,data-driven insights into gender-specific customer behavior are essential.They support the optimization of store performance,layout design,product placement,and ta...In the competitive retail industry of the digital era,data-driven insights into gender-specific customer behavior are essential.They support the optimization of store performance,layout design,product placement,and targeted marketing.However,existing computer vision solutions often rely on facial recognition to gather such insights,raising significant privacy and ethical concerns.To address these issues,this paper presents a privacypreserving customer analytics system through two key strategies.First,we deploy a deep learning framework using YOLOv9s,trained on the RCA-TVGender dataset.Cameras are positioned perpendicular to observation areas to reduce facial visibility while maintaining accurate gender classification.Second,we apply AES-128 encryption to customer position data,ensuring secure access and regulatory compliance.Our system achieved overall performance,with 81.5%mAP@50,77.7%precision,and 75.7%recall.Moreover,a 90-min observational study confirmed the system’s ability to generate privacy-protected heatmaps revealing distinct behavioral patterns between male and female customers.For instance,women spent more time in certain areas and showed interest in different products.These results confirm the system’s effectiveness in enabling personalized layout and marketing strategies without compromising privacy.展开更多
In this paper, a 3-D video encoding scheme suitable for digital TV/HDTV (high definition television) is studied through computer simulation. The encoding scheme is designed to provide a good match to human vision. Bas...In this paper, a 3-D video encoding scheme suitable for digital TV/HDTV (high definition television) is studied through computer simulation. The encoding scheme is designed to provide a good match to human vision. Basically, this involves transmission of low frequency luminance information at full frame rate for good motion rendition and transmission of high frequency luminance signal at reduced frame rate for good detail in static images.展开更多
For decades visual field defects were considered irreversible because it was thought that in the visual system the regeneration potential of the neuronal tissues is low.Nevertheless,there is always some potential for ...For decades visual field defects were considered irreversible because it was thought that in the visual system the regeneration potential of the neuronal tissues is low.Nevertheless,there is always some potential for partial recovery of the visual field defect that can be achieved through induction of neuroplasticity.Neuroplasticity refers to the ability of the brain to change its own functional architecture by modulating synaptic efficacy.It is maintained throughout life and just as neurological rehabilitation can improve motor coordination,visual field defects in glaucoma,diabetic retinopathy or optic neuropathy can be improved by inducing neuroplasticity.In ophthalmology many new treatment paradigms have been tested that can induce neuroplastic changes,including non-invasive alternating current stimulation.Treatment with alternating current stimulation(e.g.,30 minutes,daily for 10 days using transorbital electrodes and^10 Hz)activates the entire retina and parts of the brain.Electroencephalography and functional magnetic resonance imaging studies revealed local activation of the visual cortex,global reorganization of functional brain networks,and enhanced blood flow,which together activate neurons and their networks.The future of low vision is optimistic because vision loss is indeed,partially reversible.展开更多
This paper presents discrete wavelet transform (DWT) and its inverse (IDWT) with Haar wavelets as tools to compute the variable size interpolated versions of an image at optimum computational load. As a human obse...This paper presents discrete wavelet transform (DWT) and its inverse (IDWT) with Haar wavelets as tools to compute the variable size interpolated versions of an image at optimum computational load. As a human observer moves closer to or farther from a scene, the retinal image of the scene zooms in or out, respectively. This zooming in or out can be modeled using variable scale interpolation. The paper proposes a novel way of applying DWT and IDWT in a piecewise manner by non-uniform down- or up-sampling of the images to achieve partially sampled versions of the images. The partially sampled versions are then aggregated to achieve the final variable scale interpolated images. The non-uniform down- or up-sampling here is a function of the required scale of interpolation. Appropriate zero padding is used to make the images suitable for the required non-uniform sampling and the subsequent interpolation to the required scale. The concept of zeroeth level DWT is introduced here, which works as the basis for interpolating the images to achieve bigger size than the original one. The main emphasis here is on the computation of variable size images at less computational load, without compromise of quality of images. The interpolated images to different sizes and the reconstructed images are benchmarked using the statistical parameters and visual comparison. It has been found that the proposed approach performs better as compared to bilinear and bicubic interpolation techniques.展开更多
A hierarchical mobile robot simultaneous localization and mapping (SLAM) method that allows us to obtain accurate maps was presented. The local map level is composed of a set of local metric feature maps that are guar...A hierarchical mobile robot simultaneous localization and mapping (SLAM) method that allows us to obtain accurate maps was presented. The local map level is composed of a set of local metric feature maps that are guaranteed to be statistically independent. The global level is a topological graph whose arcs are labeled with the relative location between local maps. An estimation of these relative locations is maintained with local map alignment algorithm, and more accurate estimation is calculated through a global minimization procedure using the loop closure constraint. The local map is built with Rao-Blackwellised particle filter (RBPF), where the particle filter is used to extending the path posterior by sampling new poses. The landmark position estimation and update is implemented through extended Kalman filter (EKF). Monocular vision mounted on the robot tracks the 3D natural point landmarks, which are structured with matching scale invariant feature transform (SIFT) feature pairs. The matching for multi-dimension SIFT features is implemented with a KD-tree in the time cost of O(lbN). Experiment results on Pioneer mobile robot in a real indoor environment show the superior performance of our proposed method.展开更多
In this paper,we consider eyes from the human binocular system,that simultaneously gaze on stationary point targets in space,while optimalal skipping from one target to the next,by rotaing their individual gaze drecto...In this paper,we consider eyes from the human binocular system,that simultaneously gaze on stationary point targets in space,while optimalal skipping from one target to the next,by rotaing their individual gaze drecton.The head is assume fixed on the torso and the rotaing gaze direction of the two eyes are assumed restricted to pass through a point in the visual space.It is further assumed that,individullly the rotations of the two eyes satisfy the well known Listing's law.We formulate and study acombined optimal gaze rotation for the two eyes,by constructing a single Riemanmian metric,on the asociaced parameter space.The goal is to optimally rotate so that the convergent gaze changes between two pre-specified target points in a finite time interval[0,1].The cost function we choose is the total energy,measured by the L2?norm,of the six extenal torques on the binocular system.The torque functions are synthesized by solving an associated*two-point boundary value problem.The paper demonstrates,via simulation,the shape of the optimal gaze trajectory of the focused point of the bin-ocular system.The Euclidean distance between the initial and the final point is compared to the arc:length of the optimal trajectory.The consumed energy.is computed for diferent eye movement chores and discussed in the paper.Via simulation we observe that certain eye movement maneuvers are energy fficicnt and demonstrate that the optimal external torque is a linear function in time.We also explore and conclude that spitting an arbitry opimal eye movement into optimal vergence and version components is not energy fficient although this is how the human oculomotor control seems to operate.Opimal gaze tajectories and opimal extermal torque functions reported in this paper is new.展开更多
基金financially supported by the National Science Fund for Distinguished Young Scholars,China(No.52025041)the National Natural Science Foundation of China(Nos.52450003,U2341267,and 52174294)+1 种基金the National Postdoctoral Program for Innovative Talents,China(No.BX20240437)the Fundamental Research Funds for the Central Universities,China(Nos.FRF-IDRY-23-037 and FRF-TP-20-02C2)。
文摘The rapid advancements in computer vision(CV)technology have transformed the traditional approaches to material microstructure analysis.This review outlines the history of CV and explores the applications of deep-learning(DL)-driven CV in four key areas of materials science:microstructure-based performance prediction,microstructure information generation,microstructure defect detection,and crystal structure-based property prediction.The CV has significantly reduced the cost of traditional experimental methods used in material performance prediction.Moreover,recent progress made in generating microstructure images and detecting microstructural defects using CV has led to increased efficiency and reliability in material performance assessments.The DL-driven CV models can accelerate the design of new materials with optimized performance by integrating predictions based on both crystal and microstructural data,thereby allowing for the discovery and innovation of next-generation materials.Finally,the review provides insights into the rapid interdisciplinary developments in the field of materials science and future prospects.
文摘In the competitive retail industry of the digital era,data-driven insights into gender-specific customer behavior are essential.They support the optimization of store performance,layout design,product placement,and targeted marketing.However,existing computer vision solutions often rely on facial recognition to gather such insights,raising significant privacy and ethical concerns.To address these issues,this paper presents a privacypreserving customer analytics system through two key strategies.First,we deploy a deep learning framework using YOLOv9s,trained on the RCA-TVGender dataset.Cameras are positioned perpendicular to observation areas to reduce facial visibility while maintaining accurate gender classification.Second,we apply AES-128 encryption to customer position data,ensuring secure access and regulatory compliance.Our system achieved overall performance,with 81.5%mAP@50,77.7%precision,and 75.7%recall.Moreover,a 90-min observational study confirmed the system’s ability to generate privacy-protected heatmaps revealing distinct behavioral patterns between male and female customers.For instance,women spent more time in certain areas and showed interest in different products.These results confirm the system’s effectiveness in enabling personalized layout and marketing strategies without compromising privacy.
文摘In this paper, a 3-D video encoding scheme suitable for digital TV/HDTV (high definition television) is studied through computer simulation. The encoding scheme is designed to provide a good match to human vision. Basically, this involves transmission of low frequency luminance information at full frame rate for good motion rendition and transmission of high frequency luminance signal at reduced frame rate for good detail in static images.
文摘For decades visual field defects were considered irreversible because it was thought that in the visual system the regeneration potential of the neuronal tissues is low.Nevertheless,there is always some potential for partial recovery of the visual field defect that can be achieved through induction of neuroplasticity.Neuroplasticity refers to the ability of the brain to change its own functional architecture by modulating synaptic efficacy.It is maintained throughout life and just as neurological rehabilitation can improve motor coordination,visual field defects in glaucoma,diabetic retinopathy or optic neuropathy can be improved by inducing neuroplasticity.In ophthalmology many new treatment paradigms have been tested that can induce neuroplastic changes,including non-invasive alternating current stimulation.Treatment with alternating current stimulation(e.g.,30 minutes,daily for 10 days using transorbital electrodes and^10 Hz)activates the entire retina and parts of the brain.Electroencephalography and functional magnetic resonance imaging studies revealed local activation of the visual cortex,global reorganization of functional brain networks,and enhanced blood flow,which together activate neurons and their networks.The future of low vision is optimistic because vision loss is indeed,partially reversible.
文摘This paper presents discrete wavelet transform (DWT) and its inverse (IDWT) with Haar wavelets as tools to compute the variable size interpolated versions of an image at optimum computational load. As a human observer moves closer to or farther from a scene, the retinal image of the scene zooms in or out, respectively. This zooming in or out can be modeled using variable scale interpolation. The paper proposes a novel way of applying DWT and IDWT in a piecewise manner by non-uniform down- or up-sampling of the images to achieve partially sampled versions of the images. The partially sampled versions are then aggregated to achieve the final variable scale interpolated images. The non-uniform down- or up-sampling here is a function of the required scale of interpolation. Appropriate zero padding is used to make the images suitable for the required non-uniform sampling and the subsequent interpolation to the required scale. The concept of zeroeth level DWT is introduced here, which works as the basis for interpolating the images to achieve bigger size than the original one. The main emphasis here is on the computation of variable size images at less computational load, without compromise of quality of images. The interpolated images to different sizes and the reconstructed images are benchmarked using the statistical parameters and visual comparison. It has been found that the proposed approach performs better as compared to bilinear and bicubic interpolation techniques.
基金The National High Technology Research and Development Program (863) of China (No2006AA04Z259)The National Natural Sci-ence Foundation of China (No60643005)
文摘A hierarchical mobile robot simultaneous localization and mapping (SLAM) method that allows us to obtain accurate maps was presented. The local map level is composed of a set of local metric feature maps that are guaranteed to be statistically independent. The global level is a topological graph whose arcs are labeled with the relative location between local maps. An estimation of these relative locations is maintained with local map alignment algorithm, and more accurate estimation is calculated through a global minimization procedure using the loop closure constraint. The local map is built with Rao-Blackwellised particle filter (RBPF), where the particle filter is used to extending the path posterior by sampling new poses. The landmark position estimation and update is implemented through extended Kalman filter (EKF). Monocular vision mounted on the robot tracks the 3D natural point landmarks, which are structured with matching scale invariant feature transform (SIFT) feature pairs. The matching for multi-dimension SIFT features is implemented with a KD-tree in the time cost of O(lbN). Experiment results on Pioneer mobile robot in a real indoor environment show the superior performance of our proposed method.
文摘In this paper,we consider eyes from the human binocular system,that simultaneously gaze on stationary point targets in space,while optimalal skipping from one target to the next,by rotaing their individual gaze drecton.The head is assume fixed on the torso and the rotaing gaze direction of the two eyes are assumed restricted to pass through a point in the visual space.It is further assumed that,individullly the rotations of the two eyes satisfy the well known Listing's law.We formulate and study acombined optimal gaze rotation for the two eyes,by constructing a single Riemanmian metric,on the asociaced parameter space.The goal is to optimally rotate so that the convergent gaze changes between two pre-specified target points in a finite time interval[0,1].The cost function we choose is the total energy,measured by the L2?norm,of the six extenal torques on the binocular system.The torque functions are synthesized by solving an associated*two-point boundary value problem.The paper demonstrates,via simulation,the shape of the optimal gaze trajectory of the focused point of the bin-ocular system.The Euclidean distance between the initial and the final point is compared to the arc:length of the optimal trajectory.The consumed energy.is computed for diferent eye movement chores and discussed in the paper.Via simulation we observe that certain eye movement maneuvers are energy fficicnt and demonstrate that the optimal external torque is a linear function in time.We also explore and conclude that spitting an arbitry opimal eye movement into optimal vergence and version components is not energy fficient although this is how the human oculomotor control seems to operate.Opimal gaze tajectories and opimal extermal torque functions reported in this paper is new.