This paper describes a dynamically reconfigurable data-flow hardware architecture optimized for the computation of image and video. It is a scalable hierarchically organized parallel architecture that consists of data...This paper describes a dynamically reconfigurable data-flow hardware architecture optimized for the computation of image and video. It is a scalable hierarchically organized parallel architecture that consists of data-flow clusters and finite-state machine (FSM) controllers. Each cluster contains various kinds of ceils that are optimized for video processing. Furthermore, to facilitate the design process, we provide a C-like language for design specification and associated design tools. Some video applications have been implemented in the architecture to demonstrate the applicability and flexibility of the architecture. Experimental results show that the architecture, along with its video applications, can be used in many real-time video processing.展开更多
Although the scale of the express industry is large, it is difficult toachieve the function of fully intelligent receiving and sending express. In thispaper, the intelligent express delivery system is proposed based o...Although the scale of the express industry is large, it is difficult toachieve the function of fully intelligent receiving and sending express. In thispaper, the intelligent express delivery system is proposed based on the imageand video processing technology of OpenCV, the Faster R-CNN object detectionalgorithm and other technologies. Through the depth camera and electronic scale,it can identify the object category, volume and weight of the items placed on thescale by the sender and store the video of the objects packed into the cabinet. Theoverall framework of the systemwas constructed;key technologies were applied torealize the system;the function of the system was tested. The experimental resultsshow that it achieves the intelligent automation of delivery and delivery throughthe integrated express delivery system of intelligent identification and informationtraceability, which promotes the development of express delivery industry.展开更多
Important in many different sectors of the industry, the determination of stream velocity has become more and more important due to measurements precision necessity, in order to determine the right production rates, d...Important in many different sectors of the industry, the determination of stream velocity has become more and more important due to measurements precision necessity, in order to determine the right production rates, determine the volumetric production of undesired fluid, establish automated controls based on these measurements avoiding over-flooding or over-production, guaranteeing accurate predictive maintenance, etc. Difficulties being faced have been the determination of the velocity of specific fluids embedded in some others, for example, determining the gas bubbles stream velocity flowing throughout liquid fluid phase. Although different and already applicable methods have been researched and already implemented within the industry, a non-intrusive automated way of providing those stream velocities has its importance, and may have a huge impact in projects budget. Knowing the importance of its determination, this developed script uses a methodology of breaking-down real-time videos media into frame images, analyzing by pixel correlations possible superposition matches for further gas bubbles stream velocity estimation. In raw sense, the script bases itself in functions and procedures already available in MatLab, which can be used for image processing and treatments, allowing the methodology to be implemented. Its accuracy after the running test was of around 97% (ninety-seven percent);the raw source code with comments had almost 3000 (three thousand) characters;and the hardware placed for running the code was an Intel Core Duo 2.13 [Ghz] and 2 [Gb] RAM memory capable workstation. Even showing good results, it could be stated that just the end point correlations were actually getting to the final solution. So that, making use of self-learning functions or neural network, one could surely enhance the capability of the application to be run in real-time without getting exhaust by iterative loops.展开更多
The alpha stable self-similar stochastic process has been proved an effective model for high variable data traffic. A deep insight into some special issues and considerations on use of the process to model aggregated ...The alpha stable self-similar stochastic process has been proved an effective model for high variable data traffic. A deep insight into some special issues and considerations on use of the process to model aggregated VBR video traffic is made. Different methods to estimate stability parameter a and self-similar parameter H are compared. Processes to generate the linear fractional stable noise (LFSN) and the alpha stable random variables are provided. Model construction and the quantitative comparisons with fractional Brown motion (FBM) and real traffic are also examined. Open problems and future directions are also given with thoughtful discussions.展开更多
The side information quality has an immense effect on the compression efficiency of the distributed video coding (DVC) sys- tem. This article, based on the hierarchical motion estimation (HME), proposes a new side inf...The side information quality has an immense effect on the compression efficiency of the distributed video coding (DVC) sys- tem. This article, based on the hierarchical motion estimation (HME), proposes a new side information generation algorithm which is integrated into DVC system. First, forward motion estimation (FME) and bidirectional motion estimation (BME) on the basis of variable block size HME algorithm are used to acquire relatively accurate motion vectors. Second, a motion vector filter (MVF) is i...展开更多
介绍一种应用于USB video camera中的自动对焦系统。由USB video camera获取的视频图像经计算机进行FFT运算或微分运算,得到其频谱幅值数据或微分幅值数据,计算机根据所得数据判断USB video camera中的镜头是否处于离焦位置并控制电机...介绍一种应用于USB video camera中的自动对焦系统。由USB video camera获取的视频图像经计算机进行FFT运算或微分运算,得到其频谱幅值数据或微分幅值数据,计算机根据所得数据判断USB video camera中的镜头是否处于离焦位置并控制电机将镜头移到对焦位置。文章还进一步讨论了提高自动对焦准确度的措施。实验结果表明该自动对焦系统能很好地实现USB video camera的自动对焦,该系统将使具有USB接口的video camera使用更简单方便。展开更多
During the past decade, feature extraction and knowledge acquisition based on video analysis have been extensively researched and tested on many applications such as closed-circuit television (CCTV) data analysis, l...During the past decade, feature extraction and knowledge acquisition based on video analysis have been extensively researched and tested on many applications such as closed-circuit television (CCTV) data analysis, large-scale public event control, and other daily security monitoring and surveillance operations with various degrees of success. However, since the actual video process is a multi-phased one and encompasses extensive theories and techniques ranging from fundamental image processing, computational geometry and graphics, and machine vision, to advanced artificial intelligence, pattern analysis, and even cognitive science, there are still many important problems to resolve before it can be widely applied. Among them, video event identification and detection are two prominent ones. Comparing with the most popular frame-to-frame processing mode of most of today's approaches and systems, this project reorganizes video data as a 3D volume structure that provides the hybrid spatial and temporal information in a unified space. This paper reports an innovative technique to transform original video frames to 3D volume structures denoted by spatial and temporal features. It then highlights the volume array structure in a so-called "pre-suspicion" mechanism for a later process. The focus of this report is the development of an effective and efficient voxel-based segmentation technique suitable to the volumetric nature of video events and ready for deployment in 3D clustering operations. The paper is concluded with a performance evaluation of the devised technique and discussion on the future work for accelerating the pre-processing of the original video data.展开更多
A novel temporal shape error concealment technique is proposed, which can he used in the context of object-based video coding schemes. In order to reduce the effect of the shape variations of a video object, the curva...A novel temporal shape error concealment technique is proposed, which can he used in the context of object-based video coding schemes. In order to reduce the effect of the shape variations of a video object, the curvature scale space (CSS) technique is adopted to extract features, and then these features are used for boundary matching between the current frame and the previous frame. Because the temporal, spatial and sta- tistical video contour information are all considered, the proposed method can find the optimal matching, which is used to replace the damaged contours. The simulation results show that the proposed algorithm achieves better subjective, objective qualities and higher efficiency than those previously developed methods.展开更多
With the growth of digital media data manipulation in today’s era due to the availability of readily handy tampering software,the authenticity of records is at high risk,especially in video.There is a dire need to de...With the growth of digital media data manipulation in today’s era due to the availability of readily handy tampering software,the authenticity of records is at high risk,especially in video.There is a dire need to detect such problem and do the necessary actions.In this work,we propose an approach to detect the interframe video forgery utilizing the deep features obtained from the parallel deep neural network model and thorough analytical computations.The proposed approach only uses the deep features extracted from the CNN model and then applies the conventional mathematical approach to these features to find the forgery in the video.This work calculates the correlation coefficient from the deep features of the adjacent frames rather than calculating directly from the frames.We divide the procedure of forgery detection into two phases–video forgery detection and video forgery classification.In video forgery detection,this approach detect input video is original or tampered.If the video is not original,then the video is checked in the next phase,which is video forgery classification.In the video forgery classification,method review the forged video for insertion forgery,deletion forgery,and also again check for originality.The proposed work is generalized and it is tested on two different datasets.The experimental results of our proposed model show that our approach can detect the forgery with the accuracy of 91%on VIFFD dataset,90%in TDTV dataset and classify the type of forgery–insertion and deletion with the accuracy of 82%on VIFFD dataset,86%on TDTV dataset.This work can helps in the analysis of original and tempered video in various domain.展开更多
Intensity flicker is a common form of degradation in archived film. Most algorithms on this distortion are complicated and uncontrolled. This paper presented a discrete mathematical model of flicker, designed a block-...Intensity flicker is a common form of degradation in archived film. Most algorithms on this distortion are complicated and uncontrolled. This paper presented a discrete mathematical model of flicker, designed a block-based estimation method of the model's parameters according to their features of intensity variation in large area. With this estimation result it constructed a compensation model to repair the current frame. This restoration approach is full automatic and the repair process of current frame does not need the information of frames behind it. The algorithm was realized to establish a simple and adjustable repair system. The experimental results show that the proposed algorithm can remove most intensity flicker and preserve tho wanted effects.展开更多
Video steganography plays an important role in secret communication that conceals a secret video in a cover video by perturbing the value of pixels in the cover frames.Imperceptibility is the first and foremost requir...Video steganography plays an important role in secret communication that conceals a secret video in a cover video by perturbing the value of pixels in the cover frames.Imperceptibility is the first and foremost requirement of any steganographic approach.Inspired by the fact that human eyes perceive pixel perturbation differently in different video areas,a novel effective and efficient Deeply‐Recursive Attention Network(DRANet)for video steganography to find suitable areas for information hiding via modelling spatio‐temporal attention is proposed.The DRANet mainly contains two important components,a Non‐Local Self‐Attention(NLSA)block and a Non‐Local Co‐Attention(NLCA)block.Specifically,the NLSA block can select the cover frame areas which are suitable for hiding by computing the correlations among inter‐and intra‐cover frames.The NLCA block aims to effectively produce the enhanced representations of the secret frames to enhance the robustness of the model and alleviate the influence of different areas in the secret video.Furthermore,the DRANet reduces the model parameters by performing similar operations on the different frames within an input video recursively.Experimental results show the proposed DRANet achieves better performance with fewer parameters than the state‐of‐the‐art competitors.展开更多
Video based vehicle detection technology is an integral part of Intelligent Transportation System (ITS), due to its non-intrusiveness and comprehensive vehicle behavior data collection capabilities. This paper propose...Video based vehicle detection technology is an integral part of Intelligent Transportation System (ITS), due to its non-intrusiveness and comprehensive vehicle behavior data collection capabilities. This paper proposes an efficient video based vehicle detection system based on Harris-Stephen corner detector algorithm. The algorithm was used to develop a stand alone vehicle detection and tracking system that determines vehicle counts and speeds at arterial roadways and freeways. The proposed video based vehicle detection system was developed to eliminate the need of complex calibration, robustness to contrasts variations, and better performance with low resolutions videos. The algorithm performance for accuracy in vehicle counts and speed was evaluated. The performance of the proposed system is equivalent or better compared to a commercial vehicle detection system. Using the developed vehicle detection and tracking system an advance warning intelligent transportation system was designed and implemented to alert commuters in advance of speed reductions and congestions at work zones and special events. The effectiveness of the advance warning system was evaluated and the impact discussed.展开更多
This paper addresses the problem of detecting objectionable videos, which has never been carefully studied before. Our method can be efficiently used to filter objectionable videos on Internet. One tensor based key-fr...This paper addresses the problem of detecting objectionable videos, which has never been carefully studied before. Our method can be efficiently used to filter objectionable videos on Internet. One tensor based key-frame selection algorithm, one cube based color model and one objectionable video estimation algorithm are presented. The key frame selection is based on motion analysis using the three-dimensional structure tensor. Then the cube based color model is employed to detect skin color in each key frame. Finally, the video estimation algorithm is applied to estimate objectionable degree in videos. Experimental results on a variety of real-world videos downloaded from Internet show that this method is promising.展开更多
The transmission delay of realtime video packet mainly depends on the sensing time delay(short-term factor) and the entire frame transmission delay(long-term factor).Therefore,the optimization problem in the spectrum ...The transmission delay of realtime video packet mainly depends on the sensing time delay(short-term factor) and the entire frame transmission delay(long-term factor).Therefore,the optimization problem in the spectrum handoff process should be formulated as the combination of microscopic optimization and macroscopic optimization.In this paper,we focus on the issue of combining these two optimization models,and propose a novel Evolution Spectrum Handoff(ESH)strategy to minimize the expected transmission delay of real-time video packet.In the microoptimized model,considering the tradeoff between Primary User's(PU's) allowable collision percentage of each channel and transmission delay of video packet,we propose a mixed integer non-linear programming scheme.The scheme is able to achieve the minimum sensing time which is termed as an optimal stopping time.In the macro-optimized model,using the optimal stopping time as reward function within the partially observable Markov decision process framework,the EHS strategy is designed to search an optimal target channel set and minimize the expected delay of packet in the long-term real-time video transmission.Meanwhile,the minimum expected transmission delay is obtained under practical cognitive radio networks' conditions,i.e.,secondary user's mobility,PU's random access,imperfect sensing information,etc..Theoretical analysis and simulation results show that the ESH strategy can effectively reduce the transmission delay of video packet in spectrum handoff process.展开更多
基金Foundation item: the National Natural Science Foundation of China (No. 61136002), the Key Project of Chinese Ministry of Education (No. 211180), and the Shaanxi Provincial Industrial and Technological Project (No. 2011k06-47).
文摘This paper describes a dynamically reconfigurable data-flow hardware architecture optimized for the computation of image and video. It is a scalable hierarchically organized parallel architecture that consists of data-flow clusters and finite-state machine (FSM) controllers. Each cluster contains various kinds of ceils that are optimized for video processing. Furthermore, to facilitate the design process, we provide a C-like language for design specification and associated design tools. Some video applications have been implemented in the architecture to demonstrate the applicability and flexibility of the architecture. Experimental results show that the architecture, along with its video applications, can be used in many real-time video processing.
基金This article is supported by the 2020 Innovation and Entrepreneurship Training Program forCollege Students in Jiangsu Province(Project name:Traceablemulti-functional intelligent express cabinet,No.201911460090P,No.202011460090T)This article is supported by the National Natural Science Foundation of China Youth Science Foundation project(Project name:Research on Deep Discriminant Spares Representation Learning Method for Feature Extraction,No.61806098)This article is supported by Scientific Research Project of Nanjing XiaoZhuang University(Project name:Multi-robot collaborative system,No.2017NXY16).
文摘Although the scale of the express industry is large, it is difficult toachieve the function of fully intelligent receiving and sending express. In thispaper, the intelligent express delivery system is proposed based on the imageand video processing technology of OpenCV, the Faster R-CNN object detectionalgorithm and other technologies. Through the depth camera and electronic scale,it can identify the object category, volume and weight of the items placed on thescale by the sender and store the video of the objects packed into the cabinet. Theoverall framework of the systemwas constructed;key technologies were applied torealize the system;the function of the system was tested. The experimental resultsshow that it achieves the intelligent automation of delivery and delivery throughthe integrated express delivery system of intelligent identification and informationtraceability, which promotes the development of express delivery industry.
基金financial support from the Brazilian Federal Agency for Support and Evaluation of Graduate Education(Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior—CAPES,scholarship process no BEX 0506/15-0)the Brazilian National Agency of Petroleum,Natural Gas and Biofuels(Agencia Nacional do Petroleo,Gas Natural e Biocombustiveis—ANP),in cooperation with the Brazilian Financier of Studies and Projects(Financiadora de Estudos e Projetos—FINEP)the Brazilian Ministry of Science,Technology and Innovation(Ministério da Ciencia,Tecnologia e Inovacao—MCTI)through the ANP’s Human Resources Program of the State University of Sao Paulo(Universidade Estadual Paulista—UNESP)for the Oil and Gas Sector PRH-ANP/MCTI no 48(PRH48).
文摘Important in many different sectors of the industry, the determination of stream velocity has become more and more important due to measurements precision necessity, in order to determine the right production rates, determine the volumetric production of undesired fluid, establish automated controls based on these measurements avoiding over-flooding or over-production, guaranteeing accurate predictive maintenance, etc. Difficulties being faced have been the determination of the velocity of specific fluids embedded in some others, for example, determining the gas bubbles stream velocity flowing throughout liquid fluid phase. Although different and already applicable methods have been researched and already implemented within the industry, a non-intrusive automated way of providing those stream velocities has its importance, and may have a huge impact in projects budget. Knowing the importance of its determination, this developed script uses a methodology of breaking-down real-time videos media into frame images, analyzing by pixel correlations possible superposition matches for further gas bubbles stream velocity estimation. In raw sense, the script bases itself in functions and procedures already available in MatLab, which can be used for image processing and treatments, allowing the methodology to be implemented. Its accuracy after the running test was of around 97% (ninety-seven percent);the raw source code with comments had almost 3000 (three thousand) characters;and the hardware placed for running the code was an Intel Core Duo 2.13 [Ghz] and 2 [Gb] RAM memory capable workstation. Even showing good results, it could be stated that just the end point correlations were actually getting to the final solution. So that, making use of self-learning functions or neural network, one could surely enhance the capability of the application to be run in real-time without getting exhaust by iterative loops.
文摘The alpha stable self-similar stochastic process has been proved an effective model for high variable data traffic. A deep insight into some special issues and considerations on use of the process to model aggregated VBR video traffic is made. Different methods to estimate stability parameter a and self-similar parameter H are compared. Processes to generate the linear fractional stable noise (LFSN) and the alpha stable random variables are provided. Model construction and the quantitative comparisons with fractional Brown motion (FBM) and real traffic are also examined. Open problems and future directions are also given with thoughtful discussions.
基金National Natural Science Foundation of China (60702012)
文摘The side information quality has an immense effect on the compression efficiency of the distributed video coding (DVC) sys- tem. This article, based on the hierarchical motion estimation (HME), proposes a new side information generation algorithm which is integrated into DVC system. First, forward motion estimation (FME) and bidirectional motion estimation (BME) on the basis of variable block size HME algorithm are used to acquire relatively accurate motion vectors. Second, a motion vector filter (MVF) is i...
文摘介绍一种应用于USB video camera中的自动对焦系统。由USB video camera获取的视频图像经计算机进行FFT运算或微分运算,得到其频谱幅值数据或微分幅值数据,计算机根据所得数据判断USB video camera中的镜头是否处于离焦位置并控制电机将镜头移到对焦位置。文章还进一步讨论了提高自动对焦准确度的措施。实验结果表明该自动对焦系统能很好地实现USB video camera的自动对焦,该系统将使具有USB接口的video camera使用更简单方便。
文摘During the past decade, feature extraction and knowledge acquisition based on video analysis have been extensively researched and tested on many applications such as closed-circuit television (CCTV) data analysis, large-scale public event control, and other daily security monitoring and surveillance operations with various degrees of success. However, since the actual video process is a multi-phased one and encompasses extensive theories and techniques ranging from fundamental image processing, computational geometry and graphics, and machine vision, to advanced artificial intelligence, pattern analysis, and even cognitive science, there are still many important problems to resolve before it can be widely applied. Among them, video event identification and detection are two prominent ones. Comparing with the most popular frame-to-frame processing mode of most of today's approaches and systems, this project reorganizes video data as a 3D volume structure that provides the hybrid spatial and temporal information in a unified space. This paper reports an innovative technique to transform original video frames to 3D volume structures denoted by spatial and temporal features. It then highlights the volume array structure in a so-called "pre-suspicion" mechanism for a later process. The focus of this report is the development of an effective and efficient voxel-based segmentation technique suitable to the volumetric nature of video events and ready for deployment in 3D clustering operations. The paper is concluded with a performance evaluation of the devised technique and discussion on the future work for accelerating the pre-processing of the original video data.
基金the National Natural Science Foundation of China (60532070)
文摘A novel temporal shape error concealment technique is proposed, which can he used in the context of object-based video coding schemes. In order to reduce the effect of the shape variations of a video object, the curvature scale space (CSS) technique is adopted to extract features, and then these features are used for boundary matching between the current frame and the previous frame. Because the temporal, spatial and sta- tistical video contour information are all considered, the proposed method can find the optimal matching, which is used to replace the damaged contours. The simulation results show that the proposed algorithm achieves better subjective, objective qualities and higher efficiency than those previously developed methods.
文摘With the growth of digital media data manipulation in today’s era due to the availability of readily handy tampering software,the authenticity of records is at high risk,especially in video.There is a dire need to detect such problem and do the necessary actions.In this work,we propose an approach to detect the interframe video forgery utilizing the deep features obtained from the parallel deep neural network model and thorough analytical computations.The proposed approach only uses the deep features extracted from the CNN model and then applies the conventional mathematical approach to these features to find the forgery in the video.This work calculates the correlation coefficient from the deep features of the adjacent frames rather than calculating directly from the frames.We divide the procedure of forgery detection into two phases–video forgery detection and video forgery classification.In video forgery detection,this approach detect input video is original or tampered.If the video is not original,then the video is checked in the next phase,which is video forgery classification.In the video forgery classification,method review the forged video for insertion forgery,deletion forgery,and also again check for originality.The proposed work is generalized and it is tested on two different datasets.The experimental results of our proposed model show that our approach can detect the forgery with the accuracy of 91%on VIFFD dataset,90%in TDTV dataset and classify the type of forgery–insertion and deletion with the accuracy of 82%on VIFFD dataset,86%on TDTV dataset.This work can helps in the analysis of original and tempered video in various domain.
基金National Natural Science Foundation ofChina(No.69905003)
文摘Intensity flicker is a common form of degradation in archived film. Most algorithms on this distortion are complicated and uncontrolled. This paper presented a discrete mathematical model of flicker, designed a block-based estimation method of the model's parameters according to their features of intensity variation in large area. With this estimation result it constructed a compensation model to repair the current frame. This restoration approach is full automatic and the repair process of current frame does not need the information of frames behind it. The algorithm was realized to establish a simple and adjustable repair system. The experimental results show that the proposed algorithm can remove most intensity flicker and preserve tho wanted effects.
基金supported in part by NSFC(62002320,U19B2043,61672456)the Key R&D Program of Zhejiang Province,China(2021C01119).
文摘Video steganography plays an important role in secret communication that conceals a secret video in a cover video by perturbing the value of pixels in the cover frames.Imperceptibility is the first and foremost requirement of any steganographic approach.Inspired by the fact that human eyes perceive pixel perturbation differently in different video areas,a novel effective and efficient Deeply‐Recursive Attention Network(DRANet)for video steganography to find suitable areas for information hiding via modelling spatio‐temporal attention is proposed.The DRANet mainly contains two important components,a Non‐Local Self‐Attention(NLSA)block and a Non‐Local Co‐Attention(NLCA)block.Specifically,the NLSA block can select the cover frame areas which are suitable for hiding by computing the correlations among inter‐and intra‐cover frames.The NLCA block aims to effectively produce the enhanced representations of the secret frames to enhance the robustness of the model and alleviate the influence of different areas in the secret video.Furthermore,the DRANet reduces the model parameters by performing similar operations on the different frames within an input video recursively.Experimental results show the proposed DRANet achieves better performance with fewer parameters than the state‐of‐the‐art competitors.
文摘Video based vehicle detection technology is an integral part of Intelligent Transportation System (ITS), due to its non-intrusiveness and comprehensive vehicle behavior data collection capabilities. This paper proposes an efficient video based vehicle detection system based on Harris-Stephen corner detector algorithm. The algorithm was used to develop a stand alone vehicle detection and tracking system that determines vehicle counts and speeds at arterial roadways and freeways. The proposed video based vehicle detection system was developed to eliminate the need of complex calibration, robustness to contrasts variations, and better performance with low resolutions videos. The algorithm performance for accuracy in vehicle counts and speed was evaluated. The performance of the proposed system is equivalent or better compared to a commercial vehicle detection system. Using the developed vehicle detection and tracking system an advance warning intelligent transportation system was designed and implemented to alert commuters in advance of speed reductions and congestions at work zones and special events. The effectiveness of the advance warning system was evaluated and the impact discussed.
基金Supported by National Natural Science Foundation of P. R. China (60121302)the National High Technology Research and Development Program of P. R. China (2002AA142100)
文摘This paper addresses the problem of detecting objectionable videos, which has never been carefully studied before. Our method can be efficiently used to filter objectionable videos on Internet. One tensor based key-frame selection algorithm, one cube based color model and one objectionable video estimation algorithm are presented. The key frame selection is based on motion analysis using the three-dimensional structure tensor. Then the cube based color model is employed to detect skin color in each key frame. Finally, the video estimation algorithm is applied to estimate objectionable degree in videos. Experimental results on a variety of real-world videos downloaded from Internet show that this method is promising.
基金supported by the National Natural Science Foundation of China under Grant No.61301101
文摘The transmission delay of realtime video packet mainly depends on the sensing time delay(short-term factor) and the entire frame transmission delay(long-term factor).Therefore,the optimization problem in the spectrum handoff process should be formulated as the combination of microscopic optimization and macroscopic optimization.In this paper,we focus on the issue of combining these two optimization models,and propose a novel Evolution Spectrum Handoff(ESH)strategy to minimize the expected transmission delay of real-time video packet.In the microoptimized model,considering the tradeoff between Primary User's(PU's) allowable collision percentage of each channel and transmission delay of video packet,we propose a mixed integer non-linear programming scheme.The scheme is able to achieve the minimum sensing time which is termed as an optimal stopping time.In the macro-optimized model,using the optimal stopping time as reward function within the partially observable Markov decision process framework,the EHS strategy is designed to search an optimal target channel set and minimize the expected delay of packet in the long-term real-time video transmission.Meanwhile,the minimum expected transmission delay is obtained under practical cognitive radio networks' conditions,i.e.,secondary user's mobility,PU's random access,imperfect sensing information,etc..Theoretical analysis and simulation results show that the ESH strategy can effectively reduce the transmission delay of video packet in spectrum handoff process.