In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestri...In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestrian re-identification tasks,a person re-identification method combining spatial feature learning and multi-granularity feature fusion was proposed.First,an attention spatial transformation network(A-STN)is proposed to learn spatial features and solve the problem of misalignment of pedestrian spatial features.Then the network was divided into a global branch,a local coarse-grained fusion branch,and a local fine-grained fusion branch to extract pedestrian global features,coarse-grained fusion features,and fine-grained fusion features,respectively.Among them,the global branch enriches the global features by fusing different pooling features.The local coarse-grained fusion branch uses an overlay pooling to enhance each local feature while learning the correlation relationship between multi-granularity features.The local fine-grained fusion branch uses a differential pooling to obtain the differential features that were fused with global features to learn the relationship between pedestrian local features and pedestrian global features.Finally,the proposed method was compared on three public datasets:Market1501,DukeMTMC-ReID and CUHK03.The experimental results were better than those of the comparative methods,which verifies the effectiveness of the proposed method.展开更多
Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vi...Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.展开更多
To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is ba...To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method(Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed(GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.展开更多
Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework f...Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework for embodied visual exploration that possesses the efficient exploration capabilities of deep reinforcement learning(DRL)-based exploration policies and leverages feature-based visual odometry(VO) for more accurate mapping and positioning results. An improved local policy is also proposed to reduce tracking failures of feature-based VO in weakly textured scenes through a refined multi-discrete action space, keyframe fusion, and an auxiliary task. The experimental results demonstrate that Ne OR has better mapping and positioning accuracy compared to other entirely learning-based exploration frameworks and improves the robustness of feature-based VO by significantly reducing tracking failures in weakly textured scenes.展开更多
In this paper, we present a novel and efficient scheme for extracting, indexing and retrieving color images. Our motivation was to reduce the space overhead of partition-based approaches taking advantage of the fact t...In this paper, we present a novel and efficient scheme for extracting, indexing and retrieving color images. Our motivation was to reduce the space overhead of partition-based approaches taking advantage of the fact that only a relatively low number of distinct values of a particular visual feature is present in most images. To extract color feature and build indices into our image database we take into consideration factors such as human color perception and perceptual range, and the image is partitioned into a set of regions by using a simple classifying scheme. The compact color feature vector and the spatial color histogram, which are extracted from the seqmented image region, are used for representing the color and spatial information in the image. We have also developed the region-based distance measures to compare the similarity of two images. Extensive tests on a large image collection were conducted to demonstrate the effectiveness of the proposed approach.展开更多
At present,underwater terrain images are all strip-shaped small fragment images preprocessed by the side-scan sonar imaging system.However,the processed underwater terrain images have inconspicuous and few feature poi...At present,underwater terrain images are all strip-shaped small fragment images preprocessed by the side-scan sonar imaging system.However,the processed underwater terrain images have inconspicuous and few feature points.In order to better realize the stitching of underwater terrain images and solve the problems of slow traditional image stitching speed,we proposed an improved algorithm for underwater terrain image stitching based on spatial gradient feature block.First,the spatial gradient fuzzy C-Means algorithm is used to divide the underwater terrain image into feature blocks with the fusion of spatial gradient information.The accelerated-KAZE(AKAZE)algorithm is used to combine the feature block information to match the reference image and the target image.Then,the random sample consensus(RANSAC)is applied to optimize the matching results.Finally,image fusion is performed with the global homography and the optimal seam-line method to improve the accuracy of image overlay fusion.The experimental results show that the proposed method in this paper effectively divides images into feature blocks by combining spatial information and gradient information,which not only solves the problem of stitching failure of underwater terrain images due to unobvious features,and further reduces the sensitivity to noise,but also effectively reduces the iterative calculation in the feature point matching process of the traditional method,and improves the stitching speed.Ghosting and shape warping are significantly eliminated by re-optimizing the overlap of the image.展开更多
Rapid development of deepfake technology led to the spread of forged audios and videos across network platforms,presenting risks for numerous countries,societies,and individuals,and posing a serious threat to cyberspa...Rapid development of deepfake technology led to the spread of forged audios and videos across network platforms,presenting risks for numerous countries,societies,and individuals,and posing a serious threat to cyberspace security.To address the problem of insufficient extraction of spatial features and the fact that temporal features are not considered in the deepfake video detection,we propose a detection method based on improved CapsNet and temporal–spatial features(iCapsNet–TSF).First,the dynamic routing algorithm of CapsNet is improved using weight initialization and updating.Then,the optical flow algorithm is used to extract interframe temporal features of the videos to form a dataset of temporal–spatial features.Finally,the iCapsNet model is employed to fully learn the temporal–spatial features of facial videos,and the results are fused.Experimental results show that the detection accuracy of iCapsNet–TSF reaches 94.07%,98.83%,and 98.50%on the Celeb-DF,FaceSwap,and Deepfakes datasets,respectively,displaying a better performance than most existing mainstream algorithms.The iCapsNet–TSF method combines the capsule network and the optical flow algorithm,providing a novel strategy for the deepfake detection,which is of great significance to the prevention of deepfake attacks and the preservation of cyberspace security.展开更多
Feature based design has been regarded as a promising approach for CAD/CAM integration.This paper aims to establish a domain independent representation formalism for feature based design in three aspects: formal re...Feature based design has been regarded as a promising approach for CAD/CAM integration.This paper aims to establish a domain independent representation formalism for feature based design in three aspects: formal representation,design process model and design algorithm.The implementing scheme and formal description of feature taxonomy,feature operator,feature model validation and feature transformation are given in the paper.The feature based design process model suited for either sequencial or concurrent engineering is proposed and its application to product structural design and process plan design is presented. Some general design algorithms for developing feature based design system are also addressed.The proposed scheme provides a formal methodology elementary for feature based design system development and operation in a structural way.展开更多
Multi-Object Tracking(MOT)represents a fundamental but computationally demanding task in computer vision,with particular challenges arising in occluded and densely populated environments.While contemporary tracking sy...Multi-Object Tracking(MOT)represents a fundamental but computationally demanding task in computer vision,with particular challenges arising in occluded and densely populated environments.While contemporary tracking systems have demonstrated considerable progress,persistent limitations—notably frequent occlusion-induced identity switches and tracking inaccuracies—continue to impede reliable real-world deployment.This work introduces an advanced tracking framework that enhances association robustness through a two-stage matching paradigm combining spatial and appearance features.Proposed framework employs:(1)a Height Modulated and Scale Adaptive Spatial Intersection-over-Union(HMSIoU)metric for improved spatial correspondence estimation across variable object scales and partial occlusions;(2)a feature extraction module generating discriminative appearance descriptors for identity maintenance;and(3)a recovery association mechanism for refining matches between unassociated tracks and detections.Comprehensive evaluation on standard MOT17 and MOT20 benchmarks demonstrates significant improvements in tracking consistency,with state-of-the-art performance across key metrics including HOTA(64),MOTA(80.7),IDF1(79.8),and IDs(1379).These results substantiate the efficacy of our Cue-Tracker framework in complex real-world scenarios characterized by occlusions and crowd interactions.展开更多
Remote sensing cross-modal image-text retrieval(RSCIR)can flexibly and subjectively retrieve remote sensing images utilizing query text,which has received more researchers’attention recently.However,with the increasi...Remote sensing cross-modal image-text retrieval(RSCIR)can flexibly and subjectively retrieve remote sensing images utilizing query text,which has received more researchers’attention recently.However,with the increasing volume of visual-language pre-training model parameters,direct transfer learning consumes a substantial amount of computational and storage resources.Moreover,recently proposed parameter-efficient transfer learning methods mainly focus on the reconstruction of channel features,ignoring the spatial features which are vital for modeling key entity relationships.To address these issues,we design an efficient transfer learning framework for RSCIR,which is based on spatial feature efficient reconstruction(SPER).A concise and efficient spatial adapter is introduced to enhance the extraction of spatial relationships.The spatial adapter is able to spatially reconstruct the features in the backbone with few parameters while incorporating the prior information from the channel dimension.We conduct quantitative and qualitative experiments on two different commonly used RSCIR datasets.Compared with traditional methods,our approach achieves an improvement of 3%-11% in sumR metric.Compared with methods finetuning all parameters,our proposed method only trains less than 1% of the parameters,while maintaining an overall performance of about 96%.展开更多
This paper proposes an approach of developing the feature based parametric product modeling system which is suitable for integrated engineering design in CIMS environment.The architecture of ZD--MCADII and the charact...This paper proposes an approach of developing the feature based parametric product modeling system which is suitable for integrated engineering design in CIMS environment.The architecture of ZD--MCADII and the characteristics of its each module are introduced in detail. ZD--MCADII’s product data is managed by an object--oriented database management system OSCAR, and the product model is built according to the standard STEP. The product design is established on a unified product model, and all the product data are globally associated in ZD--MCADII. ZD--MCADII provides various design features to facilitate the product design, and supports the integrity of CAD, CAPP and CAM.展开更多
Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm f...Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.展开更多
A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the develo...A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the development of proper featurization method based on physicochemical nature of target proprieties can improve the predictive power of ML models with a smaller database.In this work,we show that two new featurization methods,volume occupation spatial matrix and heat contribution spatial matrix,can improve the accuracy in predicting energetic materials' crystal density(ρ_(crystal)) and solid phase enthalpy of formation(H_(f,solid)) using a database containing 451 energetic molecules.Their mean absolute errors are reduced from 0.048 g/cm~3 and 24.67 kcal/mol to 0.035 g/cm~3 and 9.66 kcal/mol,respectively.By leave-one-out-cross-validation,the newly developed ML models can be used to determine the performance of most kinds of energetic materials except cubanes.Our ML models are applied to predict ρ_(crystal) and H_(f,solid) of CHON-based molecules of the 150 million sized PubChem database,and screened out 56 candidates with competitive detonation performance and reasonable chemical structures.With further improvement in future,spatial matrices have the potential of becoming multifunctional ML simulation tools that could provide even better predictions in wider fields of materials science.展开更多
Inclusion of textures in image classification has been shown beneficial.This paper studies an efficient use of semivariogram features for object-based high-resolution image classification.First,an input image is divid...Inclusion of textures in image classification has been shown beneficial.This paper studies an efficient use of semivariogram features for object-based high-resolution image classification.First,an input image is divided into segments,for each of which a semivariogram is then calculated.Second,candidate features are extracted as a number of key locations of the semivariogram functions.Then we use an improved Relief algorithm and the principal component analysis to select independent and significant features.Then the selected prominent semivariogram features and the conventional spectral features are combined to constitute a feature vector for a support vector machine classifier.The effect of such selected semivariogram features is compared with those of the gray-level co-occurrence matrix(GLCM)features and window-based semivariogram texture features(STFs).Tests with aerial and satellite images show that such selected semivariogram features are of a more beneficial supplement to spectral features.The described method in this paper yields a higher classification accuracy than the combination of spectral and GLCM features or STFs.展开更多
In this paper, a feature selection method combining the reliefF and SVM-RFE algorithm is proposed. This algorithm integrates the weight vector from the reliefF into SVM-RFE method. In this method, the reliefF filters ...In this paper, a feature selection method combining the reliefF and SVM-RFE algorithm is proposed. This algorithm integrates the weight vector from the reliefF into SVM-RFE method. In this method, the reliefF filters out many noisy features in the first stage. Then the new ranking criterion based on SVM-RFE method is applied to obtain the final feature subset. The SVM classifier is used to evaluate the final image classification accuracy. Experimental results show that our proposed relief- SVM-RFE algorithm can achieve significant improvements for feature selection in image classification.展开更多
In this work, image feature vectors are formed for blocks containing sufficient information, which are selected using a singular-value criterion. When the ratio between the first two SVs axe below a given threshold, t...In this work, image feature vectors are formed for blocks containing sufficient information, which are selected using a singular-value criterion. When the ratio between the first two SVs axe below a given threshold, the block is considered informative. A total of 12 features including statistics of brightness, color components and texture measures are used to form intermediate vectors. Principal component analysis is then performed to reduce the dimension to 6 to give the final feature vectors. Relevance of the constructed feature vectors is demonstrated by experiments in which k-means clustering is used to group the vectors hence the blocks. Blocks falling into the same group show similar visual appearances.展开更多
Product information model for welding structure plays an important role for the integration of welding CAD/CAPP/CAM. However, existing CAD modeling systems are not capable of providing enough information for subsequen...Product information model for welding structure plays an important role for the integration of welding CAD/CAPP/CAM. However, existing CAD modeling systems are not capable of providing enough information for subsequent manufacturing activities such as CAPP and CAM. A new design approach using feature technique and object oriented programming method is put forward in this paper in order to create the product information model of welding structure. With this approach, the product information model is able to effectively support computer aided welding process planning, fixturing, assembling, path planning of welding robot and other manufacturing activities. The feature classification and representing scheme of welding structure are discussed. A prototype system is developed based on feature and object oriented programming. Its structure and functions are given in detail.展开更多
In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinfor...In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural networkbased reinforcement learning, thereby potentially leading to more effective policy improvement.展开更多
基金the Foshan Science and technology Innovation Team Project(No.FS0AA-KJ919-4402-0060)the National Natural Science Foundation of China(No.62263018)。
文摘In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestrian re-identification tasks,a person re-identification method combining spatial feature learning and multi-granularity feature fusion was proposed.First,an attention spatial transformation network(A-STN)is proposed to learn spatial features and solve the problem of misalignment of pedestrian spatial features.Then the network was divided into a global branch,a local coarse-grained fusion branch,and a local fine-grained fusion branch to extract pedestrian global features,coarse-grained fusion features,and fine-grained fusion features,respectively.Among them,the global branch enriches the global features by fusing different pooling features.The local coarse-grained fusion branch uses an overlay pooling to enhance each local feature while learning the correlation relationship between multi-granularity features.The local fine-grained fusion branch uses a differential pooling to obtain the differential features that were fused with global features to learn the relationship between pedestrian local features and pedestrian global features.Finally,the proposed method was compared on three public datasets:Market1501,DukeMTMC-ReID and CUHK03.The experimental results were better than those of the comparative methods,which verifies the effectiveness of the proposed method.
基金supported by the Fundamental Research Funds for the Provincial Universities of Zhejiang (No.GK249909299001-036)National Key Research and Development Program of China (No. 2023YFB4502803)Zhejiang Provincial Natural Science Foundation of China (No.LDT23F01014F01)。
文摘Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.
基金supported by the National Natural Science Foundation of China(No.61275010)the Ph.D.Programs Foundation of Ministry of Education of China(No.20132304110007)+1 种基金the Heilongjiang Natural Science Foundation(No.F201409)the Fundamental Research Funds for the Central Universities(No.HEUCFD1410)
文摘To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method(Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed(GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.
基金supported by the National Natural Science Foundation of China (No.62202137)the China Postdoctoral Science Foundation (No.2023M730599)the Zhejiang Provincial Natural Science Foundation of China (No.LMS25F020009)。
文摘Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework for embodied visual exploration that possesses the efficient exploration capabilities of deep reinforcement learning(DRL)-based exploration policies and leverages feature-based visual odometry(VO) for more accurate mapping and positioning results. An improved local policy is also proposed to reduce tracking failures of feature-based VO in weakly textured scenes through a refined multi-discrete action space, keyframe fusion, and an auxiliary task. The experimental results demonstrate that Ne OR has better mapping and positioning accuracy compared to other entirely learning-based exploration frameworks and improves the robustness of feature-based VO by significantly reducing tracking failures in weakly textured scenes.
文摘In this paper, we present a novel and efficient scheme for extracting, indexing and retrieving color images. Our motivation was to reduce the space overhead of partition-based approaches taking advantage of the fact that only a relatively low number of distinct values of a particular visual feature is present in most images. To extract color feature and build indices into our image database we take into consideration factors such as human color perception and perceptual range, and the image is partitioned into a set of regions by using a simple classifying scheme. The compact color feature vector and the spatial color histogram, which are extracted from the seqmented image region, are used for representing the color and spatial information in the image. We have also developed the region-based distance measures to compare the similarity of two images. Extensive tests on a large image collection were conducted to demonstrate the effectiveness of the proposed approach.
基金This research was funded by College Student Innovation and Entrepreneurship Training Program,Grant Number 2021055Z and S202110082031the Special Project for Cultivating Scientific and Technological Innovation Ability of College and Middle School Students in Hebei Province,Grant Number 2021H011404.
文摘At present,underwater terrain images are all strip-shaped small fragment images preprocessed by the side-scan sonar imaging system.However,the processed underwater terrain images have inconspicuous and few feature points.In order to better realize the stitching of underwater terrain images and solve the problems of slow traditional image stitching speed,we proposed an improved algorithm for underwater terrain image stitching based on spatial gradient feature block.First,the spatial gradient fuzzy C-Means algorithm is used to divide the underwater terrain image into feature blocks with the fusion of spatial gradient information.The accelerated-KAZE(AKAZE)algorithm is used to combine the feature block information to match the reference image and the target image.Then,the random sample consensus(RANSAC)is applied to optimize the matching results.Finally,image fusion is performed with the global homography and the optimal seam-line method to improve the accuracy of image overlay fusion.The experimental results show that the proposed method in this paper effectively divides images into feature blocks by combining spatial information and gradient information,which not only solves the problem of stitching failure of underwater terrain images due to unobvious features,and further reduces the sensitivity to noise,but also effectively reduces the iterative calculation in the feature point matching process of the traditional method,and improves the stitching speed.Ghosting and shape warping are significantly eliminated by re-optimizing the overlap of the image.
基金supported by the Fundamental Research Funds for the Central Universities under Grant 2020JKF101the Research Funds of Sugon under Grant 2022KY001.
文摘Rapid development of deepfake technology led to the spread of forged audios and videos across network platforms,presenting risks for numerous countries,societies,and individuals,and posing a serious threat to cyberspace security.To address the problem of insufficient extraction of spatial features and the fact that temporal features are not considered in the deepfake video detection,we propose a detection method based on improved CapsNet and temporal–spatial features(iCapsNet–TSF).First,the dynamic routing algorithm of CapsNet is improved using weight initialization and updating.Then,the optical flow algorithm is used to extract interframe temporal features of the videos to form a dataset of temporal–spatial features.Finally,the iCapsNet model is employed to fully learn the temporal–spatial features of facial videos,and the results are fused.Experimental results show that the detection accuracy of iCapsNet–TSF reaches 94.07%,98.83%,and 98.50%on the Celeb-DF,FaceSwap,and Deepfakes datasets,respectively,displaying a better performance than most existing mainstream algorithms.The iCapsNet–TSF method combines the capsule network and the optical flow algorithm,providing a novel strategy for the deepfake detection,which is of great significance to the prevention of deepfake attacks and the preservation of cyberspace security.
文摘Feature based design has been regarded as a promising approach for CAD/CAM integration.This paper aims to establish a domain independent representation formalism for feature based design in three aspects: formal representation,design process model and design algorithm.The implementing scheme and formal description of feature taxonomy,feature operator,feature model validation and feature transformation are given in the paper.The feature based design process model suited for either sequencial or concurrent engineering is proposed and its application to product structural design and process plan design is presented. Some general design algorithms for developing feature based design system are also addressed.The proposed scheme provides a formal methodology elementary for feature based design system development and operation in a structural way.
文摘Multi-Object Tracking(MOT)represents a fundamental but computationally demanding task in computer vision,with particular challenges arising in occluded and densely populated environments.While contemporary tracking systems have demonstrated considerable progress,persistent limitations—notably frequent occlusion-induced identity switches and tracking inaccuracies—continue to impede reliable real-world deployment.This work introduces an advanced tracking framework that enhances association robustness through a two-stage matching paradigm combining spatial and appearance features.Proposed framework employs:(1)a Height Modulated and Scale Adaptive Spatial Intersection-over-Union(HMSIoU)metric for improved spatial correspondence estimation across variable object scales and partial occlusions;(2)a feature extraction module generating discriminative appearance descriptors for identity maintenance;and(3)a recovery association mechanism for refining matches between unassociated tracks and detections.Comprehensive evaluation on standard MOT17 and MOT20 benchmarks demonstrates significant improvements in tracking consistency,with state-of-the-art performance across key metrics including HOTA(64),MOTA(80.7),IDF1(79.8),and IDs(1379).These results substantiate the efficacy of our Cue-Tracker framework in complex real-world scenarios characterized by occlusions and crowd interactions.
基金supported by the National Key R&D Program of China(No.2022ZD0118402)。
文摘Remote sensing cross-modal image-text retrieval(RSCIR)can flexibly and subjectively retrieve remote sensing images utilizing query text,which has received more researchers’attention recently.However,with the increasing volume of visual-language pre-training model parameters,direct transfer learning consumes a substantial amount of computational and storage resources.Moreover,recently proposed parameter-efficient transfer learning methods mainly focus on the reconstruction of channel features,ignoring the spatial features which are vital for modeling key entity relationships.To address these issues,we design an efficient transfer learning framework for RSCIR,which is based on spatial feature efficient reconstruction(SPER).A concise and efficient spatial adapter is introduced to enhance the extraction of spatial relationships.The spatial adapter is able to spatially reconstruct the features in the backbone with few parameters while incorporating the prior information from the channel dimension.We conduct quantitative and qualitative experiments on two different commonly used RSCIR datasets.Compared with traditional methods,our approach achieves an improvement of 3%-11% in sumR metric.Compared with methods finetuning all parameters,our proposed method only trains less than 1% of the parameters,while maintaining an overall performance of about 96%.
文摘This paper proposes an approach of developing the feature based parametric product modeling system which is suitable for integrated engineering design in CIMS environment.The architecture of ZD--MCADII and the characteristics of its each module are introduced in detail. ZD--MCADII’s product data is managed by an object--oriented database management system OSCAR, and the product model is built according to the standard STEP. The product design is established on a unified product model, and all the product data are globally associated in ZD--MCADII. ZD--MCADII provides various design features to facilitate the product design, and supports the integrity of CAD, CAPP and CAM.
基金supported by the National Natural Science Foundation of China(No.62103298)。
文摘Aiming at the problems of low detection accuracy and large model size of existing object detection algorithms applied to complex road scenes,an improved you only look once version 8(YOLOv8)object detection algorithm for infrared images,F-YOLOv8,is proposed.First,a spatial-to-depth network replaces the traditional backbone network's strided convolution or pooling layer.At the same time,it combines with the channel attention mechanism so that the neural network focuses on the channels with large weight values to better extract low-resolution image feature information;then an improved feature pyramid network of lightweight bidirectional feature pyramid network(L-BiFPN)is proposed,which can efficiently fuse features of different scales.In addition,a loss function of insertion of union based on the minimum point distance(MPDIoU)is introduced for bounding box regression,which obtains faster convergence speed and more accurate regression results.Experimental results on the FLIR dataset show that the improved algorithm can accurately detect infrared road targets in real time with 3%and 2.2%enhancement in mean average precision at 50%IoU(mAP50)and mean average precision at 50%—95%IoU(mAP50-95),respectively,and 38.1%,37.3%and 16.9%reduction in the number of model parameters,the model weight,and floating-point operations per second(FLOPs),respectively.To further demonstrate the detection capability of the improved algorithm,it is tested on the public dataset PASCAL VOC,and the results show that F-YOLO has excellent generalized detection performance.
基金support from the Ministry of Education(MOE) Singapore Tier 1 (RG8/20)。
文摘A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the development of proper featurization method based on physicochemical nature of target proprieties can improve the predictive power of ML models with a smaller database.In this work,we show that two new featurization methods,volume occupation spatial matrix and heat contribution spatial matrix,can improve the accuracy in predicting energetic materials' crystal density(ρ_(crystal)) and solid phase enthalpy of formation(H_(f,solid)) using a database containing 451 energetic molecules.Their mean absolute errors are reduced from 0.048 g/cm~3 and 24.67 kcal/mol to 0.035 g/cm~3 and 9.66 kcal/mol,respectively.By leave-one-out-cross-validation,the newly developed ML models can be used to determine the performance of most kinds of energetic materials except cubanes.Our ML models are applied to predict ρ_(crystal) and H_(f,solid) of CHON-based molecules of the 150 million sized PubChem database,and screened out 56 candidates with competitive detonation performance and reasonable chemical structures.With further improvement in future,spatial matrices have the potential of becoming multifunctional ML simulation tools that could provide even better predictions in wider fields of materials science.
基金This work was supported by the National Natural Science Foundation of China[grant number 41101410]the Comprehensive Transportation Applications of High-resolution Remote Sensing program[grant number 07-Y30B10-9001-14/16]+1 种基金the Key Laboratory of Surveying Mapping and Geoinformation in Geographical Condition Monitoring[grant number 2014NGCM]the Science and Technology Plan of Sichuan Bureau of Surveying,Mapping and Geoinformation,China[grant number J2014ZC02].
文摘Inclusion of textures in image classification has been shown beneficial.This paper studies an efficient use of semivariogram features for object-based high-resolution image classification.First,an input image is divided into segments,for each of which a semivariogram is then calculated.Second,candidate features are extracted as a number of key locations of the semivariogram functions.Then we use an improved Relief algorithm and the principal component analysis to select independent and significant features.Then the selected prominent semivariogram features and the conventional spectral features are combined to constitute a feature vector for a support vector machine classifier.The effect of such selected semivariogram features is compared with those of the gray-level co-occurrence matrix(GLCM)features and window-based semivariogram texture features(STFs).Tests with aerial and satellite images show that such selected semivariogram features are of a more beneficial supplement to spectral features.The described method in this paper yields a higher classification accuracy than the combination of spectral and GLCM features or STFs.
文摘In this paper, a feature selection method combining the reliefF and SVM-RFE algorithm is proposed. This algorithm integrates the weight vector from the reliefF into SVM-RFE method. In this method, the reliefF filters out many noisy features in the first stage. Then the new ranking criterion based on SVM-RFE method is applied to obtain the final feature subset. The SVM classifier is used to evaluate the final image classification accuracy. Experimental results show that our proposed relief- SVM-RFE algorithm can achieve significant improvements for feature selection in image classification.
基金Project supported by the National Natural Science Foundation of China (Grant No.60502039), the Shanghai Rising-Star Program (Grant No.06QA14022), and the Key Project of Shanghai Municipality for Basic Research (Grant No.04JC14037)
文摘In this work, image feature vectors are formed for blocks containing sufficient information, which are selected using a singular-value criterion. When the ratio between the first two SVs axe below a given threshold, the block is considered informative. A total of 12 features including statistics of brightness, color components and texture measures are used to form intermediate vectors. Principal component analysis is then performed to reduce the dimension to 6 to give the final feature vectors. Relevance of the constructed feature vectors is demonstrated by experiments in which k-means clustering is used to group the vectors hence the blocks. Blocks falling into the same group show similar visual appearances.
文摘Product information model for welding structure plays an important role for the integration of welding CAD/CAPP/CAM. However, existing CAD modeling systems are not capable of providing enough information for subsequent manufacturing activities such as CAPP and CAM. A new design approach using feature technique and object oriented programming method is put forward in this paper in order to create the product information model of welding structure. With this approach, the product information model is able to effectively support computer aided welding process planning, fixturing, assembling, path planning of welding robot and other manufacturing activities. The feature classification and representing scheme of welding structure are discussed. A prototype system is developed based on feature and object oriented programming. Its structure and functions are given in detail.
文摘In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural networkbased reinforcement learning, thereby potentially leading to more effective policy improvement.