The cross-modal person re-identification task aims to match visible and infrared images of the same individual.The main challenges in this field arise from significant modality differences between individuals and the ...The cross-modal person re-identification task aims to match visible and infrared images of the same individual.The main challenges in this field arise from significant modality differences between individuals and the lack of high-quality cross-modal correspondence methods.Existing approaches often attempt to establish modality correspondence by extracting shared features across different modalities.However,these methods tend to focus on local information extraction and fail to fully leverage the global identity information in the cross-modal features,resulting in limited correspondence accuracy and suboptimal matching performance.To address this issue,we propose a quadratic graph matching method designed to overcome the challenges posed by modality differences through precise cross-modal relationship alignment.This method transforms the cross-modal correspondence problem into a graph matching task and minimizes the matching cost using a center search mechanism.Building on this approach,we further design a block reasoning module to uncover latent relationships between person identities and optimize the modality correspondence results.The block strategy not only improves the efficiency of updating gallery images but also enhances matching accuracy while reducing computational load.Experimental results demonstrate that our proposed method outperforms the state-of-the-art methods on the SYSU-MM01,RegDB,and RGBNT201 datasets,achieving excellent matching accuracy and robustness,thereby validating its effectiveness in cross-modal person re-identification.展开更多
The unsupervised vehicle re-identification task aims at identifying specific vehicles in surveillance videos without utilizing annotation information.Due to the higher similarity in appearance between vehicles compare...The unsupervised vehicle re-identification task aims at identifying specific vehicles in surveillance videos without utilizing annotation information.Due to the higher similarity in appearance between vehicles compared to pedestrians,pseudo-labels generated through clustering are ineffective in mitigating the impact of noise,and the feature distance between inter-class and intra-class has not been adequately improved.To address the aforementioned issues,we design a dual contrastive learning method based on knowledge distillation.During each iteration,we utilize a teacher model to randomly partition the entire dataset into two sub-domains based on clustering pseudo-label categories.By conducting contrastive learning between the two student models,we extract more discernible vehicle identity cues to improve the problem of imbalanced data distribution.Subsequently,we propose a context-aware pseudo label refinement strategy that leverages contextual features by progressively associating granularity information from different bottleneck blocks.To produce more trustworthy pseudo-labels and lessen noise interference during the clustering process,the context-aware scores are obtained by calculating the similarity between global features and contextual ones,which are subsequently added to the pseudo-label encoding process.The proposed method has achieved excellent performance in overcoming label noise and optimizing data distribution through extensive experimental results on publicly available datasets.展开更多
Unsupervised vehicle re-identification(Re-ID)methods have garnered widespread attention due to their potential in real-world traffic monitoring.However,existing unsupervised domain adaptation techniques often rely on ...Unsupervised vehicle re-identification(Re-ID)methods have garnered widespread attention due to their potential in real-world traffic monitoring.However,existing unsupervised domain adaptation techniques often rely on pseudo-labels generated from the source domain,which struggle to effectively address the diversity and dynamic nature of real-world scenarios.Given the limited variety of common vehicle types,enhancing the model’s generalization capability across these types is crucial.To this end,an innovative approach called meta-type generalization(MTG)is proposed.By dividing the training data into meta-train and meta-test sets based on vehicle type information,a novel gradient interaction computation strategy is designed to enhance the model’s ability to learn typeinvariant features.Integrated into the ResNet50 backbone,the MTG model achieves improvements of 4.50%and 12.04%on the Veri-776 and VRAI datasets,respectively,compared with traditional unsupervised algorithms,and surpasses current state-of-the-art methods.This achievement holds promise for application in intelligent traffic systems,enabling more efficient urban traffic solutions.展开更多
In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestri...In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestrian re-identification tasks,a person re-identification method combining spatial feature learning and multi-granularity feature fusion was proposed.First,an attention spatial transformation network(A-STN)is proposed to learn spatial features and solve the problem of misalignment of pedestrian spatial features.Then the network was divided into a global branch,a local coarse-grained fusion branch,and a local fine-grained fusion branch to extract pedestrian global features,coarse-grained fusion features,and fine-grained fusion features,respectively.Among them,the global branch enriches the global features by fusing different pooling features.The local coarse-grained fusion branch uses an overlay pooling to enhance each local feature while learning the correlation relationship between multi-granularity features.The local fine-grained fusion branch uses a differential pooling to obtain the differential features that were fused with global features to learn the relationship between pedestrian local features and pedestrian global features.Finally,the proposed method was compared on three public datasets:Market1501,DukeMTMC-ReID and CUHK03.The experimental results were better than those of the comparative methods,which verifies the effectiveness of the proposed method.展开更多
Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-c...Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-class similarity in the collected vehicle images,which increases the complexity of re-identification tasks.To tackle these challenges,this study proposes AG-GCN(Attention-Guided Graph Convolutional Network),a novel framework integrating several pivotal components.Initially,AG-GCN embeds a lightweight attention module within the ResNet-50 structure to learn feature weights automatically,thereby improving the representation of vehicle features globally by highlighting salient features and suppressing extraneous ones.Moreover,AG-GCN adopts a graph-based structure to encapsulate deep local features.A graph convolutional network then amalgamates these features to understand the relationships among vehicle-related characteristics.Subsequently,we amalgamate feature maps from both the attention and graph-based branches for a more comprehensive representation of vehicle features.The framework then gauges feature similarities and ranks them,thus enhancing the accuracy of vehicle re-identification.Comprehensive qualitative and quantitative analyses on two publicly available datasets verify the efficacy of AG-GCN in addressing intra-class and inter-class variability issues.展开更多
Community detection is one of the most fundamental applications in understanding the structure of complicated networks.Furthermore,it is an important approach to identifying closely linked clusters of nodes that may r...Community detection is one of the most fundamental applications in understanding the structure of complicated networks.Furthermore,it is an important approach to identifying closely linked clusters of nodes that may represent underlying patterns and relationships.Networking structures are highly sensitive in social networks,requiring advanced techniques to accurately identify the structure of these communities.Most conventional algorithms for detecting communities perform inadequately with complicated networks.In addition,they miss out on accurately identifying clusters.Since single-objective optimization cannot always generate accurate and comprehensive results,as multi-objective optimization can.Therefore,we utilized two objective functions that enable strong connections between communities and weak connections between them.In this study,we utilized the intra function,which has proven effective in state-of-the-art research studies.We proposed a new inter-function that has demonstrated its effectiveness by making the objective of detecting external connections between communities is to make them more distinct and sparse.Furthermore,we proposed a Multi-Objective community strength enhancement algorithm(MOCSE).The proposed algorithm is based on the framework of the Multi-Objective Evolutionary Algorithm with Decomposition(MOEA/D),integrated with a new heuristic mutation strategy,community strength enhancement(CSE).The results demonstrate that the model is effective in accurately identifying community structures while also being computationally efficient.The performance measures used to evaluate the MOEA/D algorithm in our work are normalized mutual information(NMI)and modularity(Q).It was tested using five state-of-the-art algorithms on social networks,comprising real datasets(Zachary,Dolphin,Football,Krebs,SFI,Jazz,and Netscience),as well as twenty synthetic datasets.These results provide the robustness and practical value of the proposed algorithm in multi-objective community identification.展开更多
Vehicle Edge Computing(VEC)and Cloud Computing(CC)significantly enhance the processing efficiency of delay-sensitive and computation-intensive applications by offloading compute-intensive tasks from resource-constrain...Vehicle Edge Computing(VEC)and Cloud Computing(CC)significantly enhance the processing efficiency of delay-sensitive and computation-intensive applications by offloading compute-intensive tasks from resource-constrained onboard devices to nearby Roadside Unit(RSU),thereby achieving lower delay and energy consumption.However,due to the limited storage capacity and energy budget of RSUs,it is challenging to meet the demands of the highly dynamic Internet of Vehicles(IoV)environment.Therefore,determining reasonable service caching and computation offloading strategies is crucial.To address this,this paper proposes a joint service caching scheme for cloud-edge collaborative IoV computation offloading.By modeling the dynamic optimization problem using Markov Decision Processes(MDP),the scheme jointly optimizes task delay,energy consumption,load balancing,and privacy entropy to achieve better quality of service.Additionally,a dynamic adaptive multi-objective deep reinforcement learning algorithm is proposed.Each Double Deep Q-Network(DDQN)agent obtains rewards for different objectives based on distinct reward functions and dynamically updates the objective weights by learning the value changes between objectives using Radial Basis Function Networks(RBFN),thereby efficiently approximating the Pareto-optimal decisions for multiple objectives.Extensive experiments demonstrate that the proposed algorithm can better coordinate the three-tier computing resources of cloud,edge,and vehicles.Compared to existing algorithms,the proposed method reduces task delay and energy consumption by 10.64%and 5.1%,respectively.展开更多
Rapid urbanization in China has led to spatial antagonism between urban development and farmland protection and ecological security maintenance.Multi-objective spatial collaborative optimization is a powerful method f...Rapid urbanization in China has led to spatial antagonism between urban development and farmland protection and ecological security maintenance.Multi-objective spatial collaborative optimization is a powerful method for achieving sustainable regional development.Previous studies on multi-objective spatial optimization do not involve spatial corrections to simulation results based on the natural endowment of space resources.This study proposes an Ecological Security-Food Security-Urban Sustainable Development(ES-FS-USD)spatial optimization framework.This framework combines the non-dominated sorting genetic algorithm II(NSGA-II)and patch-generating land use simulation(PLUS)model with an ecological protection importance evaluation,comprehensive agricultural productivity evaluation,and urban sustainable development potential assessment and optimizes the territorial space in the Yangtze River Delta(YRD)region in 2035.The proposed sustainable development(SD)scenario can effectively reduce the destruction of landscape patterns of various land-use types while considering both ecological and economic benefits.The simulation results were further revised by evaluating the land-use suitability of the YRD region.According to the revised spatial pattern for the YRD in 2035,the farmland area accounts for 43.59%of the total YRD,which is 5.35%less than that in 2010.Forest,grassland,and water area account for 40.46%of the total YRD—an increase of 1.42%compared with the case in 2010.Construction land accounts for 14.72%of the total YRD—an increase of 2.77%compared with the case in 2010.The ES-FS-USD spatial optimization framework ensures that spatial optimization outcomes are aligned with the natural endowments of land resources,thereby promoting the sustainable use of land resources,improving the ability of spatial management,and providing valuable insights for decision makers.展开更多
Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of int...Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of intelligent transportation system.Most existing vehicle re-identification models adopt the joint learning of global and local features.However,they directly use the extracted global features,resulting in insufficient feature expression.Moreover,local features are primarily obtained through advanced annotation and complex attention mechanisms,which require additional costs.To solve this issue,a multi-feature learning model with enhanced local attention for vehicle re-identification(MFELA)is proposed in this paper.The model consists of global and local branches.The global branch utilizes both middle and highlevel semantic features of ResNet50 to enhance the global representation capability.In addition,multi-scale pooling operations are used to obtain multiscale information.While the local branch utilizes the proposed Region Batch Dropblock(RBD),which encourages the model to learn discriminative features for different local regions and simultaneously drops corresponding same areas randomly in a batch during training to enhance the attention to local regions.Then features from both branches are combined to provide a more comprehensive and distinctive feature representation.Extensive experiments on VeRi-776 and VehicleID datasets prove that our method has excellent performance.展开更多
Visible-infrared Cross-modality Person Re-identification(VI-ReID)is a critical technology in smart public facilities such as cities,campuses and libraries.It aims to match pedestrians in visible light and infrared ima...Visible-infrared Cross-modality Person Re-identification(VI-ReID)is a critical technology in smart public facilities such as cities,campuses and libraries.It aims to match pedestrians in visible light and infrared images for video surveillance,which poses a challenge in exploring cross-modal shared information accurately and efficiently.Therefore,multi-granularity feature learning methods have been applied in VI-ReID to extract potential multi-granularity semantic information related to pedestrian body structure attributes.However,existing research mainly uses traditional dual-stream fusion networks and overlooks the core of cross-modal learning networks,the fusion module.This paper introduces a novel network called the Augmented Deep Multi-Granularity Pose-Aware Feature Fusion Network(ADMPFF-Net),incorporating the Multi-Granularity Pose-Aware Feature Fusion(MPFF)module to generate discriminative representations.MPFF efficiently explores and learns global and local features with multi-level semantic information by inserting disentangling and duplicating blocks into the fusion module of the backbone network.ADMPFF-Net also provides a new perspective for designing multi-granularity learning networks.By incorporating the multi-granularity feature disentanglement(mGFD)and posture information segmentation(pIS)strategies,it extracts more representative features concerning body structure information.The Local Information Enhancement(LIE)module augments high-performance features in VI-ReID,and the multi-granularity joint loss supervises model training for objective feature learning.Experimental results on two public datasets show that ADMPFF-Net efficiently constructs pedestrian feature representations and enhances the accuracy of VI-ReID.展开更多
Person re-identification(Re-ID)is the scientific task of finding specific person images of a person in a non-overlapping camera networks,and has achieved many breakthroughs recently.However,it remains very challenging...Person re-identification(Re-ID)is the scientific task of finding specific person images of a person in a non-overlapping camera networks,and has achieved many breakthroughs recently.However,it remains very challenging in adverse environmental conditions,especially in dark areas or at nighttime due to the imaging limitations of a single visible light source.To handle this problem,we propose a novel deep red green blue(RGB)-thermal(RGBT)representation learning framework for a single modality RGB person ReID.Due to the lack of thermal data in prevalent RGB Re-ID datasets,we propose to use the generative adversarial network to translate labeled RGB images of person to thermal infrared ones,trained on existing RGBT datasets.The labeled RGB images and the synthetic thermal images make up a labeled RGBT training set,and we propose a cross-modal attention network to learn effective RGBT representations for person Re-ID in day and night by leveraging the complementary advantages of RGB and thermal modalities.Extensive experiments on Market1501,CUHK03 and Duke MTMC-re ID datasets demonstrate the effectiveness of our method,which achieves stateof-the-art performance on all above person Re-ID datasets.展开更多
Person re-identification(Re-ID)is a fundamental subject in the field of the computer vision technologies.The traditional methods of person Re-ID have difficulty in solving the problems of person illumination,occlusion...Person re-identification(Re-ID)is a fundamental subject in the field of the computer vision technologies.The traditional methods of person Re-ID have difficulty in solving the problems of person illumination,occlusion and attitude change under complex background.Meanwhile,the introduction of deep learning opens a new way of person Re-ID research and becomes a hot spot in this field.This study reviews the traditional methods of person Re-ID,then the authors focus on the related papers about different person Re-ID frameworks on the basis of deep learning,and discusses their advantages and disadvantages.Finally,they propose the direction of further research,especially the prospect of person Re-ID methods based on deep learning.展开更多
Person re-identification(re-ID)aims to match images of the same pedestrian across different cameras.It plays an important role in the field of security and surveillance.Although it has been studied for many years,it i...Person re-identification(re-ID)aims to match images of the same pedestrian across different cameras.It plays an important role in the field of security and surveillance.Although it has been studied for many years,it is still considered as an unsolved problem.Since the rise of deep learning,the accuracy of supervised person re-ID on public datasets has reached the highest level.However,these methods are difficult to apply to real-life scenarios because a large number of labeled training data is required in this situation.Pedestrian identity labeling,especially cross-camera pedestrian identity labeling,is heavy and expensive.Why we cannot apply the pre-trained model directly to the unseen camera network?Due to the existence of domain bias between source and target environment,the accuracy on target dataset is always low.For example,the model trained on the mall needs to adapt to the new environment of airport obviously.Recently,some researches have been proposed to solve this problem,including clustering-based methods,GAN-based methods,co-training methods and unsupervised domain adaptation methods.展开更多
Person re-identification(re-id)involves matching a person across nonoverlapping views,with different poses,illuminations and conditions.Visual attributes are understandable semantic information to help improve the iss...Person re-identification(re-id)involves matching a person across nonoverlapping views,with different poses,illuminations and conditions.Visual attributes are understandable semantic information to help improve the issues including illumination changes,viewpoint variations and occlusions.This paper proposes an end-to-end framework of deep learning for attribute-based person re-id.In the feature representation stage of framework,the improved convolutional neural network(CNN)model is designed to leverage the information contained in automatically detected attributes and learned low-dimensional CNN features.Moreover,an attribute classifier is trained on separate data and includes its responses into the training process of our person re-id model.The coupled clusters loss function is used in the training stage of the framework,which enhances the discriminability of both types of features.The combined features are mapped into the Euclidean space.The L2 distance can be used to calculate the distance between any two pedestrians to determine whether they are the same.Extensive experiments validate the superiority and advantages of our proposed framework over state-of-the-art competitors on contemporary challenging person re-id datasets.展开更多
Person re-identification(ReID)aims to recognize the same person in multiple images from different camera views.Training person ReID models are time-consuming and resource-intensive;thus,cloud computing is an appropria...Person re-identification(ReID)aims to recognize the same person in multiple images from different camera views.Training person ReID models are time-consuming and resource-intensive;thus,cloud computing is an appropriate model training solution.However,the required massive personal data for training contain private information with a significant risk of data leakage in cloud environments,leading to significant communication overheads.This paper proposes a federated person ReID method with model-contrastive learning(MOON)in an edge-cloud environment,named FRM.Specifically,based on federated partial averaging,MOON warmup is added to correct the local training of individual edge servers and improve the model’s effectiveness by calculating and back-propagating a model-contrastive loss,which represents the similarity between local and global models.In addition,we propose a lightweight person ReID network,named multi-branch combined depth space network(MB-CDNet),to reduce the computing resource usage of the edge device when training and testing the person ReID model.MB-CDNet is a multi-branch version of combined depth space network(CDNet).We add a part branch and a global branch on the basis of CDNet and introduce an attention pyramid to improve the performance of the model.The experimental results on open-access person ReID datasets demonstrate that FRM achieves better performance than existing baseline.展开更多
Person re-identification (re-id) on robot platform is an important application for human-robot- interaction (HRI), which aims at making the robot recognize the around persons in varying scenes. Although many effec...Person re-identification (re-id) on robot platform is an important application for human-robot- interaction (HRI), which aims at making the robot recognize the around persons in varying scenes. Although many effective methods have been proposed for surveillance re-id in recent years, re-id on robot platform is still a novel unsolved problem. Most existing methods adapt the supervised metric learning offline to improve the accuracy. However, these methods can not adapt to unknown scenes. To solve this problem, an online re-id framework is proposed. Considering that robotics can afford to use high-resolution RGB-D sensors and clear human face may be captured, face information is used to update the metric model. Firstly, the metric model is pre-trained offline using labeled data. Then during the online stage, we use face information to mine incorrect body matching pairs which are collected to update the metric model online. In addition, to make full use of both appearance and skeleton information provided by RGB-D sensors, a novel feature funnel model (FFM) is proposed. Comparison studies show our approach is more effective and adaptable to varying environments.展开更多
Person re-ID is becoming increasingly popular in the field of modern surveillance.The purpose of person re-ID is to retrieve person of interests in non-overlapping multi-camera surveillance system.Due to the complexit...Person re-ID is becoming increasingly popular in the field of modern surveillance.The purpose of person re-ID is to retrieve person of interests in non-overlapping multi-camera surveillance system.Due to the complexity of the surveillance scene,the person images captured by cameras often have problems such as size variation,rotation,occlusion,illumination difference,etc.,which brings great challenges to the study of person re-ID.In recent years,studies based on deep learning have achieved great success in person re-ID.The improvement of basic networks and a large number of studies on the influencing factors have greatly improved the accuracy of person re-ID.Recently,some studies utilize GAN to tackle the domain adaptation task by transferring person images of source domain to the style of target domain and have achieved state of the art result in person re-ID.展开更多
The attention mechanism can extract salient features in images,which has been proved to be effective in improving the performance of person re-identification(Re-ID).However,most of the existing attention modules have ...The attention mechanism can extract salient features in images,which has been proved to be effective in improving the performance of person re-identification(Re-ID).However,most of the existing attention modules have the following two shortcomings:On the one hand,they mostly use global average pooling to generate context descriptors,without highlighting the guiding role of salient information on descriptor generation,resulting in insufficient ability of the final generated attention mask representation;On the other hand,the design of most attention modules is complicated,which greatly increases the computational cost of the model.To solve these problems,this paper proposes an attention module called self-supervised recalibration(SR)block,which introduces both global and local information through adaptive weighted fusion to generate a more refined attention mask.In particular,a special"Squeeze-Excitation"(SE)unit is designed in the SR block to further process the generated intermediate masks,both for nonlinearizations of the features and for constraint of the resulting computation by controlling the number of channels.Furthermore,we combine the most commonly used Res Net-50 to construct the instantiation model of the SR block,and verify its effectiveness on multiple Re-ID datasets,especially the mean Average Precision(m AP)on the Occluded-Duke dataset exceeds the state-of-the-art(SOTA)algorithm by 4.49%.展开更多
As an emerging visual task,vehicle re-identification refers to the identification of the same vehicle across multiple cameras.Herein,we propose a novel vehicle re-identification method that uses an improved ResNet-50 ...As an emerging visual task,vehicle re-identification refers to the identification of the same vehicle across multiple cameras.Herein,we propose a novel vehicle re-identification method that uses an improved ResNet-50 architecture and utilizes the topology information of a surveillance network to rerank the final results.In the training stage,we apply several data augmentation approaches to expand our training data and increase their diversity in a cost-effective manner.We reform the original RestNet-50 architecture by adding non-local blocks to implement the attention mechanism and replacing part of the batch normalization operations with instance batch normalization.After obtaining preliminary results from the proposed model,we use the reranking algorithm,whose core function is to improve the similarity scores of all images on the most likely path that the vehicle tends to appear to optimize the final results.Compared with most existing state-of-the-art methods,our method is lighter,requires less data annotation,and offers competitive performance.展开更多
Person re-identification is a prevalent technology deployed on intelligent surveillance.There have been remarkable achievements in person re-identification methods based on the assumption that all person images have a...Person re-identification is a prevalent technology deployed on intelligent surveillance.There have been remarkable achievements in person re-identification methods based on the assumption that all person images have a sufficiently high resolution,yet such models are not applicable to the open world.In real world,the changing distance between pedestrians and the camera renders the resolution of pedestrians captured by the camera inconsistent.When low-resolution(LR)images in the query set are matched with high-resolution(HR)images in the gallery set,it degrades the performance of the pedestrian matching task due to the absent pedestrian critical information in LR images.To address the above issues,we present a dualstream coupling network with wavelet transform(DSCWT)for the cross-resolution person re-identification task.Firstly,we use the multi-resolution analysis principle of wavelet transform to separately process the low-frequency and high-frequency regions of LR images,which is applied to restore the lost detail information of LR images.Then,we devise a residual knowledge constrained loss function that transfers knowledge between the two streams of LR images and HR images for accessing pedestrian invariant features at various resolutions.Extensive qualitative and quantitative experiments across four benchmark datasets verify the superiority of the proposed approach.展开更多
文摘The cross-modal person re-identification task aims to match visible and infrared images of the same individual.The main challenges in this field arise from significant modality differences between individuals and the lack of high-quality cross-modal correspondence methods.Existing approaches often attempt to establish modality correspondence by extracting shared features across different modalities.However,these methods tend to focus on local information extraction and fail to fully leverage the global identity information in the cross-modal features,resulting in limited correspondence accuracy and suboptimal matching performance.To address this issue,we propose a quadratic graph matching method designed to overcome the challenges posed by modality differences through precise cross-modal relationship alignment.This method transforms the cross-modal correspondence problem into a graph matching task and minimizes the matching cost using a center search mechanism.Building on this approach,we further design a block reasoning module to uncover latent relationships between person identities and optimize the modality correspondence results.The block strategy not only improves the efficiency of updating gallery images but also enhances matching accuracy while reducing computational load.Experimental results demonstrate that our proposed method outperforms the state-of-the-art methods on the SYSU-MM01,RegDB,and RGBNT201 datasets,achieving excellent matching accuracy and robustness,thereby validating its effectiveness in cross-modal person re-identification.
基金supported by the National Natural Science Foundation of China under Grant Nos.62461037,62076117 and 62166026the Jiangxi Provincial Natural Science Foundation under Grant Nos.20224BAB212011,20232BAB202051,20232BAB212008 and 20242BAB25078the Jiangxi Provincial Key Laboratory of Virtual Reality under Grant No.2024SSY03151.
文摘The unsupervised vehicle re-identification task aims at identifying specific vehicles in surveillance videos without utilizing annotation information.Due to the higher similarity in appearance between vehicles compared to pedestrians,pseudo-labels generated through clustering are ineffective in mitigating the impact of noise,and the feature distance between inter-class and intra-class has not been adequately improved.To address the aforementioned issues,we design a dual contrastive learning method based on knowledge distillation.During each iteration,we utilize a teacher model to randomly partition the entire dataset into two sub-domains based on clustering pseudo-label categories.By conducting contrastive learning between the two student models,we extract more discernible vehicle identity cues to improve the problem of imbalanced data distribution.Subsequently,we propose a context-aware pseudo label refinement strategy that leverages contextual features by progressively associating granularity information from different bottleneck blocks.To produce more trustworthy pseudo-labels and lessen noise interference during the clustering process,the context-aware scores are obtained by calculating the similarity between global features and contextual ones,which are subsequently added to the pseudo-label encoding process.The proposed method has achieved excellent performance in overcoming label noise and optimizing data distribution through extensive experimental results on publicly available datasets.
基金Supported by the National Natural Science Foundation of China(No.61976098)the Natural Science Foundation for Outstanding Young Scholars of Fujian Province(No.2022J06023).
文摘Unsupervised vehicle re-identification(Re-ID)methods have garnered widespread attention due to their potential in real-world traffic monitoring.However,existing unsupervised domain adaptation techniques often rely on pseudo-labels generated from the source domain,which struggle to effectively address the diversity and dynamic nature of real-world scenarios.Given the limited variety of common vehicle types,enhancing the model’s generalization capability across these types is crucial.To this end,an innovative approach called meta-type generalization(MTG)is proposed.By dividing the training data into meta-train and meta-test sets based on vehicle type information,a novel gradient interaction computation strategy is designed to enhance the model’s ability to learn typeinvariant features.Integrated into the ResNet50 backbone,the MTG model achieves improvements of 4.50%and 12.04%on the Veri-776 and VRAI datasets,respectively,compared with traditional unsupervised algorithms,and surpasses current state-of-the-art methods.This achievement holds promise for application in intelligent traffic systems,enabling more efficient urban traffic solutions.
基金the Foshan Science and technology Innovation Team Project(No.FS0AA-KJ919-4402-0060)the National Natural Science Foundation of China(No.62263018)。
文摘In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestrian re-identification tasks,a person re-identification method combining spatial feature learning and multi-granularity feature fusion was proposed.First,an attention spatial transformation network(A-STN)is proposed to learn spatial features and solve the problem of misalignment of pedestrian spatial features.Then the network was divided into a global branch,a local coarse-grained fusion branch,and a local fine-grained fusion branch to extract pedestrian global features,coarse-grained fusion features,and fine-grained fusion features,respectively.Among them,the global branch enriches the global features by fusing different pooling features.The local coarse-grained fusion branch uses an overlay pooling to enhance each local feature while learning the correlation relationship between multi-granularity features.The local fine-grained fusion branch uses a differential pooling to obtain the differential features that were fused with global features to learn the relationship between pedestrian local features and pedestrian global features.Finally,the proposed method was compared on three public datasets:Market1501,DukeMTMC-ReID and CUHK03.The experimental results were better than those of the comparative methods,which verifies the effectiveness of the proposed method.
基金funded by the National Natural Science Foundation of China(grant number:62172292).
文摘Vehicle re-identification involves matching images of vehicles across varying camera views.The diversity of camera locations along different roadways leads to significant intra-class variation and only minimal inter-class similarity in the collected vehicle images,which increases the complexity of re-identification tasks.To tackle these challenges,this study proposes AG-GCN(Attention-Guided Graph Convolutional Network),a novel framework integrating several pivotal components.Initially,AG-GCN embeds a lightweight attention module within the ResNet-50 structure to learn feature weights automatically,thereby improving the representation of vehicle features globally by highlighting salient features and suppressing extraneous ones.Moreover,AG-GCN adopts a graph-based structure to encapsulate deep local features.A graph convolutional network then amalgamates these features to understand the relationships among vehicle-related characteristics.Subsequently,we amalgamate feature maps from both the attention and graph-based branches for a more comprehensive representation of vehicle features.The framework then gauges feature similarities and ranks them,thus enhancing the accuracy of vehicle re-identification.Comprehensive qualitative and quantitative analyses on two publicly available datasets verify the efficacy of AG-GCN in addressing intra-class and inter-class variability issues.
文摘Community detection is one of the most fundamental applications in understanding the structure of complicated networks.Furthermore,it is an important approach to identifying closely linked clusters of nodes that may represent underlying patterns and relationships.Networking structures are highly sensitive in social networks,requiring advanced techniques to accurately identify the structure of these communities.Most conventional algorithms for detecting communities perform inadequately with complicated networks.In addition,they miss out on accurately identifying clusters.Since single-objective optimization cannot always generate accurate and comprehensive results,as multi-objective optimization can.Therefore,we utilized two objective functions that enable strong connections between communities and weak connections between them.In this study,we utilized the intra function,which has proven effective in state-of-the-art research studies.We proposed a new inter-function that has demonstrated its effectiveness by making the objective of detecting external connections between communities is to make them more distinct and sparse.Furthermore,we proposed a Multi-Objective community strength enhancement algorithm(MOCSE).The proposed algorithm is based on the framework of the Multi-Objective Evolutionary Algorithm with Decomposition(MOEA/D),integrated with a new heuristic mutation strategy,community strength enhancement(CSE).The results demonstrate that the model is effective in accurately identifying community structures while also being computationally efficient.The performance measures used to evaluate the MOEA/D algorithm in our work are normalized mutual information(NMI)and modularity(Q).It was tested using five state-of-the-art algorithms on social networks,comprising real datasets(Zachary,Dolphin,Football,Krebs,SFI,Jazz,and Netscience),as well as twenty synthetic datasets.These results provide the robustness and practical value of the proposed algorithm in multi-objective community identification.
基金supported by Key Science and Technology Program of Henan Province,China(Grant Nos.242102210147,242102210027)Fujian Province Young and Middle aged Teacher Education Research Project(Science and Technology Category)(No.JZ240101)(Corresponding author:Dong Yuan).
文摘Vehicle Edge Computing(VEC)and Cloud Computing(CC)significantly enhance the processing efficiency of delay-sensitive and computation-intensive applications by offloading compute-intensive tasks from resource-constrained onboard devices to nearby Roadside Unit(RSU),thereby achieving lower delay and energy consumption.However,due to the limited storage capacity and energy budget of RSUs,it is challenging to meet the demands of the highly dynamic Internet of Vehicles(IoV)environment.Therefore,determining reasonable service caching and computation offloading strategies is crucial.To address this,this paper proposes a joint service caching scheme for cloud-edge collaborative IoV computation offloading.By modeling the dynamic optimization problem using Markov Decision Processes(MDP),the scheme jointly optimizes task delay,energy consumption,load balancing,and privacy entropy to achieve better quality of service.Additionally,a dynamic adaptive multi-objective deep reinforcement learning algorithm is proposed.Each Double Deep Q-Network(DDQN)agent obtains rewards for different objectives based on distinct reward functions and dynamically updates the objective weights by learning the value changes between objectives using Radial Basis Function Networks(RBFN),thereby efficiently approximating the Pareto-optimal decisions for multiple objectives.Extensive experiments demonstrate that the proposed algorithm can better coordinate the three-tier computing resources of cloud,edge,and vehicles.Compared to existing algorithms,the proposed method reduces task delay and energy consumption by 10.64%and 5.1%,respectively.
基金National Natural Science Foundation of China,No.42301470,No.52270185,No.42171389Capacity Building Program of Local Colleges and Universities in Shanghai,No.21010503300。
文摘Rapid urbanization in China has led to spatial antagonism between urban development and farmland protection and ecological security maintenance.Multi-objective spatial collaborative optimization is a powerful method for achieving sustainable regional development.Previous studies on multi-objective spatial optimization do not involve spatial corrections to simulation results based on the natural endowment of space resources.This study proposes an Ecological Security-Food Security-Urban Sustainable Development(ES-FS-USD)spatial optimization framework.This framework combines the non-dominated sorting genetic algorithm II(NSGA-II)and patch-generating land use simulation(PLUS)model with an ecological protection importance evaluation,comprehensive agricultural productivity evaluation,and urban sustainable development potential assessment and optimizes the territorial space in the Yangtze River Delta(YRD)region in 2035.The proposed sustainable development(SD)scenario can effectively reduce the destruction of landscape patterns of various land-use types while considering both ecological and economic benefits.The simulation results were further revised by evaluating the land-use suitability of the YRD region.According to the revised spatial pattern for the YRD in 2035,the farmland area accounts for 43.59%of the total YRD,which is 5.35%less than that in 2010.Forest,grassland,and water area account for 40.46%of the total YRD—an increase of 1.42%compared with the case in 2010.Construction land accounts for 14.72%of the total YRD—an increase of 2.77%compared with the case in 2010.The ES-FS-USD spatial optimization framework ensures that spatial optimization outcomes are aligned with the natural endowments of land resources,thereby promoting the sustainable use of land resources,improving the ability of spatial management,and providing valuable insights for decision makers.
基金This work was supported,in part,by the National Nature Science Foundation of China under Grant Numbers 61502240,61502096,61304205,61773219in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401+1 种基金in part,by the Postgraduate Research&Practice Innovation Program of Jiangsu Province under Grant Numbers SJCX21_0363in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘Vehicle re-identification(ReID)aims to retrieve the target vehicle in an extensive image gallery through its appearances from various views in the cross-camera scenario.It has gradually become a core technology of intelligent transportation system.Most existing vehicle re-identification models adopt the joint learning of global and local features.However,they directly use the extracted global features,resulting in insufficient feature expression.Moreover,local features are primarily obtained through advanced annotation and complex attention mechanisms,which require additional costs.To solve this issue,a multi-feature learning model with enhanced local attention for vehicle re-identification(MFELA)is proposed in this paper.The model consists of global and local branches.The global branch utilizes both middle and highlevel semantic features of ResNet50 to enhance the global representation capability.In addition,multi-scale pooling operations are used to obtain multiscale information.While the local branch utilizes the proposed Region Batch Dropblock(RBD),which encourages the model to learn discriminative features for different local regions and simultaneously drops corresponding same areas randomly in a batch during training to enhance the attention to local regions.Then features from both branches are combined to provide a more comprehensive and distinctive feature representation.Extensive experiments on VeRi-776 and VehicleID datasets prove that our method has excellent performance.
基金supported in part by the National Natural Science Foundation of China under Grant 62177029,62307025in part by the Startup Foundation for Introducing Talent of Nanjing University of Posts and Communications under Grant NY221041in part by the General Project of The Natural Science Foundation of Jiangsu Higher Education Institution of China 22KJB520025,23KJD580.
文摘Visible-infrared Cross-modality Person Re-identification(VI-ReID)is a critical technology in smart public facilities such as cities,campuses and libraries.It aims to match pedestrians in visible light and infrared images for video surveillance,which poses a challenge in exploring cross-modal shared information accurately and efficiently.Therefore,multi-granularity feature learning methods have been applied in VI-ReID to extract potential multi-granularity semantic information related to pedestrian body structure attributes.However,existing research mainly uses traditional dual-stream fusion networks and overlooks the core of cross-modal learning networks,the fusion module.This paper introduces a novel network called the Augmented Deep Multi-Granularity Pose-Aware Feature Fusion Network(ADMPFF-Net),incorporating the Multi-Granularity Pose-Aware Feature Fusion(MPFF)module to generate discriminative representations.MPFF efficiently explores and learns global and local features with multi-level semantic information by inserting disentangling and duplicating blocks into the fusion module of the backbone network.ADMPFF-Net also provides a new perspective for designing multi-granularity learning networks.By incorporating the multi-granularity feature disentanglement(mGFD)and posture information segmentation(pIS)strategies,it extracts more representative features concerning body structure information.The Local Information Enhancement(LIE)module augments high-performance features in VI-ReID,and the multi-granularity joint loss supervises model training for objective feature learning.Experimental results on two public datasets show that ADMPFF-Net efficiently constructs pedestrian feature representations and enhances the accuracy of VI-ReID.
基金supported by National Natural Science Foundation of China(Nos.61976002,61976003 and 61860206004)Natural Science Foundation of Anhui Higher Education Institutions of China(No.KJ2019A0033)the Open Project Program of the National Laboratory of Pattern Recognition(No.201900046)。
文摘Person re-identification(Re-ID)is the scientific task of finding specific person images of a person in a non-overlapping camera networks,and has achieved many breakthroughs recently.However,it remains very challenging in adverse environmental conditions,especially in dark areas or at nighttime due to the imaging limitations of a single visible light source.To handle this problem,we propose a novel deep red green blue(RGB)-thermal(RGBT)representation learning framework for a single modality RGB person ReID.Due to the lack of thermal data in prevalent RGB Re-ID datasets,we propose to use the generative adversarial network to translate labeled RGB images of person to thermal infrared ones,trained on existing RGBT datasets.The labeled RGB images and the synthetic thermal images make up a labeled RGBT training set,and we propose a cross-modal attention network to learn effective RGBT representations for person Re-ID in day and night by leveraging the complementary advantages of RGB and thermal modalities.Extensive experiments on Market1501,CUHK03 and Duke MTMC-re ID datasets demonstrate the effectiveness of our method,which achieves stateof-the-art performance on all above person Re-ID datasets.
基金supported by the Natural Science Foundation of China No.61703119,61573114Natural Science Fund of Heilongjiang Province of China No.QC2017070Fundamental Research Funds for the Central Universities of China No.HEUCFM180405.
文摘Person re-identification(Re-ID)is a fundamental subject in the field of the computer vision technologies.The traditional methods of person Re-ID have difficulty in solving the problems of person illumination,occlusion and attitude change under complex background.Meanwhile,the introduction of deep learning opens a new way of person Re-ID research and becomes a hot spot in this field.This study reviews the traditional methods of person Re-ID,then the authors focus on the related papers about different person Re-ID frameworks on the basis of deep learning,and discusses their advantages and disadvantages.Finally,they propose the direction of further research,especially the prospect of person Re-ID methods based on deep learning.
文摘Person re-identification(re-ID)aims to match images of the same pedestrian across different cameras.It plays an important role in the field of security and surveillance.Although it has been studied for many years,it is still considered as an unsolved problem.Since the rise of deep learning,the accuracy of supervised person re-ID on public datasets has reached the highest level.However,these methods are difficult to apply to real-life scenarios because a large number of labeled training data is required in this situation.Pedestrian identity labeling,especially cross-camera pedestrian identity labeling,is heavy and expensive.Why we cannot apply the pre-trained model directly to the unseen camera network?Due to the existence of domain bias between source and target environment,the accuracy on target dataset is always low.For example,the model trained on the mall needs to adapt to the new environment of airport obviously.Recently,some researches have been proposed to solve this problem,including clustering-based methods,GAN-based methods,co-training methods and unsupervised domain adaptation methods.
基金supported by the National Natural Science Foundation of China(6147115461876057)the Fundamental Research Funds for Central Universities(JZ2018YYPY0287)
文摘Person re-identification(re-id)involves matching a person across nonoverlapping views,with different poses,illuminations and conditions.Visual attributes are understandable semantic information to help improve the issues including illumination changes,viewpoint variations and occlusions.This paper proposes an end-to-end framework of deep learning for attribute-based person re-id.In the feature representation stage of framework,the improved convolutional neural network(CNN)model is designed to leverage the information contained in automatically detected attributes and learned low-dimensional CNN features.Moreover,an attribute classifier is trained on separate data and includes its responses into the training process of our person re-id model.The coupled clusters loss function is used in the training stage of the framework,which enhances the discriminability of both types of features.The combined features are mapped into the Euclidean space.The L2 distance can be used to calculate the distance between any two pedestrians to determine whether they are the same.Extensive experiments validate the superiority and advantages of our proposed framework over state-of-the-art competitors on contemporary challenging person re-id datasets.
基金supported by the the Natural Science Foundation of Jiangsu Province of China under Grant No.BK20211284the Financial and Science Technology Plan Project of Xinjiang Production and Construction Corps under Grant No.2020DB005.
文摘Person re-identification(ReID)aims to recognize the same person in multiple images from different camera views.Training person ReID models are time-consuming and resource-intensive;thus,cloud computing is an appropriate model training solution.However,the required massive personal data for training contain private information with a significant risk of data leakage in cloud environments,leading to significant communication overheads.This paper proposes a federated person ReID method with model-contrastive learning(MOON)in an edge-cloud environment,named FRM.Specifically,based on federated partial averaging,MOON warmup is added to correct the local training of individual edge servers and improve the model’s effectiveness by calculating and back-propagating a model-contrastive loss,which represents the similarity between local and global models.In addition,we propose a lightweight person ReID network,named multi-branch combined depth space network(MB-CDNet),to reduce the computing resource usage of the edge device when training and testing the person ReID model.MB-CDNet is a multi-branch version of combined depth space network(CDNet).We add a part branch and a global branch on the basis of CDNet and introduce an attention pyramid to improve the performance of the model.The experimental results on open-access person ReID datasets demonstrate that FRM achieves better performance than existing baseline.
基金This work is supported by the National Natural Science Foundation of China (NSFC, nos. 61340046), the National High Technology Research and Development Programme of China (863 Programme, no. 2006AA04Z247), the Scientific and Technical Innovation Commission of Shenzhen Municipality (nos. JCYJ20130331144631730), and the Specialized Research Fund for the Doctoral Programme of Higher Education (SRFDP, no. 20130001110011).
文摘Person re-identification (re-id) on robot platform is an important application for human-robot- interaction (HRI), which aims at making the robot recognize the around persons in varying scenes. Although many effective methods have been proposed for surveillance re-id in recent years, re-id on robot platform is still a novel unsolved problem. Most existing methods adapt the supervised metric learning offline to improve the accuracy. However, these methods can not adapt to unknown scenes. To solve this problem, an online re-id framework is proposed. Considering that robotics can afford to use high-resolution RGB-D sensors and clear human face may be captured, face information is used to update the metric model. Firstly, the metric model is pre-trained offline using labeled data. Then during the online stage, we use face information to mine incorrect body matching pairs which are collected to update the metric model online. In addition, to make full use of both appearance and skeleton information provided by RGB-D sensors, a novel feature funnel model (FFM) is proposed. Comparison studies show our approach is more effective and adaptable to varying environments.
文摘Person re-ID is becoming increasingly popular in the field of modern surveillance.The purpose of person re-ID is to retrieve person of interests in non-overlapping multi-camera surveillance system.Due to the complexity of the surveillance scene,the person images captured by cameras often have problems such as size variation,rotation,occlusion,illumination difference,etc.,which brings great challenges to the study of person re-ID.In recent years,studies based on deep learning have achieved great success in person re-ID.The improvement of basic networks and a large number of studies on the influencing factors have greatly improved the accuracy of person re-ID.Recently,some studies utilize GAN to tackle the domain adaptation task by transferring person images of source domain to the style of target domain and have achieved state of the art result in person re-ID.
基金supported in part by the Natural Science Foundation of Xinjiang Uygur Autonomous Region(Grant No.2022D01B186 and No.2022D01B05)。
文摘The attention mechanism can extract salient features in images,which has been proved to be effective in improving the performance of person re-identification(Re-ID).However,most of the existing attention modules have the following two shortcomings:On the one hand,they mostly use global average pooling to generate context descriptors,without highlighting the guiding role of salient information on descriptor generation,resulting in insufficient ability of the final generated attention mask representation;On the other hand,the design of most attention modules is complicated,which greatly increases the computational cost of the model.To solve these problems,this paper proposes an attention module called self-supervised recalibration(SR)block,which introduces both global and local information through adaptive weighted fusion to generate a more refined attention mask.In particular,a special"Squeeze-Excitation"(SE)unit is designed in the SR block to further process the generated intermediate masks,both for nonlinearizations of the features and for constraint of the resulting computation by controlling the number of channels.Furthermore,we combine the most commonly used Res Net-50 to construct the instantiation model of the SR block,and verify its effectiveness on multiple Re-ID datasets,especially the mean Average Precision(m AP)on the Occluded-Duke dataset exceeds the state-of-the-art(SOTA)algorithm by 4.49%.
文摘As an emerging visual task,vehicle re-identification refers to the identification of the same vehicle across multiple cameras.Herein,we propose a novel vehicle re-identification method that uses an improved ResNet-50 architecture and utilizes the topology information of a surveillance network to rerank the final results.In the training stage,we apply several data augmentation approaches to expand our training data and increase their diversity in a cost-effective manner.We reform the original RestNet-50 architecture by adding non-local blocks to implement the attention mechanism and replacing part of the batch normalization operations with instance batch normalization.After obtaining preliminary results from the proposed model,we use the reranking algorithm,whose core function is to improve the similarity scores of all images on the most likely path that the vehicle tends to appear to optimize the final results.Compared with most existing state-of-the-art methods,our method is lighter,requires less data annotation,and offers competitive performance.
基金supported by the National Natural Science Foundation of China(61471154,61876057)the Key Research and Development Program of Anhui Province-Special Project of Strengthening Science and Technology Police(202004D07020012).
文摘Person re-identification is a prevalent technology deployed on intelligent surveillance.There have been remarkable achievements in person re-identification methods based on the assumption that all person images have a sufficiently high resolution,yet such models are not applicable to the open world.In real world,the changing distance between pedestrians and the camera renders the resolution of pedestrians captured by the camera inconsistent.When low-resolution(LR)images in the query set are matched with high-resolution(HR)images in the gallery set,it degrades the performance of the pedestrian matching task due to the absent pedestrian critical information in LR images.To address the above issues,we present a dualstream coupling network with wavelet transform(DSCWT)for the cross-resolution person re-identification task.Firstly,we use the multi-resolution analysis principle of wavelet transform to separately process the low-frequency and high-frequency regions of LR images,which is applied to restore the lost detail information of LR images.Then,we devise a residual knowledge constrained loss function that transfers knowledge between the two streams of LR images and HR images for accessing pedestrian invariant features at various resolutions.Extensive qualitative and quantitative experiments across four benchmark datasets verify the superiority of the proposed approach.