期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
Robust Background Subtraction Method via Low-Rank and Structured Sparse Decomposition 被引量:1
1
作者 Minsheng Ma Ruimin Hu +2 位作者 Shihong Chen Jing Xiao Zhongyuan Wang 《China Communications》 SCIE CSCD 2018年第7期156-167,共12页
Background subtraction is a challenging problem in surveillance scenes. Although the low-rank and sparse decomposition(LRSD) methods offer an appropriate framework for background modeling, they fail to account for ima... Background subtraction is a challenging problem in surveillance scenes. Although the low-rank and sparse decomposition(LRSD) methods offer an appropriate framework for background modeling, they fail to account for image's local structure, which is favorable for this problem. Based on this, we propose a background subtraction method via low-rank and SILTP-based structured sparse decomposition, named LRSSD. In this method, a novel SILTP-inducing sparsity norm is introduced to enhance the structured presentation of the foreground region. As an assistance, saliency detection is employed to render a rough shape and location of foreground. The final refined foreground is decided jointly by sparse component and attention map. Experimental results on different datasets show its superiority over the competing methods, especially under noise and changing illumination scenarios. 展开更多
关键词 background subtraction LRSD structured sparse SILTP
在线阅读 下载PDF
Individualization of Head Related Impulse Responses Using Division Analysis 被引量:1
2
作者 Wei Chen Ruimin Hu +2 位作者 Xiaochen Wang Cheng Yang Lian Meng 《China Communications》 SCIE CSCD 2018年第5期92-103,共12页
For Virtual Reality(VR) to be truly immersive, it needs convincing sound to match. Due to the diversity of individual's anthropometric measurements, the individualized customization technology is needed to get con... For Virtual Reality(VR) to be truly immersive, it needs convincing sound to match. Due to the diversity of individual's anthropometric measurements, the individualized customization technology is needed to get convincing sound. In this paper, we proposed a simple and effective method for modeling relationships between anthropometric measurements and Head-related Impulse Response(HRIR). Considering the relationship between anthropometric measurements and different HRIR parts is complicated, we divided the HRIRs into small segments and carried out regression analysis between anthropometric measurements and each segment to establish relationship model. The results of objective simulation and subjective test indicate that the model can generate individualize HRIRs from a series of anthropometric measurements. With the individualized HRIRs, we can get more accurate acoustic localization sense than using non-individualized HRIRs. 展开更多
关键词 HRIR INDIVIDUALIZATION Division analysis
在线阅读 下载PDF
Interpolation Method of Head-Related Transfer Functions Based on Common-Pole/Zero Modeling
3
作者 Wei Chen Xiaochen Wang +2 位作者 Ruimin Hu Gang Li Weiping Tu 《China Communications》 SCIE CSCD 2020年第10期170-182,共13页
The head-related transfer function(HRTF)involves the cues for human auditory localization,which turns it into an essential item of virtual auditory display technology.In practice,the interpolation of HRTF is necessary... The head-related transfer function(HRTF)involves the cues for human auditory localization,which turns it into an essential item of virtual auditory display technology.In practice,the interpolation of HRTF is necessary for the virtual auditory display systems to achieve high spatial resolution.Traditional geometric-based interpolation methods are generally restrained by the spatial distribution of reference on HRTF.When the spatial distribution is sparse,the accuracy of interpolation decreases significantly.Therefore,an interpolation method using the common-pole/zero model and the fitting neural network is proposed.First,we propose a common-pole/zero model to represent HRTFs across multiple subjects,in which the low-dimensional features of the measured HRTFs are extracted.Then,for a new spatial direction,we predict the corresponding low-dimensional HRTF with a fitting neural network.Finally,we reconstruct the high-dimensional HRTF from the predicted low-dimensional HRTF.The simulation results suggest that the proposed method outperforms other interpolation methods such as Linear_AMBC,Bilinear_AMBC,and the Combination method. 展开更多
关键词 HRTF INTERPOLATION fitting neural network common-pole/zero model
在线阅读 下载PDF
Unequal Error Protection Based on Expanding Window Fountain for Object-Based 3D Audio
4
作者 YANG Cheng HU Ruimin +3 位作者 SONG Yucheng SU Liuyue WANG Xiaochen CHEN Wei 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2017年第4期323-328,共6页
This paper proposes an unequal error protection(UEP)coding method to improve the transmission performance of three-dimensional(3D)audio based on expanding window fountain(EWF).Different from other transmissions ... This paper proposes an unequal error protection(UEP)coding method to improve the transmission performance of three-dimensional(3D)audio based on expanding window fountain(EWF).Different from other transmissions with equal error protection(EEP)when transmitting the 3D audio objects.An approach of extracting the important audio object is presented,and more protection is given to more important audio object and comparatively less protection is given to the normal audio objects.Objective and subjective experiments have shown that the proposed UEP method achieves better performance than equal error protection method,while the bits error rates(BER)of the important audio object can decrease from 10^(–3) to 10^(–4),and the subjective quality of UEP is better than that of EEP by 14%. 展开更多
关键词 object-based 3D audio unequal error protection equal error protection
原文传递
Pedestrian Attributes Recognition in Surveillance Scenarios with Hierarchical Multi-Task CNN Models 被引量:2
5
作者 Wenhua Fang Jun Chen Ruimin Hu 《China Communications》 SCIE CSCD 2018年第12期208-219,共12页
Pedestrian attributes recognition is a very important problem in video surveillance and video forensics. Traditional methods assume the pedestrian attributes are independent and design handcraft features for each one.... Pedestrian attributes recognition is a very important problem in video surveillance and video forensics. Traditional methods assume the pedestrian attributes are independent and design handcraft features for each one. In this paper, we propose a joint hierarchical multi-task learning algorithm to learn the relationships among attributes for better recognizing the pedestrian attributes in still images using convolutional neural networks(CNN). We divide the attributes into local and global ones according to spatial and semantic relations, and then consider learning semantic attributes through a hierarchical multi-task CNN model where each CNN in the first layer will predict each group of such local attributes and CNN in the second layer will predict the global attributes. Our multi-task learning framework allows each CNN model to simultaneously share visual knowledge among different groups of attribute categories. Extensive experiments are conducted on two popular and challenging benchmarks in surveillance scenarios, namely, the PETA and RAP pedestrian attributes datasets. On both benchmarks, our framework achieves superior results over the state-of-theart methods by 88.2% on PETA and 83.25% on RAP, respectively. 展开更多
关键词 attributes RECOGNITION CNN MULTI-TASK learning
在线阅读 下载PDF
High Quality Audio Object Coding Framework Based on Non-Negative Matrix Factorization 被引量:1
6
作者 Tingzhao Wu Ruimin Hu +2 位作者 Xiaochen Wang Shanfa Ke Jinshan Wang 《China Communications》 SCIE CSCD 2017年第9期32-41,共10页
Object-based audio coding is the main technique of audio scene coding. It can effectively reconstruct each object trajectory, besides provide sufficient flexibility for personalized audio scene reconstruction. So more... Object-based audio coding is the main technique of audio scene coding. It can effectively reconstruct each object trajectory, besides provide sufficient flexibility for personalized audio scene reconstruction. So more and more attentions have been paid to the object-based audio coding. However, existing object-based techniques have poor sound quality because of low parameter frequency domain resolution. In order to achieve high quality audio object coding, we propose a new coding framework with introducing the non-negative matrix factorization(NMF) method. We extract object parameters with high resolution to improve sound quality, and apply NMF method to parameter coding to reduce the high bitrate caused by high resolution. And the experimental results have shown that the proposed framework can improve the coding quality by 25%, so it can provide a better solution to encode audio scene in a more flexible and higher quality way. 展开更多
关键词 object-based AUDIO CODING non-negative matrix FACTORIZATION AUDIO scenecoding
在线阅读 下载PDF
Urban Scene Semantic Segmentation with Insufficient Labeled Data
7
作者 Qi Zheng Jun Chen +1 位作者 Peng Huang Ruimin Hu 《China Communications》 SCIE CSCD 2019年第11期212-221,共10页
Semantic segmentation of urban scenes is an enabling factor for a wide range of applications.With the development of deep learning in recent years,semantic segmentation tasks using high-capacity models have achieved c... Semantic segmentation of urban scenes is an enabling factor for a wide range of applications.With the development of deep learning in recent years,semantic segmentation tasks using high-capacity models have achieved considerable successes on large datasets.However,the pixel-level annotation process,especially for urban scene images with various objects,is tedious and labor intensive.Meanwhile,the scale of the unlabeled data,which is currently easy to collect,is often much larger than labeled data.Thus,using the abundant unlabeled data to make up the loss of the segmentation model from insufficient labeled data is of great interest.In this paper,we propose a semi-supervised method based on reinforcement learning to capture the contextual information from the unlabeled data to improve the model trained on the small scale labeled data.Both quantitative and qualitative experiments have shown the effectiveness of the proposed method. 展开更多
关键词 SEMANTIC SEGMENTATION SEMI-SUPERVISED LEARNING REINFORCEMENT LEARNING
在线阅读 下载PDF
Remote sensing tuning:A survey
8
作者 Dongshuo Yin Ting-Feng Zhao +4 位作者 Deng-Ping Fan Shutao Li Bo Du Xian Sun Shi-Min Hu 《Computational Visual Media》 2025年第5期897-937,共41页
Large models have accelerated the development of intelligent interpretation in remote sensing.Many remote sensing foundation models(RSFM)have emerged in recent years,sparking a new wave of deep learning in this field.... Large models have accelerated the development of intelligent interpretation in remote sensing.Many remote sensing foundation models(RSFM)have emerged in recent years,sparking a new wave of deep learning in this field.Fine-tuning techniques serve as a bridge between remote sensing downstream tasks and advanced foundation models.As RSFMs become more powerful,fine-tuning techniques are expected to lead the next research frontier in numerous critical remote sensing applications.Advanced fine-tuning techniques can reduce the data and computational resource requirements during the downstream adaptation process.Current fine-tuning techniques for remote sensing are still in their early stages,leaving a large space for optimization and application.To elucidate the current development and future trends of remote sensing fine-tuning techniques,this survey offers a comprehensive overview of recent research.Specifically,this survey summarizes the applications and innovations of each work and categorizes recent remote sensing fine-tuning techniques into six types:adapter-based,prompt-based,reparameterization-based,hybrid methods,partial tuning,and improved tuning. 展开更多
关键词 remote sensing deep learning foundation models fine-tuning pre-training
原文传递
Shape-intensity knowledge distillation for robust medical image segmentation
9
作者 Wenhui DONG Bo DU Yongchao XU 《Frontiers of Computer Science》 2025年第9期123-136,共14页
Many medical image segmentation methods have achieved impressive results.Yet,most existing methods do not take into account the shape-intensity prior information.This may lead to implausible segmentation results,in pa... Many medical image segmentation methods have achieved impressive results.Yet,most existing methods do not take into account the shape-intensity prior information.This may lead to implausible segmentation results,in particular for images of unseen datasets.In this paper,we propose a novel approach to incorporate joint shape-intensity prior information into the segmentation network.Specifically,we first train a segmentation network(regarded as the teacher network)on class-wise averaged training images to extract valuable shape-intensity information,which is then transferred to a student segmentation network with the same network architecture as the teacher via knowledge distillation.In this way,the student network regarded as the final segmentation model can effectively integrate the shape-intensity prior information,yielding more accurate segmentation results.Despite its simplicity,experiments on five medical image segmentation tasks of different modalities demonstrate that the proposed Shape-Intensity Knowledge Distillation(SIKD)consistently improves several baseline models(including recent MaxStyle and SAMed)under intra-dataset evaluation,and significantly improves the cross-dataset generalization ability.The source code will be publicly available after acceptance. 展开更多
关键词 medical image segmentation knowledge distillation shape-intensity prior deep neural network
原文传递
Learning a generalizable re-identification model from unlabelled data with domain-agnostic expert
10
作者 Fangyi Liu Mang Ye Bo Du 《Visual Intelligence》 2024年第1期337-349,共13页
In response to real-world scenarios,the domain generalization(DG)problem has spurred considerable research in person re-identification(ReID).This challenge arises when the target domain,which is significantly differen... In response to real-world scenarios,the domain generalization(DG)problem has spurred considerable research in person re-identification(ReID).This challenge arises when the target domain,which is significantly different from the source domains,remains unknown.However,the performance of current DG ReID relies heavily on labor-intensive source domain annotations.Considering the potential of unlabeled data,we investigate unsupervised domain generalization(UDG)in ReID.Our goal is to create a model that can generalize from unlabeled source domains to semantically retrieve images in an unseen target domain.To address this,we propose a new approach that trains a domain-agnostic expert(DaE)for unsupervised domain-generalizable person ReID.This involves independently training multiple experts to account for label space inconsistencies between source domains.At the same time,the DaE captures domain-generalizable information for testing.Our experiments demonstrate the effectiveness of this method for learning generalizable features under the UDG setting.The results demonstrate the superiority of our method over state-of-the-art techniques.We will make our code and models available for public use. 展开更多
关键词 Domain generalization(DG) Unlabeled source domains Label space inconsistencies Domain-agnostic expert(DaE)
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部