Color inconsistency between views is an important problem to be solved in multi-view video applications, such as free viewpoint television and other three-dimensional video systems. In this paper, by combining with mu...Color inconsistency between views is an important problem to be solved in multi-view video applications, such as free viewpoint television and other three-dimensional video systems. In this paper, by combining with multi-view video coding, a coding-oriented multi-view video color correction method is proposed. We first separate foreground and background in first Group Of Pictures (GOP) by using SKIP coding mode. Then by transferring means and standard deviations in backgrounds, color correction is performed for each frame in GOP, and multi-view video coding is performed and used to renew the backgrounds. Experimental results ances in color correction and multi-view video show the proposed method can obtain better performcoding.展开更多
Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are co...Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are constructed with sequential conditional probability in HSI color space. Then, dynamic programming is used to seek the best color mapping relation with the minimum cost path between target image histogram and source image histogram. Finally, video tracking technique is performed to correct multi-view video. Experimental results show that the proposed method can obtain better subjective and objective performance in color correction.展开更多
The rate and distortion of Id-slice do not fit the globally linear relationship on a logarithmic scale. Lagrange multiplier selection methods based on the globally linear approximate relationship are neither efficient...The rate and distortion of Id-slice do not fit the globally linear relationship on a logarithmic scale. Lagrange multiplier selection methods based on the globally linear approximate relationship are neither efficient nor optimal for multi-view video coding (MVC). To improve the coding efficiency of MVC, a local curve fitting based Lagrange multiplier selection method is proposed in this paper, where Lagrange multipliers are selected according to the local slopes of the approximate curves. Experi-mental results showed that the proposed method improves the coding efficiency. Up to 2.5 dB gain was achieved at low bitrates.展开更多
Current multi-view video coding (MVC) reference model in joint video team (JVT) does not provide efficient rate control schemes. This paper presents a rate control algorithm for MVC by improving the quadratic rate...Current multi-view video coding (MVC) reference model in joint video team (JVT) does not provide efficient rate control schemes. This paper presents a rate control algorithm for MVC by improving the quadratic rate-distortion (R-D) model. We reasonably allocate bit-rate among views based on the correlation analysisl The proposed algorithm consists of three levels to control the rate bits more accurately, of which the frame layer allocates bits according to the frame complexity and the temporal activity. Extensive experiments show that the proposed algorithm can control the bit rate efficiently.展开更多
A novel color compensation method for multi-view video coding (MVC) is proposed, which efficiently exploits the inter-view dependencies between views with the existence of color mismatch caused by the diversity of cam...A novel color compensation method for multi-view video coding (MVC) is proposed, which efficiently exploits the inter-view dependencies between views with the existence of color mismatch caused by the diversity of cameras. A color compensation model is developed in RGB channels and then extended to YCbCr channels for practical use. A modified inter-view reference picture is constructed based on the color compensation model, which is more similar to the coding picture than the original inter-view reference picture. Moreover, the color compensation factors can be derived in both encoder and decoder, therefore no additional data need to be transmitted to the decoder. The experimental results show that the proposed method improves the coding efficiency of MVC and maintains good subjective quality.展开更多
The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduce...The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.展开更多
The authors propose a novel method for transporting multi-view videos that aims to keep the bandwidth requirements on both end-users and servers as low as possible. The method is based on application layer multicast, ...The authors propose a novel method for transporting multi-view videos that aims to keep the bandwidth requirements on both end-users and servers as low as possible. The method is based on application layer multicast, where each end point re- ceives only a selected number of views required for rendering video from its current viewpoint at any given time. The set of selected videos changes in real time as the user’s viewpoint changes because of head or eye movements. Techniques for reducing the black-outs during fast viewpoint changes were investigated. The performance of the approach was studied through network experiments.展开更多
Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based ...Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based rendering (IBR). Color correction between views is necessary to use multi-view systems in IBR to make audiences feel comfortable when views are switched or when a free viewpoint video is displayed. Color correction usually involves two steps: the first is to adjust camera parameters such as gain, brightness, and aperture before capture, and the second is to modify captured videos through image processing. This paper deals with the latter, which does not need a color pattern board. The proposed method uses scale invariant feature transform (SIFT) to detect correspondences, treats RGB channels independently, calculates lookup tables with an energy-minimization approach, and corrects captured video with these tables. The experimental results reveal that this approach works well.展开更多
With the continuous advancement of unmanned technology in various application domains,the development and deployment of blind-spot-free panoramic video systems have gained increasing importance.Such systems are partic...With the continuous advancement of unmanned technology in various application domains,the development and deployment of blind-spot-free panoramic video systems have gained increasing importance.Such systems are particularly critical in battlefield environments,where advanced panoramic video processing and wireless communication technologies are essential to enable remote control and autonomous operation of unmanned ground vehicles(UGVs).However,conventional video surveillance systems suffer from several limitations,including limited field of view,high processing latency,low reliability,excessive resource consumption,and significant transmission delays.These shortcomings impede the widespread adoption of UGVs in battlefield settings.To overcome these challenges,this paper proposes a novel multi-channel video capture and stitching system designed for real-time video processing.The system integrates the Speeded-Up Robust Features(SURF)algorithm and the Fast Library for Approximate Nearest Neighbors(FLANN)algorithm to execute essential operations such as feature detection,descriptor computation,image matching,homography estimation,and seamless image fusion.The fused panoramic video is then encoded and assembled to produce a seamless output devoid of stitching artifacts and shadows.Furthermore,H.264 video compression is employed to reduce the data size of the video stream without sacrificing visual quality.Using the Real-Time Streaming Protocol(RTSP),the compressed stream is transmitted efficiently,supporting real-time remote monitoring and control of UGVs in dynamic battlefield environments.Experimental results indicate that the proposed system achieves high stability,flexibility,and low latency.With a wireless link latency of 30 ms,the end-to-end video transmission latency remains around 140 ms,enabling smooth video communication.The system can tolerate packet loss rates(PLR)of up to 20%while maintaining usable video quality(with latency around 200 ms).These properties make it well-suited for mobile communication scenarios demanding high real-time video performance.展开更多
Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rel...Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rely on large amounts of labeled data,which are costly and time-consuming to obtain,especially in largescale or dynamic environments.To address this challenge,we propose the Semi-Supervised Multi-View Picture Fuzzy Clustering(SS-MPFC)algorithm,which improves segmentation accuracy and robustness,particularly in complex and uncertain remote sensing scenarios.SS-MPFC unifies three paradigms:semi-supervised learning,multi-view clustering,and picture fuzzy set theory.This integration allows the model to effectively utilize a small number of labeled samples,fuse complementary information from multiple data views,and handle the ambiguity and uncertainty inherent in satellite imagery.We design a novel objective function that jointly incorporates picture fuzzy membership functions across multiple views of the data,and embeds pairwise semi-supervised constraints(must-link and cannot-link)directly into the clustering process to enhance segmentation accuracy.Experiments conducted on several benchmark satellite datasets demonstrate that SS-MPFC significantly outperforms existing state-of-the-art methods in segmentation accuracy,noise robustness,and semantic interpretability.On the Augsburg dataset,SS-MPFC achieves a Purity of 0.8158 and an Accuracy of 0.6860,highlighting its outstanding robustness and efficiency.These results demonstrate that SSMPFC offers a scalable and effective solution for real-world satellite-based monitoring systems,particularly in scenarios where rapid annotation is infeasible,such as wildfire tracking,agricultural monitoring,and dynamic urban mapping.展开更多
Scalable simulation leveraging real-world data plays an essential role in advancing autonomous driving,owing to its efficiency and applicability in both training and evaluating algorithms.Consequently,there has been i...Scalable simulation leveraging real-world data plays an essential role in advancing autonomous driving,owing to its efficiency and applicability in both training and evaluating algorithms.Consequently,there has been increasing attention on generating highly realistic and consistent driving videos,particularly those involving viewpoint changes guided by the control commands or trajectories of ego vehicles.However,current reconstruction approaches,such as Neural Radiance Fields and 3D Gaussian Splatting,frequently suffer from limited generalization and depend on substantial input data.Meanwhile,2D generative models,though capable of producing unknown scenes,still have room for improvement in terms of coherence and visual realism.To overcome these challenges,we introduce GenScene,a world model that synthesizes front-view driving videos conditioned on trajectories.A new temporal module is presented to improve video consistency by extracting the global context of each frame,calculating relationships of frames using these global representations,and fusing frame contexts accordingly.Moreover,we propose an innovative attention mechanism that computes relations of pixels within each frame and pixels in the corresponding window range of the initial frame.Extensive experiments show that our approach surpasses various state-of-the-art models in driving video generation,and the introduced modules contribute significantly to model performance.This work establishes a new paradigm for goal-oriented video synthesis in autonomous driving,which facilitates on-demand simulation to expedite algorithm development.展开更多
Background:This study aims to investigate the underlying mechanisms between parental marital conflict and adolescent short video dependence by constructing a chain mediation model,focusing on the mediating roles of ex...Background:This study aims to investigate the underlying mechanisms between parental marital conflict and adolescent short video dependence by constructing a chain mediation model,focusing on the mediating roles of experiential avoidance and emotional disturbance(anxiety,depression,and stress).Methods:Conducted in January 2025,the research recruited 4125 adolescents from multiple Chinese provinces through convenience sampling;after data cleaning,3957 valid participants(1959 males,1998 females)were included.Using a cross-sectional design,measures included parental marital conflict,experiential avoidance,anxiety,depression,stress,and short video dependence.Results:Pearson correlation analysis revealed significant positive correlations among all variables.Mediation analysis using the SPSS PROCESS macro showed that parental marital conflict directly predicted short video dependence(β=0.269,p<0.001),and also significantly predicted experiential avoidance(β=0.519,p<0.001),anxiety(β=0.072,p<0.001),depression(β=0.067,p<0.001),and stress(β=0.048,p<0.05).Experiential avoidance further predicted anxiety(β=0.521,p<0.001),depression(β=0.489,p<0.001),stress(β=0.408,p<0.001),and short video dependence(β=0.244,p<0.001).While both anxiety(β=0.050,p<0.05)and depression(β=0.116,p<0.001)positively predicted short video dependence,stress did not(β=0.019,p=0.257).Overall,experiential avoidance,anxiety,depression,and stress significantly mediated the relationship between parental marital conflict and short video dependence.Conclusion:These findings confirm that parental marital conflict not only directly influences adolescent short video dependence but also operates through a chain mediation pathway involving experiential avoidance and emotional disturbance,highlighting central psychological mechanisms and providing theoretical support for integrated mental health and behavioral interventions.展开更多
Background:In the Chinese context,the impact of short video applications on the psychological well-being of older adults is contested.While often examined through a pathological lens of addiction,this perspective may ...Background:In the Chinese context,the impact of short video applications on the psychological well-being of older adults is contested.While often examined through a pathological lens of addiction,this perspective may overlook paradoxical,context-dependent positive outcomes.Therefore,the main objective of this study is to challenge the traditional Compensatory Internet Use Theory by proposing and testing a chained mediation model that explores a paradoxical pathway from social support to life satisfaction via problematic social media use.Methods:Data were collected between July and August 2025 via the Credamo online survey platform,yielding 384 valid responses from Chinese older adults aged 60 and above.Key constructs were assessed using the Social Support Rating Scale(SSRS),Bergen Social Media Addiction Scale(BSMAS),Simplified UCLA Loneliness Scale,and Satisfaction with Life Scale(SWLS).A chained mediation model was tested using stepwise regression and non-parametric bootstrapping(5000 resamples),controlling for age,gender,household income,and health status.Results:The analysis revealed a paradoxical pathway,which was clarified by a key statistical suppression effect.Social support significantly and positively predicted problematic usage(β=0.157,p=0.002).After controlling for the suppressor effect of social support,problematic usage in turn negatively predicted social connectedness(β=−0.177,p<0.001).Finally,reduced social connectedness—reflecting a state of solitude—positively predicted life satisfaction(β=−0.227,p<0.001).Conclusion:The findings suggest that for older adults with sufficient offline social support,these resources may serve a“social empowerment”function.This empowerment allows behaviors measured as“problematic usage”to be theoretically reframed as a form of“deep immersive entertainment”.This immersion appears to occur alongside a state of“high-quality solitude”,which ultimately is associated with higher life satisfaction.This study provides a novel,non-pathological theoretical perspective on the consequences of high engagement with emerging social media,offering empirical grounds for non-abstinence-based intervention strategies.展开更多
Multi-view video coding (MVC) comprises rich 3D information and is widely used in new visual media, such as 3DTV and free viewpoint TV (FTV). However, even with mainstream computer manufacturers migrating to multi...Multi-view video coding (MVC) comprises rich 3D information and is widely used in new visual media, such as 3DTV and free viewpoint TV (FTV). However, even with mainstream computer manufacturers migrating to multi-core processors, the huge computational requirement of MVC currently prohibits its wide use in consumer markets. In this paper, we demonstrate the design and implementation of the first parallel MVC system on Cell Broadband Engine^TM processor which is a state-of-the-art multi-core processor. We propose a task-dispatching algorithm which is adaptive data-driven on the frame level for MVC, and implement a parallel multi-view video decoder with modified H.264/AVC codec on real machine. This approach provides scalable speedup (up to 16 times on sixteen cores) through proper local store management, utilization of code locality and SIMD improvement. Decoding speed, speedup and utilization rate of cores are expressed in experimental results.展开更多
New video applications, such as 3D video and free viewpoint video, require efficient compression of multi-view video. In addition to temporal redundancy, exploiting the inter-view redundancy is crucial to improve the ...New video applications, such as 3D video and free viewpoint video, require efficient compression of multi-view video. In addition to temporal redundancy, exploiting the inter-view redundancy is crucial to improve the performance of multi-view video coding. In this paper, we present a novel method to construct the optimal inter-view prediction structure for multi-view video coding using simulated annealing. In the proposed model, the design of the prediction structure is converted to the arrangement of coding order. Then, a simulated annealing algorithm is employed to minimize the total cost for obtaining the best coding order. This method is applicable to arbitrary irregular camera arrangements. As experiment results reveal, the annealing process converges to satisfactory results rapidly and the generated optimal prediction structure outperforms the reference prediction structure of the joint multi-view video model (JMVM) by 0.1-0.8 dB PSNR gains.展开更多
Distributed video coding (DVC) is a new video coding approach based on Wyner-Ziv theorem. The novel uplink-friendly DVC, which offers low-complexity, low-power consuming, and low-cost video encoding, has aroused mor...Distributed video coding (DVC) is a new video coding approach based on Wyner-Ziv theorem. The novel uplink-friendly DVC, which offers low-complexity, low-power consuming, and low-cost video encoding, has aroused more and more research interests. In this paper a new method based on multiple view geometry is presented for spatial side information generation of uncalibrated video sensor network. Trifocal tensor encapsulates all the geometric relations among three views that are independent of scene structure; it can be computed from image correspondences alone without requiring knowledge of the motion or calibration. Simulation results show that trifocal tensor-based spatial side information improves the rate-distortion performance over motion compensation based interpolation side information by a maximum gap of around 2dB. Then fusion merges the different side information (temporal and spatial) in order to improve the quality of the final one. Simulation results show that the rate-distortion gains about 0.4 dB.展开更多
Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy cl...Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy clustering techniques,such as Fuzzy C-Means(FCM),face significant challenges in handling uncertainty and the dependencies between different views.To overcome these limitations,we introduce a new multi-view fuzzy clustering approach that integrates picture fuzzy sets with a dual-anchor graph method for multi-view data,aiming to enhance clustering accuracy and robustness,termed Multi-view Picture Fuzzy Clustering(MPFC).In particular,the picture fuzzy set theory extends the capability to represent uncertainty by modeling three membership levels:membership degrees,neutral degrees,and refusal degrees.This allows for a more flexible representation of uncertain and conflicting data than traditional fuzzy models.Meanwhile,dual-anchor graphs exploit the similarity relationships between data points and integrate information across views.This combination improves stability,scalability,and robustness when handling noisy and heterogeneous data.Experimental results on several benchmark datasets demonstrate significant improvements in clustering accuracy and efficiency,outperforming traditional methods.Specifically,the MPFC algorithm demonstrates outstanding clustering performance on a variety of datasets,attaining a Purity(PUR)score of 0.6440 and an Accuracy(ACC)score of 0.6213 for the 3 Sources dataset,underscoring its robustness and efficiency.The proposed approach significantly contributes to fields such as pattern recognition,multi-view relational data analysis,and large-scale clustering problems.Future work will focus on extending the method for semi-supervised multi-view clustering,aiming to enhance adaptability,scalability,and performance in real-world applications.展开更多
The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches...The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.展开更多
Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various as...Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various aspects.By integrating multi-view information into phenotypic prediction,a multi-view best linear unbiased prediction(MVBLUP)method is proposed in this paper.To measure the importance of multiple data views,the differential evolution algorithm with an early stopping mechanism is used,by which we obtain a multi-view kinship matrix and then incorporate it into the BLUP model for phenotypic prediction.To further illustrate the characteristics of MVBLUP,we perform the empirical experiments on four multi-view datasets in different crops.Compared to the single-view method,the prediction accuracy of the MVBLUP method has improved by 0.038–0.201 on average.The results demonstrate that the MVBLUP is an effective integrative prediction method for multi-view data.展开更多
基金the National Natural Science Foundation of China (No.60672073, No.60872094)the Program for New Century Excellent Talents in University (NCET-06-0537)+2 种基金the Key Project of Chinese Ministry of Education (No. 206059)Scientific Research Fund of Zhejiang Provincial Education Department (No.20070962)the Natural Science Foundation of Ningbo (No.2008A610016).
文摘Color inconsistency between views is an important problem to be solved in multi-view video applications, such as free viewpoint television and other three-dimensional video systems. In this paper, by combining with multi-view video coding, a coding-oriented multi-view video color correction method is proposed. We first separate foreground and background in first Group Of Pictures (GOP) by using SKIP coding mode. Then by transferring means and standard deviations in backgrounds, color correction is performed for each frame in GOP, and multi-view video coding is performed and used to renew the backgrounds. Experimental results ances in color correction and multi-view video show the proposed method can obtain better performcoding.
基金supported by the National Natural Science Foundation of China (60672073)the Program for New Century Excellent Talents in University (NCET-06-0537)+1 种基金the Natural Science Foundation of Ningbo (2008A610016)the K.C.Wong Magna Fund in Ningbo University.
文摘Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are constructed with sequential conditional probability in HSI color space. Then, dynamic programming is used to seek the best color mapping relation with the minimum cost path between target image histogram and source image histogram. Finally, video tracking technique is performed to correct multi-view video. Experimental results show that the proposed method can obtain better subjective and objective performance in color correction.
基金Project (Nos. 60505017 and 60534070) supported by the National Natural Science Foundation of China
文摘The rate and distortion of Id-slice do not fit the globally linear relationship on a logarithmic scale. Lagrange multiplier selection methods based on the globally linear approximate relationship are neither efficient nor optimal for multi-view video coding (MVC). To improve the coding efficiency of MVC, a local curve fitting based Lagrange multiplier selection method is proposed in this paper, where Lagrange multipliers are selected according to the local slopes of the approximate curves. Experi-mental results showed that the proposed method improves the coding efficiency. Up to 2.5 dB gain was achieved at low bitrates.
基金supported by the National Natural Science Foundation of China (Grant Nos.60832003,60672052,60902085,60972137)the Key Project of Shanghai Municipal Education Commission (Grant No.09ZZ90)+2 种基金the Natural Science Foundation of Shanghai(Grant No.09ZR1412500)the Innovation Foundation of Shanghai University (Grants Nos.10YZ09,SHUCX091061)the Shuguang Plan of Shanghai Education Development Foundation (Grant No.06SG43)
文摘Current multi-view video coding (MVC) reference model in joint video team (JVT) does not provide efficient rate control schemes. This paper presents a rate control algorithm for MVC by improving the quadratic rate-distortion (R-D) model. We reasonably allocate bit-rate among views based on the correlation analysisl The proposed algorithm consists of three levels to control the rate bits more accurately, of which the frame layer allocates bits according to the frame complexity and the temporal activity. Extensive experiments show that the proposed algorithm can control the bit rate efficiently.
基金Project supported by the National Natural Science Foundation of China (No. 60772134)the Innovation Foundation of Xidian University,China (No. Chuang 05018)
文摘A novel color compensation method for multi-view video coding (MVC) is proposed, which efficiently exploits the inter-view dependencies between views with the existence of color mismatch caused by the diversity of cameras. A color compensation model is developed in RGB channels and then extended to YCbCr channels for practical use. A modified inter-view reference picture is constructed based on the color compensation model, which is more similar to the coding picture than the original inter-view reference picture. Moreover, the color compensation factors can be derived in both encoder and decoder, therefore no additional data need to be transmitted to the decoder. The experimental results show that the proposed method improves the coding efficiency of MVC and maintains good subjective quality.
基金Project(08Y29-7)supported by the Transportation Science and Research Program of Jiangsu Province,ChinaProject(201103051)supported by the Major Infrastructure Program of the Health Monitoring System Hardware Platform Based on Sensor Network Node,China+1 种基金Project(61100111)supported by the National Natural Science Foundation of ChinaProject(BE2011169)supported by the Scientific and Technical Supporting Program of Jiangsu Province,China
文摘The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.
基金Project (No. 511568) supported by the European Commissionwithin Framework Program 6 with the acronym 3DTV
文摘The authors propose a novel method for transporting multi-view videos that aims to keep the bandwidth requirements on both end-users and servers as low as possible. The method is based on application layer multicast, where each end point re- ceives only a selected number of views required for rendering video from its current viewpoint at any given time. The set of selected videos changes in real time as the user’s viewpoint changes because of head or eye movements. Techniques for reducing the black-outs during fast viewpoint changes were investigated. The performance of the approach was studied through network experiments.
文摘Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based rendering (IBR). Color correction between views is necessary to use multi-view systems in IBR to make audiences feel comfortable when views are switched or when a free viewpoint video is displayed. Color correction usually involves two steps: the first is to adjust camera parameters such as gain, brightness, and aperture before capture, and the second is to modify captured videos through image processing. This paper deals with the latter, which does not need a color pattern board. The proposed method uses scale invariant feature transform (SIFT) to detect correspondences, treats RGB channels independently, calculates lookup tables with an energy-minimization approach, and corrects captured video with these tables. The experimental results reveal that this approach works well.
基金supported by the National Natural Science Foundation of China(Grant No.72334003)the National Key Research and Development Program of China(Grant No.2022YFB2702804)+1 种基金the Shandong Key Research and Development Program(Grant No.2020ZLYS09)the Jinan Program(Grant No.2021GXRC084-2).
文摘With the continuous advancement of unmanned technology in various application domains,the development and deployment of blind-spot-free panoramic video systems have gained increasing importance.Such systems are particularly critical in battlefield environments,where advanced panoramic video processing and wireless communication technologies are essential to enable remote control and autonomous operation of unmanned ground vehicles(UGVs).However,conventional video surveillance systems suffer from several limitations,including limited field of view,high processing latency,low reliability,excessive resource consumption,and significant transmission delays.These shortcomings impede the widespread adoption of UGVs in battlefield settings.To overcome these challenges,this paper proposes a novel multi-channel video capture and stitching system designed for real-time video processing.The system integrates the Speeded-Up Robust Features(SURF)algorithm and the Fast Library for Approximate Nearest Neighbors(FLANN)algorithm to execute essential operations such as feature detection,descriptor computation,image matching,homography estimation,and seamless image fusion.The fused panoramic video is then encoded and assembled to produce a seamless output devoid of stitching artifacts and shadows.Furthermore,H.264 video compression is employed to reduce the data size of the video stream without sacrificing visual quality.Using the Real-Time Streaming Protocol(RTSP),the compressed stream is transmitted efficiently,supporting real-time remote monitoring and control of UGVs in dynamic battlefield environments.Experimental results indicate that the proposed system achieves high stability,flexibility,and low latency.With a wireless link latency of 30 ms,the end-to-end video transmission latency remains around 140 ms,enabling smooth video communication.The system can tolerate packet loss rates(PLR)of up to 20%while maintaining usable video quality(with latency around 200 ms).These properties make it well-suited for mobile communication scenarios demanding high real-time video performance.
基金funded by the Research Project:THTETN.05/24-25,VietnamAcademy of Science and Technology.
文摘Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rely on large amounts of labeled data,which are costly and time-consuming to obtain,especially in largescale or dynamic environments.To address this challenge,we propose the Semi-Supervised Multi-View Picture Fuzzy Clustering(SS-MPFC)algorithm,which improves segmentation accuracy and robustness,particularly in complex and uncertain remote sensing scenarios.SS-MPFC unifies three paradigms:semi-supervised learning,multi-view clustering,and picture fuzzy set theory.This integration allows the model to effectively utilize a small number of labeled samples,fuse complementary information from multiple data views,and handle the ambiguity and uncertainty inherent in satellite imagery.We design a novel objective function that jointly incorporates picture fuzzy membership functions across multiple views of the data,and embeds pairwise semi-supervised constraints(must-link and cannot-link)directly into the clustering process to enhance segmentation accuracy.Experiments conducted on several benchmark satellite datasets demonstrate that SS-MPFC significantly outperforms existing state-of-the-art methods in segmentation accuracy,noise robustness,and semantic interpretability.On the Augsburg dataset,SS-MPFC achieves a Purity of 0.8158 and an Accuracy of 0.6860,highlighting its outstanding robustness and efficiency.These results demonstrate that SSMPFC offers a scalable and effective solution for real-world satellite-based monitoring systems,particularly in scenarios where rapid annotation is infeasible,such as wildfire tracking,agricultural monitoring,and dynamic urban mapping.
基金supported by the Cultivation Program for Major Scientific Research Projects of Harbin Institute of Technology(ZDXMPY20180109).
文摘Scalable simulation leveraging real-world data plays an essential role in advancing autonomous driving,owing to its efficiency and applicability in both training and evaluating algorithms.Consequently,there has been increasing attention on generating highly realistic and consistent driving videos,particularly those involving viewpoint changes guided by the control commands or trajectories of ego vehicles.However,current reconstruction approaches,such as Neural Radiance Fields and 3D Gaussian Splatting,frequently suffer from limited generalization and depend on substantial input data.Meanwhile,2D generative models,though capable of producing unknown scenes,still have room for improvement in terms of coherence and visual realism.To overcome these challenges,we introduce GenScene,a world model that synthesizes front-view driving videos conditioned on trajectories.A new temporal module is presented to improve video consistency by extracting the global context of each frame,calculating relationships of frames using these global representations,and fusing frame contexts accordingly.Moreover,we propose an innovative attention mechanism that computes relations of pixels within each frame and pixels in the corresponding window range of the initial frame.Extensive experiments show that our approach surpasses various state-of-the-art models in driving video generation,and the introduced modules contribute significantly to model performance.This work establishes a new paradigm for goal-oriented video synthesis in autonomous driving,which facilitates on-demand simulation to expedite algorithm development.
文摘Background:This study aims to investigate the underlying mechanisms between parental marital conflict and adolescent short video dependence by constructing a chain mediation model,focusing on the mediating roles of experiential avoidance and emotional disturbance(anxiety,depression,and stress).Methods:Conducted in January 2025,the research recruited 4125 adolescents from multiple Chinese provinces through convenience sampling;after data cleaning,3957 valid participants(1959 males,1998 females)were included.Using a cross-sectional design,measures included parental marital conflict,experiential avoidance,anxiety,depression,stress,and short video dependence.Results:Pearson correlation analysis revealed significant positive correlations among all variables.Mediation analysis using the SPSS PROCESS macro showed that parental marital conflict directly predicted short video dependence(β=0.269,p<0.001),and also significantly predicted experiential avoidance(β=0.519,p<0.001),anxiety(β=0.072,p<0.001),depression(β=0.067,p<0.001),and stress(β=0.048,p<0.05).Experiential avoidance further predicted anxiety(β=0.521,p<0.001),depression(β=0.489,p<0.001),stress(β=0.408,p<0.001),and short video dependence(β=0.244,p<0.001).While both anxiety(β=0.050,p<0.05)and depression(β=0.116,p<0.001)positively predicted short video dependence,stress did not(β=0.019,p=0.257).Overall,experiential avoidance,anxiety,depression,and stress significantly mediated the relationship between parental marital conflict and short video dependence.Conclusion:These findings confirm that parental marital conflict not only directly influences adolescent short video dependence but also operates through a chain mediation pathway involving experiential avoidance and emotional disturbance,highlighting central psychological mechanisms and providing theoretical support for integrated mental health and behavioral interventions.
基金funded by the Guangxi Philosophy and Social Science Research Project,grant number 24XWC002.
文摘Background:In the Chinese context,the impact of short video applications on the psychological well-being of older adults is contested.While often examined through a pathological lens of addiction,this perspective may overlook paradoxical,context-dependent positive outcomes.Therefore,the main objective of this study is to challenge the traditional Compensatory Internet Use Theory by proposing and testing a chained mediation model that explores a paradoxical pathway from social support to life satisfaction via problematic social media use.Methods:Data were collected between July and August 2025 via the Credamo online survey platform,yielding 384 valid responses from Chinese older adults aged 60 and above.Key constructs were assessed using the Social Support Rating Scale(SSRS),Bergen Social Media Addiction Scale(BSMAS),Simplified UCLA Loneliness Scale,and Satisfaction with Life Scale(SWLS).A chained mediation model was tested using stepwise regression and non-parametric bootstrapping(5000 resamples),controlling for age,gender,household income,and health status.Results:The analysis revealed a paradoxical pathway,which was clarified by a key statistical suppression effect.Social support significantly and positively predicted problematic usage(β=0.157,p=0.002).After controlling for the suppressor effect of social support,problematic usage in turn negatively predicted social connectedness(β=−0.177,p<0.001).Finally,reduced social connectedness—reflecting a state of solitude—positively predicted life satisfaction(β=−0.227,p<0.001).Conclusion:The findings suggest that for older adults with sufficient offline social support,these resources may serve a“social empowerment”function.This empowerment allows behaviors measured as“problematic usage”to be theoretically reframed as a form of“deep immersive entertainment”.This immersion appears to occur alongside a state of“high-quality solitude”,which ultimately is associated with higher life satisfaction.This study provides a novel,non-pathological theoretical perspective on the consequences of high engagement with emerging social media,offering empirical grounds for non-abstinence-based intervention strategies.
基金Supported partially by the National Natural Science Foundation of China (Grant No.60503063)the National High-Tech Research & Development Program of China (Grant No.2006AA01Z321)the National Basic Research Program of China (Grant No.2006CB303103)
文摘Multi-view video coding (MVC) comprises rich 3D information and is widely used in new visual media, such as 3DTV and free viewpoint TV (FTV). However, even with mainstream computer manufacturers migrating to multi-core processors, the huge computational requirement of MVC currently prohibits its wide use in consumer markets. In this paper, we demonstrate the design and implementation of the first parallel MVC system on Cell Broadband Engine^TM processor which is a state-of-the-art multi-core processor. We propose a task-dispatching algorithm which is adaptive data-driven on the frame level for MVC, and implement a parallel multi-view video decoder with modified H.264/AVC codec on real machine. This approach provides scalable speedup (up to 16 times on sixteen cores) through proper local store management, utilization of code locality and SIMD improvement. Decoding speed, speedup and utilization rate of cores are expressed in experimental results.
基金Project supported by the National Natural Science Foundation of China (No. 60802013)the Zhejiang Provincial Natural Science Foundation of China (No. Y106574)
文摘New video applications, such as 3D video and free viewpoint video, require efficient compression of multi-view video. In addition to temporal redundancy, exploiting the inter-view redundancy is crucial to improve the performance of multi-view video coding. In this paper, we present a novel method to construct the optimal inter-view prediction structure for multi-view video coding using simulated annealing. In the proposed model, the design of the prediction structure is converted to the arrangement of coding order. Then, a simulated annealing algorithm is employed to minimize the total cost for obtaining the best coding order. This method is applicable to arbitrary irregular camera arrangements. As experiment results reveal, the annealing process converges to satisfactory results rapidly and the generated optimal prediction structure outperforms the reference prediction structure of the joint multi-view video model (JMVM) by 0.1-0.8 dB PSNR gains.
文摘Distributed video coding (DVC) is a new video coding approach based on Wyner-Ziv theorem. The novel uplink-friendly DVC, which offers low-complexity, low-power consuming, and low-cost video encoding, has aroused more and more research interests. In this paper a new method based on multiple view geometry is presented for spatial side information generation of uncalibrated video sensor network. Trifocal tensor encapsulates all the geometric relations among three views that are independent of scene structure; it can be computed from image correspondences alone without requiring knowledge of the motion or calibration. Simulation results show that trifocal tensor-based spatial side information improves the rate-distortion performance over motion compensation based interpolation side information by a maximum gap of around 2dB. Then fusion merges the different side information (temporal and spatial) in order to improve the quality of the final one. Simulation results show that the rate-distortion gains about 0.4 dB.
基金funded by the Research Project:THTETN.05/24-25,VietnamAcademy of Science and Technology.
文摘Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy clustering techniques,such as Fuzzy C-Means(FCM),face significant challenges in handling uncertainty and the dependencies between different views.To overcome these limitations,we introduce a new multi-view fuzzy clustering approach that integrates picture fuzzy sets with a dual-anchor graph method for multi-view data,aiming to enhance clustering accuracy and robustness,termed Multi-view Picture Fuzzy Clustering(MPFC).In particular,the picture fuzzy set theory extends the capability to represent uncertainty by modeling three membership levels:membership degrees,neutral degrees,and refusal degrees.This allows for a more flexible representation of uncertain and conflicting data than traditional fuzzy models.Meanwhile,dual-anchor graphs exploit the similarity relationships between data points and integrate information across views.This combination improves stability,scalability,and robustness when handling noisy and heterogeneous data.Experimental results on several benchmark datasets demonstrate significant improvements in clustering accuracy and efficiency,outperforming traditional methods.Specifically,the MPFC algorithm demonstrates outstanding clustering performance on a variety of datasets,attaining a Purity(PUR)score of 0.6440 and an Accuracy(ACC)score of 0.6213 for the 3 Sources dataset,underscoring its robustness and efficiency.The proposed approach significantly contributes to fields such as pattern recognition,multi-view relational data analysis,and large-scale clustering problems.Future work will focus on extending the method for semi-supervised multi-view clustering,aiming to enhance adaptability,scalability,and performance in real-world applications.
基金supported by the research on key technologies for monitoring and identifying drug abuse of anesthetic drugs and psychotropic drugs,and intervention for addiction(No.2023YFC3304200)the program of a study on the diagnosis of addiction to synthetic cannabinoids and methods of assessing the risk of abuse(No.2022YFC3300905)+1 种基金the program of Ab initio design and generation of AI models for small molecule ligands based on target structures(No.2022PE0AC03)ZHIJIANG LAB.
文摘The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.
基金supported by National Natural Science Foundation of China(32122066,32201855)STI2030—Major Projects(2023ZD04076).
文摘Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various aspects.By integrating multi-view information into phenotypic prediction,a multi-view best linear unbiased prediction(MVBLUP)method is proposed in this paper.To measure the importance of multiple data views,the differential evolution algorithm with an early stopping mechanism is used,by which we obtain a multi-view kinship matrix and then incorporate it into the BLUP model for phenotypic prediction.To further illustrate the characteristics of MVBLUP,we perform the empirical experiments on four multi-view datasets in different crops.Compared to the single-view method,the prediction accuracy of the MVBLUP method has improved by 0.038–0.201 on average.The results demonstrate that the MVBLUP is an effective integrative prediction method for multi-view data.