Multi-View Stereo(MVS)is a pivotal technique in computer vision for reconstructing 3D models from multiple images by estimating depth maps.However,the reconstruction performance is hindered by visibility challenges,su...Multi-View Stereo(MVS)is a pivotal technique in computer vision for reconstructing 3D models from multiple images by estimating depth maps.However,the reconstruction performance is hindered by visibility challenges,such as occlusions and non-overlapping regions.In this paper,we propose an innovative visibility-aware framework to address these issues.Central to our method is an Epipolar Line-based Transformer(ELT)module,which capitalizes on the epipolar line correspondence and candidate matching features between images to enhance the feature representation and correlation robustness.Furthermore,we propose a novel Supervised Visibility Estimation(SVE)module that estimates high-precision visibility maps,transcending the constraints of previous methods that rely on indirect supervision.By integrating these modules,our method achieves state-of-the-art results on the benchmarks and demonstrates its capability to perform high-quality reconstructions even in challenging regions.The code will be released at https://github.com/npucvr/ETV-MVS.展开更多
High-resolution sub-meter satellite data play an increasingly crucial role in the 3D real-scene China construction initiative.Current research on 3D reconstruction using high-resolution satellite data primarily focuse...High-resolution sub-meter satellite data play an increasingly crucial role in the 3D real-scene China construction initiative.Current research on 3D reconstruction using high-resolution satellite data primarily focuses on two approaches:Multi-stereo fusion and multi-view matching.While algorithms based on these two methodologies for multi-view image 3D reconstruction have reached relative maturity,no systematic comparison has been conducted specifically on satellite data to evaluate the relative merits of multi-stereo fusion versus multi-view matching methods.This paper conducts a comparative analysis of the practical accuracy of both approaches using high-resolution satellite datasets from diverse geographical regions.To ensure fairness in accuracy comparison,both methodologies employ non-local dense matching for cost optimization.Results demonstrate that the multi-stereo fusion method outperforms multi-view matching in all evaluation metrics,exhibiting approximately 1.2%higher average matching accuracy and 10.7%superior elevation precision in the experimental datasets.Therefore,for 3D modeling applications using satellite data,we recommend adopting the multi-stereo fusion approach for digital surface model(DSM)product generation.展开更多
In multi-view stereo,unreliable matching in low-textured regions has a negative impact on the completeness of reconstructed models.Since the photometric consistency of low-textured regions is not discriminative under ...In multi-view stereo,unreliable matching in low-textured regions has a negative impact on the completeness of reconstructed models.Since the photometric consistency of low-textured regions is not discriminative under a local window,non-local information provided by the Markov Random Field(MRF)model can alleviate the matching ambiguity but is limited in continuous space with high computational complexity.Owing to its sampling and propagation strategy,PatchMatch multi-view stereo methods have advantages in terms of optimizing the continuous labeling problem.In this paper,we propose a novel method to address this problem,namely the Coarse-Hypotheses Guided Non-Local PAtchMatch Multi-View Stereo(CNLPA-MVS),which takes the advantages of both MRF-based non-local methods and PatchMatch multi-view stereo and compensates for their defects mutually.First,we combine dynamic programing(DP)and sequential propagation along scanlines in parallel to perform CNLPA-MVS,thereby obtaining the optimal depth and normal hypotheses.Second,we introduce coarse inference within a universal window provided by winner-takes-all to eliminate the stripe artifacts caused by DP and improve completeness.Third,we add a local consistency strategy based on the hypotheses of similar color pixels sharing approximate values into CNLPA-MVS for further improving completeness.CNLPA-MVS was validated on public benchmarks and achieved state-of-the-art performance with high completeness.展开更多
Learning-based multi-view stereo(MVS)algorithms have demonstrated great potential for depth estimation in recent years.However,they still struggle to estimate accurate depth in texture-less planar regions,which limits...Learning-based multi-view stereo(MVS)algorithms have demonstrated great potential for depth estimation in recent years.However,they still struggle to estimate accurate depth in texture-less planar regions,which limits their reconstruction perform-ance in man-made scenes.In this paper,we propose PlaneStereo,a new framework that utilizes planar prior to facilitate the depth estim-ation.Our key intuition is that pixels inside a plane share the same set of plane parameters,which can be estimated collectively using in-formation inside the whole plane.Specifically,our method first segments planes in the reference image,and then fits 3D plane paramet-ers for each segmented plane by solving a linear system using high-confidence depth predictions inside the plane.This allows us to recov-er the plane parameters accurately,which can be converted to accurate depth values for each point in the plane,improving the depth prediction for low-textured local regions.This process is fully differentiable and can be integrated into existing learning-based MVS al-gorithms.Experiments show that using our method consistently improves the performance of existing stereo matching and MVS al-gorithms on DeMoN and ScanNet datasets,achieving state-of-the-art performance.展开更多
In this paper,we present a practical method for reconstructing the bidirectional reflectance distribution function(BRDF)from multiple images of a real object composed of a homogeneous material.The key idea is that the...In this paper,we present a practical method for reconstructing the bidirectional reflectance distribution function(BRDF)from multiple images of a real object composed of a homogeneous material.The key idea is that the BRDF can be sampled after geometry estimation using multi-view stereo(MVS)techniques.Our contribution is selection of reliable samples of lighting,surface normal,and viewing directions for robustness against estimation errors of MVS.Our method is quantitatively evaluated using synthesized images and its effectiveness is shown via real-world experiments.展开更多
Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy cl...Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy clustering techniques,such as Fuzzy C-Means(FCM),face significant challenges in handling uncertainty and the dependencies between different views.To overcome these limitations,we introduce a new multi-view fuzzy clustering approach that integrates picture fuzzy sets with a dual-anchor graph method for multi-view data,aiming to enhance clustering accuracy and robustness,termed Multi-view Picture Fuzzy Clustering(MPFC).In particular,the picture fuzzy set theory extends the capability to represent uncertainty by modeling three membership levels:membership degrees,neutral degrees,and refusal degrees.This allows for a more flexible representation of uncertain and conflicting data than traditional fuzzy models.Meanwhile,dual-anchor graphs exploit the similarity relationships between data points and integrate information across views.This combination improves stability,scalability,and robustness when handling noisy and heterogeneous data.Experimental results on several benchmark datasets demonstrate significant improvements in clustering accuracy and efficiency,outperforming traditional methods.Specifically,the MPFC algorithm demonstrates outstanding clustering performance on a variety of datasets,attaining a Purity(PUR)score of 0.6440 and an Accuracy(ACC)score of 0.6213 for the 3 Sources dataset,underscoring its robustness and efficiency.The proposed approach significantly contributes to fields such as pattern recognition,multi-view relational data analysis,and large-scale clustering problems.Future work will focus on extending the method for semi-supervised multi-view clustering,aiming to enhance adaptability,scalability,and performance in real-world applications.展开更多
The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches...The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.展开更多
Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various as...Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various aspects.By integrating multi-view information into phenotypic prediction,a multi-view best linear unbiased prediction(MVBLUP)method is proposed in this paper.To measure the importance of multiple data views,the differential evolution algorithm with an early stopping mechanism is used,by which we obtain a multi-view kinship matrix and then incorporate it into the BLUP model for phenotypic prediction.To further illustrate the characteristics of MVBLUP,we perform the empirical experiments on four multi-view datasets in different crops.Compared to the single-view method,the prediction accuracy of the MVBLUP method has improved by 0.038–0.201 on average.The results demonstrate that the MVBLUP is an effective integrative prediction method for multi-view data.展开更多
The morphological description of wear particles in lubricating oil is crucial for wear state monitoring and fault diagnosis in aero-engines.Accurately and comprehensively acquiring three-dimensional(3D)morphological d...The morphological description of wear particles in lubricating oil is crucial for wear state monitoring and fault diagnosis in aero-engines.Accurately and comprehensively acquiring three-dimensional(3D)morphological data of these particles has became a key focus in wear debris analysis.Herein,we develop a novel multi-view polarization-sensitive optical coherence tomography(PS-OCT)method to achieve accurate 3D morphology detection and reconstruction of aero-engine lubricant wear particles,effectively resolving occlusion-induced information loss while enabling material-specific characterization.The particle morphology is captured by multi-view imaging,followed by filtering,sharpening,and contour recognition.The method integrates advanced registration algorithms with Poisson reconstruction to generate high-precision 3D models.This approach not only provides accurate 3D morphological reconstruction but also mitigates information loss caused by particle occlusion,ensuring model completeness.Furthermore,by collecting polarization characteristics of typical metals and their oxides in aero-engine lubricants,this work comprehensively characterizes and comparatively analyzes particle polarization properties using Stokes vectors,polarization uniformity,and cumulative phase retardation,and obtains a three-dimensional model containing polarization information.Ultimately,the proposed method enables multidimensional information acquisition for the reliable identification of abrasive particle types.展开更多
Existing multi-view deep subspace clustering methods aim to learn a unified representation from multi-view data,while the learned representation is difficult to maintain the underlying structure hidden in the origin s...Existing multi-view deep subspace clustering methods aim to learn a unified representation from multi-view data,while the learned representation is difficult to maintain the underlying structure hidden in the origin samples,especially the high-order neighbor relationship between samples.To overcome the above challenges,this paper proposes a novel multi-order neighborhood fusion based multi-view deep subspace clustering model.We creatively integrate the multi-order proximity graph structures of different views into the self-expressive layer by a multi-order neighborhood fusion module.By this design,the multi-order Laplacian matrix supervises the learning of the view-consistent self-representation affinity matrix;then,we can obtain an optimal global affinity matrix where each connected node belongs to one cluster.In addition,the discriminative constraint between views is designed to further improve the clustering performance.A range of experiments on six public datasets demonstrates that the method performs better than other advanced multi-view clustering methods.The code is available at https://github.com/songzuolong/MNF-MDSC(accessed on 25 December 2024).展开更多
The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show...The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show limitations with the inherent uncertainty and imprecision of such data,as they rely on a single-dimensional membership value.To overcome these limitations,we propose an auto-weighted multi-view neutrosophic fuzzy clustering(AW-MVNFC)algorithm.Our method leverages the neutrosophic framework,an extension of fuzzy sets,to explicitly model imprecision and ambiguity through three membership degrees.The core novelty of AWMVNFC lies in a hierarchical weighting strategy that adaptively learns the contributions of both individual data views and the importance of each feature within a view.Through a unified objective function,AW-MVNFC jointly optimizes the neutrosophic membership assignments,cluster centers,and the distributions of view and feature weights.Comprehensive experiments conducted on synthetic and real-world datasets demonstrate that our algorithm achieves more accurate and stable clustering than existing methods,demonstrating its effectiveness in handling the complexities of multi-view data.展开更多
With technological advancements,virtual reality(VR),once limited to high-end professional applications,is rapidly expanding into entertainment and broader consumer domains.However,the inherent contradiction between mo...With technological advancements,virtual reality(VR),once limited to high-end professional applications,is rapidly expanding into entertainment and broader consumer domains.However,the inherent contradiction between mobile hardware computing power and the demand for high-resolution,high-refresh-rate rendering has intensified,leading to critical bottlenecks,including frame latency and power overload,which constrain large-scale applications of VR systems.This study systematically analyzes four key technologies for efficient VR rendering:(1)foveated rendering,which dynamically reduces rendering precision in peripheral regions based on the physiological characteristics of the human visual system(HVS),thereby significantly decreasing graphics computation load;(2)stereo rendering,optimized through consistent stereo rendering acceleration algorithms;(3)cloud rendering,utilizing object-based decomposition and illumination-based decomposition for distributed resource scheduling;and(4)low-power rendering,integrating parameter-optimized rendering,super-resolution technology,and frame-generation technology to enhance mobile energy efficiency.Through a systematic review of the core principles and optimization approaches of these technologies,this study establishes research benchmarks for developing efficient VR systems that achieve high fidelity and low latency while providing further theoretical support for the engineering implementation and industrial advancement of VR rendering technologies.展开更多
Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches ofte...Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches often rely on limited data sources and simplistic hypotheses,which restrict their ability to capture the multi-faceted nature of biological systems.This study introduces adaptive multi-view learning(AMVL),a novel methodology that integrates chemical-induced transcriptional profiles(CTPs),knowledge graph(KG)embeddings,and large language model(LLM)representations,to enhance drug repurposing predictions.AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning(MVL),matrix factorization,and ensemble optimization techniques to integrate heterogeneous multi-source data.Comprehensive evaluations on benchmark datasets(Fdata-set,Cdataset,and Ydataset)and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art(SOTA)methods,achieving superior accuracy in predicting drug-disease associations across multiple metrics.Literature-based validation further confirmed the model's predictive capabilities,with seven out of the top ten predictions corroborated by post-2011 evidence.To promote transparency and reproducibility,all data and codes used in this study were open-sourced,providing resources for pro-cessing CTPs,KG,and LLM-based similarity calculations,along with the complete AMVL algorithm and benchmarking procedures.By unifying diverse data modalities,AMVL offers a robust and scalable so-lution for accelerating drug discovery,fostering advancements in translational medicine and integrating multi-omics data.We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.展开更多
Drone swarm systems,equipped with photoelectric imaging and intelligent target perception,are essential for reconnaissance and strike missions in complex and high-risk environments.They excel in information sharing,an...Drone swarm systems,equipped with photoelectric imaging and intelligent target perception,are essential for reconnaissance and strike missions in complex and high-risk environments.They excel in information sharing,anti-jamming capabilities,and combat performance,making them critical for future warfare.However,varied perspectives in collaborative combat scenarios pose challenges to object detection,hindering traditional detection algorithms and reducing accuracy.Limited angle-prior data and sparse samples further complicate detection.This paper presents the Multi-View Collaborative Detection System,which tackles the challenges of multi-view object detection in collaborative combat scenarios.The system is designed to enhance multi-view image generation and detection algorithms,thereby improving the accuracy and efficiency of object detection across varying perspectives.First,an observation model for three-dimensional targets through line-of-sight angle transformation is constructed,and a multi-view image generation algorithm based on the Pix2Pix network is designed.For object detection,YOLOX is utilized,and a deep feature extraction network,BA-RepCSPDarknet,is developed to address challenges related to small target scale and feature extraction challenges.Additionally,a feature fusion network NS-PAFPN is developed to mitigate the issue of deep feature map information loss in UAV images.A visual attention module(BAM)is employed to manage appearance differences under varying angles,while a feature mapping module(DFM)prevents fine-grained feature loss.These advancements lead to the development of BA-YOLOX,a multi-view object detection network model suitable for drone platforms,enhancing accuracy and effectively targeting small objects.展开更多
With the rapid progress of the artificial intelligence(AI)technology and mobile internet,3D hand pose estimation has become critical to various intelligent application areas,e.g.,human-computer interaction.To avoid th...With the rapid progress of the artificial intelligence(AI)technology and mobile internet,3D hand pose estimation has become critical to various intelligent application areas,e.g.,human-computer interaction.To avoid the low accuracy of single-modal estimation and the high complexity of traditional multi-modal 3D estimation,this paper proposes a novel multi-modal multi-view(MMV)3D hand pose estimation system,which introduces a registration before translation(RT)-translation before registration(TR)jointed conditional generative adversarial network(cGAN)to train a multi-modal registration network,and then employs the multi-modal feature fusion to achieve high-quality estimation,with low hardware and software costs both in data acquisition and processing.Experimental results demonstrate that the MMV system is effective and feasible in various scenarios.It is promising for the MMV system to be used in broad intelligent application areas.展开更多
Accurate digital terrain models(DTMs)are essential for a wide range of geospatial and environmental applications,yet their derivation in forested regions remains a significant challenge.Existing global DTMs,typically ...Accurate digital terrain models(DTMs)are essential for a wide range of geospatial and environmental applications,yet their derivation in forested regions remains a significant challenge.Existing global DTMs,typically generated from satellite stereo photogrammetry or interferometric synthetic aperture radar(InSAR),fail to accurately capture understory terrain due to limited penetration capabilities,resulting in elevation overestimation in densely vegetated areas.While airborne light detection and ranging(LiDAR)can provide high-accuracy DTMs,its limited spatial coverage and high acquisition cost hinder large-scale applications.Thus,there is an urgent need for a scalable and cost-effective approach to extract DTMs directly from satellite-derived digital surface models(DSMs).In this study,we propose a simple,interpretable understory terrain extraction method that utilizes canopy height data from Global Ecosystem Dynamics Investigation(GEDI)and Ice,Cloud,and Land Elevation Satellite-2(ICESat-2)to construct a tree height surface model,which is then subtracted from the stereo-derived DSM to generate the final DTM.By directly incorporating LiDAR constraints,the method avoids error propagation from multiple heterogeneous datasets and reduces reliance on ancillary inputs,ensuring ease of implementation and broad applicability.In contrast to machine learning-based terrain modeling methods,which are often prone to overfitting and data bias,the proposed approach is simple,interpretable,and robust across diverse forested landscapes.The accuracy of the resulting DTM was validated against airborne LiDAR reference data and compared with both the Copernicus Digital Elevation Model(DEM)and the forest and buildings removed DEM(FABDEM),a global bare-earth elevation model corrected for vegetation bias.The results indicate that the proposed DTM consistently outperforms the Copernicus DEM(CopDEM)and achieves accuracy comparable to FABDEM.In addition,its finer spatial resolution of 1 m,compared to the 30 m resolution of FABDEM,allows for more detailed terrain representation and better capture of fine-scale variation.This advantage is most pronounced in gently to moderately sloped areas,where the proposed DTM shows clearly higher accuracy than both the CopDEM and FABDEM.The results confirm that high-resolution DTMs can be effectively extracted from DSMs using spaceborne LiDAR constraints,offering a scalable solution for terrain modeling in forested environments where airborne LiDAR is unavailable.To illustrate the potential utility of the proposed DTM,we applied it to a fire risk mapping application based on topographic parameters such as slope,aspect,and elevation.This case highlights how improved terrain representation can support geospatial hazard assessments.展开更多
As the location of the wheel center is the key to accurately measuring the wheelbase, the wheelbase difference and the wheel static radius, a high-precision wheel center detection method based on stereo vision is prop...As the location of the wheel center is the key to accurately measuring the wheelbase, the wheelbase difference and the wheel static radius, a high-precision wheel center detection method based on stereo vision is proposed. First, according to the prior information, the contour of the wheel hub is extracted and fitted as an ellipse curve, and the ellipse fitting equation can be obtained. Then, a new un-tangent constraint is adopted to improve the ellipse matching precision. Finally, the 3D coordinates of the wheel center can be reconstructed by the spatial circle projection algorithm with low time complexity and high measurement accuracy. Simulation experiments verify that compared with the ellipse center reconstruction algorithm and the planar constraint optimization algorithm, the proposed method can acquire the 3D coordinates of the spatial circle more exactly. Furthermore, the measurements of the wheelbase, the wheelbase difference and the wheel static radius for three types of vehicles demonstrate the effectiveness of the proposed method for wheel center detection.展开更多
基金supported by the National Natural Science Foundation of China(No.62271410)the Fundamental Research Funds for the Central Universities.
文摘Multi-View Stereo(MVS)is a pivotal technique in computer vision for reconstructing 3D models from multiple images by estimating depth maps.However,the reconstruction performance is hindered by visibility challenges,such as occlusions and non-overlapping regions.In this paper,we propose an innovative visibility-aware framework to address these issues.Central to our method is an Epipolar Line-based Transformer(ELT)module,which capitalizes on the epipolar line correspondence and candidate matching features between images to enhance the feature representation and correlation robustness.Furthermore,we propose a novel Supervised Visibility Estimation(SVE)module that estimates high-precision visibility maps,transcending the constraints of previous methods that rely on indirect supervision.By integrating these modules,our method achieves state-of-the-art results on the benchmarks and demonstrates its capability to perform high-quality reconstructions even in challenging regions.The code will be released at https://github.com/npucvr/ETV-MVS.
文摘High-resolution sub-meter satellite data play an increasingly crucial role in the 3D real-scene China construction initiative.Current research on 3D reconstruction using high-resolution satellite data primarily focuses on two approaches:Multi-stereo fusion and multi-view matching.While algorithms based on these two methodologies for multi-view image 3D reconstruction have reached relative maturity,no systematic comparison has been conducted specifically on satellite data to evaluate the relative merits of multi-stereo fusion versus multi-view matching methods.This paper conducts a comparative analysis of the practical accuracy of both approaches using high-resolution satellite datasets from diverse geographical regions.To ensure fairness in accuracy comparison,both methodologies employ non-local dense matching for cost optimization.Results demonstrate that the multi-stereo fusion method outperforms multi-view matching in all evaluation metrics,exhibiting approximately 1.2%higher average matching accuracy and 10.7%superior elevation precision in the experimental datasets.Therefore,for 3D modeling applications using satellite data,we recommend adopting the multi-stereo fusion approach for digital surface model(DSM)product generation.
基金supported by the National Natural Science Foundation of China under Grant Nos.61732015,61932018,and 61472349the National Key Research and Development Program of China under Grant No.2017YFB0202203.
文摘In multi-view stereo,unreliable matching in low-textured regions has a negative impact on the completeness of reconstructed models.Since the photometric consistency of low-textured regions is not discriminative under a local window,non-local information provided by the Markov Random Field(MRF)model can alleviate the matching ambiguity but is limited in continuous space with high computational complexity.Owing to its sampling and propagation strategy,PatchMatch multi-view stereo methods have advantages in terms of optimizing the continuous labeling problem.In this paper,we propose a novel method to address this problem,namely the Coarse-Hypotheses Guided Non-Local PAtchMatch Multi-View Stereo(CNLPA-MVS),which takes the advantages of both MRF-based non-local methods and PatchMatch multi-view stereo and compensates for their defects mutually.First,we combine dynamic programing(DP)and sequential propagation along scanlines in parallel to perform CNLPA-MVS,thereby obtaining the optimal depth and normal hypotheses.Second,we introduce coarse inference within a universal window provided by winner-takes-all to eliminate the stripe artifacts caused by DP and improve completeness.Third,we add a local consistency strategy based on the hypotheses of similar color pixels sharing approximate values into CNLPA-MVS for further improving completeness.CNLPA-MVS was validated on public benchmarks and achieved state-of-the-art performance with high completeness.
文摘Learning-based multi-view stereo(MVS)algorithms have demonstrated great potential for depth estimation in recent years.However,they still struggle to estimate accurate depth in texture-less planar regions,which limits their reconstruction perform-ance in man-made scenes.In this paper,we propose PlaneStereo,a new framework that utilizes planar prior to facilitate the depth estim-ation.Our key intuition is that pixels inside a plane share the same set of plane parameters,which can be estimated collectively using in-formation inside the whole plane.Specifically,our method first segments planes in the reference image,and then fits 3D plane paramet-ers for each segmented plane by solving a linear system using high-confidence depth predictions inside the plane.This allows us to recov-er the plane parameters accurately,which can be converted to accurate depth values for each point in the plane,improving the depth prediction for low-textured local regions.This process is fully differentiable and can be integrated into existing learning-based MVS al-gorithms.Experiments show that using our method consistently improves the performance of existing stereo matching and MVS al-gorithms on DeMoN and ScanNet datasets,achieving state-of-the-art performance.
基金partly supported by JSPS KAKENHI JP15K16027,JP26700013,JP15H05918,JP19H04138,JST CREST JP179423the Foundation for Nara Institute of Science and Technology.
文摘In this paper,we present a practical method for reconstructing the bidirectional reflectance distribution function(BRDF)from multiple images of a real object composed of a homogeneous material.The key idea is that the BRDF can be sampled after geometry estimation using multi-view stereo(MVS)techniques.Our contribution is selection of reliable samples of lighting,surface normal,and viewing directions for robustness against estimation errors of MVS.Our method is quantitatively evaluated using synthesized images and its effectiveness is shown via real-world experiments.
基金funded by the Research Project:THTETN.05/24-25,VietnamAcademy of Science and Technology.
文摘Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy clustering techniques,such as Fuzzy C-Means(FCM),face significant challenges in handling uncertainty and the dependencies between different views.To overcome these limitations,we introduce a new multi-view fuzzy clustering approach that integrates picture fuzzy sets with a dual-anchor graph method for multi-view data,aiming to enhance clustering accuracy and robustness,termed Multi-view Picture Fuzzy Clustering(MPFC).In particular,the picture fuzzy set theory extends the capability to represent uncertainty by modeling three membership levels:membership degrees,neutral degrees,and refusal degrees.This allows for a more flexible representation of uncertain and conflicting data than traditional fuzzy models.Meanwhile,dual-anchor graphs exploit the similarity relationships between data points and integrate information across views.This combination improves stability,scalability,and robustness when handling noisy and heterogeneous data.Experimental results on several benchmark datasets demonstrate significant improvements in clustering accuracy and efficiency,outperforming traditional methods.Specifically,the MPFC algorithm demonstrates outstanding clustering performance on a variety of datasets,attaining a Purity(PUR)score of 0.6440 and an Accuracy(ACC)score of 0.6213 for the 3 Sources dataset,underscoring its robustness and efficiency.The proposed approach significantly contributes to fields such as pattern recognition,multi-view relational data analysis,and large-scale clustering problems.Future work will focus on extending the method for semi-supervised multi-view clustering,aiming to enhance adaptability,scalability,and performance in real-world applications.
基金supported by the research on key technologies for monitoring and identifying drug abuse of anesthetic drugs and psychotropic drugs,and intervention for addiction(No.2023YFC3304200)the program of a study on the diagnosis of addiction to synthetic cannabinoids and methods of assessing the risk of abuse(No.2022YFC3300905)+1 种基金the program of Ab initio design and generation of AI models for small molecule ligands based on target structures(No.2022PE0AC03)ZHIJIANG LAB.
文摘The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.
基金supported by National Natural Science Foundation of China(32122066,32201855)STI2030—Major Projects(2023ZD04076).
文摘Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various aspects.By integrating multi-view information into phenotypic prediction,a multi-view best linear unbiased prediction(MVBLUP)method is proposed in this paper.To measure the importance of multiple data views,the differential evolution algorithm with an early stopping mechanism is used,by which we obtain a multi-view kinship matrix and then incorporate it into the BLUP model for phenotypic prediction.To further illustrate the characteristics of MVBLUP,we perform the empirical experiments on four multi-view datasets in different crops.Compared to the single-view method,the prediction accuracy of the MVBLUP method has improved by 0.038–0.201 on average.The results demonstrate that the MVBLUP is an effective integrative prediction method for multi-view data.
文摘The morphological description of wear particles in lubricating oil is crucial for wear state monitoring and fault diagnosis in aero-engines.Accurately and comprehensively acquiring three-dimensional(3D)morphological data of these particles has became a key focus in wear debris analysis.Herein,we develop a novel multi-view polarization-sensitive optical coherence tomography(PS-OCT)method to achieve accurate 3D morphology detection and reconstruction of aero-engine lubricant wear particles,effectively resolving occlusion-induced information loss while enabling material-specific characterization.The particle morphology is captured by multi-view imaging,followed by filtering,sharpening,and contour recognition.The method integrates advanced registration algorithms with Poisson reconstruction to generate high-precision 3D models.This approach not only provides accurate 3D morphological reconstruction but also mitigates information loss caused by particle occlusion,ensuring model completeness.Furthermore,by collecting polarization characteristics of typical metals and their oxides in aero-engine lubricants,this work comprehensively characterizes and comparatively analyzes particle polarization properties using Stokes vectors,polarization uniformity,and cumulative phase retardation,and obtains a three-dimensional model containing polarization information.Ultimately,the proposed method enables multidimensional information acquisition for the reliable identification of abrasive particle types.
基金supported by the National Key R&D Program of China(2023YFC3304600).
文摘Existing multi-view deep subspace clustering methods aim to learn a unified representation from multi-view data,while the learned representation is difficult to maintain the underlying structure hidden in the origin samples,especially the high-order neighbor relationship between samples.To overcome the above challenges,this paper proposes a novel multi-order neighborhood fusion based multi-view deep subspace clustering model.We creatively integrate the multi-order proximity graph structures of different views into the self-expressive layer by a multi-order neighborhood fusion module.By this design,the multi-order Laplacian matrix supervises the learning of the view-consistent self-representation affinity matrix;then,we can obtain an optimal global affinity matrix where each connected node belongs to one cluster.In addition,the discriminative constraint between views is designed to further improve the clustering performance.A range of experiments on six public datasets demonstrates that the method performs better than other advanced multi-view clustering methods.The code is available at https://github.com/songzuolong/MNF-MDSC(accessed on 25 December 2024).
文摘The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show limitations with the inherent uncertainty and imprecision of such data,as they rely on a single-dimensional membership value.To overcome these limitations,we propose an auto-weighted multi-view neutrosophic fuzzy clustering(AW-MVNFC)algorithm.Our method leverages the neutrosophic framework,an extension of fuzzy sets,to explicitly model imprecision and ambiguity through three membership degrees.The core novelty of AWMVNFC lies in a hierarchical weighting strategy that adaptively learns the contributions of both individual data views and the importance of each feature within a view.Through a unified objective function,AW-MVNFC jointly optimizes the neutrosophic membership assignments,cluster centers,and the distributions of view and feature weights.Comprehensive experiments conducted on synthetic and real-world datasets demonstrate that our algorithm achieves more accurate and stable clustering than existing methods,demonstrating its effectiveness in handling the complexities of multi-view data.
基金Supported by the National Key R&D Program of China under grant No.2022YFB3303203the National Natural Science Foundation of China under grant No.62272275.
文摘With technological advancements,virtual reality(VR),once limited to high-end professional applications,is rapidly expanding into entertainment and broader consumer domains.However,the inherent contradiction between mobile hardware computing power and the demand for high-resolution,high-refresh-rate rendering has intensified,leading to critical bottlenecks,including frame latency and power overload,which constrain large-scale applications of VR systems.This study systematically analyzes four key technologies for efficient VR rendering:(1)foveated rendering,which dynamically reduces rendering precision in peripheral regions based on the physiological characteristics of the human visual system(HVS),thereby significantly decreasing graphics computation load;(2)stereo rendering,optimized through consistent stereo rendering acceleration algorithms;(3)cloud rendering,utilizing object-based decomposition and illumination-based decomposition for distributed resource scheduling;and(4)low-power rendering,integrating parameter-optimized rendering,super-resolution technology,and frame-generation technology to enhance mobile energy efficiency.Through a systematic review of the core principles and optimization approaches of these technologies,this study establishes research benchmarks for developing efficient VR systems that achieve high fidelity and low latency while providing further theoretical support for the engineering implementation and industrial advancement of VR rendering technologies.
基金supported by the National Natural Science Foundation of China(Grant No.:62101087)the China Postdoctoral Science Foundation(Grant No.:2021MD703942)+2 种基金the Chongqing Postdoctoral Research Project Special Funding,China(Grant No.:2021XM2016)the Science Foundation of Chongqing Municipal Commission of Education,China(Grant No.:KJQN202100642)the Chongqing Natural Science Foundation,China(Grant No.:cstc2021jcyj-msxmX0834).
文摘Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches often rely on limited data sources and simplistic hypotheses,which restrict their ability to capture the multi-faceted nature of biological systems.This study introduces adaptive multi-view learning(AMVL),a novel methodology that integrates chemical-induced transcriptional profiles(CTPs),knowledge graph(KG)embeddings,and large language model(LLM)representations,to enhance drug repurposing predictions.AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning(MVL),matrix factorization,and ensemble optimization techniques to integrate heterogeneous multi-source data.Comprehensive evaluations on benchmark datasets(Fdata-set,Cdataset,and Ydataset)and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art(SOTA)methods,achieving superior accuracy in predicting drug-disease associations across multiple metrics.Literature-based validation further confirmed the model's predictive capabilities,with seven out of the top ten predictions corroborated by post-2011 evidence.To promote transparency and reproducibility,all data and codes used in this study were open-sourced,providing resources for pro-cessing CTPs,KG,and LLM-based similarity calculations,along with the complete AMVL algorithm and benchmarking procedures.By unifying diverse data modalities,AMVL offers a robust and scalable so-lution for accelerating drug discovery,fostering advancements in translational medicine and integrating multi-omics data.We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.
基金supported by the Natural Science Foundation of China,Grant No.62103052.
文摘Drone swarm systems,equipped with photoelectric imaging and intelligent target perception,are essential for reconnaissance and strike missions in complex and high-risk environments.They excel in information sharing,anti-jamming capabilities,and combat performance,making them critical for future warfare.However,varied perspectives in collaborative combat scenarios pose challenges to object detection,hindering traditional detection algorithms and reducing accuracy.Limited angle-prior data and sparse samples further complicate detection.This paper presents the Multi-View Collaborative Detection System,which tackles the challenges of multi-view object detection in collaborative combat scenarios.The system is designed to enhance multi-view image generation and detection algorithms,thereby improving the accuracy and efficiency of object detection across varying perspectives.First,an observation model for three-dimensional targets through line-of-sight angle transformation is constructed,and a multi-view image generation algorithm based on the Pix2Pix network is designed.For object detection,YOLOX is utilized,and a deep feature extraction network,BA-RepCSPDarknet,is developed to address challenges related to small target scale and feature extraction challenges.Additionally,a feature fusion network NS-PAFPN is developed to mitigate the issue of deep feature map information loss in UAV images.A visual attention module(BAM)is employed to manage appearance differences under varying angles,while a feature mapping module(DFM)prevents fine-grained feature loss.These advancements lead to the development of BA-YOLOX,a multi-view object detection network model suitable for drone platforms,enhancing accuracy and effectively targeting small objects.
文摘With the rapid progress of the artificial intelligence(AI)technology and mobile internet,3D hand pose estimation has become critical to various intelligent application areas,e.g.,human-computer interaction.To avoid the low accuracy of single-modal estimation and the high complexity of traditional multi-modal 3D estimation,this paper proposes a novel multi-modal multi-view(MMV)3D hand pose estimation system,which introduces a registration before translation(RT)-translation before registration(TR)jointed conditional generative adversarial network(cGAN)to train a multi-modal registration network,and then employs the multi-modal feature fusion to achieve high-quality estimation,with low hardware and software costs both in data acquisition and processing.Experimental results demonstrate that the MMV system is effective and feasible in various scenarios.It is promising for the MMV system to be used in broad intelligent application areas.
基金supported by the National Key Research and Development Program of China(Nos.SQ2022YFB3900026 and 2022YFB3903305)supported by the Leading Talents of Guangdong Pearl River Talent Program(No.2021CX02S024)the Guangdong S&T programme(No.2024B1212050011).
文摘Accurate digital terrain models(DTMs)are essential for a wide range of geospatial and environmental applications,yet their derivation in forested regions remains a significant challenge.Existing global DTMs,typically generated from satellite stereo photogrammetry or interferometric synthetic aperture radar(InSAR),fail to accurately capture understory terrain due to limited penetration capabilities,resulting in elevation overestimation in densely vegetated areas.While airborne light detection and ranging(LiDAR)can provide high-accuracy DTMs,its limited spatial coverage and high acquisition cost hinder large-scale applications.Thus,there is an urgent need for a scalable and cost-effective approach to extract DTMs directly from satellite-derived digital surface models(DSMs).In this study,we propose a simple,interpretable understory terrain extraction method that utilizes canopy height data from Global Ecosystem Dynamics Investigation(GEDI)and Ice,Cloud,and Land Elevation Satellite-2(ICESat-2)to construct a tree height surface model,which is then subtracted from the stereo-derived DSM to generate the final DTM.By directly incorporating LiDAR constraints,the method avoids error propagation from multiple heterogeneous datasets and reduces reliance on ancillary inputs,ensuring ease of implementation and broad applicability.In contrast to machine learning-based terrain modeling methods,which are often prone to overfitting and data bias,the proposed approach is simple,interpretable,and robust across diverse forested landscapes.The accuracy of the resulting DTM was validated against airborne LiDAR reference data and compared with both the Copernicus Digital Elevation Model(DEM)and the forest and buildings removed DEM(FABDEM),a global bare-earth elevation model corrected for vegetation bias.The results indicate that the proposed DTM consistently outperforms the Copernicus DEM(CopDEM)and achieves accuracy comparable to FABDEM.In addition,its finer spatial resolution of 1 m,compared to the 30 m resolution of FABDEM,allows for more detailed terrain representation and better capture of fine-scale variation.This advantage is most pronounced in gently to moderately sloped areas,where the proposed DTM shows clearly higher accuracy than both the CopDEM and FABDEM.The results confirm that high-resolution DTMs can be effectively extracted from DSMs using spaceborne LiDAR constraints,offering a scalable solution for terrain modeling in forested environments where airborne LiDAR is unavailable.To illustrate the potential utility of the proposed DTM,we applied it to a fire risk mapping application based on topographic parameters such as slope,aspect,and elevation.This case highlights how improved terrain representation can support geospatial hazard assessments.
基金The National Natural Science Foundation of China(No.61272223)the National Key Scientific Apparatus Development of Special Item(No.2012YQ170003-5)
文摘As the location of the wheel center is the key to accurately measuring the wheelbase, the wheelbase difference and the wheel static radius, a high-precision wheel center detection method based on stereo vision is proposed. First, according to the prior information, the contour of the wheel hub is extracted and fitted as an ellipse curve, and the ellipse fitting equation can be obtained. Then, a new un-tangent constraint is adopted to improve the ellipse matching precision. Finally, the 3D coordinates of the wheel center can be reconstructed by the spatial circle projection algorithm with low time complexity and high measurement accuracy. Simulation experiments verify that compared with the ellipse center reconstruction algorithm and the planar constraint optimization algorithm, the proposed method can acquire the 3D coordinates of the spatial circle more exactly. Furthermore, the measurements of the wheelbase, the wheelbase difference and the wheel static radius for three types of vehicles demonstrate the effectiveness of the proposed method for wheel center detection.