Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rel...Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rely on large amounts of labeled data,which are costly and time-consuming to obtain,especially in largescale or dynamic environments.To address this challenge,we propose the Semi-Supervised Multi-View Picture Fuzzy Clustering(SS-MPFC)algorithm,which improves segmentation accuracy and robustness,particularly in complex and uncertain remote sensing scenarios.SS-MPFC unifies three paradigms:semi-supervised learning,multi-view clustering,and picture fuzzy set theory.This integration allows the model to effectively utilize a small number of labeled samples,fuse complementary information from multiple data views,and handle the ambiguity and uncertainty inherent in satellite imagery.We design a novel objective function that jointly incorporates picture fuzzy membership functions across multiple views of the data,and embeds pairwise semi-supervised constraints(must-link and cannot-link)directly into the clustering process to enhance segmentation accuracy.Experiments conducted on several benchmark satellite datasets demonstrate that SS-MPFC significantly outperforms existing state-of-the-art methods in segmentation accuracy,noise robustness,and semantic interpretability.On the Augsburg dataset,SS-MPFC achieves a Purity of 0.8158 and an Accuracy of 0.6860,highlighting its outstanding robustness and efficiency.These results demonstrate that SSMPFC offers a scalable and effective solution for real-world satellite-based monitoring systems,particularly in scenarios where rapid annotation is infeasible,such as wildfire tracking,agricultural monitoring,and dynamic urban mapping.展开更多
An autostereoscopic display composed of a directional backlight, an image display panel, a striped half-wave plate,and a polarized lenticular lens array is proposed. The directional backlight emitting the parallel lig...An autostereoscopic display composed of a directional backlight, an image display panel, a striped half-wave plate,and a polarized lenticular lens array is proposed. The directional backlight emitting the parallel light can redirect the cones of light to lenticular lens array and reduce the chromatic spatial-interference effect. The striped half-wave plate, located in front of the image display panel, transformed the polarization direction of the lights from the directional backlight into two mutually perpendicular directions. The polarized lenticular lens array not only can divide the light from the left and right view images to send to left and right eyes but also can reduce the crosstalk of the stereoscopic images. The proposed autostereoscopic display can produce high quality stereoscopic images without crosstalk at the optimal viewing distance.展开更多
High-resolution non-emissive displays based on electrochromic tungsten oxides(WOx)are crucial for future near-eye virtual/augmented reality interactions,given their impressive attributes such as high environmental sta...High-resolution non-emissive displays based on electrochromic tungsten oxides(WOx)are crucial for future near-eye virtual/augmented reality interactions,given their impressive attributes such as high environmental stability,ideal outdoor readability,and low energy consumption.However,the limited intrinsic structure of inorganic materials has presented a significant challenge in achieving precise patterning/pixelation at the micron scale.Here,we successfully developed the direct photolithography for WOx nanoparticles based on in situ photo-induced ligand exchange.This strategy enabled us to achieve ultra-high resolution efficiently(line width<4μm,the best resolution for reported inorganic electrochromic materials).Additionally,the resulting device exhibited impressive electrochromic performance,such as fast response(<1 s at 0 V),high coloration efficiency(119.5 cm^(2) C^(−1)),good optical modulation(55.9%),and durability(>3600 cycles),as well as promising applications in electronic logos,pixelated displays,flexible electronics,etc.The success and advancements presented here are expected to inspire and accelerate research and development(R&D)in high-resolution non-emissive displays and other ultra-fine micro-electronics.展开更多
It is of great scientific significance to construct a 3D dynamic structural color with a special color effect based on the microlens array.However,the problems of imperfect mechanisms and poor color quality need to be...It is of great scientific significance to construct a 3D dynamic structural color with a special color effect based on the microlens array.However,the problems of imperfect mechanisms and poor color quality need to be solved.A method of 3D structural color turning on periodic metasurfaces fabricated by the microlens array and self-assembly technology was proposed in this study.In the experiment,Polydimethylsiloxane(PDMS)flexible film was used as a substrate,and SiO2 microspheres were scraped into grooves of the PDMS film to form 3D photonic crystal structures.By adjusting the number of blade-coated times and microsphere concentrations,high-saturation structural color micropatterns were obtained.These films were then matched with microlens arrays to produce dynamic graphics with iridescent effects.The results showed that by blade-coated two times and SiO2 microsphere concentrations of 50%are the best conditions.This method demonstrates the potential for being widely applied in the anticounterfeiting printing and ultra-high-resolution display.展开更多
Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy cl...Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy clustering techniques,such as Fuzzy C-Means(FCM),face significant challenges in handling uncertainty and the dependencies between different views.To overcome these limitations,we introduce a new multi-view fuzzy clustering approach that integrates picture fuzzy sets with a dual-anchor graph method for multi-view data,aiming to enhance clustering accuracy and robustness,termed Multi-view Picture Fuzzy Clustering(MPFC).In particular,the picture fuzzy set theory extends the capability to represent uncertainty by modeling three membership levels:membership degrees,neutral degrees,and refusal degrees.This allows for a more flexible representation of uncertain and conflicting data than traditional fuzzy models.Meanwhile,dual-anchor graphs exploit the similarity relationships between data points and integrate information across views.This combination improves stability,scalability,and robustness when handling noisy and heterogeneous data.Experimental results on several benchmark datasets demonstrate significant improvements in clustering accuracy and efficiency,outperforming traditional methods.Specifically,the MPFC algorithm demonstrates outstanding clustering performance on a variety of datasets,attaining a Purity(PUR)score of 0.6440 and an Accuracy(ACC)score of 0.6213 for the 3 Sources dataset,underscoring its robustness and efficiency.The proposed approach significantly contributes to fields such as pattern recognition,multi-view relational data analysis,and large-scale clustering problems.Future work will focus on extending the method for semi-supervised multi-view clustering,aiming to enhance adaptability,scalability,and performance in real-world applications.展开更多
The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches...The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.展开更多
Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various as...Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various aspects.By integrating multi-view information into phenotypic prediction,a multi-view best linear unbiased prediction(MVBLUP)method is proposed in this paper.To measure the importance of multiple data views,the differential evolution algorithm with an early stopping mechanism is used,by which we obtain a multi-view kinship matrix and then incorporate it into the BLUP model for phenotypic prediction.To further illustrate the characteristics of MVBLUP,we perform the empirical experiments on four multi-view datasets in different crops.Compared to the single-view method,the prediction accuracy of the MVBLUP method has improved by 0.038–0.201 on average.The results demonstrate that the MVBLUP is an effective integrative prediction method for multi-view data.展开更多
Super-fine electrohydrodynamic inkjet(SIJ)printing of perovskite nanocrystal(PNC)colloid ink exhibits significant potential in the fabrication of high-resolution color conversion microstructures arrays for fullcolor m...Super-fine electrohydrodynamic inkjet(SIJ)printing of perovskite nanocrystal(PNC)colloid ink exhibits significant potential in the fabrication of high-resolution color conversion microstructures arrays for fullcolor micro-LED displays.However,the impact of solvent on both the printing process and the morphology of SIJ-printed PNC color conversion microstructures remains underexplored.In this study,we prepared samples of CsPbBr3PNC colloid inks in various solvents and investigated the solvent's impact on SIJ printed PNC microstructures.Our findings reveal that the boiling point of the solvent is crucial to the SIJ printing process of PNC colloid inks.Only does the boiling point of the solvent fall in the optimal range,the regular positioned,micron-scaled,conical PNC microstructures can be successfully printed.Below this optimal range,the ink is unable to be ejected from the nozzle;while above this range,irregular positioned microstructures with nanoscale height and coffee-ring-like morphology are produced.Based on these observations,high-resolution color conversion PNC microstructures were effectively prepared using SIJ printing of PNC colloid ink dispersed in dimethylbenzene solvent.展开更多
The morphological description of wear particles in lubricating oil is crucial for wear state monitoring and fault diagnosis in aero-engines.Accurately and comprehensively acquiring three-dimensional(3D)morphological d...The morphological description of wear particles in lubricating oil is crucial for wear state monitoring and fault diagnosis in aero-engines.Accurately and comprehensively acquiring three-dimensional(3D)morphological data of these particles has became a key focus in wear debris analysis.Herein,we develop a novel multi-view polarization-sensitive optical coherence tomography(PS-OCT)method to achieve accurate 3D morphology detection and reconstruction of aero-engine lubricant wear particles,effectively resolving occlusion-induced information loss while enabling material-specific characterization.The particle morphology is captured by multi-view imaging,followed by filtering,sharpening,and contour recognition.The method integrates advanced registration algorithms with Poisson reconstruction to generate high-precision 3D models.This approach not only provides accurate 3D morphological reconstruction but also mitigates information loss caused by particle occlusion,ensuring model completeness.Furthermore,by collecting polarization characteristics of typical metals and their oxides in aero-engine lubricants,this work comprehensively characterizes and comparatively analyzes particle polarization properties using Stokes vectors,polarization uniformity,and cumulative phase retardation,and obtains a three-dimensional model containing polarization information.Ultimately,the proposed method enables multidimensional information acquisition for the reliable identification of abrasive particle types.展开更多
Virtual reality(VR)is regarded as the next-generation display platform for immersive human-computer interaction.To solve the long-existing problem of vergence accommodation conflict in VR,varifocal displays based on t...Virtual reality(VR)is regarded as the next-generation display platform for immersive human-computer interaction.To solve the long-existing problem of vergence accommodation conflict in VR,varifocal displays based on the diffractive Pancharatnam–Berry lens(PBL)are considered as one of the most promising approaches with great compatibility to current display architectures.However,the diffractive nature of PBL leads to serious chromatic aberrations in optical systems,which deteriorates the image quality and discourages its actual usage.展开更多
Traditional sheep identification is based on ear tags.However,the application of ear tags not only causes stress to the animals but also leads to loss of ear tags,which affects the correct recognition of sheep identit...Traditional sheep identification is based on ear tags.However,the application of ear tags not only causes stress to the animals but also leads to loss of ear tags,which affects the correct recognition of sheep identity.In contrast,the acquisition of sheep face images offers the advantages of being non-invasive and stress-free for the animals.Nevertheless,the extant convolutional neural network-based sheep face identification model is prone to the issue of inadequate refinement,which renders its implementation on farms challenging.To address this issue,this study presented a novel sheep face recognition model that employs advanced feature fusion techniques and precise image segmentation strategies.The images were preprocessed and accurately segmented using deep learning techniques,with a dataset constructed containing sheep face images from multiple viewpoints(left,front,and right faces).In particular,the model employs a segmentation algorithm to delineate the sheep face region accurately,utilizes the Improved Convolutional Block Attention Module(I-CBAM)to emphasize the salient features of the sheep face,and achieves multi-scale fusion of the features through a Feature Pyramid Network(FPN).This process guarantees that the features captured from disparate viewpoints can be efficiently integrated to enhance recognition accuracy.Furthermore,the model guarantees the precise delineation of sheep facial contours by streamlining the image segmentation procedure,thereby establishing a robust basis for the precise identification of sheep identity.The findings demonstrate that the recognition accuracy of the Sheep Face Mask Region-based Convolutional Neural Network(SFMask RCNN)model has been enhanced by 9.64%to 98.65%in comparison to the original model.The method offers a novel technological approach to the management of animal identity in the context of sheep husbandry.展开更多
Existing multi-view deep subspace clustering methods aim to learn a unified representation from multi-view data,while the learned representation is difficult to maintain the underlying structure hidden in the origin s...Existing multi-view deep subspace clustering methods aim to learn a unified representation from multi-view data,while the learned representation is difficult to maintain the underlying structure hidden in the origin samples,especially the high-order neighbor relationship between samples.To overcome the above challenges,this paper proposes a novel multi-order neighborhood fusion based multi-view deep subspace clustering model.We creatively integrate the multi-order proximity graph structures of different views into the self-expressive layer by a multi-order neighborhood fusion module.By this design,the multi-order Laplacian matrix supervises the learning of the view-consistent self-representation affinity matrix;then,we can obtain an optimal global affinity matrix where each connected node belongs to one cluster.In addition,the discriminative constraint between views is designed to further improve the clustering performance.A range of experiments on six public datasets demonstrates that the method performs better than other advanced multi-view clustering methods.The code is available at https://github.com/songzuolong/MNF-MDSC(accessed on 25 December 2024).展开更多
Balancing high display performance with energy efficiency is crucial for global sustainability.Lowering operating frequencies—such as enabling 1 Hz operation in fringe-field switching(FFS)liquid crystal displays—red...Balancing high display performance with energy efficiency is crucial for global sustainability.Lowering operating frequencies—such as enabling 1 Hz operation in fringe-field switching(FFS)liquid crystal displays—reduces power consumption but is hindered by image flicker.While negative dielectric anisotropy liquid crystals(nLCs)mitigate flicker,their high driving voltages and production costs limit adoption.Positive dielectric anisotropy liquid crystals(pLCs)offer lower operating voltages,faster response times,and broader applicability,making them a more viable alternative.This study introduces a novel approach to minimizing flexoelectric effects in pLCs by investigating how single components influence flexoelectric behavior in mixtures through an effective experimental methodology.Two innovative measurement techniques—(1)flexoelectric coefficient difference analysis and(2)displacement-current measurement(DCM)—are presented,marking the first application of DCM for verifying flexoelectric effects.The proposed system eliminates uncertainties associated with previous methods,providing a reliable framework for selecting liquid crystal components with minimal flexoelectric effects while preserving key electro-optic properties.Given pLCs'higher reliability,lower production costs,and broader material selection,these advancements hold significant potential for low-power displays.We believe this work enhances flexoelectric analysis in nematic liquid crystals and contributes to sustainable innovation in the display industry,aligning with global energy-saving goals.展开更多
The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show...The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show limitations with the inherent uncertainty and imprecision of such data,as they rely on a single-dimensional membership value.To overcome these limitations,we propose an auto-weighted multi-view neutrosophic fuzzy clustering(AW-MVNFC)algorithm.Our method leverages the neutrosophic framework,an extension of fuzzy sets,to explicitly model imprecision and ambiguity through three membership degrees.The core novelty of AWMVNFC lies in a hierarchical weighting strategy that adaptively learns the contributions of both individual data views and the importance of each feature within a view.Through a unified objective function,AW-MVNFC jointly optimizes the neutrosophic membership assignments,cluster centers,and the distributions of view and feature weights.Comprehensive experiments conducted on synthetic and real-world datasets demonstrate that our algorithm achieves more accurate and stable clustering than existing methods,demonstrating its effectiveness in handling the complexities of multi-view data.展开更多
Eco-friendly quantum-dot light-emitting diodes(QLEDs),which employ colloidal quantum dots(QDs)such as InP,and ZnSe,stand out due to their low toxicity,color purity,and high efficiency.Currently,significant advancement...Eco-friendly quantum-dot light-emitting diodes(QLEDs),which employ colloidal quantum dots(QDs)such as InP,and ZnSe,stand out due to their low toxicity,color purity,and high efficiency.Currently,significant advancements have been made in the performance of cadmium-free QLEDs.However,several challenges persist in the industrialization of ecofriendly QLED displays.For instance,(1)the poor performance,characterized by low photoluminescence quantum yield(PLQY),unstable ligand,and charge imbalance,cannot be effectively addressed with a solitary strategy;(2)the degradation mechanism,involving emission quenching,morphological inhomogeneity,and field-enhanced electron delocalization remains unclear;(3)the lack of techniques for color patterning,such as optical lithography and transfer printing.Herein,we undertake a specific review of all technological breakthroughs that endeavor to tackle the above challenges associated with cadmium-free QLED displays.We begin by reviewing the evolution,architecture,and operational characteristics of eco-friendly QLEDs,highlighting the photoelectric properties of QDs,carrier transport layer stability,and device lifetime.Subsequently,we focus our attention not only on the latest insights into device degradation mechanisms,particularly,but also on the remarkable technological progress in color patterning techniques.To conclude,we provide a synthesis of the promising prospects,current challenges,potential solutions,and emerging research trends for QLED displays.展开更多
At present,the naked-eye three-dimensional(3D)display technology still has some drawbacks,such as low brightness uniformity,high crosstalk,low light efficiency,short viewing distance,and the manufacturing is difficult...At present,the naked-eye three-dimensional(3D)display technology still has some drawbacks,such as low brightness uniformity,high crosstalk,low light efficiency,short viewing distance,and the manufacturing is difficulty.Based on the principle of naked-eye 3D display and the Fresnel optical theory,this paper designs a Fresnel lens array and the star-shaped liquid crystal display(LCD)switch of unit LCD screen to achieve low-crosstalk and high brightness uniformity for the autostereoscopic 3D display.The unit parameters of a 139.7 cm 4K model autostereoscopic 3D displayer are provided and they are optimized by the TracePro software.The results show that when the pitch of the Fresnel lens on the exit surface is 0.304 mm,the width of each serration of Fresnel lens is 0.0234 mm,the length of the Fresnel lens is 2.87 mm,and the center height of star-shaped LCD switch is 0.030 mm,the center length is 0.040 mm,the width of star-shaped LCD switch is 0.050 mm,and the image crosstalk is less than 2%when the viewing distance is 2.50 m.The problem on the brightness of the image in different positions is improved.展开更多
The evolution of display backplane technologies has been driven by the relentless pursuit of higher form factor and superior performance coupled with lower power consumption.Current state-of-the-art backplane technolo...The evolution of display backplane technologies has been driven by the relentless pursuit of higher form factor and superior performance coupled with lower power consumption.Current state-of-the-art backplane technologies based on amorphous Si,poly Si,and IGZO,face challenges in meeting the requirements of next-generation displays,including larger dimensions,higher refresh rates,increased pixel density,greater brightness,and reduced power consumption.In this context,2D chalcogenides have emerged as promising candidates for thin-film transistors(TFTs)in display backplanes,offering advantages such as high mobility,low leakage current,mechanical robustness,and transparency.This comprehensive review explores the significance of 2D chalcogenides as materials for TFTs in next-generation display backplanes.We delve into the structural characteristics,electronic properties,and synthesis methods of 2D chalcogenides,emphasizing scalable growth strategies that are relevant to large-area display backplanes.Additionally,we discuss mechanical flexibility and strain engineering,crucial for the development of flexible displays.Performance enhancement strategies for 2D chalcogenide TFTs have been explored encompassing techniques in device engineering and geometry optimization,while considering scaling over a large area.Active-matrix implementation of 2D TFTs in various applications is also explored,benchmarking device performance on a large scale which is a necessary aspect of TFTs used in display backplanes.Furthermore,the latest development on the integration of 2D chalcogenide TFTs with different display technologies,such as OLED,quantum dot,and MicroLED displays has been reviewed in detail.Finally,challenges and opportunities in the field are discussed with a brief insight into emerging trends and research directions.展开更多
Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches ofte...Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches often rely on limited data sources and simplistic hypotheses,which restrict their ability to capture the multi-faceted nature of biological systems.This study introduces adaptive multi-view learning(AMVL),a novel methodology that integrates chemical-induced transcriptional profiles(CTPs),knowledge graph(KG)embeddings,and large language model(LLM)representations,to enhance drug repurposing predictions.AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning(MVL),matrix factorization,and ensemble optimization techniques to integrate heterogeneous multi-source data.Comprehensive evaluations on benchmark datasets(Fdata-set,Cdataset,and Ydataset)and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art(SOTA)methods,achieving superior accuracy in predicting drug-disease associations across multiple metrics.Literature-based validation further confirmed the model's predictive capabilities,with seven out of the top ten predictions corroborated by post-2011 evidence.To promote transparency and reproducibility,all data and codes used in this study were open-sourced,providing resources for pro-cessing CTPs,KG,and LLM-based similarity calculations,along with the complete AMVL algorithm and benchmarking procedures.By unifying diverse data modalities,AMVL offers a robust and scalable so-lution for accelerating drug discovery,fostering advancements in translational medicine and integrating multi-omics data.We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.展开更多
Drone swarm systems,equipped with photoelectric imaging and intelligent target perception,are essential for reconnaissance and strike missions in complex and high-risk environments.They excel in information sharing,an...Drone swarm systems,equipped with photoelectric imaging and intelligent target perception,are essential for reconnaissance and strike missions in complex and high-risk environments.They excel in information sharing,anti-jamming capabilities,and combat performance,making them critical for future warfare.However,varied perspectives in collaborative combat scenarios pose challenges to object detection,hindering traditional detection algorithms and reducing accuracy.Limited angle-prior data and sparse samples further complicate detection.This paper presents the Multi-View Collaborative Detection System,which tackles the challenges of multi-view object detection in collaborative combat scenarios.The system is designed to enhance multi-view image generation and detection algorithms,thereby improving the accuracy and efficiency of object detection across varying perspectives.First,an observation model for three-dimensional targets through line-of-sight angle transformation is constructed,and a multi-view image generation algorithm based on the Pix2Pix network is designed.For object detection,YOLOX is utilized,and a deep feature extraction network,BA-RepCSPDarknet,is developed to address challenges related to small target scale and feature extraction challenges.Additionally,a feature fusion network NS-PAFPN is developed to mitigate the issue of deep feature map information loss in UAV images.A visual attention module(BAM)is employed to manage appearance differences under varying angles,while a feature mapping module(DFM)prevents fine-grained feature loss.These advancements lead to the development of BA-YOLOX,a multi-view object detection network model suitable for drone platforms,enhancing accuracy and effectively targeting small objects.展开更多
With the rapid progress of the artificial intelligence(AI)technology and mobile internet,3D hand pose estimation has become critical to various intelligent application areas,e.g.,human-computer interaction.To avoid th...With the rapid progress of the artificial intelligence(AI)technology and mobile internet,3D hand pose estimation has become critical to various intelligent application areas,e.g.,human-computer interaction.To avoid the low accuracy of single-modal estimation and the high complexity of traditional multi-modal 3D estimation,this paper proposes a novel multi-modal multi-view(MMV)3D hand pose estimation system,which introduces a registration before translation(RT)-translation before registration(TR)jointed conditional generative adversarial network(cGAN)to train a multi-modal registration network,and then employs the multi-modal feature fusion to achieve high-quality estimation,with low hardware and software costs both in data acquisition and processing.Experimental results demonstrate that the MMV system is effective and feasible in various scenarios.It is promising for the MMV system to be used in broad intelligent application areas.展开更多
基金funded by the Research Project:THTETN.05/24-25,VietnamAcademy of Science and Technology.
文摘Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rely on large amounts of labeled data,which are costly and time-consuming to obtain,especially in largescale or dynamic environments.To address this challenge,we propose the Semi-Supervised Multi-View Picture Fuzzy Clustering(SS-MPFC)algorithm,which improves segmentation accuracy and robustness,particularly in complex and uncertain remote sensing scenarios.SS-MPFC unifies three paradigms:semi-supervised learning,multi-view clustering,and picture fuzzy set theory.This integration allows the model to effectively utilize a small number of labeled samples,fuse complementary information from multiple data views,and handle the ambiguity and uncertainty inherent in satellite imagery.We design a novel objective function that jointly incorporates picture fuzzy membership functions across multiple views of the data,and embeds pairwise semi-supervised constraints(must-link and cannot-link)directly into the clustering process to enhance segmentation accuracy.Experiments conducted on several benchmark satellite datasets demonstrate that SS-MPFC significantly outperforms existing state-of-the-art methods in segmentation accuracy,noise robustness,and semantic interpretability.On the Augsburg dataset,SS-MPFC achieves a Purity of 0.8158 and an Accuracy of 0.6860,highlighting its outstanding robustness and efficiency.These results demonstrate that SSMPFC offers a scalable and effective solution for real-world satellite-based monitoring systems,particularly in scenarios where rapid annotation is infeasible,such as wildfire tracking,agricultural monitoring,and dynamic urban mapping.
基金Project supported by the National High Technology Research and Development Program of China(Grant No.2012AA03A301)the National Natural Science Foundation of China(Grant No.60932007)+1 种基金the Postdoctoral Science Programs Foundation of the Ministry of Education of China(Grant No.0110032110029)the Key Projects in the Tianjin Science & Technology Pillar Program,China(Grant No.11ZCKFGX02000)
文摘An autostereoscopic display composed of a directional backlight, an image display panel, a striped half-wave plate,and a polarized lenticular lens array is proposed. The directional backlight emitting the parallel light can redirect the cones of light to lenticular lens array and reduce the chromatic spatial-interference effect. The striped half-wave plate, located in front of the image display panel, transformed the polarization direction of the lights from the directional backlight into two mutually perpendicular directions. The polarized lenticular lens array not only can divide the light from the left and right view images to send to left and right eyes but also can reduce the crosstalk of the stereoscopic images. The proposed autostereoscopic display can produce high quality stereoscopic images without crosstalk at the optimal viewing distance.
基金supported by the National Key R&D Program of China(2022YFB3606501,2022YFB3602902)the Key projects of National Natural Science Foundation of China(62234004)+8 种基金the National Natural Science Foundation of China(U23A2092)Pioneer and Leading Goose R&D Program of Zhejiang(2024C01191,2024C01092)Innovation and Entrepreneurship Team of Zhejiang Province(2021R01003)Ningbo Key Technologies R&D Program(2022Z085),Ningbo 3315 Programme(2020A-01-B)YONGJIANG Talent Introduction Programme(2021A-038-B,2021A-159-G)“Innovation Yongjiang 2035”Key R&D Programme(2024Z146)Ningbo JiangBei District public welfare science and technology project(2022C07)the China National Postdoctoral Program for Innovative Talents(grant no.BX20240391)the China Postdoctoral Science Foundation(grant no.2023M743623).
文摘High-resolution non-emissive displays based on electrochromic tungsten oxides(WOx)are crucial for future near-eye virtual/augmented reality interactions,given their impressive attributes such as high environmental stability,ideal outdoor readability,and low energy consumption.However,the limited intrinsic structure of inorganic materials has presented a significant challenge in achieving precise patterning/pixelation at the micron scale.Here,we successfully developed the direct photolithography for WOx nanoparticles based on in situ photo-induced ligand exchange.This strategy enabled us to achieve ultra-high resolution efficiently(line width<4μm,the best resolution for reported inorganic electrochromic materials).Additionally,the resulting device exhibited impressive electrochromic performance,such as fast response(<1 s at 0 V),high coloration efficiency(119.5 cm^(2) C^(−1)),good optical modulation(55.9%),and durability(>3600 cycles),as well as promising applications in electronic logos,pixelated displays,flexible electronics,etc.The success and advancements presented here are expected to inspire and accelerate research and development(R&D)in high-resolution non-emissive displays and other ultra-fine micro-electronics.
文摘It is of great scientific significance to construct a 3D dynamic structural color with a special color effect based on the microlens array.However,the problems of imperfect mechanisms and poor color quality need to be solved.A method of 3D structural color turning on periodic metasurfaces fabricated by the microlens array and self-assembly technology was proposed in this study.In the experiment,Polydimethylsiloxane(PDMS)flexible film was used as a substrate,and SiO2 microspheres were scraped into grooves of the PDMS film to form 3D photonic crystal structures.By adjusting the number of blade-coated times and microsphere concentrations,high-saturation structural color micropatterns were obtained.These films were then matched with microlens arrays to produce dynamic graphics with iridescent effects.The results showed that by blade-coated two times and SiO2 microsphere concentrations of 50%are the best conditions.This method demonstrates the potential for being widely applied in the anticounterfeiting printing and ultra-high-resolution display.
基金funded by the Research Project:THTETN.05/24-25,VietnamAcademy of Science and Technology.
文摘Multi-view clustering is a critical research area in computer science aimed at effectively extracting meaningful patterns from complex,high-dimensional data that single-view methods cannot capture.Traditional fuzzy clustering techniques,such as Fuzzy C-Means(FCM),face significant challenges in handling uncertainty and the dependencies between different views.To overcome these limitations,we introduce a new multi-view fuzzy clustering approach that integrates picture fuzzy sets with a dual-anchor graph method for multi-view data,aiming to enhance clustering accuracy and robustness,termed Multi-view Picture Fuzzy Clustering(MPFC).In particular,the picture fuzzy set theory extends the capability to represent uncertainty by modeling three membership levels:membership degrees,neutral degrees,and refusal degrees.This allows for a more flexible representation of uncertain and conflicting data than traditional fuzzy models.Meanwhile,dual-anchor graphs exploit the similarity relationships between data points and integrate information across views.This combination improves stability,scalability,and robustness when handling noisy and heterogeneous data.Experimental results on several benchmark datasets demonstrate significant improvements in clustering accuracy and efficiency,outperforming traditional methods.Specifically,the MPFC algorithm demonstrates outstanding clustering performance on a variety of datasets,attaining a Purity(PUR)score of 0.6440 and an Accuracy(ACC)score of 0.6213 for the 3 Sources dataset,underscoring its robustness and efficiency.The proposed approach significantly contributes to fields such as pattern recognition,multi-view relational data analysis,and large-scale clustering problems.Future work will focus on extending the method for semi-supervised multi-view clustering,aiming to enhance adaptability,scalability,and performance in real-world applications.
基金supported by the research on key technologies for monitoring and identifying drug abuse of anesthetic drugs and psychotropic drugs,and intervention for addiction(No.2023YFC3304200)the program of a study on the diagnosis of addiction to synthetic cannabinoids and methods of assessing the risk of abuse(No.2022YFC3300905)+1 种基金the program of Ab initio design and generation of AI models for small molecule ligands based on target structures(No.2022PE0AC03)ZHIJIANG LAB.
文摘The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.
基金supported by National Natural Science Foundation of China(32122066,32201855)STI2030—Major Projects(2023ZD04076).
文摘Phenotypic prediction is a promising strategy for accelerating plant breeding.Data from multiple sources(called multi-view data)can provide complementary information to characterize a biological object from various aspects.By integrating multi-view information into phenotypic prediction,a multi-view best linear unbiased prediction(MVBLUP)method is proposed in this paper.To measure the importance of multiple data views,the differential evolution algorithm with an early stopping mechanism is used,by which we obtain a multi-view kinship matrix and then incorporate it into the BLUP model for phenotypic prediction.To further illustrate the characteristics of MVBLUP,we perform the empirical experiments on four multi-view datasets in different crops.Compared to the single-view method,the prediction accuracy of the MVBLUP method has improved by 0.038–0.201 on average.The results demonstrate that the MVBLUP is an effective integrative prediction method for multi-view data.
基金supported by the National Natural Science Foundation of China(No.62374142)Fundamental Research Funds for the Central Universities(Nos.20720220085 and 20720240064)+2 种基金External Cooperation Program of Fujian(No.2022I0004)Major Science and Technology Project of Xiamen in China(No.3502Z20191015)Xiamen Natural Science Foundation Youth Project(No.3502Z202471002)。
文摘Super-fine electrohydrodynamic inkjet(SIJ)printing of perovskite nanocrystal(PNC)colloid ink exhibits significant potential in the fabrication of high-resolution color conversion microstructures arrays for fullcolor micro-LED displays.However,the impact of solvent on both the printing process and the morphology of SIJ-printed PNC color conversion microstructures remains underexplored.In this study,we prepared samples of CsPbBr3PNC colloid inks in various solvents and investigated the solvent's impact on SIJ printed PNC microstructures.Our findings reveal that the boiling point of the solvent is crucial to the SIJ printing process of PNC colloid inks.Only does the boiling point of the solvent fall in the optimal range,the regular positioned,micron-scaled,conical PNC microstructures can be successfully printed.Below this optimal range,the ink is unable to be ejected from the nozzle;while above this range,irregular positioned microstructures with nanoscale height and coffee-ring-like morphology are produced.Based on these observations,high-resolution color conversion PNC microstructures were effectively prepared using SIJ printing of PNC colloid ink dispersed in dimethylbenzene solvent.
文摘The morphological description of wear particles in lubricating oil is crucial for wear state monitoring and fault diagnosis in aero-engines.Accurately and comprehensively acquiring three-dimensional(3D)morphological data of these particles has became a key focus in wear debris analysis.Herein,we develop a novel multi-view polarization-sensitive optical coherence tomography(PS-OCT)method to achieve accurate 3D morphology detection and reconstruction of aero-engine lubricant wear particles,effectively resolving occlusion-induced information loss while enabling material-specific characterization.The particle morphology is captured by multi-view imaging,followed by filtering,sharpening,and contour recognition.The method integrates advanced registration algorithms with Poisson reconstruction to generate high-precision 3D models.This approach not only provides accurate 3D morphological reconstruction but also mitigates information loss caused by particle occlusion,ensuring model completeness.Furthermore,by collecting polarization characteristics of typical metals and their oxides in aero-engine lubricants,this work comprehensively characterizes and comparatively analyzes particle polarization properties using Stokes vectors,polarization uniformity,and cumulative phase retardation,and obtains a three-dimensional model containing polarization information.Ultimately,the proposed method enables multidimensional information acquisition for the reliable identification of abrasive particle types.
基金National Natural Science Foundation of China(62405021,U24A20304)Beijing Nova Program(20240484557)。
文摘Virtual reality(VR)is regarded as the next-generation display platform for immersive human-computer interaction.To solve the long-existing problem of vergence accommodation conflict in VR,varifocal displays based on the diffractive Pancharatnam–Berry lens(PBL)are considered as one of the most promising approaches with great compatibility to current display architectures.However,the diffractive nature of PBL leads to serious chromatic aberrations in optical systems,which deteriorates the image quality and discourages its actual usage.
基金Fundamental Research Funds for Inner Mongolia Directly Affiliated Universities(Grant No.BR221032)the First Class Disciplines Research Special Project(Grant No.YLXKZX-NND-009)。
文摘Traditional sheep identification is based on ear tags.However,the application of ear tags not only causes stress to the animals but also leads to loss of ear tags,which affects the correct recognition of sheep identity.In contrast,the acquisition of sheep face images offers the advantages of being non-invasive and stress-free for the animals.Nevertheless,the extant convolutional neural network-based sheep face identification model is prone to the issue of inadequate refinement,which renders its implementation on farms challenging.To address this issue,this study presented a novel sheep face recognition model that employs advanced feature fusion techniques and precise image segmentation strategies.The images were preprocessed and accurately segmented using deep learning techniques,with a dataset constructed containing sheep face images from multiple viewpoints(left,front,and right faces).In particular,the model employs a segmentation algorithm to delineate the sheep face region accurately,utilizes the Improved Convolutional Block Attention Module(I-CBAM)to emphasize the salient features of the sheep face,and achieves multi-scale fusion of the features through a Feature Pyramid Network(FPN).This process guarantees that the features captured from disparate viewpoints can be efficiently integrated to enhance recognition accuracy.Furthermore,the model guarantees the precise delineation of sheep facial contours by streamlining the image segmentation procedure,thereby establishing a robust basis for the precise identification of sheep identity.The findings demonstrate that the recognition accuracy of the Sheep Face Mask Region-based Convolutional Neural Network(SFMask RCNN)model has been enhanced by 9.64%to 98.65%in comparison to the original model.The method offers a novel technological approach to the management of animal identity in the context of sheep husbandry.
基金supported by the National Key R&D Program of China(2023YFC3304600).
文摘Existing multi-view deep subspace clustering methods aim to learn a unified representation from multi-view data,while the learned representation is difficult to maintain the underlying structure hidden in the origin samples,especially the high-order neighbor relationship between samples.To overcome the above challenges,this paper proposes a novel multi-order neighborhood fusion based multi-view deep subspace clustering model.We creatively integrate the multi-order proximity graph structures of different views into the self-expressive layer by a multi-order neighborhood fusion module.By this design,the multi-order Laplacian matrix supervises the learning of the view-consistent self-representation affinity matrix;then,we can obtain an optimal global affinity matrix where each connected node belongs to one cluster.In addition,the discriminative constraint between views is designed to further improve the clustering performance.A range of experiments on six public datasets demonstrates that the method performs better than other advanced multi-view clustering methods.The code is available at https://github.com/songzuolong/MNF-MDSC(accessed on 25 December 2024).
基金supported by Basic Science Research Program through the National Research Foundation(NRF)of Korea,funded by the Ministry of Science and ICT(MSIT),Korea[2022R1A2C2091671]by ITECH R&D Program of MOTIE/KEIT(Ministry of Trade,Industry&Energy/Korea Evaluation Institute of Industrial Technology)[20016808].
文摘Balancing high display performance with energy efficiency is crucial for global sustainability.Lowering operating frequencies—such as enabling 1 Hz operation in fringe-field switching(FFS)liquid crystal displays—reduces power consumption but is hindered by image flicker.While negative dielectric anisotropy liquid crystals(nLCs)mitigate flicker,their high driving voltages and production costs limit adoption.Positive dielectric anisotropy liquid crystals(pLCs)offer lower operating voltages,faster response times,and broader applicability,making them a more viable alternative.This study introduces a novel approach to minimizing flexoelectric effects in pLCs by investigating how single components influence flexoelectric behavior in mixtures through an effective experimental methodology.Two innovative measurement techniques—(1)flexoelectric coefficient difference analysis and(2)displacement-current measurement(DCM)—are presented,marking the first application of DCM for verifying flexoelectric effects.The proposed system eliminates uncertainties associated with previous methods,providing a reliable framework for selecting liquid crystal components with minimal flexoelectric effects while preserving key electro-optic properties.Given pLCs'higher reliability,lower production costs,and broader material selection,these advancements hold significant potential for low-power displays.We believe this work enhances flexoelectric analysis in nematic liquid crystals and contributes to sustainable innovation in the display industry,aligning with global energy-saving goals.
文摘The increasing prevalence of multi-view data has made multi-view clustering a crucial technique for discovering latent structures from heterogeneous representations.However,traditional fuzzy clustering algorithms show limitations with the inherent uncertainty and imprecision of such data,as they rely on a single-dimensional membership value.To overcome these limitations,we propose an auto-weighted multi-view neutrosophic fuzzy clustering(AW-MVNFC)algorithm.Our method leverages the neutrosophic framework,an extension of fuzzy sets,to explicitly model imprecision and ambiguity through three membership degrees.The core novelty of AWMVNFC lies in a hierarchical weighting strategy that adaptively learns the contributions of both individual data views and the importance of each feature within a view.Through a unified objective function,AW-MVNFC jointly optimizes the neutrosophic membership assignments,cluster centers,and the distributions of view and feature weights.Comprehensive experiments conducted on synthetic and real-world datasets demonstrate that our algorithm achieves more accurate and stable clustering than existing methods,demonstrating its effectiveness in handling the complexities of multi-view data.
基金supported by the Research Projects of Department of Education of Guangdong Province-024CJPT002Special Project of Guangdong Provincial Department of Education in Key Areas (No. 6021210075K)Shenzhen Polytechnic University Research Fund. (No. 6024310006K)
文摘Eco-friendly quantum-dot light-emitting diodes(QLEDs),which employ colloidal quantum dots(QDs)such as InP,and ZnSe,stand out due to their low toxicity,color purity,and high efficiency.Currently,significant advancements have been made in the performance of cadmium-free QLEDs.However,several challenges persist in the industrialization of ecofriendly QLED displays.For instance,(1)the poor performance,characterized by low photoluminescence quantum yield(PLQY),unstable ligand,and charge imbalance,cannot be effectively addressed with a solitary strategy;(2)the degradation mechanism,involving emission quenching,morphological inhomogeneity,and field-enhanced electron delocalization remains unclear;(3)the lack of techniques for color patterning,such as optical lithography and transfer printing.Herein,we undertake a specific review of all technological breakthroughs that endeavor to tackle the above challenges associated with cadmium-free QLED displays.We begin by reviewing the evolution,architecture,and operational characteristics of eco-friendly QLEDs,highlighting the photoelectric properties of QDs,carrier transport layer stability,and device lifetime.Subsequently,we focus our attention not only on the latest insights into device degradation mechanisms,particularly,but also on the remarkable technological progress in color patterning techniques.To conclude,we provide a synthesis of the promising prospects,current challenges,potential solutions,and emerging research trends for QLED displays.
基金supported by the 2022 Fujian Provincial Young and Middle-aged Teacher Education and Research Project(Science and Technology)(No.JAT220468)the Xiamen Natural Science Foundation(No.3502Z20227334).
文摘At present,the naked-eye three-dimensional(3D)display technology still has some drawbacks,such as low brightness uniformity,high crosstalk,low light efficiency,short viewing distance,and the manufacturing is difficulty.Based on the principle of naked-eye 3D display and the Fresnel optical theory,this paper designs a Fresnel lens array and the star-shaped liquid crystal display(LCD)switch of unit LCD screen to achieve low-crosstalk and high brightness uniformity for the autostereoscopic 3D display.The unit parameters of a 139.7 cm 4K model autostereoscopic 3D displayer are provided and they are optimized by the TracePro software.The results show that when the pitch of the Fresnel lens on the exit surface is 0.304 mm,the width of each serration of Fresnel lens is 0.0234 mm,the length of the Fresnel lens is 2.87 mm,and the center height of star-shaped LCD switch is 0.030 mm,the center length is 0.040 mm,the width of star-shaped LCD switch is 0.050 mm,and the image crosstalk is less than 2%when the viewing distance is 2.50 m.The problem on the brightness of the image in different positions is improved.
基金supported in part by the National Research Foundation of Korea Grant Number:RS-2024-00448809National Research Foundation of Korea Grant Number:RS-2025-00517255+1 种基金National Research Foundation of Korea Grant Number:No.2021M3H4A1A02056037supported by Basic Science Research Program through the National Research Foundation of Korean(NRF)funded by the Ministry of Education(2020R1A6A1A03040516).
文摘The evolution of display backplane technologies has been driven by the relentless pursuit of higher form factor and superior performance coupled with lower power consumption.Current state-of-the-art backplane technologies based on amorphous Si,poly Si,and IGZO,face challenges in meeting the requirements of next-generation displays,including larger dimensions,higher refresh rates,increased pixel density,greater brightness,and reduced power consumption.In this context,2D chalcogenides have emerged as promising candidates for thin-film transistors(TFTs)in display backplanes,offering advantages such as high mobility,low leakage current,mechanical robustness,and transparency.This comprehensive review explores the significance of 2D chalcogenides as materials for TFTs in next-generation display backplanes.We delve into the structural characteristics,electronic properties,and synthesis methods of 2D chalcogenides,emphasizing scalable growth strategies that are relevant to large-area display backplanes.Additionally,we discuss mechanical flexibility and strain engineering,crucial for the development of flexible displays.Performance enhancement strategies for 2D chalcogenide TFTs have been explored encompassing techniques in device engineering and geometry optimization,while considering scaling over a large area.Active-matrix implementation of 2D TFTs in various applications is also explored,benchmarking device performance on a large scale which is a necessary aspect of TFTs used in display backplanes.Furthermore,the latest development on the integration of 2D chalcogenide TFTs with different display technologies,such as OLED,quantum dot,and MicroLED displays has been reviewed in detail.Finally,challenges and opportunities in the field are discussed with a brief insight into emerging trends and research directions.
基金supported by the National Natural Science Foundation of China(Grant No.:62101087)the China Postdoctoral Science Foundation(Grant No.:2021MD703942)+2 种基金the Chongqing Postdoctoral Research Project Special Funding,China(Grant No.:2021XM2016)the Science Foundation of Chongqing Municipal Commission of Education,China(Grant No.:KJQN202100642)the Chongqing Natural Science Foundation,China(Grant No.:cstc2021jcyj-msxmX0834).
文摘Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches often rely on limited data sources and simplistic hypotheses,which restrict their ability to capture the multi-faceted nature of biological systems.This study introduces adaptive multi-view learning(AMVL),a novel methodology that integrates chemical-induced transcriptional profiles(CTPs),knowledge graph(KG)embeddings,and large language model(LLM)representations,to enhance drug repurposing predictions.AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning(MVL),matrix factorization,and ensemble optimization techniques to integrate heterogeneous multi-source data.Comprehensive evaluations on benchmark datasets(Fdata-set,Cdataset,and Ydataset)and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art(SOTA)methods,achieving superior accuracy in predicting drug-disease associations across multiple metrics.Literature-based validation further confirmed the model's predictive capabilities,with seven out of the top ten predictions corroborated by post-2011 evidence.To promote transparency and reproducibility,all data and codes used in this study were open-sourced,providing resources for pro-cessing CTPs,KG,and LLM-based similarity calculations,along with the complete AMVL algorithm and benchmarking procedures.By unifying diverse data modalities,AMVL offers a robust and scalable so-lution for accelerating drug discovery,fostering advancements in translational medicine and integrating multi-omics data.We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.
基金supported by the Natural Science Foundation of China,Grant No.62103052.
文摘Drone swarm systems,equipped with photoelectric imaging and intelligent target perception,are essential for reconnaissance and strike missions in complex and high-risk environments.They excel in information sharing,anti-jamming capabilities,and combat performance,making them critical for future warfare.However,varied perspectives in collaborative combat scenarios pose challenges to object detection,hindering traditional detection algorithms and reducing accuracy.Limited angle-prior data and sparse samples further complicate detection.This paper presents the Multi-View Collaborative Detection System,which tackles the challenges of multi-view object detection in collaborative combat scenarios.The system is designed to enhance multi-view image generation and detection algorithms,thereby improving the accuracy and efficiency of object detection across varying perspectives.First,an observation model for three-dimensional targets through line-of-sight angle transformation is constructed,and a multi-view image generation algorithm based on the Pix2Pix network is designed.For object detection,YOLOX is utilized,and a deep feature extraction network,BA-RepCSPDarknet,is developed to address challenges related to small target scale and feature extraction challenges.Additionally,a feature fusion network NS-PAFPN is developed to mitigate the issue of deep feature map information loss in UAV images.A visual attention module(BAM)is employed to manage appearance differences under varying angles,while a feature mapping module(DFM)prevents fine-grained feature loss.These advancements lead to the development of BA-YOLOX,a multi-view object detection network model suitable for drone platforms,enhancing accuracy and effectively targeting small objects.
文摘With the rapid progress of the artificial intelligence(AI)technology and mobile internet,3D hand pose estimation has become critical to various intelligent application areas,e.g.,human-computer interaction.To avoid the low accuracy of single-modal estimation and the high complexity of traditional multi-modal 3D estimation,this paper proposes a novel multi-modal multi-view(MMV)3D hand pose estimation system,which introduces a registration before translation(RT)-translation before registration(TR)jointed conditional generative adversarial network(cGAN)to train a multi-modal registration network,and then employs the multi-modal feature fusion to achieve high-quality estimation,with low hardware and software costs both in data acquisition and processing.Experimental results demonstrate that the MMV system is effective and feasible in various scenarios.It is promising for the MMV system to be used in broad intelligent application areas.