As China's high-speed railway technology advances,high-speed trains have emerged as a pivotal mode of transportation,instrumental in facilitating passenger and freight mobility while fostering robust regional eco-...As China's high-speed railway technology advances,high-speed trains have emerged as a pivotal mode of transportation,instrumental in facilitating passenger and freight mobility while fostering robust regional eco-nomic and trade interactions.Nonetheless,the safety of train operations remains a paramount concern,prompting extensive research into the dynamic behavior of critical components,which is essential to ensuring seamless and secure transportation services.This article commences by comprehensively reviewing the current landscape and evolutionary trajectory of dynamic model analysis for both traditional bearings and axle box bearings.Emphasis is placed on elucidating the profound influence of diverse bearing fault types on the system's kinematic state,alongside delving into the research methodologies employed in developing multi-physics field coupling models.Subsequently,it expounds on the content of investigations focusing on various wheel and track impairments,grounded in the dynamic modeling of the bearing vehicle coupling system.Concurrently,the intricate interplay between wheel-rail excitation and axle box bearing faults on the system's performance is elucidated.Concludingly,the article underscores the inadequacy of current multi-source fault diagnosis meth-odologies in tackling the intricacies of complex train operating environments,thereby highlighting its sig-nificance as a pressing and vital research agenda for the future.展开更多
The spatial offset of bridge has a significant impact on the safety,comfort,and durability of high-speed railway(HSR)operations,so it is crucial to rapidly and effectively detect the spatial offset of operational HSR ...The spatial offset of bridge has a significant impact on the safety,comfort,and durability of high-speed railway(HSR)operations,so it is crucial to rapidly and effectively detect the spatial offset of operational HSR bridges.Drive-by monitoring of bridge uneven settlement demonstrates significant potential due to its practicality,cost-effectiveness,and efficiency.However,existing drive-by methods for detecting bridge offset have limitations such as reliance on a single data source,low detection accuracy,and the inability to identify lateral deformations of bridges.This paper proposes a novel drive-by inspection method for spatial offset of HSR bridge based on multi-source data fusion of comprehensive inspection train.Firstly,dung beetle optimizer-variational mode decomposition was employed to achieve adaptive decomposition of non-stationary dynamic signals,and explore the hidden temporal relationships in the data.Subsequently,a long short-term memory neural network was developed to achieve feature fusion of multi-source signal and accurate prediction of spatial settlement of HSR bridge.A dataset of track irregularities and CRH380A high-speed train responses was generated using a 3D train-track-bridge interaction model,and the accuracy and effectiveness of the proposed hybrid deep learning model were numerically validated.Finally,the reliability of the proposed drive-by inspection method was further validated by analyzing the actual measurement data obtained from comprehensive inspection train.The research findings indicate that the proposed approach enables rapid and accurate detection of spatial offset in HSR bridge,ensuring the long-term operational safety of HSR bridges.展开更多
Domain adaptation aims to reduce the distribution gap between the training data(source domain)and the target data.This enables effective predictions even for domains not seen during training.However,most conventional ...Domain adaptation aims to reduce the distribution gap between the training data(source domain)and the target data.This enables effective predictions even for domains not seen during training.However,most conventional domain adaptation methods assume a single source domain,making them less suitable for modern deep learning settings that rely on diverse and large-scale datasets.To address this limitation,recent research has focused on Multi-Source Domain Adaptation(MSDA),which aims to learn effectively from multiple source domains.In this paper,we propose Efficient Domain Transition for Multi-source(EDTM),a novel and efficient framework designed to tackle two major challenges in existing MSDA approaches:(1)integrating knowledge across different source domains and(2)aligning label distributions between source and target domains.EDTM leverages an ensemble-based classifier expert mechanism to enhance the contribution of source domains that are more similar to the target domain.To further stabilize the learning process and improve performance,we incorporate imitation learning into the training of the target model.In addition,Maximum Classifier Discrepancy(MCD)is employed to align class-wise label distributions between the source and target domains.Experiments were conducted using Digits-Five,one of the most representative benchmark datasets for MSDA.The results show that EDTM consistently outperforms existing methods in terms of average classification accuracy.Notably,EDTM achieved significantly higher performance on target domains such as Modified National Institute of Standards and Technolog with blended background images(MNIST-M)and Street View House Numbers(SVHN)datasets,demonstrating enhanced generalization compared to baseline approaches.Furthermore,an ablation study analyzing the contribution of each loss component validated the effectiveness of the framework,highlighting the importance of each module in achieving optimal performance.展开更多
Benthic habitat mapping is an emerging discipline in the international marine field in recent years,providing an effective tool for marine spatial planning,marine ecological management,and decision-making applications...Benthic habitat mapping is an emerging discipline in the international marine field in recent years,providing an effective tool for marine spatial planning,marine ecological management,and decision-making applications.Seabed sediment classification is one of the main contents of seabed habitat mapping.In response to the impact of remote sensing imaging quality and the limitations of acoustic measurement range,where a single data source does not fully reflect the substrate type,we proposed a high-precision seabed habitat sediment classification method that integrates data from multiple sources.Based on WorldView-2 multi-spectral remote sensing image data and multibeam bathymetry data,constructed a random forests(RF)classifier with optimal feature selection.A seabed sediment classification experiment integrating optical remote sensing and acoustic remote sensing data was carried out in the shallow water area of Wuzhizhou Island,Hainan,South China.Different seabed sediment types,such as sand,seagrass,and coral reefs were effectively identified,with an overall classification accuracy of 92%.Experimental results show that RF matrix optimized by fusing multi-source remote sensing data for feature selection were better than the classification results of simple combinations of data sources,which improved the accuracy of seabed sediment classification.Therefore,the method proposed in this paper can be effectively applied to high-precision seabed sediment classification and habitat mapping around islands and reefs.展开更多
Accurate estimation of understory terrain has significant scientific importance for maintaining ecosystem balance and biodiversity conservation.Addressing the issue of inadequate representation of spatial heterogeneit...Accurate estimation of understory terrain has significant scientific importance for maintaining ecosystem balance and biodiversity conservation.Addressing the issue of inadequate representation of spatial heterogeneity when traditional forest topographic inversion methods consider the entire forest as the inversion unit,this study pro⁃poses a differentiated modeling approach to forest types based on refined land cover classification.Taking Puerto Ri⁃co and Maryland as study areas,a multi-dimensional feature system is constructed by integrating multi-source re⁃mote sensing data:ICESat-2 spaceborne LiDAR is used to obtain benchmark values for understory terrain,topo⁃graphic factors such as slope and aspect are extracted based on SRTM data,and vegetation cover characteristics are analyzed using Landsat-8 multispectral imagery.This study incorporates forest type as a classification modeling con⁃dition and applies the random forest algorithm to build differentiated topographic inversion models.Experimental re⁃sults indicate that,compared to traditional whole-area modeling methods(RMSE=5.06 m),forest type-based classi⁃fication modeling significantly improves the accuracy of understory terrain estimation(RMSE=2.94 m),validating the effectiveness of spatial heterogeneity modeling.Further sensitivity analysis reveals that canopy structure parame⁃ters(with RMSE variation reaching 4.11 m)exert a stronger regulatory effect on estimation accuracy compared to forest cover,providing important theoretical support for optimizing remote sensing models of forest topography.展开更多
To elucidate the fracturing mechanism of deep hard rock under complex disturbance environments,this study investigates the dynamic failure behavior of pre-damaged granite subjected to multi-source dynamic disturbances...To elucidate the fracturing mechanism of deep hard rock under complex disturbance environments,this study investigates the dynamic failure behavior of pre-damaged granite subjected to multi-source dynamic disturbances.Blasting vibration monitoring was conducted in a deep-buried drill-and-blast tunnel to characterize in-situ dynamic loading conditions.Subsequently,true triaxial compression tests incorporating multi-source disturbances were performed using a self-developed wide-low-frequency true triaxial system to simulate disturbance accumulation and damage evolution in granite.The results demonstrate that combined dynamic disturbances and unloading damage significantly accelerate strength degradation and trigger shear-slip failure along preferentially oriented blast-induced fractures,with strength reductions up to 16.7%.Layered failure was observed on the free surface of pre-damaged granite under biaxial loading,indicating a disturbance-induced fracture localization mechanism.Time-stress-fracture-energy coupling fields were constructed to reveal the spatiotemporal characteristics of fracture evolution.Critical precursor frequency bands(105-150,185-225,and 300-325 kHz)were identified,which serve as diagnostic signatures of impending failure.A dynamic instability mechanism driven by multi-source disturbance superposition and pre-damage evolution was established.Furthermore,a grouting-based wave-absorption control strategy was proposed to mitigate deep dynamic disasters by attenuating disturbance amplitude and reducing excitation frequency.展开更多
The SiO_(2) inverse opal photonic crystals(PC)with a three-dimensional macroporous structure were fabricated by the sacrificial template method,followed by infiltration of a pyrene derivative,1-(pyren-8-yl)but-3-en-1-...The SiO_(2) inverse opal photonic crystals(PC)with a three-dimensional macroporous structure were fabricated by the sacrificial template method,followed by infiltration of a pyrene derivative,1-(pyren-8-yl)but-3-en-1-amine(PEA),to achieve a formaldehyde(FA)-sensitive and fluorescence-enhanced sensing film.Utilizing the specific Aza-Cope rearrangement reaction of allylamine of PEA and FA to generate a strong fluorescent product emitted at approximately 480 nm,we chose a PC whose blue band edge of stopband overlapped with the fluorescence emission wavelength.In virtue of the fluorescence enhancement property derived from slow photon effect of PC,FA was detected highly selectively and sensitively.The limit of detection(LoD)was calculated to be 1.38 nmol/L.Furthermore,the fast detection of FA(within 1 min)is realized due to the interconnected three-dimensional macroporous structure of the inverse opal PC and its high specific surface area.The prepared sensing film can be used for the detection of FA in air,aquatic products and living cells.The very close FA content in indoor air to the result from FA detector,the recovery rate of 101.5%for detecting FA in aquatic products and fast fluorescence imaging in 2 min for living cells demonstrate the reliability and accuracy of our method in practical applications.展开更多
Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.P...Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.Previous schemes have achieved secure outsourced computing,but they suffer from low computational accuracy,difficult-to-handle heterogeneous distribution of data from multiple sources,and high computational cost,which result in extremely poor user experience and expensive cloud computing costs.To address the above problems,we propose amulti-precision,multi-sourced,andmulti-key outsourcing neural network training scheme.Firstly,we design a multi-precision functional encryption computation based on Euclidean division.Second,we design the outsourcing model training algorithm based on a multi-precision functional encryption with multi-sourced heterogeneity.Finally,we conduct experiments on three datasets.The results indicate that our framework achieves an accuracy improvement of 6%to 30%.Additionally,it offers a memory space optimization of 1.0×2^(24) times compared to the previous best approach.展开更多
Accurate monitoring of track irregularities is very helpful to improving the vehicle operation quality and to formulating appropriate track maintenance strategies.Existing methods have the problem that they rely on co...Accurate monitoring of track irregularities is very helpful to improving the vehicle operation quality and to formulating appropriate track maintenance strategies.Existing methods have the problem that they rely on complex signal processing algorithms and lack multi-source data analysis.Driven by multi-source measurement data,including the axle box,the bogie frame and the carbody accelerations,this paper proposes a track irregularities monitoring network(TIMNet)based on deep learning methods.TIMNet uses the feature extraction capability of convolutional neural networks and the sequence map-ping capability of the long short-term memory model to explore the mapping relationship between vehicle accelerations and track irregularities.The particle swarm optimization algorithm is used to optimize the network parameters,so that both the vertical and lateral track irregularities can be accurately identified in the time and spatial domains.The effectiveness and superiority of the proposed TIMNet is analyzed under different simulation conditions using a vehicle dynamics model.Field tests are conducted to prove the availability of the proposed TIMNet in quantitatively monitoring vertical and lateral track irregularities.Furthermore,comparative tests show that the TIMNet has a better fitting degree and timeliness in monitoring track irregularities(vertical R2 of 0.91,lateral R2 of 0.84 and time cost of 10 ms),compared to other classical regression.The test also proves that the TIMNet has a better anti-interference ability than other regression models.展开更多
In the heterogeneous power internet of things(IoT)environment,data signals are acquired to support different business systems to realize advanced intelligent applications,with massive,multi-source,heterogeneous and ot...In the heterogeneous power internet of things(IoT)environment,data signals are acquired to support different business systems to realize advanced intelligent applications,with massive,multi-source,heterogeneous and other characteristics.Reliable perception of information and efficient transmission of energy in multi-source heterogeneous environments are crucial issues.Compressive sensing(CS),as an effective method of signal compression and transmission,can accurately recover the original signal only by very few sampling.In this paper,we study a new method of multi-source heterogeneous data signal reconstruction of power IoT based on compressive sensing technology.Based on the traditional compressive sensing technology to directly recover multi-source heterogeneous signals,we fully use the interference subspace information to design the measurement matrix,which directly and effectively eliminates the interference while making the measurement.The measure matrix is optimized by minimizing the average cross-coherence of the matrix,and the reconstruction performance of the new method is further improved.Finally,the effectiveness of the new method with different parameter settings under different multi-source heterogeneous data signal cases is verified by using orthogonal matching pursuit(OMP)and sparsity adaptive matching pursuit(SAMP)for considering the actual environment with prior information utilization of signal sparsity and no prior information utilization of signal sparsity.展开更多
This paper deeply discusses the causes of gear howling noise,the identification and analysis of multi-source excitation,the transmission path of dynamic noise,simulation and experimental research,case analysis,optimiz...This paper deeply discusses the causes of gear howling noise,the identification and analysis of multi-source excitation,the transmission path of dynamic noise,simulation and experimental research,case analysis,optimization effect,etc.,aiming to better provide a certain guideline and reference for relevant researchers.展开更多
With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heter...With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.展开更多
Multi-source data fusion provides high-precision spatial situational awareness essential for analyzing granular urban social activities.This study used Shanghai’s catering industry as a case study,leveraging electron...Multi-source data fusion provides high-precision spatial situational awareness essential for analyzing granular urban social activities.This study used Shanghai’s catering industry as a case study,leveraging electronic reviews and consumer data sourced from third-party restaurant platforms collected in 2021.By performing weighted processing on two-dimensional point-of-interest(POI)data,clustering hotspots of high-dimensional restaurant data were identified.A hierarchical network of restaurant hotspots was constructed following the Central Place Theory(CPT)framework,while the Geo-Informatic Tupu method was employed to resolve the challenges posed by network deformation in multi-scale processes.These findings suggest the necessity of enhancing the spatial balance of Shanghai’s urban centers by moderately increasing the number and service capacity of suburban centers at the urban periphery.Such measures would contribute to a more optimized urban structure and facilitate the outward dispersion of comfort-oriented facilities such as the restaurant industry.At a finer spatial scale,the distribution of restaurant hotspots demonstrates a polycentric and symmetric spatial pattern,with a developmental trend radiating outward along the city’s ring roads.This trend can be attributed to the efforts of restaurants to establish connections with other urban functional spaces,leading to the reconfiguration of urban spaces,expansion of restaurant-dedicated land use,and the reorganization of associated commercial activities.The results validate the existence of a polycentric urban structure in Shanghai but also highlight the instability of the restaurant hotspot network during cross-scale transitions.展开更多
Taking the Ming Tombs Forest Farm in Beijing as the research object,this research applied multi-source data fusion and GIS heat-map overlay analysis techniques,systematically collected bird observation point data from...Taking the Ming Tombs Forest Farm in Beijing as the research object,this research applied multi-source data fusion and GIS heat-map overlay analysis techniques,systematically collected bird observation point data from the Global Biodiversity Information Facility(GBIF),population distribution data from the Oak Ridge National Laboratory(ORNL)in the United States,as well as information on the composition of tree species in suitable forest areas for birds and the forest geographical information of the Ming Tombs Forest Farm,which is based on literature research and field investigations.By using GIS technology,spatial processing was carried out on bird observation points and population distribution data to identify suitable bird-watching areas in different seasons.Then,according to the suitability value range,these areas were classified into different grades(from unsuitable to highly suitable).The research findings indicated that there was significant spatial heterogeneity in the bird-watching suitability of the Ming Tombs Forest Farm.The north side of the reservoir was generally a core area with high suitability in all seasons.The deep-aged broad-leaved mixed forests supported the overlapping co-existence of the ecological niches of various bird species,such as the Zosterops simplex and Urocissa erythrorhyncha.In contrast,the shallow forest-edge coniferous pure forests and mixed forests were more suitable for specialized species like Carduelis sinica.The southern urban area and the core area of the mausoleums had relatively low suitability due to ecological fragmentation or human interference.Based on these results,this paper proposed a three-level protection framework of“core area conservation—buffer zone management—isolation zone construction”and a spatio-temporal coordinated human-bird co-existence strategy.It was also suggested that the human-bird co-existence space could be optimized through measures such as constructing sound and light buffer interfaces,restoring ecological corridors,and integrating cultural heritage elements.This research provided an operational technical approach and decision-making support for the scientific planning of bird-watching sites and the coordination of ecological protection and tourism development.展开更多
As coal mining progresses to greater depths,controlling the stability of surrounding rock in deep roadways has become an increasingly complex challenge.Although four-dimensional(4D)support theoretically offers unique ...As coal mining progresses to greater depths,controlling the stability of surrounding rock in deep roadways has become an increasingly complex challenge.Although four-dimensional(4D)support theoretically offers unique advantages in maintaining the stability of rock mass,the disaster evolution processes and multi-source information response characteristics in deep roadways with 4D support remain unclear.Consequently,a large-scale physical model testing system and self-designed 4D support components were employed to conduct similarity model tests on the surrounding rock failure process under unsupported(U-1),traditional bolt-mesh-cable support(T-2),and 4D support(4D-R-3)conditions.Combined with multi-source monitoring techniques,including stress–strain,digital image correlation(DIC),acoustic emission(AE),microseismic(MS),parallel electric(PE),and electromagnetic radiation(EMR),the mechanical behavior and multi-source information responses were comprehensively analyzed.The results show that the peak stress and displacement of the models are positively correlated with the support strength.The multi-source information exhibits distinct response characteristics under different supports.The response frequency,energy,and fluctuationsof AE,MS,and EMR signals,along with the apparent resistivity(AR)high-resistivity zone,follow the trend U-1>T-2>4D-R-3.Furthermore,multi-source information exhibits significantdifferences in sensitivity across different phases.The AE,MS,and EMR signals exhibit active responses to rock mass activity at each phase.However,AR signals are only sensitive to the fracture propagation during the plastic yield and failure phases.In summary,the 4D support significantlyenhances the bearing capacity and plastic deformation of the models,while substantially reducing the frequency,energy,and fluctuationsof multi-source signals.展开更多
The exponential growth of video content has driven significant advancements in video summarization techniques in recent years.Breakthroughs in deep learning have been particularly transformative,enabling more effectiv...The exponential growth of video content has driven significant advancements in video summarization techniques in recent years.Breakthroughs in deep learning have been particularly transformative,enabling more effective detection of key information and creating new possibilities for video synopsis.To summarize recent progress and accelerate research in this field,this paper provides a comprehensive review of deep learning-based video summarization methods developed over the past decade.We begin by examining the research landscape of video abstraction technologies and identifying core challenges in video summarization.Subsequently,we systematically analyze prevailing deep learning frameworks and methodologies employed in current video summarization systems,offering researchers a clear roadmap of the field's evelution.Unlike previous review works,we first classify research papers based on the structural hierarchy of the video(from frame-level to shot-level to video-level),then further categorize them according to the summary backbone model(feature extraction and spatiotemporal modeling).This approach provides a more systematic and hierarchical organization of the documents.Following this comprehensive review,we summarize the benchmark datasets and evaluation metrics commonly employed in the field.Finally,we analyze persistent challenges and propose insightful directions for future research,providing a forward-looking perspective on video summarization technologies.This systematic literature review is of great reference value to new researchers exploring the fields of deep learning and video summarization.展开更多
With the continuous advancement of unmanned technology in various application domains,the development and deployment of blind-spot-free panoramic video systems have gained increasing importance.Such systems are partic...With the continuous advancement of unmanned technology in various application domains,the development and deployment of blind-spot-free panoramic video systems have gained increasing importance.Such systems are particularly critical in battlefield environments,where advanced panoramic video processing and wireless communication technologies are essential to enable remote control and autonomous operation of unmanned ground vehicles(UGVs).However,conventional video surveillance systems suffer from several limitations,including limited field of view,high processing latency,low reliability,excessive resource consumption,and significant transmission delays.These shortcomings impede the widespread adoption of UGVs in battlefield settings.To overcome these challenges,this paper proposes a novel multi-channel video capture and stitching system designed for real-time video processing.The system integrates the Speeded-Up Robust Features(SURF)algorithm and the Fast Library for Approximate Nearest Neighbors(FLANN)algorithm to execute essential operations such as feature detection,descriptor computation,image matching,homography estimation,and seamless image fusion.The fused panoramic video is then encoded and assembled to produce a seamless output devoid of stitching artifacts and shadows.Furthermore,H.264 video compression is employed to reduce the data size of the video stream without sacrificing visual quality.Using the Real-Time Streaming Protocol(RTSP),the compressed stream is transmitted efficiently,supporting real-time remote monitoring and control of UGVs in dynamic battlefield environments.Experimental results indicate that the proposed system achieves high stability,flexibility,and low latency.With a wireless link latency of 30 ms,the end-to-end video transmission latency remains around 140 ms,enabling smooth video communication.The system can tolerate packet loss rates(PLR)of up to 20%while maintaining usable video quality(with latency around 200 ms).These properties make it well-suited for mobile communication scenarios demanding high real-time video performance.展开更多
Scalable simulation leveraging real-world data plays an essential role in advancing autonomous driving,owing to its efficiency and applicability in both training and evaluating algorithms.Consequently,there has been i...Scalable simulation leveraging real-world data plays an essential role in advancing autonomous driving,owing to its efficiency and applicability in both training and evaluating algorithms.Consequently,there has been increasing attention on generating highly realistic and consistent driving videos,particularly those involving viewpoint changes guided by the control commands or trajectories of ego vehicles.However,current reconstruction approaches,such as Neural Radiance Fields and 3D Gaussian Splatting,frequently suffer from limited generalization and depend on substantial input data.Meanwhile,2D generative models,though capable of producing unknown scenes,still have room for improvement in terms of coherence and visual realism.To overcome these challenges,we introduce GenScene,a world model that synthesizes front-view driving videos conditioned on trajectories.A new temporal module is presented to improve video consistency by extracting the global context of each frame,calculating relationships of frames using these global representations,and fusing frame contexts accordingly.Moreover,we propose an innovative attention mechanism that computes relations of pixels within each frame and pixels in the corresponding window range of the initial frame.Extensive experiments show that our approach surpasses various state-of-the-art models in driving video generation,and the introduced modules contribute significantly to model performance.This work establishes a new paradigm for goal-oriented video synthesis in autonomous driving,which facilitates on-demand simulation to expedite algorithm development.展开更多
Video emotion recognition is widely used due to its alignment with the temporal characteristics of human emotional expression,but existingmodels have significant shortcomings.On the one hand,Transformermultihead self-...Video emotion recognition is widely used due to its alignment with the temporal characteristics of human emotional expression,but existingmodels have significant shortcomings.On the one hand,Transformermultihead self-attention modeling of global temporal dependency has problems of high computational overhead and feature similarity.On the other hand,fixed-size convolution kernels are often used,which have weak perception ability for emotional regions of different scales.Therefore,this paper proposes a video emotion recognition model that combines multi-scale region-aware convolution with temporal interactive sampling.In terms of space,multi-branch large-kernel stripe convolution is used to perceive emotional region features at different scales,and attention weights are generated for each scale feature.In terms of time,multi-layer odd-even down-sampling is performed on the time series,and oddeven sub-sequence interaction is performed to solve the problem of feature similarity,while reducing computational costs due to the linear relationship between sampling and convolution overhead.This paper was tested on CMU-MOSI,CMU-MOSEI,and Hume Reaction.The Acc-2 reached 83.4%,85.2%,and 81.2%,respectively.The experimental results show that the model can significantly improve the accuracy of emotion recognition.展开更多
Background:This study aims to investigate the underlying mechanisms between parental marital conflict and adolescent short video dependence by constructing a chain mediation model,focusing on the mediating roles of ex...Background:This study aims to investigate the underlying mechanisms between parental marital conflict and adolescent short video dependence by constructing a chain mediation model,focusing on the mediating roles of experiential avoidance and emotional disturbance(anxiety,depression,and stress).Methods:Conducted in January 2025,the research recruited 4125 adolescents from multiple Chinese provinces through convenience sampling;after data cleaning,3957 valid participants(1959 males,1998 females)were included.Using a cross-sectional design,measures included parental marital conflict,experiential avoidance,anxiety,depression,stress,and short video dependence.Results:Pearson correlation analysis revealed significant positive correlations among all variables.Mediation analysis using the SPSS PROCESS macro showed that parental marital conflict directly predicted short video dependence(β=0.269,p<0.001),and also significantly predicted experiential avoidance(β=0.519,p<0.001),anxiety(β=0.072,p<0.001),depression(β=0.067,p<0.001),and stress(β=0.048,p<0.05).Experiential avoidance further predicted anxiety(β=0.521,p<0.001),depression(β=0.489,p<0.001),stress(β=0.408,p<0.001),and short video dependence(β=0.244,p<0.001).While both anxiety(β=0.050,p<0.05)and depression(β=0.116,p<0.001)positively predicted short video dependence,stress did not(β=0.019,p=0.257).Overall,experiential avoidance,anxiety,depression,and stress significantly mediated the relationship between parental marital conflict and short video dependence.Conclusion:These findings confirm that parental marital conflict not only directly influences adolescent short video dependence but also operates through a chain mediation pathway involving experiential avoidance and emotional disturbance,highlighting central psychological mechanisms and providing theoretical support for integrated mental health and behavioral interventions.展开更多
基金Supported by the National Natural Science Foundation of China(Grant Nos.12393783,12302067,12172235,52072249)Joint Funds of the National Natural Science Foundation of China(Grant No.U24A2003)+3 种基金College Education Scientific Research Project of Hebei Province(Grant No.JZX2024006)Central Guiding Local Scientific and Technological Development Funding Project(Grant No.246Z2206G)the Key Research Project of China State Railway Group Co.,Ltd.(Grant No.N2024T009)S&T Program of Hebei(Grant No.21567622H).
文摘As China's high-speed railway technology advances,high-speed trains have emerged as a pivotal mode of transportation,instrumental in facilitating passenger and freight mobility while fostering robust regional eco-nomic and trade interactions.Nonetheless,the safety of train operations remains a paramount concern,prompting extensive research into the dynamic behavior of critical components,which is essential to ensuring seamless and secure transportation services.This article commences by comprehensively reviewing the current landscape and evolutionary trajectory of dynamic model analysis for both traditional bearings and axle box bearings.Emphasis is placed on elucidating the profound influence of diverse bearing fault types on the system's kinematic state,alongside delving into the research methodologies employed in developing multi-physics field coupling models.Subsequently,it expounds on the content of investigations focusing on various wheel and track impairments,grounded in the dynamic modeling of the bearing vehicle coupling system.Concurrently,the intricate interplay between wheel-rail excitation and axle box bearing faults on the system's performance is elucidated.Concludingly,the article underscores the inadequacy of current multi-source fault diagnosis meth-odologies in tackling the intricacies of complex train operating environments,thereby highlighting its sig-nificance as a pressing and vital research agenda for the future.
基金sponsored by the National Natural Science Foundation of China(Grant No.52178100).
文摘The spatial offset of bridge has a significant impact on the safety,comfort,and durability of high-speed railway(HSR)operations,so it is crucial to rapidly and effectively detect the spatial offset of operational HSR bridges.Drive-by monitoring of bridge uneven settlement demonstrates significant potential due to its practicality,cost-effectiveness,and efficiency.However,existing drive-by methods for detecting bridge offset have limitations such as reliance on a single data source,low detection accuracy,and the inability to identify lateral deformations of bridges.This paper proposes a novel drive-by inspection method for spatial offset of HSR bridge based on multi-source data fusion of comprehensive inspection train.Firstly,dung beetle optimizer-variational mode decomposition was employed to achieve adaptive decomposition of non-stationary dynamic signals,and explore the hidden temporal relationships in the data.Subsequently,a long short-term memory neural network was developed to achieve feature fusion of multi-source signal and accurate prediction of spatial settlement of HSR bridge.A dataset of track irregularities and CRH380A high-speed train responses was generated using a 3D train-track-bridge interaction model,and the accuracy and effectiveness of the proposed hybrid deep learning model were numerically validated.Finally,the reliability of the proposed drive-by inspection method was further validated by analyzing the actual measurement data obtained from comprehensive inspection train.The research findings indicate that the proposed approach enables rapid and accurate detection of spatial offset in HSR bridge,ensuring the long-term operational safety of HSR bridges.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.RS-2024-00406320)the Institute of Information&Communica-tions Technology Planning&Evaluation(IITP)-Innovative Human Resource Development for Local Intellectualization Program Grant funded by the Korea government(MSIT)(IITP-2026-RS-2023-00259678).
文摘Domain adaptation aims to reduce the distribution gap between the training data(source domain)and the target data.This enables effective predictions even for domains not seen during training.However,most conventional domain adaptation methods assume a single source domain,making them less suitable for modern deep learning settings that rely on diverse and large-scale datasets.To address this limitation,recent research has focused on Multi-Source Domain Adaptation(MSDA),which aims to learn effectively from multiple source domains.In this paper,we propose Efficient Domain Transition for Multi-source(EDTM),a novel and efficient framework designed to tackle two major challenges in existing MSDA approaches:(1)integrating knowledge across different source domains and(2)aligning label distributions between source and target domains.EDTM leverages an ensemble-based classifier expert mechanism to enhance the contribution of source domains that are more similar to the target domain.To further stabilize the learning process and improve performance,we incorporate imitation learning into the training of the target model.In addition,Maximum Classifier Discrepancy(MCD)is employed to align class-wise label distributions between the source and target domains.Experiments were conducted using Digits-Five,one of the most representative benchmark datasets for MSDA.The results show that EDTM consistently outperforms existing methods in terms of average classification accuracy.Notably,EDTM achieved significantly higher performance on target domains such as Modified National Institute of Standards and Technolog with blended background images(MNIST-M)and Street View House Numbers(SVHN)datasets,demonstrating enhanced generalization compared to baseline approaches.Furthermore,an ablation study analyzing the contribution of each loss component validated the effectiveness of the framework,highlighting the importance of each module in achieving optimal performance.
基金Supported by the National Natural Science Foundation of China(Nos.42376185,41876111)the Shandong Provincial Natural Science Foundation(No.ZR2023MD073)。
文摘Benthic habitat mapping is an emerging discipline in the international marine field in recent years,providing an effective tool for marine spatial planning,marine ecological management,and decision-making applications.Seabed sediment classification is one of the main contents of seabed habitat mapping.In response to the impact of remote sensing imaging quality and the limitations of acoustic measurement range,where a single data source does not fully reflect the substrate type,we proposed a high-precision seabed habitat sediment classification method that integrates data from multiple sources.Based on WorldView-2 multi-spectral remote sensing image data and multibeam bathymetry data,constructed a random forests(RF)classifier with optimal feature selection.A seabed sediment classification experiment integrating optical remote sensing and acoustic remote sensing data was carried out in the shallow water area of Wuzhizhou Island,Hainan,South China.Different seabed sediment types,such as sand,seagrass,and coral reefs were effectively identified,with an overall classification accuracy of 92%.Experimental results show that RF matrix optimized by fusing multi-source remote sensing data for feature selection were better than the classification results of simple combinations of data sources,which improved the accuracy of seabed sediment classification.Therefore,the method proposed in this paper can be effectively applied to high-precision seabed sediment classification and habitat mapping around islands and reefs.
基金Supported by the National Natural Science Foundation of China(42401488,42071351)the National Key Research and Development Program of China(2020YFA0608501,2017YFB0504204)+4 种基金the Liaoning Revitalization Talents Program(XLYC1802027)the Talent Recruited Program of the Chinese Academy of Science(Y938091)the Project Supported Discipline Innovation Team of the Liaoning Technical University(LNTU20TD-23)the Liaoning Province Doctoral Research Initiation Fund Program(2023-BS-202)the Basic Research Projects of Liaoning Department of Education(JYTQN2023202)。
文摘Accurate estimation of understory terrain has significant scientific importance for maintaining ecosystem balance and biodiversity conservation.Addressing the issue of inadequate representation of spatial heterogeneity when traditional forest topographic inversion methods consider the entire forest as the inversion unit,this study pro⁃poses a differentiated modeling approach to forest types based on refined land cover classification.Taking Puerto Ri⁃co and Maryland as study areas,a multi-dimensional feature system is constructed by integrating multi-source re⁃mote sensing data:ICESat-2 spaceborne LiDAR is used to obtain benchmark values for understory terrain,topo⁃graphic factors such as slope and aspect are extracted based on SRTM data,and vegetation cover characteristics are analyzed using Landsat-8 multispectral imagery.This study incorporates forest type as a classification modeling con⁃dition and applies the random forest algorithm to build differentiated topographic inversion models.Experimental re⁃sults indicate that,compared to traditional whole-area modeling methods(RMSE=5.06 m),forest type-based classi⁃fication modeling significantly improves the accuracy of understory terrain estimation(RMSE=2.94 m),validating the effectiveness of spatial heterogeneity modeling.Further sensitivity analysis reveals that canopy structure parame⁃ters(with RMSE variation reaching 4.11 m)exert a stronger regulatory effect on estimation accuracy compared to forest cover,providing important theoretical support for optimizing remote sensing models of forest topography.
基金supported by the National Key R&D Program of China(No.2023YFB2603602)the National Natural Science Foundation of China(Nos.52222810 and 52178383).
文摘To elucidate the fracturing mechanism of deep hard rock under complex disturbance environments,this study investigates the dynamic failure behavior of pre-damaged granite subjected to multi-source dynamic disturbances.Blasting vibration monitoring was conducted in a deep-buried drill-and-blast tunnel to characterize in-situ dynamic loading conditions.Subsequently,true triaxial compression tests incorporating multi-source disturbances were performed using a self-developed wide-low-frequency true triaxial system to simulate disturbance accumulation and damage evolution in granite.The results demonstrate that combined dynamic disturbances and unloading damage significantly accelerate strength degradation and trigger shear-slip failure along preferentially oriented blast-induced fractures,with strength reductions up to 16.7%.Layered failure was observed on the free surface of pre-damaged granite under biaxial loading,indicating a disturbance-induced fracture localization mechanism.Time-stress-fracture-energy coupling fields were constructed to reveal the spatiotemporal characteristics of fracture evolution.Critical precursor frequency bands(105-150,185-225,and 300-325 kHz)were identified,which serve as diagnostic signatures of impending failure.A dynamic instability mechanism driven by multi-source disturbance superposition and pre-damage evolution was established.Furthermore,a grouting-based wave-absorption control strategy was proposed to mitigate deep dynamic disasters by attenuating disturbance amplitude and reducing excitation frequency.
基金supported by the National Natural Science Foundation of China(21663032 and 22061041)the Open Sharing Platform for Scientific and Technological Resources of Shaanxi Province(2021PT-004)the National Innovation and Entrepreneurship Training Program for College Students of China(S202110719044)。
文摘The SiO_(2) inverse opal photonic crystals(PC)with a three-dimensional macroporous structure were fabricated by the sacrificial template method,followed by infiltration of a pyrene derivative,1-(pyren-8-yl)but-3-en-1-amine(PEA),to achieve a formaldehyde(FA)-sensitive and fluorescence-enhanced sensing film.Utilizing the specific Aza-Cope rearrangement reaction of allylamine of PEA and FA to generate a strong fluorescent product emitted at approximately 480 nm,we chose a PC whose blue band edge of stopband overlapped with the fluorescence emission wavelength.In virtue of the fluorescence enhancement property derived from slow photon effect of PC,FA was detected highly selectively and sensitively.The limit of detection(LoD)was calculated to be 1.38 nmol/L.Furthermore,the fast detection of FA(within 1 min)is realized due to the interconnected three-dimensional macroporous structure of the inverse opal PC and its high specific surface area.The prepared sensing film can be used for the detection of FA in air,aquatic products and living cells.The very close FA content in indoor air to the result from FA detector,the recovery rate of 101.5%for detecting FA in aquatic products and fast fluorescence imaging in 2 min for living cells demonstrate the reliability and accuracy of our method in practical applications.
基金supported by Natural Science Foundation of China(Nos.62303126,62362008,author Z.Z,https://www.nsfc.gov.cn/,accessed on 20 December 2024)Major Scientific and Technological Special Project of Guizhou Province([2024]014)+2 种基金Guizhou Provincial Science and Technology Projects(No.ZK[2022]General149) ,author Z.Z,https://kjt.guizhou.gov.cn/,accessed on 20 December 2024)The Open Project of the Key Laboratory of Computing Power Network and Information Security,Ministry of Education under Grant 2023ZD037,author Z.Z,https://www.gzu.edu.cn/,accessed on 20 December 2024)Open Research Project of the State Key Laboratory of Industrial Control Technology,Zhejiang University,China(No.ICT2024B25),author Z.Z,https://www.gzu.edu.cn/,accessed on 20 December 2024).
文摘Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.Previous schemes have achieved secure outsourced computing,but they suffer from low computational accuracy,difficult-to-handle heterogeneous distribution of data from multiple sources,and high computational cost,which result in extremely poor user experience and expensive cloud computing costs.To address the above problems,we propose amulti-precision,multi-sourced,andmulti-key outsourcing neural network training scheme.Firstly,we design a multi-precision functional encryption computation based on Euclidean division.Second,we design the outsourcing model training algorithm based on a multi-precision functional encryption with multi-sourced heterogeneity.Finally,we conduct experiments on three datasets.The results indicate that our framework achieves an accuracy improvement of 6%to 30%.Additionally,it offers a memory space optimization of 1.0×2^(24) times compared to the previous best approach.
基金supported by the Sichuan Science and Technology Program(Nos.2024JDRC0100 and 2023YFQ0091)the National Natural Science Foundation of China(Nos.U21A20167 and 52475138)the Scientific Research Foundation of the State Key Laboratory of Rail Transit Vehicle System(No.2024RVL-T08).
文摘Accurate monitoring of track irregularities is very helpful to improving the vehicle operation quality and to formulating appropriate track maintenance strategies.Existing methods have the problem that they rely on complex signal processing algorithms and lack multi-source data analysis.Driven by multi-source measurement data,including the axle box,the bogie frame and the carbody accelerations,this paper proposes a track irregularities monitoring network(TIMNet)based on deep learning methods.TIMNet uses the feature extraction capability of convolutional neural networks and the sequence map-ping capability of the long short-term memory model to explore the mapping relationship between vehicle accelerations and track irregularities.The particle swarm optimization algorithm is used to optimize the network parameters,so that both the vertical and lateral track irregularities can be accurately identified in the time and spatial domains.The effectiveness and superiority of the proposed TIMNet is analyzed under different simulation conditions using a vehicle dynamics model.Field tests are conducted to prove the availability of the proposed TIMNet in quantitatively monitoring vertical and lateral track irregularities.Furthermore,comparative tests show that the TIMNet has a better fitting degree and timeliness in monitoring track irregularities(vertical R2 of 0.91,lateral R2 of 0.84 and time cost of 10 ms),compared to other classical regression.The test also proves that the TIMNet has a better anti-interference ability than other regression models.
基金supported by National Natural Science Foundation of China(12174350)Science and Technology Project of State Grid Henan Electric Power Company(5217Q0240008).
文摘In the heterogeneous power internet of things(IoT)environment,data signals are acquired to support different business systems to realize advanced intelligent applications,with massive,multi-source,heterogeneous and other characteristics.Reliable perception of information and efficient transmission of energy in multi-source heterogeneous environments are crucial issues.Compressive sensing(CS),as an effective method of signal compression and transmission,can accurately recover the original signal only by very few sampling.In this paper,we study a new method of multi-source heterogeneous data signal reconstruction of power IoT based on compressive sensing technology.Based on the traditional compressive sensing technology to directly recover multi-source heterogeneous signals,we fully use the interference subspace information to design the measurement matrix,which directly and effectively eliminates the interference while making the measurement.The measure matrix is optimized by minimizing the average cross-coherence of the matrix,and the reconstruction performance of the new method is further improved.Finally,the effectiveness of the new method with different parameter settings under different multi-source heterogeneous data signal cases is verified by using orthogonal matching pursuit(OMP)and sparsity adaptive matching pursuit(SAMP)for considering the actual environment with prior information utilization of signal sparsity and no prior information utilization of signal sparsity.
文摘This paper deeply discusses the causes of gear howling noise,the identification and analysis of multi-source excitation,the transmission path of dynamic noise,simulation and experimental research,case analysis,optimization effect,etc.,aiming to better provide a certain guideline and reference for relevant researchers.
文摘With the acceleration of intelligent transformation of energy system,the monitoring of equipment operation status and optimization of production process in thermal power plants face the challenge of multi-source heterogeneous data integration.In view of the heterogeneous characteristics of physical sensor data,including temperature,vibration and pressure that generated by boilers,steam turbines and other key equipment and real-time working condition data of SCADA system,this paper proposes a multi-source heterogeneous data fusion and analysis platform for thermal power plants based on edge computing and deep learning.By constructing a multi-level fusion architecture,the platform adopts dynamic weight allocation strategy and 5D digital twin model to realize the collaborative analysis of physical sensor data,simulation calculation results and expert knowledge.The data fusion module combines Kalman filter,wavelet transform and Bayesian estimation method to solve the problem of data time series alignment and dimension difference.Simulation results show that the data fusion accuracy can be improved to more than 98%,and the calculation delay can be controlled within 500 ms.The data analysis module integrates Dymola simulation model and AERMOD pollutant diffusion model,supports the cascade analysis of boiler combustion efficiency prediction and flue gas emission monitoring,system response time is less than 2 seconds,and data consistency verification accuracy reaches 99.5%.
基金Under the auspices of the Key Program of National Natural Science Foundation of China(No.42030409)。
文摘Multi-source data fusion provides high-precision spatial situational awareness essential for analyzing granular urban social activities.This study used Shanghai’s catering industry as a case study,leveraging electronic reviews and consumer data sourced from third-party restaurant platforms collected in 2021.By performing weighted processing on two-dimensional point-of-interest(POI)data,clustering hotspots of high-dimensional restaurant data were identified.A hierarchical network of restaurant hotspots was constructed following the Central Place Theory(CPT)framework,while the Geo-Informatic Tupu method was employed to resolve the challenges posed by network deformation in multi-scale processes.These findings suggest the necessity of enhancing the spatial balance of Shanghai’s urban centers by moderately increasing the number and service capacity of suburban centers at the urban periphery.Such measures would contribute to a more optimized urban structure and facilitate the outward dispersion of comfort-oriented facilities such as the restaurant industry.At a finer spatial scale,the distribution of restaurant hotspots demonstrates a polycentric and symmetric spatial pattern,with a developmental trend radiating outward along the city’s ring roads.This trend can be attributed to the efforts of restaurants to establish connections with other urban functional spaces,leading to the reconfiguration of urban spaces,expansion of restaurant-dedicated land use,and the reorganization of associated commercial activities.The results validate the existence of a polycentric urban structure in Shanghai but also highlight the instability of the restaurant hotspot network during cross-scale transitions.
基金Sponsored by Beijing Youth Innovation Talent Support Program for Urban Greening and Landscaping——The 2024 Special Project for Promoting High-Quality Development of Beijing’s Landscaping through Scientific and Technological Innovation(KJCXQT202410).
文摘Taking the Ming Tombs Forest Farm in Beijing as the research object,this research applied multi-source data fusion and GIS heat-map overlay analysis techniques,systematically collected bird observation point data from the Global Biodiversity Information Facility(GBIF),population distribution data from the Oak Ridge National Laboratory(ORNL)in the United States,as well as information on the composition of tree species in suitable forest areas for birds and the forest geographical information of the Ming Tombs Forest Farm,which is based on literature research and field investigations.By using GIS technology,spatial processing was carried out on bird observation points and population distribution data to identify suitable bird-watching areas in different seasons.Then,according to the suitability value range,these areas were classified into different grades(from unsuitable to highly suitable).The research findings indicated that there was significant spatial heterogeneity in the bird-watching suitability of the Ming Tombs Forest Farm.The north side of the reservoir was generally a core area with high suitability in all seasons.The deep-aged broad-leaved mixed forests supported the overlapping co-existence of the ecological niches of various bird species,such as the Zosterops simplex and Urocissa erythrorhyncha.In contrast,the shallow forest-edge coniferous pure forests and mixed forests were more suitable for specialized species like Carduelis sinica.The southern urban area and the core area of the mausoleums had relatively low suitability due to ecological fragmentation or human interference.Based on these results,this paper proposed a three-level protection framework of“core area conservation—buffer zone management—isolation zone construction”and a spatio-temporal coordinated human-bird co-existence strategy.It was also suggested that the human-bird co-existence space could be optimized through measures such as constructing sound and light buffer interfaces,restoring ecological corridors,and integrating cultural heritage elements.This research provided an operational technical approach and decision-making support for the scientific planning of bird-watching sites and the coordination of ecological protection and tourism development.
基金supported by the National Natural Science Foundation of China(Grant Nos.U22A20598 and 52104107)the"Qinglan Project"of Jiangsu Colleges and Universities,Young Elite Scientists Sponsorship Program of Jiangsu Province(Grant No.TJ-2023-086).
文摘As coal mining progresses to greater depths,controlling the stability of surrounding rock in deep roadways has become an increasingly complex challenge.Although four-dimensional(4D)support theoretically offers unique advantages in maintaining the stability of rock mass,the disaster evolution processes and multi-source information response characteristics in deep roadways with 4D support remain unclear.Consequently,a large-scale physical model testing system and self-designed 4D support components were employed to conduct similarity model tests on the surrounding rock failure process under unsupported(U-1),traditional bolt-mesh-cable support(T-2),and 4D support(4D-R-3)conditions.Combined with multi-source monitoring techniques,including stress–strain,digital image correlation(DIC),acoustic emission(AE),microseismic(MS),parallel electric(PE),and electromagnetic radiation(EMR),the mechanical behavior and multi-source information responses were comprehensively analyzed.The results show that the peak stress and displacement of the models are positively correlated with the support strength.The multi-source information exhibits distinct response characteristics under different supports.The response frequency,energy,and fluctuationsof AE,MS,and EMR signals,along with the apparent resistivity(AR)high-resistivity zone,follow the trend U-1>T-2>4D-R-3.Furthermore,multi-source information exhibits significantdifferences in sensitivity across different phases.The AE,MS,and EMR signals exhibit active responses to rock mass activity at each phase.However,AR signals are only sensitive to the fracture propagation during the plastic yield and failure phases.In summary,the 4D support significantlyenhances the bearing capacity and plastic deformation of the models,while substantially reducing the frequency,energy,and fluctuationsof multi-source signals.
基金supported by UKRI(EP/Z000025/1)Horizon Europe Programme under the MSCA grant for the ACMod project(101130271)。
文摘The exponential growth of video content has driven significant advancements in video summarization techniques in recent years.Breakthroughs in deep learning have been particularly transformative,enabling more effective detection of key information and creating new possibilities for video synopsis.To summarize recent progress and accelerate research in this field,this paper provides a comprehensive review of deep learning-based video summarization methods developed over the past decade.We begin by examining the research landscape of video abstraction technologies and identifying core challenges in video summarization.Subsequently,we systematically analyze prevailing deep learning frameworks and methodologies employed in current video summarization systems,offering researchers a clear roadmap of the field's evelution.Unlike previous review works,we first classify research papers based on the structural hierarchy of the video(from frame-level to shot-level to video-level),then further categorize them according to the summary backbone model(feature extraction and spatiotemporal modeling).This approach provides a more systematic and hierarchical organization of the documents.Following this comprehensive review,we summarize the benchmark datasets and evaluation metrics commonly employed in the field.Finally,we analyze persistent challenges and propose insightful directions for future research,providing a forward-looking perspective on video summarization technologies.This systematic literature review is of great reference value to new researchers exploring the fields of deep learning and video summarization.
基金supported by the National Natural Science Foundation of China(Grant No.72334003)the National Key Research and Development Program of China(Grant No.2022YFB2702804)+1 种基金the Shandong Key Research and Development Program(Grant No.2020ZLYS09)the Jinan Program(Grant No.2021GXRC084-2).
文摘With the continuous advancement of unmanned technology in various application domains,the development and deployment of blind-spot-free panoramic video systems have gained increasing importance.Such systems are particularly critical in battlefield environments,where advanced panoramic video processing and wireless communication technologies are essential to enable remote control and autonomous operation of unmanned ground vehicles(UGVs).However,conventional video surveillance systems suffer from several limitations,including limited field of view,high processing latency,low reliability,excessive resource consumption,and significant transmission delays.These shortcomings impede the widespread adoption of UGVs in battlefield settings.To overcome these challenges,this paper proposes a novel multi-channel video capture and stitching system designed for real-time video processing.The system integrates the Speeded-Up Robust Features(SURF)algorithm and the Fast Library for Approximate Nearest Neighbors(FLANN)algorithm to execute essential operations such as feature detection,descriptor computation,image matching,homography estimation,and seamless image fusion.The fused panoramic video is then encoded and assembled to produce a seamless output devoid of stitching artifacts and shadows.Furthermore,H.264 video compression is employed to reduce the data size of the video stream without sacrificing visual quality.Using the Real-Time Streaming Protocol(RTSP),the compressed stream is transmitted efficiently,supporting real-time remote monitoring and control of UGVs in dynamic battlefield environments.Experimental results indicate that the proposed system achieves high stability,flexibility,and low latency.With a wireless link latency of 30 ms,the end-to-end video transmission latency remains around 140 ms,enabling smooth video communication.The system can tolerate packet loss rates(PLR)of up to 20%while maintaining usable video quality(with latency around 200 ms).These properties make it well-suited for mobile communication scenarios demanding high real-time video performance.
基金supported by the Cultivation Program for Major Scientific Research Projects of Harbin Institute of Technology(ZDXMPY20180109).
文摘Scalable simulation leveraging real-world data plays an essential role in advancing autonomous driving,owing to its efficiency and applicability in both training and evaluating algorithms.Consequently,there has been increasing attention on generating highly realistic and consistent driving videos,particularly those involving viewpoint changes guided by the control commands or trajectories of ego vehicles.However,current reconstruction approaches,such as Neural Radiance Fields and 3D Gaussian Splatting,frequently suffer from limited generalization and depend on substantial input data.Meanwhile,2D generative models,though capable of producing unknown scenes,still have room for improvement in terms of coherence and visual realism.To overcome these challenges,we introduce GenScene,a world model that synthesizes front-view driving videos conditioned on trajectories.A new temporal module is presented to improve video consistency by extracting the global context of each frame,calculating relationships of frames using these global representations,and fusing frame contexts accordingly.Moreover,we propose an innovative attention mechanism that computes relations of pixels within each frame and pixels in the corresponding window range of the initial frame.Extensive experiments show that our approach surpasses various state-of-the-art models in driving video generation,and the introduced modules contribute significantly to model performance.This work establishes a new paradigm for goal-oriented video synthesis in autonomous driving,which facilitates on-demand simulation to expedite algorithm development.
基金supported,in part,by the National Nature Science Foundation of China under Grant 62272236,62376128in part,by the Natural Science Foundation of Jiangsu Province under Grant BK20201136,BK20191401.
文摘Video emotion recognition is widely used due to its alignment with the temporal characteristics of human emotional expression,but existingmodels have significant shortcomings.On the one hand,Transformermultihead self-attention modeling of global temporal dependency has problems of high computational overhead and feature similarity.On the other hand,fixed-size convolution kernels are often used,which have weak perception ability for emotional regions of different scales.Therefore,this paper proposes a video emotion recognition model that combines multi-scale region-aware convolution with temporal interactive sampling.In terms of space,multi-branch large-kernel stripe convolution is used to perceive emotional region features at different scales,and attention weights are generated for each scale feature.In terms of time,multi-layer odd-even down-sampling is performed on the time series,and oddeven sub-sequence interaction is performed to solve the problem of feature similarity,while reducing computational costs due to the linear relationship between sampling and convolution overhead.This paper was tested on CMU-MOSI,CMU-MOSEI,and Hume Reaction.The Acc-2 reached 83.4%,85.2%,and 81.2%,respectively.The experimental results show that the model can significantly improve the accuracy of emotion recognition.
文摘Background:This study aims to investigate the underlying mechanisms between parental marital conflict and adolescent short video dependence by constructing a chain mediation model,focusing on the mediating roles of experiential avoidance and emotional disturbance(anxiety,depression,and stress).Methods:Conducted in January 2025,the research recruited 4125 adolescents from multiple Chinese provinces through convenience sampling;after data cleaning,3957 valid participants(1959 males,1998 females)were included.Using a cross-sectional design,measures included parental marital conflict,experiential avoidance,anxiety,depression,stress,and short video dependence.Results:Pearson correlation analysis revealed significant positive correlations among all variables.Mediation analysis using the SPSS PROCESS macro showed that parental marital conflict directly predicted short video dependence(β=0.269,p<0.001),and also significantly predicted experiential avoidance(β=0.519,p<0.001),anxiety(β=0.072,p<0.001),depression(β=0.067,p<0.001),and stress(β=0.048,p<0.05).Experiential avoidance further predicted anxiety(β=0.521,p<0.001),depression(β=0.489,p<0.001),stress(β=0.408,p<0.001),and short video dependence(β=0.244,p<0.001).While both anxiety(β=0.050,p<0.05)and depression(β=0.116,p<0.001)positively predicted short video dependence,stress did not(β=0.019,p=0.257).Overall,experiential avoidance,anxiety,depression,and stress significantly mediated the relationship between parental marital conflict and short video dependence.Conclusion:These findings confirm that parental marital conflict not only directly influences adolescent short video dependence but also operates through a chain mediation pathway involving experiential avoidance and emotional disturbance,highlighting central psychological mechanisms and providing theoretical support for integrated mental health and behavioral interventions.