Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vi...Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.展开更多
Multi-Object Tracking(MOT)represents a fundamental but computationally demanding task in computer vision,with particular challenges arising in occluded and densely populated environments.While contemporary tracking sy...Multi-Object Tracking(MOT)represents a fundamental but computationally demanding task in computer vision,with particular challenges arising in occluded and densely populated environments.While contemporary tracking systems have demonstrated considerable progress,persistent limitations—notably frequent occlusion-induced identity switches and tracking inaccuracies—continue to impede reliable real-world deployment.This work introduces an advanced tracking framework that enhances association robustness through a two-stage matching paradigm combining spatial and appearance features.Proposed framework employs:(1)a Height Modulated and Scale Adaptive Spatial Intersection-over-Union(HMSIoU)metric for improved spatial correspondence estimation across variable object scales and partial occlusions;(2)a feature extraction module generating discriminative appearance descriptors for identity maintenance;and(3)a recovery association mechanism for refining matches between unassociated tracks and detections.Comprehensive evaluation on standard MOT17 and MOT20 benchmarks demonstrates significant improvements in tracking consistency,with state-of-the-art performance across key metrics including HOTA(64),MOTA(80.7),IDF1(79.8),and IDs(1379).These results substantiate the efficacy of our Cue-Tracker framework in complex real-world scenarios characterized by occlusions and crowd interactions.展开更多
Remote sensing cross-modal image-text retrieval(RSCIR)can flexibly and subjectively retrieve remote sensing images utilizing query text,which has received more researchers’attention recently.However,with the increasi...Remote sensing cross-modal image-text retrieval(RSCIR)can flexibly and subjectively retrieve remote sensing images utilizing query text,which has received more researchers’attention recently.However,with the increasing volume of visual-language pre-training model parameters,direct transfer learning consumes a substantial amount of computational and storage resources.Moreover,recently proposed parameter-efficient transfer learning methods mainly focus on the reconstruction of channel features,ignoring the spatial features which are vital for modeling key entity relationships.To address these issues,we design an efficient transfer learning framework for RSCIR,which is based on spatial feature efficient reconstruction(SPER).A concise and efficient spatial adapter is introduced to enhance the extraction of spatial relationships.The spatial adapter is able to spatially reconstruct the features in the backbone with few parameters while incorporating the prior information from the channel dimension.We conduct quantitative and qualitative experiments on two different commonly used RSCIR datasets.Compared with traditional methods,our approach achieves an improvement of 3%-11% in sumR metric.Compared with methods finetuning all parameters,our proposed method only trains less than 1% of the parameters,while maintaining an overall performance of about 96%.展开更多
In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestri...In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestrian re-identification tasks,a person re-identification method combining spatial feature learning and multi-granularity feature fusion was proposed.First,an attention spatial transformation network(A-STN)is proposed to learn spatial features and solve the problem of misalignment of pedestrian spatial features.Then the network was divided into a global branch,a local coarse-grained fusion branch,and a local fine-grained fusion branch to extract pedestrian global features,coarse-grained fusion features,and fine-grained fusion features,respectively.Among them,the global branch enriches the global features by fusing different pooling features.The local coarse-grained fusion branch uses an overlay pooling to enhance each local feature while learning the correlation relationship between multi-granularity features.The local fine-grained fusion branch uses a differential pooling to obtain the differential features that were fused with global features to learn the relationship between pedestrian local features and pedestrian global features.Finally,the proposed method was compared on three public datasets:Market1501,DukeMTMC-ReID and CUHK03.The experimental results were better than those of the comparative methods,which verifies the effectiveness of the proposed method.展开更多
Multimodal sentiment analysis aims to understand emotions from text,speech,and video data.However,current methods often overlook the dominant role of text and suffer from feature loss during integration.Given the vary...Multimodal sentiment analysis aims to understand emotions from text,speech,and video data.However,current methods often overlook the dominant role of text and suffer from feature loss during integration.Given the varying importance of each modality across different contexts,a central and pressing challenge in multimodal sentiment analysis lies in maximizing the use of rich intra-modal features while minimizing information loss during the fusion process.In response to these critical limitations,we propose a novel framework that integrates spatial position encoding and fusion embedding modules to address these issues.In our model,text is treated as the core modality,while speech and video features are selectively incorporated through a unique position-aware fusion process.The spatial position encoding strategy preserves the internal structural information of speech and visual modalities,enabling the model to capture localized intra-modal dependencies that are often overlooked.This design enhances the richness and discriminative power of the fused representation,enabling more accurate and context-aware sentiment prediction.Finally,we conduct comprehensive evaluations on two widely recognized standard datasets in the field—CMU-MOSI and CMU-MOSEI to validate the performance of the proposed model.The experimental results demonstrate that our model exhibits good performance and effectiveness for sentiment analysis tasks.展开更多
Recent advancements in smart-meter technology are transforming traditional power systems into intelligent smart grids.It offers substantial benefits across social,environmental,and economic dimensions.To effectively r...Recent advancements in smart-meter technology are transforming traditional power systems into intelligent smart grids.It offers substantial benefits across social,environmental,and economic dimensions.To effectively realize these advantages,a fine-grained collection and analysis of smart meter data is essential.However,the high dimensionality and volume of such time-series present significant challenges,including increased computational load,data transmission overhead,latency,and complexity in real-time analysis.This study proposes a novel,computationally efficient framework for feature extraction and selection tailored to smart meter time-series data.The approach begins with an extensive offline analysis,where features are derived from multiple domains—time,frequency,and statistical—to capture diverse signal characteristics.Various feature sets are fused and evaluated using robust machine learning classifiers to identify the most informative combinations for automated appliance categorization.The bestperforming fused features set undergoes further refinement using Analysis of Variance(ANOVA)to identify the most discriminative features.The mathematical models,used to compute the selected features,are optimized to extract them with computational efficiency during online processing.Moreover,a notable dimension reduction is secured which facilitates data storage,transmission,and post processing.Onward,a specifically designed LogitBoost(LB)based ensemble of Random Forest base learners is used for an automated classification.The proposed solution demonstrates a high classification accuracy(97.93%)for the case of nine-class problem and dimension reduction(17.33-fold)with minimal front-end computational requirements,making it well-suited for real-world applications in smart grid environments.展开更多
In modern war,radar countermeasure is becoming increasingly fierce,and the enemy jamming time and pattern are changing more randomly.It is challenging for the radar to efficiently identify jamming and obtain precise p...In modern war,radar countermeasure is becoming increasingly fierce,and the enemy jamming time and pattern are changing more randomly.It is challenging for the radar to efficiently identify jamming and obtain precise parameter information,particularly in low signal-to-noise ratio(SNR)situations.In this paper,an approach to intelligent recognition and complex jamming parameter estimate based on joint time-frequency distribution features is proposed to address this challenging issue.Firstly,a joint algorithm based on YOLOv5 convolutional neural networks(CNNs)is proposed,which is used to achieve the jamming signal classification and preliminary parameter estimation.Furthermore,an accurate jamming key parameters estimation algorithm is constructed by comprehensively utilizing chi-square statistical test,feature region search,position regression,spectrum interpolation,etc.,which realizes the accurate estimation of jamming carrier frequency,relative delay,Doppler frequency shift,and other parameters.Finally,the approach has improved performance for complex jamming recognition and parameter estimation under low SNR,and the recognition rate can reach 98%under−15 dB SNR,according to simulation and real data verification results.展开更多
To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is ba...To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method(Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed(GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.展开更多
The spatial composition of natural environment and settlement in the Three Gorges region along the Yangtze River was analyzed from a macro perspective,which emphasized its characteristics of the interdependence among ...The spatial composition of natural environment and settlement in the Three Gorges region along the Yangtze River was analyzed from a macro perspective,which emphasized its characteristics of the interdependence among its buildings,landform and waterscape,between buildings and landscape,and integration of nature and human culture.Then the spatial features of folk houses were analyzed,while special attention was paid to its "upward","grey",and dynamic characteristics.The courtyard-type residence and stilted building in South China were taken as examples in order to explain their exterior spatial characteristics,and the interior spatial features were analyzed from the pursuit of courtyard layout,the preference of courtyard space and the emphasis of central room space.The paper exposed the builders' rational thinking about natural environment and living place conveyed through the traditional folk houses,as well as the practical value of this architectural style in the special natural environment of the Three Gorges region,and explained the artistic achievements from the integration of architecture and environment,aiming to provide references for the urban and living environment construction in this region during the "Post Three-Gorges Project Era".展开更多
41 a (1961 - 2001) seasonal Z index series of 25 representative weather stations are investigated by virtue of EOF, FFT, continuous wavelet transformation (CWT) and orthogonai wavelet transformation (OWT). It sh...41 a (1961 - 2001) seasonal Z index series of 25 representative weather stations are investigated by virtue of EOF, FFT, continuous wavelet transformation (CWT) and orthogonai wavelet transformation (OWT). It shows that: (1) Fujian drought/flood (DF) has a significant 2 - 3a cycle for the periods 1965 - 1975 and 1990's; (2) the pattern, which represents the opposite DF trend between the southern and northem parts, has la and 3 - 4a cycles since the middle of 1980's; (3) EOF3, which denotes the reverse change between the middle-west region and other areas, has significant 1 - 2a cycle for the period from 1985 to 1998 and 9 - 13a cycle since 1980s; (4) there is an obvious drought trend for the last 40a (especially in the 1990's), which is more outstanding in the south (east) than in the north (west); (5) the 1960's and 1980's are in relatively wet phases and the 1970's and 1990's are in drought spells.展开更多
Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconst...Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconstruction has become a prevalent approach for unsupervised anomaly detection.However,effectively learning representations and achieving accurate detection results remain challenging due to the intricate temporal patterns and dependencies in real-world time series.In this paper,we propose a cross-dimension attentive feature fusion network for time series anomaly detection,referred to as CAFFN.Specifically,a series and feature mixing block is introduced to learn representations in 1D space.Additionally,a fast Fourier transform is employed to convert the time series into 2D space,providing the capability for 2D feature extraction.Finally,a cross-dimension attentive feature fusion mechanism is designed that adaptively integrates features across different dimensions for anomaly detection.Experimental results on real-world time series datasets demonstrate that CAFFN performs better than other competing methods in time series anomaly detection.展开更多
Historically,landslides have been the primary type of geological disaster worldwide.Generally,the stability of reservoir banks is primarily affected by rainfall and reservoir water level fluctuations.Moreover,the stab...Historically,landslides have been the primary type of geological disaster worldwide.Generally,the stability of reservoir banks is primarily affected by rainfall and reservoir water level fluctuations.Moreover,the stability of reservoir banks changes with the long-term dynamics of external disastercausing factors.Thus,assessing the time-varying reliability of reservoir landslides remains a challenge.In this paper,a machine learning(ML)based approach is proposed to analyze the long-term reliability of reservoir bank landslides in spatially variable soils through time series prediction.This study systematically investigated the prediction performances of three ML algorithms,i.e.multilayer perceptron(MLP),convolutional neural network(CNN),and long short-term memory(LSTM).Additionally,the effects of the data quantity and data ratio on the predictive power of deep learning models are considered.The results show that all three ML models can accurately depict the changes in the time-varying failure probability of reservoir landslides.The CNN model outperforms both the MLP and LSTM models in predicting the failure probability.Furthermore,selecting the right data ratio can improve the prediction accuracy of the failure probability obtained by ML models.展开更多
The main purpose of nonlinear time series analysis is based on the rebuilding theory of phase space, and to study how to transform the response signal to rebuilt phase space in order to extract dynamic feature informa...The main purpose of nonlinear time series analysis is based on the rebuilding theory of phase space, and to study how to transform the response signal to rebuilt phase space in order to extract dynamic feature information, and to provide effective approach for nonlinear signal analysis and fault diagnosis of nonlinear dynamic system. Now, it has already formed an important offset of nonlinear science. But, traditional method cannot extract chaos features automatically, and it needs man's participation in the whole process. A new method is put forward, which can implement auto-extracting of chaos features for nonlinear time series. Firstly, to confirm time delay r by autocorrelation method; Secondly, to compute embedded dimension m and correlation dimension D; Thirdly, to compute the maximum Lyapunov index λmax; Finally, to calculate the chaos degree Dch of Poincare map, and the non-circle degree Dnc and non-order degree Dno of quasi-phase orbit. Chaos features extracting has important meaning to fault diagnosis of nonlinear system based on nonlinear chaos features. Examples show validity of the proposed method.展开更多
As one of the eight Taihang passes,Fukou Xing is located in the south of the Taihang Mountains and has been an important passage for Shanxi and Hebei in history.Taking traditional settlements in Fukou Xing Region as r...As one of the eight Taihang passes,Fukou Xing is located in the south of the Taihang Mountains and has been an important passage for Shanxi and Hebei in history.Taking traditional settlements in Fukou Xing Region as research object,using the Advanced Spaceborne Thermal Emission and Reflection Radiometer(ASTER) Global Digital Elevation Model(GDEM)(remote sensing measurement of elevation data) and GIS platform,this paper made a quantitative study on traditional settlement space in mountain environment of this region,and studied space parameters including elevation,terrain,aspect,and boundary,observed and summarized the spatial features.In addition,based on the local chronicles of Ming and Qing dynasties,it mutually verified the quantitative conclusions and qualitative cognition,analyzed the evolution rules of traditional settlements in Fukou Xing region,and finally obtained new understandings of spatial features of traditional settlements in Fukou Xing region.展开更多
As a special outcome of urbanization,mega-towns not only play an important role in the process of socio-economic development,but also are important contributors to urbanization.Based on a spatial database of mega-town...As a special outcome of urbanization,mega-towns not only play an important role in the process of socio-economic development,but also are important contributors to urbanization.Based on a spatial database of mega-towns in China,this paper explores the spatial distribution features and growth mechanisms of China’s 238 mega-towns using the nearest neighbour distance method,kernel density estimation,regression analysis,global autocorrelation,local autocorrelation and other spatial analysis methods.Results of spatial distribution features show that:(1)on the national scale,the existing 238 mega-towns mainly gathered in the southeast coastal areas of China;they formed two spatial core agglomerations,several secondary ones and a southeast coastal agglomeration belt;(2)on the regional scale,each economic region’s index was less than 1,indicating that mega-towns in each region tended to be spatially agglomerated due to the close relationship with regional development level and their number;(3)on the provincial scale,68%of provincial-level units in China tended to be a spatial agglomeration of mega-towns;only one province had a random distribution;the number of mega-towns in those evenly-distributed provinces was generally small.The growth of mega-towns was determined by a combination of various natural and humanistic factors,including topography,location,economy,population,traffic,and national policy.This paper chose digital elevation model(DEM),location advantage,economic density,population density,and highway density distribution as corresponding indicators as quantitative factors.By combining their local autocorrelation analysis,these factors all showed certain influence on the spatial growth of mega-towns and together scheduled it.In the future,provinces and cities should make full use of the mega-town functions to promote their socioeconomic development,especially the central and western regions in China.展开更多
Obstructive Sleep Apnea(OSA)is a respiratory syndrome that occurs due to insufficient airflow through the respiratory or respiratory arrest while sleeping and sometimes due to the reduced oxygen saturation.The aim of ...Obstructive Sleep Apnea(OSA)is a respiratory syndrome that occurs due to insufficient airflow through the respiratory or respiratory arrest while sleeping and sometimes due to the reduced oxygen saturation.The aim of this paper is to analyze the respiratory signal of a person to detect the Normal Breathing Activity and the Sleep Apnea(SA)activity.In the proposed method,the time domain and frequency domain features of respiration signal obtained from the PPG device are extracted.These features are applied to the Classification and Regression Tree(CART)-Particle Swarm Optimization(PSO)classifier which classifies the signal into normal breathing signal and sleep apnea signal.The proposed method is validated to measure the performance metrics like sensitivity,specificity,accuracy and F1 score by applying time domain and frequency domain features separately.Additionally,the performance of the CART-PSO(CPSO)classification algorithm is evaluated through comparing its measures with existing classification algorithms.Concurrently,the effect of the PSO algorithm in the classifier is validated by varying the parameters of PSO.展开更多
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram...An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.展开更多
Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and text...Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals.展开更多
The theoretical positioning accuracy of multilateration(MLAT) with the time difference of arrival(TDOA) algorithm is very high. However, there are some problems in practical applications. Here we analyze the location ...The theoretical positioning accuracy of multilateration(MLAT) with the time difference of arrival(TDOA) algorithm is very high. However, there are some problems in practical applications. Here we analyze the location performance of the time sum of arrival(TSOA) algorithm from the root mean square error(RMSE) and geometric dilution of precision(GDOP) in additive white Gaussian noise(AWGN) environment. The TSOA localization model is constructed. Using it, the distribution of location ambiguity region is presented with 4-base stations. And then, the location performance analysis is started from the 4-base stations with calculating the RMSE and GDOP variation. Subsequently, when the location parameters are changed in number of base stations, base station layout and so on, the performance changing patterns of the TSOA location algorithm are shown. So, the TSOA location characteristics and performance are revealed. From the RMSE and GDOP state changing trend, the anti-noise performance and robustness of the TSOA localization algorithm are proved. The TSOA anti-noise performance will be used for reducing the blind-zone and the false location rate of MLAT systems.展开更多
Extreme weather events such as persistent high temperatures, heavy rains or sudden cold waves in Shanxi Province in China have brought great losses and disasters to people’s production and life. It is of great practi...Extreme weather events such as persistent high temperatures, heavy rains or sudden cold waves in Shanxi Province in China have brought great losses and disasters to people’s production and life. It is of great practical significance to study the temporal and spatial distribution characteristics of extreme weather events and the circulation background field. We selected daily high temperature data (≥35°C), daily minimum temperature data and daily precipitation data (≥50 mm) from 109 meteorological stations in Shanxi Province, China from 1981 to 2010, then set the period in which the temperature is ≥35°C for more than 3 days as a high temperature extreme weather event, define the station in which 24 hour cumulative precipitation is ≥50 mm precipitation on a certain day (20 - 20 hours, Beijing time) as a rainstorm weather, and determine the cold air activity with daily minimum temperature dropped by more than 8°C for 24 hours, or decreased by 10°C for 48 h, and a daily minimum temperature of ≤4°C as a cold weather process. We statistically analyze the temporal and spatial characteristics and trends of high temperature, heavy rain and cold weather and the circulation background field. We count the number of extreme weather events such as persistent high temperatures, heavy rains and cold weather frosts in Shanxi, and analyze the temporal and spatial distribution characteristics, trends and general circulation background of extreme weather events. We analyze and find out the common features of the large-scale circulation background field in various extreme weather events. Through the study of the temporal and spatial distribution characteristics of extreme weather events in Shanxi, including persistent high temperature, heavy rain or sudden cold wave frost weather, we summarize the large-scale circulation characteristics of such extreme weather events. It will provide some reference for future related weather forecasting.展开更多
基金supported by the Fundamental Research Funds for the Provincial Universities of Zhejiang (No.GK249909299001-036)National Key Research and Development Program of China (No. 2023YFB4502803)Zhejiang Provincial Natural Science Foundation of China (No.LDT23F01014F01)。
文摘Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.
文摘Multi-Object Tracking(MOT)represents a fundamental but computationally demanding task in computer vision,with particular challenges arising in occluded and densely populated environments.While contemporary tracking systems have demonstrated considerable progress,persistent limitations—notably frequent occlusion-induced identity switches and tracking inaccuracies—continue to impede reliable real-world deployment.This work introduces an advanced tracking framework that enhances association robustness through a two-stage matching paradigm combining spatial and appearance features.Proposed framework employs:(1)a Height Modulated and Scale Adaptive Spatial Intersection-over-Union(HMSIoU)metric for improved spatial correspondence estimation across variable object scales and partial occlusions;(2)a feature extraction module generating discriminative appearance descriptors for identity maintenance;and(3)a recovery association mechanism for refining matches between unassociated tracks and detections.Comprehensive evaluation on standard MOT17 and MOT20 benchmarks demonstrates significant improvements in tracking consistency,with state-of-the-art performance across key metrics including HOTA(64),MOTA(80.7),IDF1(79.8),and IDs(1379).These results substantiate the efficacy of our Cue-Tracker framework in complex real-world scenarios characterized by occlusions and crowd interactions.
基金supported by the National Key R&D Program of China(No.2022ZD0118402)。
文摘Remote sensing cross-modal image-text retrieval(RSCIR)can flexibly and subjectively retrieve remote sensing images utilizing query text,which has received more researchers’attention recently.However,with the increasing volume of visual-language pre-training model parameters,direct transfer learning consumes a substantial amount of computational and storage resources.Moreover,recently proposed parameter-efficient transfer learning methods mainly focus on the reconstruction of channel features,ignoring the spatial features which are vital for modeling key entity relationships.To address these issues,we design an efficient transfer learning framework for RSCIR,which is based on spatial feature efficient reconstruction(SPER).A concise and efficient spatial adapter is introduced to enhance the extraction of spatial relationships.The spatial adapter is able to spatially reconstruct the features in the backbone with few parameters while incorporating the prior information from the channel dimension.We conduct quantitative and qualitative experiments on two different commonly used RSCIR datasets.Compared with traditional methods,our approach achieves an improvement of 3%-11% in sumR metric.Compared with methods finetuning all parameters,our proposed method only trains less than 1% of the parameters,while maintaining an overall performance of about 96%.
基金the Foshan Science and technology Innovation Team Project(No.FS0AA-KJ919-4402-0060)the National Natural Science Foundation of China(No.62263018)。
文摘In view of the weak ability of the convolutional neural networks to explicitly learn spatial invariance and the probabilistic loss of discriminative features caused by occlusion and background interference in pedestrian re-identification tasks,a person re-identification method combining spatial feature learning and multi-granularity feature fusion was proposed.First,an attention spatial transformation network(A-STN)is proposed to learn spatial features and solve the problem of misalignment of pedestrian spatial features.Then the network was divided into a global branch,a local coarse-grained fusion branch,and a local fine-grained fusion branch to extract pedestrian global features,coarse-grained fusion features,and fine-grained fusion features,respectively.Among them,the global branch enriches the global features by fusing different pooling features.The local coarse-grained fusion branch uses an overlay pooling to enhance each local feature while learning the correlation relationship between multi-granularity features.The local fine-grained fusion branch uses a differential pooling to obtain the differential features that were fused with global features to learn the relationship between pedestrian local features and pedestrian global features.Finally,the proposed method was compared on three public datasets:Market1501,DukeMTMC-ReID and CUHK03.The experimental results were better than those of the comparative methods,which verifies the effectiveness of the proposed method.
基金supported by the Collaborative Tackling Project of the Yangtze River Delta SciTech Innovation Community(Nos.2024CSJGG01503,2024CSJGG01500)Guangxi Key Research and Development Program(No.AB24010317)Jiangxi Provincial Key Laboratory of Electronic Data Control and Forensics(Jiangxi Police College)(No.2025JXJYKFJJ002).
文摘Multimodal sentiment analysis aims to understand emotions from text,speech,and video data.However,current methods often overlook the dominant role of text and suffer from feature loss during integration.Given the varying importance of each modality across different contexts,a central and pressing challenge in multimodal sentiment analysis lies in maximizing the use of rich intra-modal features while minimizing information loss during the fusion process.In response to these critical limitations,we propose a novel framework that integrates spatial position encoding and fusion embedding modules to address these issues.In our model,text is treated as the core modality,while speech and video features are selectively incorporated through a unique position-aware fusion process.The spatial position encoding strategy preserves the internal structural information of speech and visual modalities,enabling the model to capture localized intra-modal dependencies that are often overlooked.This design enhances the richness and discriminative power of the fused representation,enabling more accurate and context-aware sentiment prediction.Finally,we conduct comprehensive evaluations on two widely recognized standard datasets in the field—CMU-MOSI and CMU-MOSEI to validate the performance of the proposed model.The experimental results demonstrate that our model exhibits good performance and effectiveness for sentiment analysis tasks.
文摘Recent advancements in smart-meter technology are transforming traditional power systems into intelligent smart grids.It offers substantial benefits across social,environmental,and economic dimensions.To effectively realize these advantages,a fine-grained collection and analysis of smart meter data is essential.However,the high dimensionality and volume of such time-series present significant challenges,including increased computational load,data transmission overhead,latency,and complexity in real-time analysis.This study proposes a novel,computationally efficient framework for feature extraction and selection tailored to smart meter time-series data.The approach begins with an extensive offline analysis,where features are derived from multiple domains—time,frequency,and statistical—to capture diverse signal characteristics.Various feature sets are fused and evaluated using robust machine learning classifiers to identify the most informative combinations for automated appliance categorization.The bestperforming fused features set undergoes further refinement using Analysis of Variance(ANOVA)to identify the most discriminative features.The mathematical models,used to compute the selected features,are optimized to extract them with computational efficiency during online processing.Moreover,a notable dimension reduction is secured which facilitates data storage,transmission,and post processing.Onward,a specifically designed LogitBoost(LB)based ensemble of Random Forest base learners is used for an automated classification.The proposed solution demonstrates a high classification accuracy(97.93%)for the case of nine-class problem and dimension reduction(17.33-fold)with minimal front-end computational requirements,making it well-suited for real-world applications in smart grid environments.
基金supported by Shandong Provincial Natural Science Foundation(ZR2020MF015)Aerospace Technology Group Stability Support Project(ZY0110020009).
文摘In modern war,radar countermeasure is becoming increasingly fierce,and the enemy jamming time and pattern are changing more randomly.It is challenging for the radar to efficiently identify jamming and obtain precise parameter information,particularly in low signal-to-noise ratio(SNR)situations.In this paper,an approach to intelligent recognition and complex jamming parameter estimate based on joint time-frequency distribution features is proposed to address this challenging issue.Firstly,a joint algorithm based on YOLOv5 convolutional neural networks(CNNs)is proposed,which is used to achieve the jamming signal classification and preliminary parameter estimation.Furthermore,an accurate jamming key parameters estimation algorithm is constructed by comprehensively utilizing chi-square statistical test,feature region search,position regression,spectrum interpolation,etc.,which realizes the accurate estimation of jamming carrier frequency,relative delay,Doppler frequency shift,and other parameters.Finally,the approach has improved performance for complex jamming recognition and parameter estimation under low SNR,and the recognition rate can reach 98%under−15 dB SNR,according to simulation and real data verification results.
基金supported by the National Natural Science Foundation of China(No.61275010)the Ph.D.Programs Foundation of Ministry of Education of China(No.20132304110007)+1 种基金the Heilongjiang Natural Science Foundation(No.F201409)the Fundamental Research Funds for the Central Universities(No.HEUCFD1410)
文摘To minimize the low classification accuracy and low utilization of spatial information in traditional hyperspectral image classification methods, we propose a new hyperspectral image classification method, which is based on the Gabor spatial texture features and nonparametric weighted spectral features, and the sparse representation classification method(Gabor–NWSF and SRC), abbreviated GNWSF–SRC. The proposed(GNWSF–SRC) method first combines the Gabor spatial features and nonparametric weighted spectral features to describe the hyperspectral image, and then applies the sparse representation method. Finally, the classification is obtained by analyzing the reconstruction error. We use the proposed method to process two typical hyperspectral data sets with different percentages of training samples. Theoretical analysis and simulation demonstrate that the proposed method improves the classification accuracy and Kappa coefficient compared with traditional classification methods and achieves better classification performance.
基金Supported by Humanities Fund of Ministry of Education (09YJAZH047)Scientific Research and Development Program of Yichang City (A09302-27)~~
文摘The spatial composition of natural environment and settlement in the Three Gorges region along the Yangtze River was analyzed from a macro perspective,which emphasized its characteristics of the interdependence among its buildings,landform and waterscape,between buildings and landscape,and integration of nature and human culture.Then the spatial features of folk houses were analyzed,while special attention was paid to its "upward","grey",and dynamic characteristics.The courtyard-type residence and stilted building in South China were taken as examples in order to explain their exterior spatial characteristics,and the interior spatial features were analyzed from the pursuit of courtyard layout,the preference of courtyard space and the emphasis of central room space.The paper exposed the builders' rational thinking about natural environment and living place conveyed through the traditional folk houses,as well as the practical value of this architectural style in the special natural environment of the Three Gorges region,and explained the artistic achievements from the integration of architecture and environment,aiming to provide references for the urban and living environment construction in this region during the "Post Three-Gorges Project Era".
基金Project from the Ministry of Science and Technology of China (2001DIB20116)open projectfor KLME of Nanjing Institute of Meteorology (KJS02108)
文摘41 a (1961 - 2001) seasonal Z index series of 25 representative weather stations are investigated by virtue of EOF, FFT, continuous wavelet transformation (CWT) and orthogonai wavelet transformation (OWT). It shows that: (1) Fujian drought/flood (DF) has a significant 2 - 3a cycle for the periods 1965 - 1975 and 1990's; (2) the pattern, which represents the opposite DF trend between the southern and northem parts, has la and 3 - 4a cycles since the middle of 1980's; (3) EOF3, which denotes the reverse change between the middle-west region and other areas, has significant 1 - 2a cycle for the period from 1985 to 1998 and 9 - 13a cycle since 1980s; (4) there is an obvious drought trend for the last 40a (especially in the 1990's), which is more outstanding in the south (east) than in the north (west); (5) the 1960's and 1980's are in relatively wet phases and the 1970's and 1990's are in drought spells.
基金supported in part by the National Natural Science Foundation of China(Grants 62376172,62006163,62376043)in part by the National Postdoctoral Program for Innovative Talents(Grant BX20200226)in part by Sichuan Science and Technology Planning Project(Grants 2022YFSY0047,2022YFQ0014,2023ZYD0143,2022YFH0021,2023YFQ0020,24QYCX0354,24NSFTD0025).
文摘Time series anomaly detection is crucial in various industrial applications to identify unusual behaviors within the time series data.Due to the challenges associated with annotating anomaly events,time series reconstruction has become a prevalent approach for unsupervised anomaly detection.However,effectively learning representations and achieving accurate detection results remain challenging due to the intricate temporal patterns and dependencies in real-world time series.In this paper,we propose a cross-dimension attentive feature fusion network for time series anomaly detection,referred to as CAFFN.Specifically,a series and feature mixing block is introduced to learn representations in 1D space.Additionally,a fast Fourier transform is employed to convert the time series into 2D space,providing the capability for 2D feature extraction.Finally,a cross-dimension attentive feature fusion mechanism is designed that adaptively integrates features across different dimensions for anomaly detection.Experimental results on real-world time series datasets demonstrate that CAFFN performs better than other competing methods in time series anomaly detection.
基金supported by the National Natural Science Foundation of China(Grant No.52308340)the Innovative Projects of Universities in Guangdong(Grant No.2022KTSCX208)Sichuan Transportation Science and Technology Project(Grant No.2018-ZL-01).
文摘Historically,landslides have been the primary type of geological disaster worldwide.Generally,the stability of reservoir banks is primarily affected by rainfall and reservoir water level fluctuations.Moreover,the stability of reservoir banks changes with the long-term dynamics of external disastercausing factors.Thus,assessing the time-varying reliability of reservoir landslides remains a challenge.In this paper,a machine learning(ML)based approach is proposed to analyze the long-term reliability of reservoir bank landslides in spatially variable soils through time series prediction.This study systematically investigated the prediction performances of three ML algorithms,i.e.multilayer perceptron(MLP),convolutional neural network(CNN),and long short-term memory(LSTM).Additionally,the effects of the data quantity and data ratio on the predictive power of deep learning models are considered.The results show that all three ML models can accurately depict the changes in the time-varying failure probability of reservoir landslides.The CNN model outperforms both the MLP and LSTM models in predicting the failure probability.Furthermore,selecting the right data ratio can improve the prediction accuracy of the failure probability obtained by ML models.
文摘The main purpose of nonlinear time series analysis is based on the rebuilding theory of phase space, and to study how to transform the response signal to rebuilt phase space in order to extract dynamic feature information, and to provide effective approach for nonlinear signal analysis and fault diagnosis of nonlinear dynamic system. Now, it has already formed an important offset of nonlinear science. But, traditional method cannot extract chaos features automatically, and it needs man's participation in the whole process. A new method is put forward, which can implement auto-extracting of chaos features for nonlinear time series. Firstly, to confirm time delay r by autocorrelation method; Secondly, to compute embedded dimension m and correlation dimension D; Thirdly, to compute the maximum Lyapunov index λmax; Finally, to calculate the chaos degree Dch of Poincare map, and the non-circle degree Dnc and non-order degree Dno of quasi-phase orbit. Chaos features extracting has important meaning to fault diagnosis of nonlinear system based on nonlinear chaos features. Examples show validity of the proposed method.
基金Sponsored by Project of National Natural Science Foundation(51608007)"Young Top-notch Talent Support Plan" of North China University of Technology
文摘As one of the eight Taihang passes,Fukou Xing is located in the south of the Taihang Mountains and has been an important passage for Shanxi and Hebei in history.Taking traditional settlements in Fukou Xing Region as research object,using the Advanced Spaceborne Thermal Emission and Reflection Radiometer(ASTER) Global Digital Elevation Model(GDEM)(remote sensing measurement of elevation data) and GIS platform,this paper made a quantitative study on traditional settlement space in mountain environment of this region,and studied space parameters including elevation,terrain,aspect,and boundary,observed and summarized the spatial features.In addition,based on the local chronicles of Ming and Qing dynasties,it mutually verified the quantitative conclusions and qualitative cognition,analyzed the evolution rules of traditional settlements in Fukou Xing region,and finally obtained new understandings of spatial features of traditional settlements in Fukou Xing region.
基金Strategic Priority Research Program of the Chinese Academy of Sciences,No.XDA 19040402National Natural Science Foundation of China,No.41771180,No.41661144023,No.41701165。
文摘As a special outcome of urbanization,mega-towns not only play an important role in the process of socio-economic development,but also are important contributors to urbanization.Based on a spatial database of mega-towns in China,this paper explores the spatial distribution features and growth mechanisms of China’s 238 mega-towns using the nearest neighbour distance method,kernel density estimation,regression analysis,global autocorrelation,local autocorrelation and other spatial analysis methods.Results of spatial distribution features show that:(1)on the national scale,the existing 238 mega-towns mainly gathered in the southeast coastal areas of China;they formed two spatial core agglomerations,several secondary ones and a southeast coastal agglomeration belt;(2)on the regional scale,each economic region’s index was less than 1,indicating that mega-towns in each region tended to be spatially agglomerated due to the close relationship with regional development level and their number;(3)on the provincial scale,68%of provincial-level units in China tended to be a spatial agglomeration of mega-towns;only one province had a random distribution;the number of mega-towns in those evenly-distributed provinces was generally small.The growth of mega-towns was determined by a combination of various natural and humanistic factors,including topography,location,economy,population,traffic,and national policy.This paper chose digital elevation model(DEM),location advantage,economic density,population density,and highway density distribution as corresponding indicators as quantitative factors.By combining their local autocorrelation analysis,these factors all showed certain influence on the spatial growth of mega-towns and together scheduled it.In the future,provinces and cities should make full use of the mega-town functions to promote their socioeconomic development,especially the central and western regions in China.
文摘Obstructive Sleep Apnea(OSA)is a respiratory syndrome that occurs due to insufficient airflow through the respiratory or respiratory arrest while sleeping and sometimes due to the reduced oxygen saturation.The aim of this paper is to analyze the respiratory signal of a person to detect the Normal Breathing Activity and the Sleep Apnea(SA)activity.In the proposed method,the time domain and frequency domain features of respiration signal obtained from the PPG device are extracted.These features are applied to the Classification and Regression Tree(CART)-Particle Swarm Optimization(PSO)classifier which classifies the signal into normal breathing signal and sleep apnea signal.The proposed method is validated to measure the performance metrics like sensitivity,specificity,accuracy and F1 score by applying time domain and frequency domain features separately.Additionally,the performance of the CART-PSO(CPSO)classification algorithm is evaluated through comparing its measures with existing classification algorithms.Concurrently,the effect of the PSO algorithm in the classifier is validated by varying the parameters of PSO.
基金supported by the National Natural Science Foundation of China(No.62241109)the Tianjin Science and Technology Commissioner Project(No.20YDTPJC01110)。
文摘An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.
基金Shenzhen Institute of Artificial Intelligence and Robotics for Society,Grant/Award Number:AC01202201003-02GuangDong Basic and Applied Basic Research Foundation,Grant/Award Number:2024A1515010252Longgang District Shenzhen's“Ten Action Plan”for Supporting Innovation Projects,Grant/Award Number:LGKCSDPT2024002。
文摘Audio-visual scene classification(AVSC)poses a formidable challenge owing to the intricate spatial-temporal relationships exhibited by audio-visual signals,coupled with the complex spatial patterns of objects and textures found in visual images.The focus of recent studies has predominantly revolved around extracting features from diverse neural network structures,inadvertently neglecting the acquisition of semantically meaningful regions and crucial components within audio-visual data.The authors present a feature pyramid attention network(FPANet)for audio-visual scene understanding,which extracts semantically significant characteristics from audio-visual data.The authors’approach builds multi-scale hierarchical features of sound spectrograms and visual images using a feature pyramid representation and localises the semantically relevant regions with a feature pyramid attention module(FPAM).A dimension alignment(DA)strategy is employed to align feature maps from multiple layers,a pyramid spatial attention(PSA)to spatially locate essential regions,and a pyramid channel attention(PCA)to pinpoint significant temporal frames.Experiments on visual scene classification(VSC),audio scene classification(ASC),and AVSC tasks demonstrate that FPANet achieves performance on par with state-of-the-art(SOTA)approaches,with a 95.9 F1-score on the ADVANCE dataset and a relative improvement of 28.8%.Visualisation results show that FPANet can prioritise semantically meaningful areas in audio-visual signals.
基金supported by the Joint Civil Aviation Fund of National Natural Science Foundation of China(Nos.U1533108 and U1233112)
文摘The theoretical positioning accuracy of multilateration(MLAT) with the time difference of arrival(TDOA) algorithm is very high. However, there are some problems in practical applications. Here we analyze the location performance of the time sum of arrival(TSOA) algorithm from the root mean square error(RMSE) and geometric dilution of precision(GDOP) in additive white Gaussian noise(AWGN) environment. The TSOA localization model is constructed. Using it, the distribution of location ambiguity region is presented with 4-base stations. And then, the location performance analysis is started from the 4-base stations with calculating the RMSE and GDOP variation. Subsequently, when the location parameters are changed in number of base stations, base station layout and so on, the performance changing patterns of the TSOA location algorithm are shown. So, the TSOA location characteristics and performance are revealed. From the RMSE and GDOP state changing trend, the anti-noise performance and robustness of the TSOA localization algorithm are proved. The TSOA anti-noise performance will be used for reducing the blind-zone and the false location rate of MLAT systems.
文摘Extreme weather events such as persistent high temperatures, heavy rains or sudden cold waves in Shanxi Province in China have brought great losses and disasters to people’s production and life. It is of great practical significance to study the temporal and spatial distribution characteristics of extreme weather events and the circulation background field. We selected daily high temperature data (≥35°C), daily minimum temperature data and daily precipitation data (≥50 mm) from 109 meteorological stations in Shanxi Province, China from 1981 to 2010, then set the period in which the temperature is ≥35°C for more than 3 days as a high temperature extreme weather event, define the station in which 24 hour cumulative precipitation is ≥50 mm precipitation on a certain day (20 - 20 hours, Beijing time) as a rainstorm weather, and determine the cold air activity with daily minimum temperature dropped by more than 8°C for 24 hours, or decreased by 10°C for 48 h, and a daily minimum temperature of ≤4°C as a cold weather process. We statistically analyze the temporal and spatial characteristics and trends of high temperature, heavy rain and cold weather and the circulation background field. We count the number of extreme weather events such as persistent high temperatures, heavy rains and cold weather frosts in Shanxi, and analyze the temporal and spatial distribution characteristics, trends and general circulation background of extreme weather events. We analyze and find out the common features of the large-scale circulation background field in various extreme weather events. Through the study of the temporal and spatial distribution characteristics of extreme weather events in Shanxi, including persistent high temperature, heavy rain or sudden cold wave frost weather, we summarize the large-scale circulation characteristics of such extreme weather events. It will provide some reference for future related weather forecasting.