Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combin...Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.展开更多
False Data Injection Attacks(FDIAs)pose a critical security threat to modern power grids,corrupting state estimation and enabling malicious control actions that can lead to severe consequences,including cascading fail...False Data Injection Attacks(FDIAs)pose a critical security threat to modern power grids,corrupting state estimation and enabling malicious control actions that can lead to severe consequences,including cascading failures,large-scale blackouts,and significant economic losses.While detecting attacks is important,accurately localizing compromised nodes or measurements is even more critical,as it enables timely mitigation,targeted response,and enhanced system resilience beyond what detection alone can offer.Existing research typically models topological features using fixed structures,which can introduce irrelevant information and affect the effectiveness of feature extraction.To address this limitation,this paper proposes an FDIA localization model with adaptive neighborhood selection,which dynamically captures spatial dependencies of the power grid by adjusting node relationships based on data-driven similarities.The improved Transformer is employed to pre-fuse global spatial features of the graph,enriching the feature representation.To improve spatio-temporal correlation extraction for FDIA localization,the proposed model employs dilated causal convolution with a gating mechanism combined with graph convolution to capture and fuse long-range temporal features and adaptive topological features.This fully exploits the temporal dynamics and spatial dependencies inherent in the power grid.Finally,multi-source information is integrated to generate highly robust node embeddings,enhancing FDIA detection and localization.Experiments are conducted on IEEE 14,57,and 118-bus systems,and the results demonstrate that the proposed model substantially improves the accuracy of FDIA localization.Additional experiments are conducted to verify the effectiveness and robustness of the proposed model.展开更多
The fluidity of coal-water slurry(CWS)is crucial for various industrial applications such as long-distance transportation,gasification,and combustion.However,there is currently a lack of rapid and accurate detection m...The fluidity of coal-water slurry(CWS)is crucial for various industrial applications such as long-distance transportation,gasification,and combustion.However,there is currently a lack of rapid and accurate detection methods for assessing CWS fluidity.This paper proposed a method for analyzing the fluidity using videos of CWS dripping processes.By integrating the temporal and spatial features of each frame in the video,a multi-cascade classifier for CWS fluidity is established.The classifier distinguishes between four levels(A,B,C,and D)based on the quality of fluidity.The preliminary classification of A and D is achieved through feature engineering and the XGBoost algorithm.Subsequently,convolutional neural networks(CNN)and long short-term memory(LSTM)are utilized to further differentiate between the B and C categories which are prone to confusion.Finally,through detailed comparative experiments,the paper demonstrates the step-by-step design process of the proposed method and the superiority of the final solution.The proposed method achieves an accuracy rate of over 90%in determining the fluidity of CWS,serving as a technical reference for future industrial applications.展开更多
Due to water conflicts and allocation in the Lancang-Mekong River Basin(LMRB),the spatio-temporal differentiation of total water resources and the natural-human influence need to be clarified.This work investigated LM...Due to water conflicts and allocation in the Lancang-Mekong River Basin(LMRB),the spatio-temporal differentiation of total water resources and the natural-human influence need to be clarified.This work investigated LMRB's terrestrial water storage anomaly(TWSA)and its spatio-temporal dynamics during 2002–2020.Considering the effects of natural factors and human activities,the respective contributions of climate variability and human activities to terrestrial water storage change(TWSC)were separated.Results showed that:(1)LMRB's TWSA decreased by 0.3158 cm/a.(2)TWSA showed a gradual increase in distribution from southwest of MRB to middle LMRB and from northeast of LRB to middle LMRB.TWSA positively changed in Myanmar while slightly changed in Laos and China.It negatively changed in Vietnam,Thailand and Cambodia.(3)TWSA components decreased in a descending order of soil moisture,groundwater and precipitation.(4)Natural factors had a substantial and spatial differentiated influence on TWSA over the LMRB.(5)Climate variability contributed 79%of TWSC in the LMRB while human activities contributed 21%with an increasing impact after 2008.The TWSC of upstream basin countries was found to be controlled by climate variability while Vietnam and Cambodia's TWSC has been controlled by human activities since 2012.展开更多
The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
In the article“A Lightweight Approach for Skin Lesion Detection through Optimal Features Fusion”by Khadija Manzoor,Fiaz Majeed,Ansar Siddique,Talha Meraj,Hafiz Tayyab Rauf,Mohammed A.El-Meligy,Mohamed Sharaf,Abd Ela...In the article“A Lightweight Approach for Skin Lesion Detection through Optimal Features Fusion”by Khadija Manzoor,Fiaz Majeed,Ansar Siddique,Talha Meraj,Hafiz Tayyab Rauf,Mohammed A.El-Meligy,Mohamed Sharaf,Abd Elatty E.Abd Elgawad Computers,Materials&Continua,2022,Vol.70,No.1,pp.1617–1630.DOI:10.32604/cmc.2022.018621,URL:https://www.techscience.com/cmc/v70n1/44361,there was an error regarding the affiliation for the author Hafiz Tayyab Rauf.Instead of“Centre for Smart Systems,AI and Cybersecurity,Staffordshire University,Stoke-on-Trent,UK”,the affiliation should be“Independent Researcher,Bradford,BD80HS,UK”.展开更多
BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features ...BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features is crucial for early detection and appropriate treatment planning.AIM To retrospectively analyze the relationship between different pathological types of pancreatic cancer and their corresponding imaging features.METHODS We retrospectively analyzed the data of 500 patients diagnosed with pancreatic cancer between January 2010 and December 2020 at our institution.Pathological types were determined by histopathological examination of the surgical spe-cimens or biopsy samples.The imaging features were assessed using computed tomography,magnetic resonance imaging,and endoscopic ultrasound.Statistical analyses were performed to identify significant associations between pathological types and specific imaging characteristics.RESULTS There were 320(64%)cases of pancreatic ductal adenocarcinoma,75(15%)of intraductal papillary mucinous neoplasms,50(10%)of neuroendocrine tumors,and 55(11%)of other rare types.Distinct imaging features were identified in each pathological type.Pancreatic ductal adenocarcinoma typically presents as a hypodense mass with poorly defined borders on computed tomography,whereas intraductal papillary mucinous neoplasms present as characteristic cystic lesions with mural nodules.Neuroendocrine tumors often appear as hypervascular lesions in contrast-enhanced imaging.Statistical analysis revealed significant correlations between specific imaging features and pathological types(P<0.001).CONCLUSION This study demonstrated a strong association between the pathological types of pancreatic cancer and imaging features.These findings can enhance the accuracy of noninvasive diagnosis and guide personalized treatment approaches.展开更多
During Donald Trump’s first term,the“Trump Shock”brought world politics into an era of uncertainties and pulled the transatlantic alliance down to its lowest point in history.The Trump 2.0 tsunami brewed by the 202...During Donald Trump’s first term,the“Trump Shock”brought world politics into an era of uncertainties and pulled the transatlantic alliance down to its lowest point in history.The Trump 2.0 tsunami brewed by the 2024 presidential election of the United States has plunged the U.S.-Europe relations into more gloomy waters,ushering in a more complex and turbulent period of adjustment.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Objective To investigate the spatiotemporal patterns and socioeconomic factors influencing the incidence of tuberculosis(TB)in the Guangdong Province between 2010 and 2019.Method Spatial and temporal variations in TB ...Objective To investigate the spatiotemporal patterns and socioeconomic factors influencing the incidence of tuberculosis(TB)in the Guangdong Province between 2010 and 2019.Method Spatial and temporal variations in TB incidence were mapped using heat maps and hierarchical clustering.Socioenvironmental influencing factors were evaluated using a Bayesian spatiotemporal conditional autoregressive(ST-CAR)model.Results Annual incidence of TB in Guangdong decreased from 91.85/100,000 in 2010 to 53.06/100,000in 2019.Spatial hotspots were found in northeastern Guangdong,particularly in Heyuan,Shanwei,and Shantou,while Shenzhen,Dongguan,and Foshan had the lowest rates in the Pearl River Delta.The STCAR model showed that the TB risk was lower with higher per capita Gross Domestic Product(GDP)[Relative Risk(RR),0.91;95%Confidence Interval(CI):0.86–0.98],more the ratio of licensed physicians and physician(RR,0.94;95%CI:0.90-0.98),and higher per capita public expenditure(RR,0.94;95%CI:0.90–0.97),with a marginal effect of population density(RR,0.86;95%CI:0.86–1.00).Conclusion The incidence of TB in Guangdong varies spatially and temporally.Areas with poor economic conditions and insufficient healthcare resources are at an increased risk of TB infection.Strategies focusing on equitable health resource distribution and economic development are the key to TB control.展开更多
Smart contracts are widely used on the blockchain to implement complex transactions,such as decentralized applications on Ethereum.Effective vulnerability detection of large-scale smart contracts is critical,as attack...Smart contracts are widely used on the blockchain to implement complex transactions,such as decentralized applications on Ethereum.Effective vulnerability detection of large-scale smart contracts is critical,as attacks on smart contracts often cause huge economic losses.Since it is difficult to repair and update smart contracts,it is necessary to find the vulnerabilities before they are deployed.However,code analysis,which requires traversal paths,and learning methods,which require many features to be trained,are too time-consuming to detect large-scale on-chain contracts.Learning-based methods will obtain detection models from a feature space compared to code analysis methods such as symbol execution.But the existing features lack the interpretability of the detection results and training model,even worse,the large-scale feature space also affects the efficiency of detection.This paper focuses on improving the detection efficiency by reducing the dimension of the features,combined with expert knowledge.In this paper,a feature extraction model Block-gram is proposed to form low-dimensional knowledge-based features from bytecode.First,the metadata is separated and the runtime code is converted into a sequence of opcodes,which are divided into segments based on some instructions(jumps,etc.).Then,scalable Block-gram features,including 4-dimensional block features and 8-dimensional attribute features,are mined for the learning-based model training.Finally,feature contributions are calculated from SHAP values to measure the relationship between our features and the results of the detection model.In addition,six types of vulnerability labels are made on a dataset containing 33,885 contracts,and these knowledge-based features are evaluated using seven state-of-the-art learning algorithms,which show that the average detection latency speeds up 25×to 650×,compared with the features extracted by N-gram,and also can enhance the interpretability of the detection model.展开更多
This study examines the effects of rapid land use changes in India,with a specific focus on Sonipat District in Haryana—a region undergoing significant urban expansion.Over the past two decades,rural landscapes in So...This study examines the effects of rapid land use changes in India,with a specific focus on Sonipat District in Haryana—a region undergoing significant urban expansion.Over the past two decades,rural landscapes in Sonipat have undergone notable transformation,as open spaces and agricultural lands are increasingly converted into residential colonies,commercial hubs,and industrial zones.While such changes reflect economic development and urban growth,they also raise critical concerns about sustainability,especially in terms of food security,groundwater depletion,and environmental degradation.The study examines land use changes between 2000 and 2024 using remote sensing techniques and spatial analysis.It further incorporates secondary data and insights from community-level interactions to assess the socio-economic and ecological impacts of this transformation.The findings indicate rising land fragmentation,loss of agricultural livelihoods,pressure on civic infrastructure,and increasing pollution—factors that threaten long-term regional sustainability.The study underscores the urgent need to reconcile urban development with environmental and social sustainability.By offering a detailed case study of Sonipat,this research contributes to the broader discourse on India’s urbanisation pathways.It aims to provide policymakers,planners,and researchers with evidence-based recommendations to manage land transitions more responsibly,promoting urban growth models that ensure ecological integrity,equitable development,and long-term resilience.展开更多
Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework...Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework for human gait classification in video sequences using deep learning(DL)fusion assisted and posterior probability-based moth flames optimization(MFO)is proposed.In the first step,the video frames are resized and finetuned by two pre-trained lightweight DL models,EfficientNetB0 and MobileNetV2.Both models are selected based on the top-5 accuracy and less number of parameters.Later,both models are trained through deep transfer learning and extracted deep features fused using a voting scheme.In the last step,the authors develop a posterior probabilitybased MFO feature selection algorithm to select the best features.The selected features are classified using several supervised learning methods.The CASIA-B publicly available dataset has been employed for the experimental process.On this dataset,the authors selected six angles such as 0°,18°,90°,108°,162°,and 180°and obtained an average accuracy of 96.9%,95.7%,86.8%,90.0%,95.1%,and 99.7%.Results demonstrate comparable improvement in accuracy and significantly minimize the computational time with recent state-of-the-art techniques.展开更多
Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vi...Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.展开更多
Recognizing road scene context from a single image remains a critical challenge for intelligent autonomous driving systems,particularly in dynamic and unstructured environments.While recent advancements in deep learni...Recognizing road scene context from a single image remains a critical challenge for intelligent autonomous driving systems,particularly in dynamic and unstructured environments.While recent advancements in deep learning have significantly enhanced road scene classification,simultaneously achieving high accuracy,computational efficiency,and adaptability across diverse conditions continues to be difficult.To address these challenges,this study proposes HybridLSTM,a novel and efficient framework that integrates deep learning-based,object-based,and handcrafted feature extraction methods within a unified architecture.HybridLSTM is designed to classify four distinct road scene categories—crosswalk(CW),highway(HW),overpass/tunnel(OP/T),and parking(P)—by leveraging multiple publicly available datasets,including Places-365,BDD100K,LabelMe,and KITTI,thereby promoting domain generalization.The framework fuses object-level features extracted using YOLOv5 and VGG19,scene-level global representations obtained from a modified VGG19,and fine-grained texture features captured through eight handcrafted descriptors.This hybrid feature fusion enables the model to capture both semantic context and low-level visual cues,which are critical for robust scene understanding.To model spatial arrangements and latent sequential dependencies present even in static imagery,the combined features are processed through a Long Short-Term Memory(LSTM)network,allowing the extraction of discriminative patterns across heterogeneous feature spaces.Extensive experiments conducted on 2725 annotated road scene images,with an 80:20 training-to-testing split,validate the effectiveness of the proposed model.HybridLSTM achieves a classification accuracy of 96.3%,a precision of 95.8%,a recall of 96.1%,and an F1-score of 96.0%,outperforming several existing state-of-the-art methods.These results demonstrate the robustness,scalability,and generalization capability of HybridLSTM across varying environments and scene complexities.Moreover,the framework is optimized to balance classification performance with computational efficiency,making it highly suitable for real-time deployment in embedded autonomous driving systems.Future work will focus on extending the model to multi-class detection within a single frame and optimizing it further for edge-device deployments to reduce computational overhead in practical applications.展开更多
As Deepfake technology continues to evolve,the distinction between real and fake content becomes increasingly blurred.Most existing Deepfake video detectionmethods rely on single-frame facial image features,which limi...As Deepfake technology continues to evolve,the distinction between real and fake content becomes increasingly blurred.Most existing Deepfake video detectionmethods rely on single-frame facial image features,which limits their ability to capture temporal differences between frames.Current methods also exhibit limited generalization capabilities,struggling to detect content generated by unknown forgery algorithms.Moreover,the diversity and complexity of forgery techniques introduced by Artificial Intelligence Generated Content(AIGC)present significant challenges for traditional detection frameworks,whichmust balance high detection accuracy with robust performance.To address these challenges,we propose a novel Deepfake detection framework that combines a two-stream convolutional network with a Vision Transformer(ViT)module to enhance spatio-temporal feature representation.The ViT model extracts spatial features from the forged video,while the 3D convolutional network captures temporal features.The 3D convolution enables cross-frame feature extraction,allowing the model to detect subtle facial changes between frames.The confidence scores from both the ViT and 3D convolution submodels are fused at the decision layer,enabling themodel to effectively handle unknown forgery techniques.Focusing on Deepfake videos and GAN-generated images,the proposed approach is evaluated on two widely used public face forgery datasets.Compared to existing state-of-theartmethods,it achieves higher detection accuracy and better generalization performance,offering a robust solution for deepfake detection in real-world scenarios.展开更多
Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decode...Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decoder (ACSF-ED) network to predict the action and locate the object efficiently. In the Adaptive Cross-Scale Fusion Spatio-Temporal Encoder (ACSF ST-Encoder), the Asymptotic Cross-scale Feature-fusion Module (ACCFM) is designed to address the issue of information degradation caused by the propagation of high-level semantic information, thereby extracting high-quality multi-scale features to provide superior features for subsequent spatio-temporal information modeling. Within the Shared-Head Decoder structure, a shared classification and regression detection head is constructed. A multi-constraint loss function composed of one-to-one, one-to-many, and contrastive denoising losses is designed to address the problem of insufficient constraint force in predicting results with traditional methods. This loss function enhances the accuracy of model classification predictions and improves the proximity of regression position predictions to ground truth objects. The proposed method model is evaluated on the popular dataset UCF101-24 and JHMDB-21. Experimental results demonstrate that the proposed method achieves an accuracy of 81.52% on the Frame-mAP metric, surpassing current existing methods.展开更多
Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to...Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to perform multi-perspective learning of temporal signals and Electrocardiogram images, nor can they fully extract the latent information within the data, falling short of the accuracy required by clinicians. Therefore, this paper proposes an innovative hybrid multimodal spatiotemporal neural network to address these challenges. The model employs a multimodal data augmentation framework integrating visual and signal-based features to enhance the classification performance of rare arrhythmias in imbalanced datasets. Additionally, the spatiotemporal fusion module incorporates a spatiotemporal graph convolutional network to jointly model temporal and spatial features, uncovering complex dependencies within the Electrocardiogram data and improving the model’s ability to represent complex patterns. In experiments conducted on the MIT-BIH arrhythmia dataset, the model achieved 99.95% accuracy, 99.80% recall, and a 99.78% F1 score. The model was further validated for generalization using the clinical INCART arrhythmia dataset, and the results demonstrated its effectiveness in terms of both generalization and robustness.展开更多
Exploring the spatial evolution patterns of land use in creative urban tourism complexes provides theoretical and decision-making support to foster creative tourism projects.This study focuses on the Hangzhou Leisure ...Exploring the spatial evolution patterns of land use in creative urban tourism complexes provides theoretical and decision-making support to foster creative tourism projects.This study focuses on the Hangzhou Leisure Expo Garden as a case study,utilizing a land use change index model to analyze the spatial evolution characteristics and dynamic processes of creative urban tourism complexes,as well as to explore their spatial differentiation mechanisms.The analysis indicates that Hangzhou Leisure Expo Garden,initially a derelict industrial area dominated by production and residential land use,has evolved into a creative urban tourism complex with tourism comprehensive service land at its core,going through the pattern evolution processes of“constrained sprawl,”“intensive expansion,”and“random integration.”From the perspective of tourism human-land relationships,the formation of land use evolution patterns in creative urban tourism complexes results from various stakeholders(government,tourism enterprises,residents,tourists,etc.),as humanistic factors,continuously adapting to specific urban spaces,which are considered as geographical elements and have locational advantages and are oriented towards economic and social values.Based on the acquisition of stakeholder interests,the transformation of resource-disadvantaged areas into tourism advantage areas is facilitated,thereby achieving the re-creation of tourism creative space and promoting intensive spatial growth.展开更多
Remote sensing cross-modal image-text retrieval(RSCIR)can flexibly and subjectively retrieve remote sensing images utilizing query text,which has received more researchers’attention recently.However,with the increasi...Remote sensing cross-modal image-text retrieval(RSCIR)can flexibly and subjectively retrieve remote sensing images utilizing query text,which has received more researchers’attention recently.However,with the increasing volume of visual-language pre-training model parameters,direct transfer learning consumes a substantial amount of computational and storage resources.Moreover,recently proposed parameter-efficient transfer learning methods mainly focus on the reconstruction of channel features,ignoring the spatial features which are vital for modeling key entity relationships.To address these issues,we design an efficient transfer learning framework for RSCIR,which is based on spatial feature efficient reconstruction(SPER).A concise and efficient spatial adapter is introduced to enhance the extraction of spatial relationships.The spatial adapter is able to spatially reconstruct the features in the backbone with few parameters while incorporating the prior information from the channel dimension.We conduct quantitative and qualitative experiments on two different commonly used RSCIR datasets.Compared with traditional methods,our approach achieves an improvement of 3%-11% in sumR metric.Compared with methods finetuning all parameters,our proposed method only trains less than 1% of the parameters,while maintaining an overall performance of about 96%.展开更多
基金supported by National Natural Science Foundation of China(No.61103123)Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry
文摘Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.
基金supported by National Key Research and Development Plan of China(No.2022YFB3103304).
文摘False Data Injection Attacks(FDIAs)pose a critical security threat to modern power grids,corrupting state estimation and enabling malicious control actions that can lead to severe consequences,including cascading failures,large-scale blackouts,and significant economic losses.While detecting attacks is important,accurately localizing compromised nodes or measurements is even more critical,as it enables timely mitigation,targeted response,and enhanced system resilience beyond what detection alone can offer.Existing research typically models topological features using fixed structures,which can introduce irrelevant information and affect the effectiveness of feature extraction.To address this limitation,this paper proposes an FDIA localization model with adaptive neighborhood selection,which dynamically captures spatial dependencies of the power grid by adjusting node relationships based on data-driven similarities.The improved Transformer is employed to pre-fuse global spatial features of the graph,enriching the feature representation.To improve spatio-temporal correlation extraction for FDIA localization,the proposed model employs dilated causal convolution with a gating mechanism combined with graph convolution to capture and fuse long-range temporal features and adaptive topological features.This fully exploits the temporal dynamics and spatial dependencies inherent in the power grid.Finally,multi-source information is integrated to generate highly robust node embeddings,enhancing FDIA detection and localization.Experiments are conducted on IEEE 14,57,and 118-bus systems,and the results demonstrate that the proposed model substantially improves the accuracy of FDIA localization.Additional experiments are conducted to verify the effectiveness and robustness of the proposed model.
基金supported by the Youth Fund of the National Natural Science Foundation of China(No.52304311)the National Natural Science Foundation of China(No.52274282)the Postdoctoral Fellowship Program of CPSF(No.GZC20233016)。
文摘The fluidity of coal-water slurry(CWS)is crucial for various industrial applications such as long-distance transportation,gasification,and combustion.However,there is currently a lack of rapid and accurate detection methods for assessing CWS fluidity.This paper proposed a method for analyzing the fluidity using videos of CWS dripping processes.By integrating the temporal and spatial features of each frame in the video,a multi-cascade classifier for CWS fluidity is established.The classifier distinguishes between four levels(A,B,C,and D)based on the quality of fluidity.The preliminary classification of A and D is achieved through feature engineering and the XGBoost algorithm.Subsequently,convolutional neural networks(CNN)and long short-term memory(LSTM)are utilized to further differentiate between the B and C categories which are prone to confusion.Finally,through detailed comparative experiments,the paper demonstrates the step-by-step design process of the proposed method and the superiority of the final solution.The proposed method achieves an accuracy rate of over 90%in determining the fluidity of CWS,serving as a technical reference for future industrial applications.
基金National Natural Science Foundation of China,No.42161006Yunnan Fundamental Research Projects No.202201AT070094,No.202301BF070001-004+1 种基金Special Project for High-level Talents of Yunnan Province for Young Top Talents,No.C6213001159European Research Council(ERC)Starting-Grant STORIES,No.101040939。
文摘Due to water conflicts and allocation in the Lancang-Mekong River Basin(LMRB),the spatio-temporal differentiation of total water resources and the natural-human influence need to be clarified.This work investigated LMRB's terrestrial water storage anomaly(TWSA)and its spatio-temporal dynamics during 2002–2020.Considering the effects of natural factors and human activities,the respective contributions of climate variability and human activities to terrestrial water storage change(TWSC)were separated.Results showed that:(1)LMRB's TWSA decreased by 0.3158 cm/a.(2)TWSA showed a gradual increase in distribution from southwest of MRB to middle LMRB and from northeast of LRB to middle LMRB.TWSA positively changed in Myanmar while slightly changed in Laos and China.It negatively changed in Vietnam,Thailand and Cambodia.(3)TWSA components decreased in a descending order of soil moisture,groundwater and precipitation.(4)Natural factors had a substantial and spatial differentiated influence on TWSA over the LMRB.(5)Climate variability contributed 79%of TWSC in the LMRB while human activities contributed 21%with an increasing impact after 2008.The TWSC of upstream basin countries was found to be controlled by climate variability while Vietnam and Cambodia's TWSC has been controlled by human activities since 2012.
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
文摘In the article“A Lightweight Approach for Skin Lesion Detection through Optimal Features Fusion”by Khadija Manzoor,Fiaz Majeed,Ansar Siddique,Talha Meraj,Hafiz Tayyab Rauf,Mohammed A.El-Meligy,Mohamed Sharaf,Abd Elatty E.Abd Elgawad Computers,Materials&Continua,2022,Vol.70,No.1,pp.1617–1630.DOI:10.32604/cmc.2022.018621,URL:https://www.techscience.com/cmc/v70n1/44361,there was an error regarding the affiliation for the author Hafiz Tayyab Rauf.Instead of“Centre for Smart Systems,AI and Cybersecurity,Staffordshire University,Stoke-on-Trent,UK”,the affiliation should be“Independent Researcher,Bradford,BD80HS,UK”.
文摘BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features is crucial for early detection and appropriate treatment planning.AIM To retrospectively analyze the relationship between different pathological types of pancreatic cancer and their corresponding imaging features.METHODS We retrospectively analyzed the data of 500 patients diagnosed with pancreatic cancer between January 2010 and December 2020 at our institution.Pathological types were determined by histopathological examination of the surgical spe-cimens or biopsy samples.The imaging features were assessed using computed tomography,magnetic resonance imaging,and endoscopic ultrasound.Statistical analyses were performed to identify significant associations between pathological types and specific imaging characteristics.RESULTS There were 320(64%)cases of pancreatic ductal adenocarcinoma,75(15%)of intraductal papillary mucinous neoplasms,50(10%)of neuroendocrine tumors,and 55(11%)of other rare types.Distinct imaging features were identified in each pathological type.Pancreatic ductal adenocarcinoma typically presents as a hypodense mass with poorly defined borders on computed tomography,whereas intraductal papillary mucinous neoplasms present as characteristic cystic lesions with mural nodules.Neuroendocrine tumors often appear as hypervascular lesions in contrast-enhanced imaging.Statistical analysis revealed significant correlations between specific imaging features and pathological types(P<0.001).CONCLUSION This study demonstrated a strong association between the pathological types of pancreatic cancer and imaging features.These findings can enhance the accuracy of noninvasive diagnosis and guide personalized treatment approaches.
文摘During Donald Trump’s first term,the“Trump Shock”brought world politics into an era of uncertainties and pulled the transatlantic alliance down to its lowest point in history.The Trump 2.0 tsunami brewed by the 2024 presidential election of the United States has plunged the U.S.-Europe relations into more gloomy waters,ushering in a more complex and turbulent period of adjustment.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金supported by the Guangdong Provincial Clinical Research Center for Tuberculosis(No.2020B1111170014)。
文摘Objective To investigate the spatiotemporal patterns and socioeconomic factors influencing the incidence of tuberculosis(TB)in the Guangdong Province between 2010 and 2019.Method Spatial and temporal variations in TB incidence were mapped using heat maps and hierarchical clustering.Socioenvironmental influencing factors were evaluated using a Bayesian spatiotemporal conditional autoregressive(ST-CAR)model.Results Annual incidence of TB in Guangdong decreased from 91.85/100,000 in 2010 to 53.06/100,000in 2019.Spatial hotspots were found in northeastern Guangdong,particularly in Heyuan,Shanwei,and Shantou,while Shenzhen,Dongguan,and Foshan had the lowest rates in the Pearl River Delta.The STCAR model showed that the TB risk was lower with higher per capita Gross Domestic Product(GDP)[Relative Risk(RR),0.91;95%Confidence Interval(CI):0.86–0.98],more the ratio of licensed physicians and physician(RR,0.94;95%CI:0.90-0.98),and higher per capita public expenditure(RR,0.94;95%CI:0.90–0.97),with a marginal effect of population density(RR,0.86;95%CI:0.86–1.00).Conclusion The incidence of TB in Guangdong varies spatially and temporally.Areas with poor economic conditions and insufficient healthcare resources are at an increased risk of TB infection.Strategies focusing on equitable health resource distribution and economic development are the key to TB control.
基金partially supported by the National Natural Science Foundation (62272248)the Open Project Fund of State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences (CARCHA202108,CARCH201905)+1 种基金the Natural Science Foundation of Tianjin (20JCZDJC00610)Sponsored by Zhejiang Lab (2021KF0AB04)。
文摘Smart contracts are widely used on the blockchain to implement complex transactions,such as decentralized applications on Ethereum.Effective vulnerability detection of large-scale smart contracts is critical,as attacks on smart contracts often cause huge economic losses.Since it is difficult to repair and update smart contracts,it is necessary to find the vulnerabilities before they are deployed.However,code analysis,which requires traversal paths,and learning methods,which require many features to be trained,are too time-consuming to detect large-scale on-chain contracts.Learning-based methods will obtain detection models from a feature space compared to code analysis methods such as symbol execution.But the existing features lack the interpretability of the detection results and training model,even worse,the large-scale feature space also affects the efficiency of detection.This paper focuses on improving the detection efficiency by reducing the dimension of the features,combined with expert knowledge.In this paper,a feature extraction model Block-gram is proposed to form low-dimensional knowledge-based features from bytecode.First,the metadata is separated and the runtime code is converted into a sequence of opcodes,which are divided into segments based on some instructions(jumps,etc.).Then,scalable Block-gram features,including 4-dimensional block features and 8-dimensional attribute features,are mined for the learning-based model training.Finally,feature contributions are calculated from SHAP values to measure the relationship between our features and the results of the detection model.In addition,six types of vulnerability labels are made on a dataset containing 33,885 contracts,and these knowledge-based features are evaluated using seven state-of-the-art learning algorithms,which show that the average detection latency speeds up 25×to 650×,compared with the features extracted by N-gram,and also can enhance the interpretability of the detection model.
文摘This study examines the effects of rapid land use changes in India,with a specific focus on Sonipat District in Haryana—a region undergoing significant urban expansion.Over the past two decades,rural landscapes in Sonipat have undergone notable transformation,as open spaces and agricultural lands are increasingly converted into residential colonies,commercial hubs,and industrial zones.While such changes reflect economic development and urban growth,they also raise critical concerns about sustainability,especially in terms of food security,groundwater depletion,and environmental degradation.The study examines land use changes between 2000 and 2024 using remote sensing techniques and spatial analysis.It further incorporates secondary data and insights from community-level interactions to assess the socio-economic and ecological impacts of this transformation.The findings indicate rising land fragmentation,loss of agricultural livelihoods,pressure on civic infrastructure,and increasing pollution—factors that threaten long-term regional sustainability.The study underscores the urgent need to reconcile urban development with environmental and social sustainability.By offering a detailed case study of Sonipat,this research contributes to the broader discourse on India’s urbanisation pathways.It aims to provide policymakers,planners,and researchers with evidence-based recommendations to manage land transitions more responsibly,promoting urban growth models that ensure ecological integrity,equitable development,and long-term resilience.
基金King Saud University,Grant/Award Number:RSP2024R157。
文摘Biometric characteristics are playing a vital role in security for the last few years.Human gait classification in video sequences is an important biometrics attribute and is used for security purposes.A new framework for human gait classification in video sequences using deep learning(DL)fusion assisted and posterior probability-based moth flames optimization(MFO)is proposed.In the first step,the video frames are resized and finetuned by two pre-trained lightweight DL models,EfficientNetB0 and MobileNetV2.Both models are selected based on the top-5 accuracy and less number of parameters.Later,both models are trained through deep transfer learning and extracted deep features fused using a voting scheme.In the last step,the authors develop a posterior probabilitybased MFO feature selection algorithm to select the best features.The selected features are classified using several supervised learning methods.The CASIA-B publicly available dataset has been employed for the experimental process.On this dataset,the authors selected six angles such as 0°,18°,90°,108°,162°,and 180°and obtained an average accuracy of 96.9%,95.7%,86.8%,90.0%,95.1%,and 99.7%.Results demonstrate comparable improvement in accuracy and significantly minimize the computational time with recent state-of-the-art techniques.
基金supported by the Fundamental Research Funds for the Provincial Universities of Zhejiang (No.GK249909299001-036)National Key Research and Development Program of China (No. 2023YFB4502803)Zhejiang Provincial Natural Science Foundation of China (No.LDT23F01014F01)。
文摘Due to the limitations of existing imaging hardware, obtaining high-resolution hyperspectral images is challenging. Hyperspectral image super-resolution(HSI SR) has been a very attractive research topic in computer vision, attracting the attention of many researchers. However, most HSI SR methods focus on the tradeoff between spatial resolution and spectral information, and cannot guarantee the efficient extraction of image information. In this paper, a multidimensional features network(MFNet) for HSI SR is proposed, which simultaneously learns and fuses the spatial,spectral, and frequency multidimensional features of HSI. Spatial features contain rich local details,spectral features contain the information and correlation between spectral bands, and frequency feature can reflect the global information of the image and can be used to obtain the global context of HSI. The fusion of the three features can better guide image super-resolution, to obtain higher-quality high-resolution hyperspectral images. In MFNet, we use the frequency feature extraction module(FFEM) to extract the frequency feature. On this basis, a multidimensional features extraction module(MFEM) is designed to learn and fuse multidimensional features. In addition, experimental results on two public datasets demonstrate that MFNet achieves state-of-the-art performance.
文摘Recognizing road scene context from a single image remains a critical challenge for intelligent autonomous driving systems,particularly in dynamic and unstructured environments.While recent advancements in deep learning have significantly enhanced road scene classification,simultaneously achieving high accuracy,computational efficiency,and adaptability across diverse conditions continues to be difficult.To address these challenges,this study proposes HybridLSTM,a novel and efficient framework that integrates deep learning-based,object-based,and handcrafted feature extraction methods within a unified architecture.HybridLSTM is designed to classify four distinct road scene categories—crosswalk(CW),highway(HW),overpass/tunnel(OP/T),and parking(P)—by leveraging multiple publicly available datasets,including Places-365,BDD100K,LabelMe,and KITTI,thereby promoting domain generalization.The framework fuses object-level features extracted using YOLOv5 and VGG19,scene-level global representations obtained from a modified VGG19,and fine-grained texture features captured through eight handcrafted descriptors.This hybrid feature fusion enables the model to capture both semantic context and low-level visual cues,which are critical for robust scene understanding.To model spatial arrangements and latent sequential dependencies present even in static imagery,the combined features are processed through a Long Short-Term Memory(LSTM)network,allowing the extraction of discriminative patterns across heterogeneous feature spaces.Extensive experiments conducted on 2725 annotated road scene images,with an 80:20 training-to-testing split,validate the effectiveness of the proposed model.HybridLSTM achieves a classification accuracy of 96.3%,a precision of 95.8%,a recall of 96.1%,and an F1-score of 96.0%,outperforming several existing state-of-the-art methods.These results demonstrate the robustness,scalability,and generalization capability of HybridLSTM across varying environments and scene complexities.Moreover,the framework is optimized to balance classification performance with computational efficiency,making it highly suitable for real-time deployment in embedded autonomous driving systems.Future work will focus on extending the model to multi-class detection within a single frame and optimizing it further for edge-device deployments to reduce computational overhead in practical applications.
基金supported by National Natural Science Foundation of China(Nos.62477026,62177029,61807020)Humanities and Social Sciences Research Program of the Ministry of Education of China(No.23YJAZH047)the Startup Foundation for Introducing Talent of Nanjing University of Posts and Communications under Grant NY222034.
文摘As Deepfake technology continues to evolve,the distinction between real and fake content becomes increasingly blurred.Most existing Deepfake video detectionmethods rely on single-frame facial image features,which limits their ability to capture temporal differences between frames.Current methods also exhibit limited generalization capabilities,struggling to detect content generated by unknown forgery algorithms.Moreover,the diversity and complexity of forgery techniques introduced by Artificial Intelligence Generated Content(AIGC)present significant challenges for traditional detection frameworks,whichmust balance high detection accuracy with robust performance.To address these challenges,we propose a novel Deepfake detection framework that combines a two-stream convolutional network with a Vision Transformer(ViT)module to enhance spatio-temporal feature representation.The ViT model extracts spatial features from the forged video,while the 3D convolutional network captures temporal features.The 3D convolution enables cross-frame feature extraction,allowing the model to detect subtle facial changes between frames.The confidence scores from both the ViT and 3D convolution submodels are fused at the decision layer,enabling themodel to effectively handle unknown forgery techniques.Focusing on Deepfake videos and GAN-generated images,the proposed approach is evaluated on two widely used public face forgery datasets.Compared to existing state-of-theartmethods,it achieves higher detection accuracy and better generalization performance,offering a robust solution for deepfake detection in real-world scenarios.
基金support for this work was supported by Key Lab of Intelligent and Green Flexographic Printing under Grant ZBKT202301.
文摘Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decoder (ACSF-ED) network to predict the action and locate the object efficiently. In the Adaptive Cross-Scale Fusion Spatio-Temporal Encoder (ACSF ST-Encoder), the Asymptotic Cross-scale Feature-fusion Module (ACCFM) is designed to address the issue of information degradation caused by the propagation of high-level semantic information, thereby extracting high-quality multi-scale features to provide superior features for subsequent spatio-temporal information modeling. Within the Shared-Head Decoder structure, a shared classification and regression detection head is constructed. A multi-constraint loss function composed of one-to-one, one-to-many, and contrastive denoising losses is designed to address the problem of insufficient constraint force in predicting results with traditional methods. This loss function enhances the accuracy of model classification predictions and improves the proximity of regression position predictions to ground truth objects. The proposed method model is evaluated on the popular dataset UCF101-24 and JHMDB-21. Experimental results demonstrate that the proposed method achieves an accuracy of 81.52% on the Frame-mAP metric, surpassing current existing methods.
基金supported by The Henan Province Science and Technology Research Project(242102211046)the Key Scientific Research Project of Higher Education Institutions in Henan Province(25A520039)+1 种基金theNatural Science Foundation project of Zhongyuan Institute of Technology(K2025YB011)the Zhongyuan University of Technology Graduate Education and Teaching Reform Research Project(JG202424).
文摘Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to perform multi-perspective learning of temporal signals and Electrocardiogram images, nor can they fully extract the latent information within the data, falling short of the accuracy required by clinicians. Therefore, this paper proposes an innovative hybrid multimodal spatiotemporal neural network to address these challenges. The model employs a multimodal data augmentation framework integrating visual and signal-based features to enhance the classification performance of rare arrhythmias in imbalanced datasets. Additionally, the spatiotemporal fusion module incorporates a spatiotemporal graph convolutional network to jointly model temporal and spatial features, uncovering complex dependencies within the Electrocardiogram data and improving the model’s ability to represent complex patterns. In experiments conducted on the MIT-BIH arrhythmia dataset, the model achieved 99.95% accuracy, 99.80% recall, and a 99.78% F1 score. The model was further validated for generalization using the clinical INCART arrhythmia dataset, and the results demonstrated its effectiveness in terms of both generalization and robustness.
文摘Exploring the spatial evolution patterns of land use in creative urban tourism complexes provides theoretical and decision-making support to foster creative tourism projects.This study focuses on the Hangzhou Leisure Expo Garden as a case study,utilizing a land use change index model to analyze the spatial evolution characteristics and dynamic processes of creative urban tourism complexes,as well as to explore their spatial differentiation mechanisms.The analysis indicates that Hangzhou Leisure Expo Garden,initially a derelict industrial area dominated by production and residential land use,has evolved into a creative urban tourism complex with tourism comprehensive service land at its core,going through the pattern evolution processes of“constrained sprawl,”“intensive expansion,”and“random integration.”From the perspective of tourism human-land relationships,the formation of land use evolution patterns in creative urban tourism complexes results from various stakeholders(government,tourism enterprises,residents,tourists,etc.),as humanistic factors,continuously adapting to specific urban spaces,which are considered as geographical elements and have locational advantages and are oriented towards economic and social values.Based on the acquisition of stakeholder interests,the transformation of resource-disadvantaged areas into tourism advantage areas is facilitated,thereby achieving the re-creation of tourism creative space and promoting intensive spatial growth.
基金supported by the National Key R&D Program of China(No.2022ZD0118402)。
文摘Remote sensing cross-modal image-text retrieval(RSCIR)can flexibly and subjectively retrieve remote sensing images utilizing query text,which has received more researchers’attention recently.However,with the increasing volume of visual-language pre-training model parameters,direct transfer learning consumes a substantial amount of computational and storage resources.Moreover,recently proposed parameter-efficient transfer learning methods mainly focus on the reconstruction of channel features,ignoring the spatial features which are vital for modeling key entity relationships.To address these issues,we design an efficient transfer learning framework for RSCIR,which is based on spatial feature efficient reconstruction(SPER).A concise and efficient spatial adapter is introduced to enhance the extraction of spatial relationships.The spatial adapter is able to spatially reconstruct the features in the backbone with few parameters while incorporating the prior information from the channel dimension.We conduct quantitative and qualitative experiments on two different commonly used RSCIR datasets.Compared with traditional methods,our approach achieves an improvement of 3%-11% in sumR metric.Compared with methods finetuning all parameters,our proposed method only trains less than 1% of the parameters,while maintaining an overall performance of about 96%.