The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
How to fully use spectral and temporal information for efficient identification of crops becomes a crucial issue since each crop has its specific seasonal dynamics. A thorough understanding on the relative usefulness ...How to fully use spectral and temporal information for efficient identification of crops becomes a crucial issue since each crop has its specific seasonal dynamics. A thorough understanding on the relative usefulness of spectral and temporal features is thus essential for better organization of crop classification information. This study, taking Heilongjiang Province as the study area, aims to use time-series moderate resolution imaging spectroradiometer (MODIS) surface reflectance product (MOD09A1) data to evaluate the importance of spectral and temporal features for crop classification. In doing so, a feature selection strategy based on separability index (SI) was first used to rank the most important spectro-temporal features for crop classification. Ten feature scenarios with different spectral and temporal variable combinations were then devised, which were used for crop classification using the support vector machine and their accuracies were finally assessed with the same crop samples. The results show that the normalized difference tillage index (NDTI), land surface water index (LSWl) and enhanced vegetation index (EVI) are the most informative spectral features and late August to early September is the most informative temporal window for identifying crops in Heilongjiang for the observed year 2011. Spectral diversity and time variety are both vital for crop classification, and their combined use can improve the accuracy by about 30% in comparison with single image. The feature selection technique based on SI analysis is superior for achieving high crop classification accuracy (producers' accuracy of 94.03% and users' accuracy of 93.77%) with a small number of features. Increasing temporal resolution is not necessarily important for improving the classification accuracies for crops, and a relatively high classification accuracy can be achieved as long as the images associated with key phenological phrases are retained.展开更多
Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure p...Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.展开更多
Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells an...Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.展开更多
To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network mo...To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.展开更多
With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of...With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.展开更多
Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE...Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE)has been widely used to improve the model accuracy of soft sensors.However,with the increase of network layers,SAE may encounter serious information loss issues,which affect the modeling performance of soft sensors.Besides,there are typically very few labeled samples in the data set,which brings challenges to traditional neural networks to solve.In this paper,a multi-scale feature fused stacked autoencoder(MFF-SAE)is suggested for feature representation related to hierarchical output,where stacked autoencoder,mutual information(MI)and multi-scale feature fusion(MFF)strategies are integrated.Based on correlation analysis between output and input variables,critical hidden variables are extracted from the original variables in each autoencoder's input layer,which are correspondingly given varying weights.Besides,an integration strategy based on multi-scale feature fusion is adopted to mitigate the impact of information loss with the deepening of the network layers.Then,the MFF-SAE method is designed and stacked to form deep networks.Two practical industrial processes are utilized to evaluate the performance of MFF-SAE.Results from simulations indicate that in comparison to other cutting-edge techniques,the proposed method may considerably enhance the accuracy of soft sensor modeling,where the suggested method reduces the root mean square error(RMSE)by 71.8%,17.1%and 64.7%,15.1%,respectively.展开更多
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio...Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.展开更多
Background:Accurate mapping of tree species is highly desired in the management and research of plantation forests,whose ecosystem services are currently under threats.Time-series multispectral satellite images,e.g.,f...Background:Accurate mapping of tree species is highly desired in the management and research of plantation forests,whose ecosystem services are currently under threats.Time-series multispectral satellite images,e.g.,from Landsat-8(L8)and Sentinel-2(S2),have been proven useful in mapping general forest types,yet we do not know quantitatively how their spectral features(e.g.,red-edge)and temporal frequency of data acquisitions(e.g.,16-day vs.5-day)contribute to plantation forest mapping to the species level.Moreover,it is unclear to what extent the fusion of L8 and S2 will result in improvements in tree species mapping of northern plantation forests in China.Methods:We designed three sets of classification experiments(i.e.,single-date,multi-date,and spectral-temporal)to evaluate the performances of L8 and S2 data for mapping keystone timber tree species in northern China.We first used seven pairs of L8 and S2 images to evaluate the performances of L8 and S2 key spectral features for separating these tree species across key growing stages.Then we extracted the spectral-temporal features from all available images of different temporal frequency of data acquisition(i.e.,L8 time series,S2 time series,and fusion of L8 and S2)to assess the contribution of image temporal frequency on the accuracy of tree species mapping in the study area.Results:1)S2 outperformed L8 images in all classification experiments,with or without the red edge bands(0.4%–3.4%and 0.2%–4.4%higher for overall accuracy and macro-F1,respectively);2)NDTI(the ratio of SWIR1 minus SWIR2 to SWIR1 plus SWIR2)and Tasseled Cap coefficients were most important features in all the classifications,and for time-series experiments,the spectral-temporal features of red band-related vegetation indices were most useful;3)increasing the temporal frequency of data acquisition can improve overall accuracy of tree species mapping for up to 3.2%(from 90.1%using single-date imagery to 93.3%using S2 time-series),yet similar overall accuracies were achieved using S2 time-series(93.3%)and the fusion of S2 and L8(93.2%).Conclusions:This study quantifies the contributions of L8 and S2 spectral and temporal features in mapping keystone tree species of northern plantation forests in China and suggests that for mapping tree species in China's northern plantation forests,the effects of increasing the temporal frequency of data acquisition could saturate quickly after using only two images from key phenological stages.展开更多
Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combin...Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.展开更多
Extreme weather events such as persistent high temperatures, heavy rains or sudden cold waves in Shanxi Province in China have brought great losses and disasters to people’s production and life. It is of great practi...Extreme weather events such as persistent high temperatures, heavy rains or sudden cold waves in Shanxi Province in China have brought great losses and disasters to people’s production and life. It is of great practical significance to study the temporal and spatial distribution characteristics of extreme weather events and the circulation background field. We selected daily high temperature data (≥35°C), daily minimum temperature data and daily precipitation data (≥50 mm) from 109 meteorological stations in Shanxi Province, China from 1981 to 2010, then set the period in which the temperature is ≥35°C for more than 3 days as a high temperature extreme weather event, define the station in which 24 hour cumulative precipitation is ≥50 mm precipitation on a certain day (20 - 20 hours, Beijing time) as a rainstorm weather, and determine the cold air activity with daily minimum temperature dropped by more than 8°C for 24 hours, or decreased by 10°C for 48 h, and a daily minimum temperature of ≤4°C as a cold weather process. We statistically analyze the temporal and spatial characteristics and trends of high temperature, heavy rain and cold weather and the circulation background field. We count the number of extreme weather events such as persistent high temperatures, heavy rains and cold weather frosts in Shanxi, and analyze the temporal and spatial distribution characteristics, trends and general circulation background of extreme weather events. We analyze and find out the common features of the large-scale circulation background field in various extreme weather events. Through the study of the temporal and spatial distribution characteristics of extreme weather events in Shanxi, including persistent high temperature, heavy rain or sudden cold wave frost weather, we summarize the large-scale circulation characteristics of such extreme weather events. It will provide some reference for future related weather forecasting.展开更多
Aiming at the problems of inaccuracy in detecting theαphase contour of TB6 titanium alloy.By combining computer vision technology with human vision mechanisms,the spatial characteristics of theαphase can be simulate...Aiming at the problems of inaccuracy in detecting theαphase contour of TB6 titanium alloy.By combining computer vision technology with human vision mechanisms,the spatial characteristics of theαphase can be simulated to obtain the contour accurately.Therefore,an algorithm forαphase contour detection of TB6 titanium alloy fused with multi-scale fretting features is proposed.Firstly,through the response of the classical receptive field model based on fretting and the suppression of new non-classical receptive field model based on fretting,the information maps of theαphase contour of the TB6 titanium alloy at different scales are obtained;then the information map of the smallest scale contour is used as a benchmark,the neighborhood is constructed to judge the deviation of other scale contour information,and the corresponding weight value is calculated;finally,Gaussian function is used to weight and fuse the deviation information,and the contour detection result of TB6 titanium alloyαphase is obtained.In the Visual Studio 2013 environment,484 metallographic images with different temperatures,strain rates,and magnifications were tested.The results show that the performance evaluation F value of the proposed algorithm is 0.915,which can effectively improve the accuracy ofαphase contour detection of TB6 titanium alloy.展开更多
Rapid development of deepfake technology led to the spread of forged audios and videos across network platforms,presenting risks for numerous countries,societies,and individuals,and posing a serious threat to cyberspa...Rapid development of deepfake technology led to the spread of forged audios and videos across network platforms,presenting risks for numerous countries,societies,and individuals,and posing a serious threat to cyberspace security.To address the problem of insufficient extraction of spatial features and the fact that temporal features are not considered in the deepfake video detection,we propose a detection method based on improved CapsNet and temporal–spatial features(iCapsNet–TSF).First,the dynamic routing algorithm of CapsNet is improved using weight initialization and updating.Then,the optical flow algorithm is used to extract interframe temporal features of the videos to form a dataset of temporal–spatial features.Finally,the iCapsNet model is employed to fully learn the temporal–spatial features of facial videos,and the results are fused.Experimental results show that the detection accuracy of iCapsNet–TSF reaches 94.07%,98.83%,and 98.50%on the Celeb-DF,FaceSwap,and Deepfakes datasets,respectively,displaying a better performance than most existing mainstream algorithms.The iCapsNet–TSF method combines the capsule network and the optical flow algorithm,providing a novel strategy for the deepfake detection,which is of great significance to the prevention of deepfake attacks and the preservation of cyberspace security.展开更多
In order to improve the models capability in expressing features during few-shot learning,a multi-scale features prototypical network(MS-PN)algorithm is proposed.The metric learning algo-rithm is employed to extract i...In order to improve the models capability in expressing features during few-shot learning,a multi-scale features prototypical network(MS-PN)algorithm is proposed.The metric learning algo-rithm is employed to extract image features and project them into a feature space,thus evaluating the similarity between samples based on their relative distances within the metric space.To sufficiently extract feature information from limited sample data and mitigate the impact of constrained data vol-ume,a multi-scale feature extraction network is presented to capture data features at various scales during the process of image feature extraction.Additionally,the position of the prototype is fine-tuned by assigning weights to data points to mitigate the influence of outliers on the experiment.The loss function integrates contrastive loss and label-smoothing to bring similar data points closer and separate dissimilar data points within the metric space.Experimental evaluations are conducted on small-sample datasets mini-ImageNet and CUB200-2011.The method in this paper can achieve higher classification accuracy.Specifically,in the 5-way 1-shot experiment,classification accuracy reaches 50.13%and 66.79%respectively on these two datasets.Moreover,in the 5-way 5-shot ex-periment,accuracy of 66.79%and 85.91%are observed,respectively.展开更多
On the basis of the arctic monthly mean sea ice extent data set during 1953-1984, the arctic region is divided into eight subregions,and the analyses of empirical orthogonal functions, power spectrum and maximum entro...On the basis of the arctic monthly mean sea ice extent data set during 1953-1984, the arctic region is divided into eight subregions,and the analyses of empirical orthogonal functions, power spectrum and maximum entropy spectrum are made to indentify the major spatial and temporal features of the sea ice fluctuations within 32-year period. And then, a brief appropriate physical explanation is tentatively suggested. The results show that both seasonal and non-seasonal variations of the sea ice extent are remarkable, and iis mean annual peripheral positions as well as their interannu-al shifting amplitudes are quite different among all subregions. These features are primarily affected by solar radiation, o-cean circulation, sea surface temperature and maritime-continental contrast, while the non-seasonal variations are most possibly affected by the cosmic-geophysical factors such as earth pole shife, earth rotation oscillation and solar activity.展开更多
Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. Th...Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.展开更多
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ...Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.展开更多
In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accurac...In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.展开更多
The Yellow River Basin in Sichuan Province(YRS)is undergoing severe soil erosion and exacerbated ecological vulnerability,which collectively pose formidable challenges for regional water conservation(WC)and sustainabl...The Yellow River Basin in Sichuan Province(YRS)is undergoing severe soil erosion and exacerbated ecological vulnerability,which collectively pose formidable challenges for regional water conservation(WC)and sustainable development.While effectively enhancing WC necessitates a comprehensive understanding of its driving factors and corresponding intervention strategies,existing studies have largely neglected the spatiotemporal heterogeneity of both natural and socio-economic drivers.Therefore,this study explored the spatiotemporal heterogeneity of WC drivers in YRS using multi-scale geographically weighted regression(MGWR)and geographically and temporally weighted regression(GTWR)models from an eco-hydrological perspective.We discovered that downstream regions,which are more developed,achieved significantly better WC than upstream regions.The results also demonstrated that the influence of temperature and wind speed is consistently dominant and temporally stable due to climate stability,while the influence of vegetation shifted from negative to positive around 2010,likely indicating greater benefits from understory vegetation.Economic growth positively impacted WC in upstream regions but had a negative effect in the more developed downstream regions.These findings highlight the importance of targeted water conservation strategies,including locally appropriate revegetation,optimization of agricultural and economic structures,and the establishment of eco-compensation mechanisms for ecological conservation and sustainable development.展开更多
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
基金financially supported by the Non-Profit Research Grant of the National Administration of Surveying,Mapping and Geoinformation of China (201512028)the National Natural Science Foundation of China (41271112)
文摘How to fully use spectral and temporal information for efficient identification of crops becomes a crucial issue since each crop has its specific seasonal dynamics. A thorough understanding on the relative usefulness of spectral and temporal features is thus essential for better organization of crop classification information. This study, taking Heilongjiang Province as the study area, aims to use time-series moderate resolution imaging spectroradiometer (MODIS) surface reflectance product (MOD09A1) data to evaluate the importance of spectral and temporal features for crop classification. In doing so, a feature selection strategy based on separability index (SI) was first used to rank the most important spectro-temporal features for crop classification. Ten feature scenarios with different spectral and temporal variable combinations were then devised, which were used for crop classification using the support vector machine and their accuracies were finally assessed with the same crop samples. The results show that the normalized difference tillage index (NDTI), land surface water index (LSWl) and enhanced vegetation index (EVI) are the most informative spectral features and late August to early September is the most informative temporal window for identifying crops in Heilongjiang for the observed year 2011. Spectral diversity and time variety are both vital for crop classification, and their combined use can improve the accuracy by about 30% in comparison with single image. The feature selection technique based on SI analysis is superior for achieving high crop classification accuracy (producers' accuracy of 94.03% and users' accuracy of 93.77%) with a small number of features. Increasing temporal resolution is not necessarily important for improving the classification accuracies for crops, and a relatively high classification accuracy can be achieved as long as the images associated with key phenological phrases are retained.
基金supported by the National Natural Science Foundation of China(No.62376287)the International Science and Technology Innovation Joint Base of Machine Vision and Medical Image Processing in Hunan Province(2021CB1013)the Natural Science Foundation of Hunan Province(Nos.2022JJ30762,2023JJ70016).
文摘Globally,diabetic retinopathy(DR)is the primary cause of blindness,affecting millions of people worldwide.This widespread impact underscores the critical need for reliable and precise diagnostic techniques to ensure prompt diagnosis and effective treatment.Deep learning-based automated diagnosis for diabetic retinopathy can facilitate early detection and treatment.However,traditional deep learning models that focus on local views often learn feature representations that are less discriminative at the semantic level.On the other hand,models that focus on global semantic-level information might overlook critical,subtle local pathological features.To address this issue,we propose an adaptive multi-scale feature fusion network called(AMSFuse),which can adaptively combine multi-scale global and local features without compromising their individual representation.Specifically,our model incorporates global features for extracting high-level contextual information from retinal images.Concurrently,local features capture fine-grained details,such as microaneurysms,hemorrhages,and exudates,which are critical for DR diagnosis.These global and local features are adaptively fused using a fusion block,followed by an Integrated Attention Mechanism(IAM)that refines the fused features by emphasizing relevant regions,thereby enhancing classification accuracy for DR classification.Our model achieves 86.3%accuracy on the APTOS dataset and 96.6%RFMiD,both of which are comparable to state-of-the-art methods.
基金funded by the China Chongqing Municipal Science and Technology Bureau,grant numbers 2024TIAD-CYKJCXX0121,2024NSCQ-LZX0135Chongqing Municipal Commission of Housing and Urban-Rural Development,grant number CKZ2024-87+3 种基金the Chongqing University of Technology graduate education high-quality development project,grant number gzlsz202401the Chongqing University of Technology-Chongqing LINGLUE Technology Co.,Ltd.,Electronic Information(Artificial Intelligence)graduate joint training basethe Postgraduate Education and Teaching Reform Research Project in Chongqing,grant number yjg213116the Chongqing University of Technology-CISDI Chongqing Information Technology Co.,Ltd.,Computer Technology graduate joint training base.
文摘Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.
基金supported by National Natural Science Foundation of China(No.61862037)Lanzhou Jiaotong University Tianyou Innovation Team Project(No.TY202002)。
文摘To solve the problems of redundant feature information,the insignificant difference in feature representation,and low recognition accuracy of the fine-grained image,based on the ResNeXt50 model,an MSFResNet network model is proposed by fusing multi-scale feature information.Firstly,a multi-scale feature extraction module is designed to obtain multi-scale information on feature images by using different scales of convolution kernels.Meanwhile,the channel attention mechanism is used to increase the global information acquisition of the network.Secondly,the feature images processed by the multi-scale feature extraction module are fused with the deep feature images through short links to guide the full learning of the network,thus reducing the loss of texture details of the deep network feature images,and improving network generalization ability and recognition accuracy.Finally,the validity of the MSFResNet model is verified using public datasets and applied to wild mushroom identification.Experimental results show that compared with ResNeXt50 network model,the accuracy of the MSFResNet model is improved by 6.01%on the FGVC-Aircraft common dataset.It achieves 99.13%classification accuracy on the wild mushroom dataset,which is 0.47%higher than ResNeXt50.Furthermore,the experimental results of the thermal map show that the MSFResNet model significantly reduces the interference of background information,making the network focus on the location of the main body of wild mushroom,which can effectively improve the accuracy of wild mushroom identification.
基金supported by Communication University of China(HG23035)partly supported by the Fundamental Research Funds for the Central Universities(CUC230A013).
文摘With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.
基金supported by the National Key Research and Development Program of China(2023YFB3307800)National Natural Science Foundation of China(62394343,62373155)+2 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024A26)Fundamental Research Funds for the Central Universities.
文摘Deep Learning has been widely used to model soft sensors in modern industrial processes with nonlinear variables and uncertainty.Due to the outstanding ability for high-level feature extraction,stacked autoencoder(SAE)has been widely used to improve the model accuracy of soft sensors.However,with the increase of network layers,SAE may encounter serious information loss issues,which affect the modeling performance of soft sensors.Besides,there are typically very few labeled samples in the data set,which brings challenges to traditional neural networks to solve.In this paper,a multi-scale feature fused stacked autoencoder(MFF-SAE)is suggested for feature representation related to hierarchical output,where stacked autoencoder,mutual information(MI)and multi-scale feature fusion(MFF)strategies are integrated.Based on correlation analysis between output and input variables,critical hidden variables are extracted from the original variables in each autoencoder's input layer,which are correspondingly given varying weights.Besides,an integration strategy based on multi-scale feature fusion is adopted to mitigate the impact of information loss with the deepening of the network layers.Then,the MFF-SAE method is designed and stacked to form deep networks.Two practical industrial processes are utilized to evaluate the performance of MFF-SAE.Results from simulations indicate that in comparison to other cutting-edge techniques,the proposed method may considerably enhance the accuracy of soft sensor modeling,where the suggested method reduces the root mean square error(RMSE)by 71.8%,17.1%and 64.7%,15.1%,respectively.
基金funded by the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia through research group No.(RG-NBU-2022-1234).
文摘Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.
基金supported by National Natural Science Foundation of China(Grant No.41901382)Open Fund of State Key Laboratory of Remote Sensing Science(Grant No.OFSLRSS201917)the HZAU research startup fund(No.11041810340,No.11041810341).
文摘Background:Accurate mapping of tree species is highly desired in the management and research of plantation forests,whose ecosystem services are currently under threats.Time-series multispectral satellite images,e.g.,from Landsat-8(L8)and Sentinel-2(S2),have been proven useful in mapping general forest types,yet we do not know quantitatively how their spectral features(e.g.,red-edge)and temporal frequency of data acquisitions(e.g.,16-day vs.5-day)contribute to plantation forest mapping to the species level.Moreover,it is unclear to what extent the fusion of L8 and S2 will result in improvements in tree species mapping of northern plantation forests in China.Methods:We designed three sets of classification experiments(i.e.,single-date,multi-date,and spectral-temporal)to evaluate the performances of L8 and S2 data for mapping keystone timber tree species in northern China.We first used seven pairs of L8 and S2 images to evaluate the performances of L8 and S2 key spectral features for separating these tree species across key growing stages.Then we extracted the spectral-temporal features from all available images of different temporal frequency of data acquisition(i.e.,L8 time series,S2 time series,and fusion of L8 and S2)to assess the contribution of image temporal frequency on the accuracy of tree species mapping in the study area.Results:1)S2 outperformed L8 images in all classification experiments,with or without the red edge bands(0.4%–3.4%and 0.2%–4.4%higher for overall accuracy and macro-F1,respectively);2)NDTI(the ratio of SWIR1 minus SWIR2 to SWIR1 plus SWIR2)and Tasseled Cap coefficients were most important features in all the classifications,and for time-series experiments,the spectral-temporal features of red band-related vegetation indices were most useful;3)increasing the temporal frequency of data acquisition can improve overall accuracy of tree species mapping for up to 3.2%(from 90.1%using single-date imagery to 93.3%using S2 time-series),yet similar overall accuracies were achieved using S2 time-series(93.3%)and the fusion of S2 and L8(93.2%).Conclusions:This study quantifies the contributions of L8 and S2 spectral and temporal features in mapping keystone tree species of northern plantation forests in China and suggests that for mapping tree species in China's northern plantation forests,the effects of increasing the temporal frequency of data acquisition could saturate quickly after using only two images from key phenological stages.
基金supported by National Natural Science Foundation of China(No.61103123)Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry
文摘Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.
文摘Extreme weather events such as persistent high temperatures, heavy rains or sudden cold waves in Shanxi Province in China have brought great losses and disasters to people’s production and life. It is of great practical significance to study the temporal and spatial distribution characteristics of extreme weather events and the circulation background field. We selected daily high temperature data (≥35°C), daily minimum temperature data and daily precipitation data (≥50 mm) from 109 meteorological stations in Shanxi Province, China from 1981 to 2010, then set the period in which the temperature is ≥35°C for more than 3 days as a high temperature extreme weather event, define the station in which 24 hour cumulative precipitation is ≥50 mm precipitation on a certain day (20 - 20 hours, Beijing time) as a rainstorm weather, and determine the cold air activity with daily minimum temperature dropped by more than 8°C for 24 hours, or decreased by 10°C for 48 h, and a daily minimum temperature of ≤4°C as a cold weather process. We statistically analyze the temporal and spatial characteristics and trends of high temperature, heavy rain and cold weather and the circulation background field. We count the number of extreme weather events such as persistent high temperatures, heavy rains and cold weather frosts in Shanxi, and analyze the temporal and spatial distribution characteristics, trends and general circulation background of extreme weather events. We analyze and find out the common features of the large-scale circulation background field in various extreme weather events. Through the study of the temporal and spatial distribution characteristics of extreme weather events in Shanxi, including persistent high temperature, heavy rain or sudden cold wave frost weather, we summarize the large-scale circulation characteristics of such extreme weather events. It will provide some reference for future related weather forecasting.
基金Supported by Hebei Provincial Key Laboratory for Software Engineering(Grant No.22567637H)the"Rail Vehicle Application Engineering"National International Science and Technology Cooperation Base Open Project Fund(Grant No.BMRV21KF09).
文摘Aiming at the problems of inaccuracy in detecting theαphase contour of TB6 titanium alloy.By combining computer vision technology with human vision mechanisms,the spatial characteristics of theαphase can be simulated to obtain the contour accurately.Therefore,an algorithm forαphase contour detection of TB6 titanium alloy fused with multi-scale fretting features is proposed.Firstly,through the response of the classical receptive field model based on fretting and the suppression of new non-classical receptive field model based on fretting,the information maps of theαphase contour of the TB6 titanium alloy at different scales are obtained;then the information map of the smallest scale contour is used as a benchmark,the neighborhood is constructed to judge the deviation of other scale contour information,and the corresponding weight value is calculated;finally,Gaussian function is used to weight and fuse the deviation information,and the contour detection result of TB6 titanium alloyαphase is obtained.In the Visual Studio 2013 environment,484 metallographic images with different temperatures,strain rates,and magnifications were tested.The results show that the performance evaluation F value of the proposed algorithm is 0.915,which can effectively improve the accuracy ofαphase contour detection of TB6 titanium alloy.
基金supported by the Fundamental Research Funds for the Central Universities under Grant 2020JKF101the Research Funds of Sugon under Grant 2022KY001.
文摘Rapid development of deepfake technology led to the spread of forged audios and videos across network platforms,presenting risks for numerous countries,societies,and individuals,and posing a serious threat to cyberspace security.To address the problem of insufficient extraction of spatial features and the fact that temporal features are not considered in the deepfake video detection,we propose a detection method based on improved CapsNet and temporal–spatial features(iCapsNet–TSF).First,the dynamic routing algorithm of CapsNet is improved using weight initialization and updating.Then,the optical flow algorithm is used to extract interframe temporal features of the videos to form a dataset of temporal–spatial features.Finally,the iCapsNet model is employed to fully learn the temporal–spatial features of facial videos,and the results are fused.Experimental results show that the detection accuracy of iCapsNet–TSF reaches 94.07%,98.83%,and 98.50%on the Celeb-DF,FaceSwap,and Deepfakes datasets,respectively,displaying a better performance than most existing mainstream algorithms.The iCapsNet–TSF method combines the capsule network and the optical flow algorithm,providing a novel strategy for the deepfake detection,which is of great significance to the prevention of deepfake attacks and the preservation of cyberspace security.
基金the Scientific Research Foundation of Liaoning Provincial Department of Education(No.LJKZ0139)the Program for Liaoning Excellent Talents in University(No.LR15045).
文摘In order to improve the models capability in expressing features during few-shot learning,a multi-scale features prototypical network(MS-PN)algorithm is proposed.The metric learning algo-rithm is employed to extract image features and project them into a feature space,thus evaluating the similarity between samples based on their relative distances within the metric space.To sufficiently extract feature information from limited sample data and mitigate the impact of constrained data vol-ume,a multi-scale feature extraction network is presented to capture data features at various scales during the process of image feature extraction.Additionally,the position of the prototype is fine-tuned by assigning weights to data points to mitigate the influence of outliers on the experiment.The loss function integrates contrastive loss and label-smoothing to bring similar data points closer and separate dissimilar data points within the metric space.Experimental evaluations are conducted on small-sample datasets mini-ImageNet and CUB200-2011.The method in this paper can achieve higher classification accuracy.Specifically,in the 5-way 1-shot experiment,classification accuracy reaches 50.13%and 66.79%respectively on these two datasets.Moreover,in the 5-way 5-shot ex-periment,accuracy of 66.79%and 85.91%are observed,respectively.
文摘On the basis of the arctic monthly mean sea ice extent data set during 1953-1984, the arctic region is divided into eight subregions,and the analyses of empirical orthogonal functions, power spectrum and maximum entropy spectrum are made to indentify the major spatial and temporal features of the sea ice fluctuations within 32-year period. And then, a brief appropriate physical explanation is tentatively suggested. The results show that both seasonal and non-seasonal variations of the sea ice extent are remarkable, and iis mean annual peripheral positions as well as their interannu-al shifting amplitudes are quite different among all subregions. These features are primarily affected by solar radiation, o-cean circulation, sea surface temperature and maritime-continental contrast, while the non-seasonal variations are most possibly affected by the cosmic-geophysical factors such as earth pole shife, earth rotation oscillation and solar activity.
文摘Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.
基金supported by the Natural Science Foundation of the Anhui Higher Education Institutions of China(Grant Nos.2023AH040149 and 2024AH051915)the Anhui Provincial Natural Science Foundation(Grant No.2208085MF168)+1 种基金the Science and Technology Innovation Tackle Plan Project of Maanshan(Grant No.2024RGZN001)the Scientific Research Fund Project of Anhui Medical University(Grant No.2023xkj122).
文摘Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.
基金supported by the National Natural Science Foundation of China(62272049,62236006,62172045)the Key Projects of Beijing Union University(ZKZD202301).
文摘In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.
基金supported by the funding provided by the State Key Laboratory of Hydraulics and Mountain River Engineering(SKHL2210)National Natural Science Foundation of China(42171304)+1 种基金the Sichuan Science and Technology Program(2023YFS0380)Natural Science Foundation of Jiangsu Province of China(BK20242018)。
文摘The Yellow River Basin in Sichuan Province(YRS)is undergoing severe soil erosion and exacerbated ecological vulnerability,which collectively pose formidable challenges for regional water conservation(WC)and sustainable development.While effectively enhancing WC necessitates a comprehensive understanding of its driving factors and corresponding intervention strategies,existing studies have largely neglected the spatiotemporal heterogeneity of both natural and socio-economic drivers.Therefore,this study explored the spatiotemporal heterogeneity of WC drivers in YRS using multi-scale geographically weighted regression(MGWR)and geographically and temporally weighted regression(GTWR)models from an eco-hydrological perspective.We discovered that downstream regions,which are more developed,achieved significantly better WC than upstream regions.The results also demonstrated that the influence of temperature and wind speed is consistently dominant and temporally stable due to climate stability,while the influence of vegetation shifted from negative to positive around 2010,likely indicating greater benefits from understory vegetation.Economic growth positively impacted WC in upstream regions but had a negative effect in the more developed downstream regions.These findings highlight the importance of targeted water conservation strategies,including locally appropriate revegetation,optimization of agricultural and economic structures,and the establishment of eco-compensation mechanisms for ecological conservation and sustainable development.