Forests are vital ecosystems that play a crucial role in sustaining life on Earth and supporting human well-being.Traditional forest mapping and monitoring methods are often costly and limited in scope,necessitating t...Forests are vital ecosystems that play a crucial role in sustaining life on Earth and supporting human well-being.Traditional forest mapping and monitoring methods are often costly and limited in scope,necessitating the adoption of advanced,automated approaches for improved forest conservation and management.This study explores the application of deep learning-based object detection techniques for individual tree detection in RGB satellite imagery.A dataset of 3157 images was collected and divided into training(2528),validation(495),and testing(134)sets.To enhance model robustness and generalization,data augmentation was applied to the training part of the dataset.Various YOLO-based models,including YOLOv8,YOLOv9,YOLOv10,YOLOv11,and YOLOv12,were evaluated using different hyperparameters and optimization techniques,such as stochastic gradient descent(SGD)and auto-optimization.These models were assessed in terms of detection accuracy and the number of detected trees.The highest-performing model,YOLOv12m,achieved a mean average precision(mAP@50)of 0.908,mAP@50:95 of 0.581,recall of 0.851,precision of 0.852,and an F1-score of 0.847.The results demonstrate that YOLO-based object detection offers a highly efficient,scalable,and accurate solution for individual tree detection in satellite imagery,facilitating improved forest inventory,monitoring,and ecosystem management.This study underscores the potential of AI-driven tree detection to enhance environmental sustainability and support data-driven decision-making in forestry.展开更多
With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods ...With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.展开更多
Crop-yield is a crucial metric in agriculture,essential for effective sector management and improving the overall production process.This indicator is heavily influenced by numerous environmental factors,particularly ...Crop-yield is a crucial metric in agriculture,essential for effective sector management and improving the overall production process.This indicator is heavily influenced by numerous environmental factors,particularly those related to soil and climate,which present a challenging task due to the complex interactions involved.In this paper,we introduce a novel integrated neurosymbolic framework that combines knowledge-based approaches with sensor data for crop-yield prediction.This framework merges predictions from vectors generated by modeling environmental factors using a newly developed ontology focused on key elements and evaluates this ontology using quantitative methods,specifically representation learning techniques,along with predictions derived from remote sensing imagery.We tested our proposed methodology on a public dataset centered on corn,aiming to predict crop-yield.Our developed smart model achieved promising results in terms of crop-yield prediction,with a root mean squared error(RMSE)of 1.72,outperforming the baseline models.The ontologybased approach achieved an RMSE of 1.73,while the remote sensing-based method yielded an RMSE of 1.77.This confirms the superior performance of our proposed approach over those using single modalities.This in-tegrated neurosymbolic approach demonstrates that the fusion of statistical and symbolic artificial intelligence(AI)represents a significant advancement in agricultural applications.It is particularly effective for crop-yield prediction at the field scale,thus facilitating more informed decision-making in advanced agricultural prac-tices.Additionally,it is acknowledged that results might be further improved by incorporating more detailed ontological knowledge and testing the model with higher-resolution imagery to enhance prediction accuracy.展开更多
Stroke is a leading cause of death and disability worldwide,significantly impairing motor and cognitive functions.Effective rehabilitation is often hindered by the heterogeneity of stroke lesions,variability in recove...Stroke is a leading cause of death and disability worldwide,significantly impairing motor and cognitive functions.Effective rehabilitation is often hindered by the heterogeneity of stroke lesions,variability in recovery patterns,and the complexity of electroencephalography(EEG)signals,which are often contaminated by artifacts.Accurate classification of motor imagery(MI)tasks,involving the mental simulation of movements,is crucial for assessing rehabilitation strategies but is challenged by overlapping neural signatures and patient-specific variability.To address these challenges,this study introduces a graph-attentive convolutional long short-term memory(LSTM)network(GACL-Net),a novel hybrid deep learning model designed to improve MI classification accuracy and robustness.GACL-Net incorporates multi-scale convolutional blocks for spatial feature extraction,attention fusion layers for adaptive feature prioritization,graph convolutional layers to model inter-channel dependencies,and bidi-rectional LSTM layers with attention to capture temporal dynamics.Evaluated on an open-source EEG dataset of 50 acute stroke patients performing left and right MI tasks,GACL-Net achieved 99.52%classification accuracy and 97.43%generalization accuracy under leave-one-subject-out cross-validation,outperforming existing state-of-the-art methods.Additionally,its real-time processing capability,with prediction times of 33–56 ms on a T4 GPU,underscores its clinical potential for real-time neurofeedback and adaptive rehabilitation.These findings highlight the model’s potential for clinical applications in assessing rehabilitation effectiveness and optimizing therapy plans through precise MI classification.展开更多
Urban environments offer a wealth of opportunities for residents to respite from their hectic life.Outdoor running or jogging becomes increasingly popular of an option.Impacts of urban environments on outdoor running,...Urban environments offer a wealth of opportunities for residents to respite from their hectic life.Outdoor running or jogging becomes increasingly popular of an option.Impacts of urban environments on outdoor running,despite some initial studies,remain underexplored.This study aims to establish an analytical framework that can holistically assess the urban environment on the healthy vitality of running.The proposed framework is applied to two modern Chinese cities,i.e.,Guangzhou and Shenzhen.We construct three interpretable random forest models to explore the non-linear relationship between environmental variables and running intensity(RI)through analyzing the runners'trajectories and integrating with multi-source urban big data(e.g.,street view imagery,remote sensing,and socio-economic data)across the built,natural,and social dimensions,The findings uncover that road density has the greatest impact on RI,and social variables(e.g.,population density and housing price)and natural variables(e.g.,slope and humidity)all make notable impact on outdoor running.Despite these findings,the impact of environmental variables likely change across different regions due to disparate regional construction and micro-environments,and those specific impacts as well as optimal thresholds also alter.Therefore,construction of healthy cities should take the whole urban environment into account and adapt to local conditions.This study provides a comprehensive evaluation on the influencing variables of healthy vitality and guides sustainable urban planning for creating running-friendly cities.展开更多
While algorithms have been created for land usage in urban settings,there have been few investigations into the extraction of urban footprint(UF).To address this research gap,the study employs several widely used imag...While algorithms have been created for land usage in urban settings,there have been few investigations into the extraction of urban footprint(UF).To address this research gap,the study employs several widely used image classification method classified into three categories to evaluate their segmentation capabilities for extracting UF across eight cities.The results indicate that pixel-based methods only excel in clear urban environments,and their overall accuracy is not consistently high.RF and SVM perform well but lack stability in object-based UF extraction,influenced by feature selection and classifier performance.Deep learning enhances feature extraction but requires powerful computing and faces challenges with complex urban layouts.SAM excels in medium-sized urban areas but falters in intricate layouts.Integrating traditional and deep learning methods optimizes UF extraction,balancing accuracy and processing efficiency.Future research should focus on adapting algorithms for diverse urban landscapes to enhance UF extraction accuracy and applicability.展开更多
A brain-computer interface(BCI)based on motor imagery(MI)provides additional control pathways by decoding the intentions of the brain.MI ability has great intra-individual variability,and the majority of MI-BCI system...A brain-computer interface(BCI)based on motor imagery(MI)provides additional control pathways by decoding the intentions of the brain.MI ability has great intra-individual variability,and the majority of MI-BCI systems are unable to adapt to this variability,leading to poor training effects.Therefore,prediction of MI ability is needed.In this study,we propose an MI ability predictor based on multi-frequency EEG features.To validate the performance of the predictor,a video-guided paradigm and a traditional MI paradigm are designed,and the predictor is applied to both paradigms.The results demonstrate that all subjects achieved>85%prediction precision in both applications,with a maximum of 96%.This study indicates that the predictor can accurately predict the individuals’MI ability in different states,provide the scientific basis for personalized training,and enhance the effect of MI-BCI training.展开更多
Drone photography is an essential building block of intelligent transportation,enabling wide-ranging monitoring,precise positioning,and rapid transmission.However,the high computational cost of transformer-based metho...Drone photography is an essential building block of intelligent transportation,enabling wide-ranging monitoring,precise positioning,and rapid transmission.However,the high computational cost of transformer-based methods in object detection tasks hinders real-time result transmission in drone target detection applications.Therefore,we propose mask adaptive transformer (MAT) tailored for such scenarios.Specifically,we introduce a structure that supports collaborative token sparsification in support windows,enhancing fault tolerance and reducing computational overhead.This structure comprises two modules:a binary mask strategy and adaptive window self-attention (A-WSA).The binary mask strategy focuses on significant objects in various complex scenes.The A-WSA mechanism is employed to self-attend for balance perfomance and computational cost to select objects and isolate all contextual leakage.Extensive experiments on the challenging CarPK and VisDrone datasets demonstrate the effectiveness and superiority of the proposed method.Specifically,it achieves a mean average precision (mAP@0.5) improvement of 1.25%over car detector based on you only look once version 5 (CD-YOLOv5) on the CarPK dataset and a 3.75%average precision(AP@0.5) improvement over cascaded zoom-in detector (CZ Det) on the VisDrone dataset.展开更多
Rapidly obtaining spatial distribution maps of secondary disasters triggered by strong earthquakes is crucial for understanding the disaster-causing processes in the earthquake hazard chain and formulating effective e...Rapidly obtaining spatial distribution maps of secondary disasters triggered by strong earthquakes is crucial for understanding the disaster-causing processes in the earthquake hazard chain and formulating effective emergency response measures and post-disaster reconstruction plans.On April 3,2024,a M_(W)7.4 earthquake struck offshore east of Hualien,Taiwan,China,which triggered numerous coseismic landslides in bedrock mountain regions and severe soil liquefaction in coastal areas,resulting in significant economic losses.This study utilized postearthquake emergency data from China's high-resolution optical satellite imagery and applied visual interpretation method to establish a partial database of secondary disasters triggered by the 2024 Hualien earthquake.A total of 5348 coseismic landslides were identified,which were primarily distributed along the eastern slopes of the Central Mountain Range watersheds.In high mountain valleys,these landslides mainly manifest as localized bedrock collapses or slope debris flows,causing extensive damage to highways and tourism facilities.Their distribution partially overlaps with the landslide concentration zones triggered by the 1999 Chi-Chi earthquake.Additionally,6040 soil liquefaction events were interpreted,predominantly in the Hualien Port area and the lowland valleys of the Hualien River and concentrated within the IX-intensity zone.Widespread surface subsidence and sand ejections characterized soil liquefaction.Verified against local field investigation data in Taiwan,rapid imaging through post-earthquake remote sensing data can effectively assess the distribution of coseismic landslides and soil liquefaction within high-intensity zones.This study provides efficient and reliable data for earthquake disaster response.Moreover,the results are critical for seismic disaster mitigation in high mountain valleys and coastal lowlands.展开更多
Spartina alterniflora is now listed among the world’s 100 most dangerous invasive species,severely affecting the ecological balance of coastal wetlands.Remote sensing technologies based on deep learning enable large-...Spartina alterniflora is now listed among the world’s 100 most dangerous invasive species,severely affecting the ecological balance of coastal wetlands.Remote sensing technologies based on deep learning enable large-scale monitoring of Spartina alterniflora,but they require large datasets and have poor interpretability.A new method is proposed to detect Spartina alterniflora from Sentinel-2 imagery.Firstly,to get the high canopy cover and dense community characteristics of Spartina alterniflora,multi-dimensional shallow features are extracted from the imagery.Secondly,to detect different objects from satellite imagery,index features are extracted,and the statistical features of the Gray-Level Co-occurrence Matrix(GLCM)are derived using principal component analysis.Then,ensemble learning methods,including random forest,extreme gradient boosting,and light gradient boosting machine models,are employed for image classification.Meanwhile,Recursive Feature Elimination with Cross-Validation(RFECV)is used to select the best feature subset.Finally,to enhance the interpretability of the models,the best features are utilized to classify multi-temporal images and SHapley Additive exPlanations(SHAP)is combined with these classifications to explain the model prediction process.The method is validated by using Sentinel-2 imageries and previous observations of Spartina alterniflora in Chongming Island,it is found that the model combining image texture features such as GLCM covariance can significantly improve the detection accuracy of Spartina alterniflora by about 8%compared with the model without image texture features.Through multiple model comparisons and feature selection via RFECV,the selected model and eight features demonstrated good classification accuracy when applied to data from different time periods,proving that feature reduction can effectively enhance model generalization.Additionally,visualizing model decisions using SHAP revealed that the image texture feature component_1_GLCMVariance is particularly important for identifying each land cover type.展开更多
Individual Tree Detection-and-Counting(ITDC)is among the important tasks in town areas,and numerous methods are proposed in this direction.Despite their many advantages,still,the proposed methods are inadequate to pro...Individual Tree Detection-and-Counting(ITDC)is among the important tasks in town areas,and numerous methods are proposed in this direction.Despite their many advantages,still,the proposed methods are inadequate to provide robust results because they mostly rely on the direct field investigations.This paper presents a novel approach involving high-resolution imagery and the Canopy-Height-Model(CHM)data to solve the ITDC problem.The new approach is studied in six urban scenes:farmland,woodland,park,industrial land,road and residential areas.First,it identifies tree canopy regions using a deep learning network from high-resolution imagery.It then deploys the CHM-data to detect treetops of the canopy regions using a local maximum algorithm and individual tree canopies using the region growing.Finally,it calculates and describes the number of individual trees and tree canopies.The proposed approach is experimented with the data from Shanghai,China.Our results show that the individual tree detection method had an average overall accuracy of 0.953,with a precision of 0.987 for woodland scene.Meanwhile,the R^(2) value for canopy segmentation in different urban scenes is greater than 0.780 and 0.779 for canopy area and diameter size,respectively.These results confirm that the proposed method is robust enough for urban tree planning and management.展开更多
Understanding forest health is of great importance for the conservation of the integrity of forest ecosystems.The monitoring of forest health is,therefore,indispensable for the long-term conservation of forests and th...Understanding forest health is of great importance for the conservation of the integrity of forest ecosystems.The monitoring of forest health is,therefore,indispensable for the long-term conservation of forests and their sustainable management.In this regard,evaluating the amount and quality of dead wood is of utmost interest as they are favorable indicators of biodiversity.Apparently,remote sensing-based Machine Learning(ML)techniques have proven to be more efficient and sustainable with unprecedented accuracy in forest inventory.However,the application of these techniques is still in its infancy with respect to dead wood mapping.This study,for the first time,automatically categorizing individual coniferous trees(Norway spruce)into five decay stages(live,declining,dead,loose bark,and clean)from combined Airborne Laser Scanning(ALS)point clouds and color infrared(CIR)images using three different ML methods−3D point cloud-based deep learning(KPConv),Convolutional Neural Network(CNN),and Random Forest(RF).First,CIR colorized point clouds are created by fusing the ALS point clouds and color infrared images.Then,individual tree segmentation is conducted,after which the results are further projected onto four orthogonal planes.Finally,the classification is conducted on the two datasets(3D multispectral point clouds and 2D projected images)based on the three ML algorithms.All models achieved promising results,reaching overall accuracy(OA)of up to 88.8%,88.4%and 85.9%for KPConv,CNN and RF,respectively.The experimental results reveal that color information,3D coordinates,and intensity of point clouds have significant impact on the promising classification performance.The performance of our models,therefore,shows the significance of machine/deep learning for individual tree decay stages classification and landscape-wide assessment of the dead wood amount and quality by using modern airborne remote sensing techniques.The proposed method can contribute as an important and reliable tool for monitoring biodiversity in forest ecosystems.展开更多
Cloud detection from satellite and drone imagery is crucial for applications such as weather forecasting and environmentalmonitoring.Addressing the limitations of conventional convolutional neural networks,we propose ...Cloud detection from satellite and drone imagery is crucial for applications such as weather forecasting and environmentalmonitoring.Addressing the limitations of conventional convolutional neural networks,we propose an innovative transformer-based method.This method leverages transformers,which are adept at processing data sequences,to enhance cloud detection accuracy.Additionally,we introduce a Cyclic Refinement Architecture that improves the resolution and quality of feature extraction,thereby aiding in the retention of critical details often lost during cloud detection.Our extensive experimental validation shows that our approach significantly outperforms established models,excelling in high-resolution feature extraction and precise cloud segmentation.By integrating Positional Visual Transformers(PVT)with this architecture,our method advances high-resolution feature delineation and segmentation accuracy.Ultimately,our research offers a novel perspective for surmounting traditional challenges in cloud detection and contributes to the advancement of precise and dependable image analysis across various domains.展开更多
The analysis of microstates in EEG signals is a crucial technique for understanding the spatiotemporal dynamics of brain electrical activity.Traditional methods such as Atomic Agglomerative Hierarchical Clustering(AAH...The analysis of microstates in EEG signals is a crucial technique for understanding the spatiotemporal dynamics of brain electrical activity.Traditional methods such as Atomic Agglomerative Hierarchical Clustering(AAHC),K-means clustering,Principal Component Analysis(PCA),and Independent Component Analysis(ICA)are limited by a fixed number of microstate maps and insufficient capability in cross-task feature extraction.Tackling these limitations,this study introduces a Global Map Dissimilarity(GMD)-driven density canopy K-means clustering algorithm.This innovative approach autonomously determines the optimal number of EEG microstate topographies and employs Gaussian kernel density estimation alongside the GMD index for dynamic modeling of EEG data.Utilizing this advanced algorithm,the study analyzes the Motor Imagery(MI)dataset from the GigaScience database,GigaDB.The findings reveal six distinct microstates during actual right-hand movement and five microstates across other task conditions,with microstate C showing superior performance in all task states.During imagined movement,microstate A was significantly enhanced.Comparison with existing algorithms indicates a significant improvement in clustering performance by the refined method,with an average Calinski-Harabasz Index(CHI)of 35517.29 and a Davis-Bouldin Index(DBI)average of 2.57.Furthermore,an information-theoretical analysis of the microstate sequences suggests that imagined movement exhibits higher complexity and disorder than actual movement.By utilizing the extracted microstate sequence parameters as features,the improved algorithm achieved a classification accuracy of 98.41%in EEG signal categorization for motor imagery.A performance of 78.183%accuracy was achieved in a four-class motor imagery task on the BCI-IV-2a dataset.These results demonstrate the potential of the advanced algorithm in microstate analysis,offering a more effective tool for a deeper understanding of the spatiotemporal features of EEG signals.展开更多
Motor imagery(MI)based electroencephalogram(EEG)represents a frontier in enabling direct neural control of external devices and advancing neural rehabilitation.This study introduces a novel time embedding technique,te...Motor imagery(MI)based electroencephalogram(EEG)represents a frontier in enabling direct neural control of external devices and advancing neural rehabilitation.This study introduces a novel time embedding technique,termed traveling-wave based time embedding,utilized as a pseudo channel to enhance the decoding accuracy of MI-EEG signals across various neural network architectures.Unlike traditional neural network methods that fail to account for the temporal dynamics in MI-EEG in individual difference,our approach captures time-related changes for different participants based on a priori knowledge.Through extensive experimentation with multiple participants,we demonstrate that this method not only improves classification accuracy but also exhibits greater adaptability to individual differences compared to position encoding used in Transformer architecture.Significantly,our results reveal that traveling-wave based time embedding crucially enhances decoding accuracy,particularly for participants typically considered“EEG-illiteracy”.As a novel direction in EEG research,the traveling-wave based time embedding not only offers fresh insights for neural network decoding strategies but also expands new avenues for research into attention mechanisms in neuroscience and a deeper understanding of EEG signals.展开更多
Leaving no one behind is a worldwide goal,but it is difficult to make policy to address this issue because we do not have a thorough knowledge of where poverty exists and in what forms due to lack of data,particularly...Leaving no one behind is a worldwide goal,but it is difficult to make policy to address this issue because we do not have a thorough knowledge of where poverty exists and in what forms due to lack of data,particularly in developing countries.Household interview surveys are the common way to collect such information,but conducting large-scale surveys frequently is difficult from the perspective of cost and time.Here,we show a novel method for estimating income levels of individual building in urban and peri-urban rural areas.The combination of high-resolution satellite imagery and household interview survey data obtained by visiting households on the ground makes it possible to estimate income levels at a detailed scale for the first time.These data are often handled in different academic disciplines and are rarely used in combination.Using the results,we can determine the number and location of poor people at the local scale.We can also identify areas with particularly high concentrations of poor people.This information enables planning and policy making for more effective poverty reduction and disaster prevention measures tailored to local conditions.Thus,the results of this study will help developing countries to achieve sustainable development.展开更多
Imagery analysis is a commonly used analytical method in literary analysis.In Angela Carter’s work,the image of wolves is particularly prominent.Her“Werewolf Tetralogy”rewrites traditional culture and subverts trad...Imagery analysis is a commonly used analytical method in literary analysis.In Angela Carter’s work,the image of wolves is particularly prominent.Her“Werewolf Tetralogy”rewrites traditional culture and subverts traditional consciousness,and is the research object of many scholars.Starting from the analysis of the wolf image in The Company of Wolves,this paper uses Deleuze’s Becoming-Animal Theory to explore the construction of harmony between nature,humans and gender relations in The Company of Wolves.展开更多
Deep learning has been applied for motor imagery electroencephalogram(MI-EEG)classification in brain-computer system to help people who suffer from serious neuromotor disorders.The inefficiency network and data shorta...Deep learning has been applied for motor imagery electroencephalogram(MI-EEG)classification in brain-computer system to help people who suffer from serious neuromotor disorders.The inefficiency network and data shortage are the primary issues that the researchers face and need to solve.A novel MI-EEG classification method is proposed in this paper.A plain convolutional neural network(pCNN),which contains two convolution layers,is designed to extract the temporal-spatial information of MI-EEG,and a linear interpolation-based data augmentation(LIDA)method is introduced,by which any two unrepeated trials are randomly selected to generate a new data.Based on two publicly available brain-computer interface competition datasets,the experiments are conducted to confirm the structure of pCNN and optimize the parameters of pCNN and LIDA as well.The average classification accuracy values achieve 90.27%and 98.23%,and the average Kappa values are 0.805 and 0.965 respectively.The experiment results show the advantage of the proposed classification method in both accuracy and statistical consistency,compared with the existing methods.展开更多
基金funding from Horizon Europe Framework Programme(HORIZON),call Teaming for Excellence(HORIZON-WIDERA-2022-ACCESS-01-two-stage)-Creation of the centre of excellence in smart forestry“Forest 4.0”No.101059985funded by the EuropeanUnion under the project FOREST 4.0-“Ekscelencijos centras tvariai miško bioekonomikai vystyti”No.10-042-P-0002.
文摘Forests are vital ecosystems that play a crucial role in sustaining life on Earth and supporting human well-being.Traditional forest mapping and monitoring methods are often costly and limited in scope,necessitating the adoption of advanced,automated approaches for improved forest conservation and management.This study explores the application of deep learning-based object detection techniques for individual tree detection in RGB satellite imagery.A dataset of 3157 images was collected and divided into training(2528),validation(495),and testing(134)sets.To enhance model robustness and generalization,data augmentation was applied to the training part of the dataset.Various YOLO-based models,including YOLOv8,YOLOv9,YOLOv10,YOLOv11,and YOLOv12,were evaluated using different hyperparameters and optimization techniques,such as stochastic gradient descent(SGD)and auto-optimization.These models were assessed in terms of detection accuracy and the number of detected trees.The highest-performing model,YOLOv12m,achieved a mean average precision(mAP@50)of 0.908,mAP@50:95 of 0.581,recall of 0.851,precision of 0.852,and an F1-score of 0.847.The results demonstrate that YOLO-based object detection offers a highly efficient,scalable,and accurate solution for individual tree detection in satellite imagery,facilitating improved forest inventory,monitoring,and ecosystem management.This study underscores the potential of AI-driven tree detection to enhance environmental sustainability and support data-driven decision-making in forestry.
文摘With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.
基金partially funded by the JSPS KAKENHI Grant Number JP22K18004.
文摘Crop-yield is a crucial metric in agriculture,essential for effective sector management and improving the overall production process.This indicator is heavily influenced by numerous environmental factors,particularly those related to soil and climate,which present a challenging task due to the complex interactions involved.In this paper,we introduce a novel integrated neurosymbolic framework that combines knowledge-based approaches with sensor data for crop-yield prediction.This framework merges predictions from vectors generated by modeling environmental factors using a newly developed ontology focused on key elements and evaluates this ontology using quantitative methods,specifically representation learning techniques,along with predictions derived from remote sensing imagery.We tested our proposed methodology on a public dataset centered on corn,aiming to predict crop-yield.Our developed smart model achieved promising results in terms of crop-yield prediction,with a root mean squared error(RMSE)of 1.72,outperforming the baseline models.The ontologybased approach achieved an RMSE of 1.73,while the remote sensing-based method yielded an RMSE of 1.77.This confirms the superior performance of our proposed approach over those using single modalities.This in-tegrated neurosymbolic approach demonstrates that the fusion of statistical and symbolic artificial intelligence(AI)represents a significant advancement in agricultural applications.It is particularly effective for crop-yield prediction at the field scale,thus facilitating more informed decision-making in advanced agricultural prac-tices.Additionally,it is acknowledged that results might be further improved by incorporating more detailed ontological knowledge and testing the model with higher-resolution imagery to enhance prediction accuracy.
基金supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Science and ICT under Grant NRF-2022R1A2C1005316.
文摘Stroke is a leading cause of death and disability worldwide,significantly impairing motor and cognitive functions.Effective rehabilitation is often hindered by the heterogeneity of stroke lesions,variability in recovery patterns,and the complexity of electroencephalography(EEG)signals,which are often contaminated by artifacts.Accurate classification of motor imagery(MI)tasks,involving the mental simulation of movements,is crucial for assessing rehabilitation strategies but is challenged by overlapping neural signatures and patient-specific variability.To address these challenges,this study introduces a graph-attentive convolutional long short-term memory(LSTM)network(GACL-Net),a novel hybrid deep learning model designed to improve MI classification accuracy and robustness.GACL-Net incorporates multi-scale convolutional blocks for spatial feature extraction,attention fusion layers for adaptive feature prioritization,graph convolutional layers to model inter-channel dependencies,and bidi-rectional LSTM layers with attention to capture temporal dynamics.Evaluated on an open-source EEG dataset of 50 acute stroke patients performing left and right MI tasks,GACL-Net achieved 99.52%classification accuracy and 97.43%generalization accuracy under leave-one-subject-out cross-validation,outperforming existing state-of-the-art methods.Additionally,its real-time processing capability,with prediction times of 33–56 ms on a T4 GPU,underscores its clinical potential for real-time neurofeedback and adaptive rehabilitation.These findings highlight the model’s potential for clinical applications in assessing rehabilitation effectiveness and optimizing therapy plans through precise MI classification.
基金National Natural Science Foundation of China,No.42171455The Hong Kong RGC Research Impact Fund,No.R5011-23The Hong Kong General Research Fund,No.15204121。
文摘Urban environments offer a wealth of opportunities for residents to respite from their hectic life.Outdoor running or jogging becomes increasingly popular of an option.Impacts of urban environments on outdoor running,despite some initial studies,remain underexplored.This study aims to establish an analytical framework that can holistically assess the urban environment on the healthy vitality of running.The proposed framework is applied to two modern Chinese cities,i.e.,Guangzhou and Shenzhen.We construct three interpretable random forest models to explore the non-linear relationship between environmental variables and running intensity(RI)through analyzing the runners'trajectories and integrating with multi-source urban big data(e.g.,street view imagery,remote sensing,and socio-economic data)across the built,natural,and social dimensions,The findings uncover that road density has the greatest impact on RI,and social variables(e.g.,population density and housing price)and natural variables(e.g.,slope and humidity)all make notable impact on outdoor running.Despite these findings,the impact of environmental variables likely change across different regions due to disparate regional construction and micro-environments,and those specific impacts as well as optimal thresholds also alter.Therefore,construction of healthy cities should take the whole urban environment into account and adapt to local conditions.This study provides a comprehensive evaluation on the influencing variables of healthy vitality and guides sustainable urban planning for creating running-friendly cities.
文摘While algorithms have been created for land usage in urban settings,there have been few investigations into the extraction of urban footprint(UF).To address this research gap,the study employs several widely used image classification method classified into three categories to evaluate their segmentation capabilities for extracting UF across eight cities.The results indicate that pixel-based methods only excel in clear urban environments,and their overall accuracy is not consistently high.RF and SVM perform well but lack stability in object-based UF extraction,influenced by feature selection and classifier performance.Deep learning enhances feature extraction but requires powerful computing and faces challenges with complex urban layouts.SAM excels in medium-sized urban areas but falters in intricate layouts.Integrating traditional and deep learning methods optimizes UF extraction,balancing accuracy and processing efficiency.Future research should focus on adapting algorithms for diverse urban landscapes to enhance UF extraction accuracy and applicability.
基金supported by the Natural Science Foundation of Hebei Province(F2024202019)the National Natural Science Foundation of China(32201072).
文摘A brain-computer interface(BCI)based on motor imagery(MI)provides additional control pathways by decoding the intentions of the brain.MI ability has great intra-individual variability,and the majority of MI-BCI systems are unable to adapt to this variability,leading to poor training effects.Therefore,prediction of MI ability is needed.In this study,we propose an MI ability predictor based on multi-frequency EEG features.To validate the performance of the predictor,a video-guided paradigm and a traditional MI paradigm are designed,and the predictor is applied to both paradigms.The results demonstrate that all subjects achieved>85%prediction precision in both applications,with a maximum of 96%.This study indicates that the predictor can accurately predict the individuals’MI ability in different states,provide the scientific basis for personalized training,and enhance the effect of MI-BCI training.
文摘Drone photography is an essential building block of intelligent transportation,enabling wide-ranging monitoring,precise positioning,and rapid transmission.However,the high computational cost of transformer-based methods in object detection tasks hinders real-time result transmission in drone target detection applications.Therefore,we propose mask adaptive transformer (MAT) tailored for such scenarios.Specifically,we introduce a structure that supports collaborative token sparsification in support windows,enhancing fault tolerance and reducing computational overhead.This structure comprises two modules:a binary mask strategy and adaptive window self-attention (A-WSA).The binary mask strategy focuses on significant objects in various complex scenes.The A-WSA mechanism is employed to self-attend for balance perfomance and computational cost to select objects and isolate all contextual leakage.Extensive experiments on the challenging CarPK and VisDrone datasets demonstrate the effectiveness and superiority of the proposed method.Specifically,it achieves a mean average precision (mAP@0.5) improvement of 1.25%over car detector based on you only look once version 5 (CD-YOLOv5) on the CarPK dataset and a 3.75%average precision(AP@0.5) improvement over cascaded zoom-in detector (CZ Det) on the VisDrone dataset.
基金funded by the Basic Research program from the Institute of Earthquake Forecasting,China Earthquake Administration(Grant No.CEAIEF20240302)the National Natural Science Foundation of China(Grant Nos.42072248)the National Key Research and Development Program of China(Grant Nos.2021YFC3000600 and 2019YFE0108900)。
文摘Rapidly obtaining spatial distribution maps of secondary disasters triggered by strong earthquakes is crucial for understanding the disaster-causing processes in the earthquake hazard chain and formulating effective emergency response measures and post-disaster reconstruction plans.On April 3,2024,a M_(W)7.4 earthquake struck offshore east of Hualien,Taiwan,China,which triggered numerous coseismic landslides in bedrock mountain regions and severe soil liquefaction in coastal areas,resulting in significant economic losses.This study utilized postearthquake emergency data from China's high-resolution optical satellite imagery and applied visual interpretation method to establish a partial database of secondary disasters triggered by the 2024 Hualien earthquake.A total of 5348 coseismic landslides were identified,which were primarily distributed along the eastern slopes of the Central Mountain Range watersheds.In high mountain valleys,these landslides mainly manifest as localized bedrock collapses or slope debris flows,causing extensive damage to highways and tourism facilities.Their distribution partially overlaps with the landslide concentration zones triggered by the 1999 Chi-Chi earthquake.Additionally,6040 soil liquefaction events were interpreted,predominantly in the Hualien Port area and the lowland valleys of the Hualien River and concentrated within the IX-intensity zone.Widespread surface subsidence and sand ejections characterized soil liquefaction.Verified against local field investigation data in Taiwan,rapid imaging through post-earthquake remote sensing data can effectively assess the distribution of coseismic landslides and soil liquefaction within high-intensity zones.This study provides efficient and reliable data for earthquake disaster response.Moreover,the results are critical for seismic disaster mitigation in high mountain valleys and coastal lowlands.
基金The National Key Research and Development Program of China under contract No.2023YFC3008204the National Natural Science Foundation of China under contract Nos 41977302 and 42476217.
文摘Spartina alterniflora is now listed among the world’s 100 most dangerous invasive species,severely affecting the ecological balance of coastal wetlands.Remote sensing technologies based on deep learning enable large-scale monitoring of Spartina alterniflora,but they require large datasets and have poor interpretability.A new method is proposed to detect Spartina alterniflora from Sentinel-2 imagery.Firstly,to get the high canopy cover and dense community characteristics of Spartina alterniflora,multi-dimensional shallow features are extracted from the imagery.Secondly,to detect different objects from satellite imagery,index features are extracted,and the statistical features of the Gray-Level Co-occurrence Matrix(GLCM)are derived using principal component analysis.Then,ensemble learning methods,including random forest,extreme gradient boosting,and light gradient boosting machine models,are employed for image classification.Meanwhile,Recursive Feature Elimination with Cross-Validation(RFECV)is used to select the best feature subset.Finally,to enhance the interpretability of the models,the best features are utilized to classify multi-temporal images and SHapley Additive exPlanations(SHAP)is combined with these classifications to explain the model prediction process.The method is validated by using Sentinel-2 imageries and previous observations of Spartina alterniflora in Chongming Island,it is found that the model combining image texture features such as GLCM covariance can significantly improve the detection accuracy of Spartina alterniflora by about 8%compared with the model without image texture features.Through multiple model comparisons and feature selection via RFECV,the selected model and eight features demonstrated good classification accuracy when applied to data from different time periods,proving that feature reduction can effectively enhance model generalization.Additionally,visualizing model decisions using SHAP revealed that the image texture feature component_1_GLCMVariance is particularly important for identifying each land cover type.
基金supported by the project funded by International Research Center of Big Data for Sustainable 740 Development Goals[Grant Number CBAS2022GSP07]Fundamental Research Funds for the Central Universities,Chongqing Natural Science Foundation[Grant Number CSTB2022NSCQMSX 2069]Ministry of Education of China[Grant Number 19JZD023].
文摘Individual Tree Detection-and-Counting(ITDC)is among the important tasks in town areas,and numerous methods are proposed in this direction.Despite their many advantages,still,the proposed methods are inadequate to provide robust results because they mostly rely on the direct field investigations.This paper presents a novel approach involving high-resolution imagery and the Canopy-Height-Model(CHM)data to solve the ITDC problem.The new approach is studied in six urban scenes:farmland,woodland,park,industrial land,road and residential areas.First,it identifies tree canopy regions using a deep learning network from high-resolution imagery.It then deploys the CHM-data to detect treetops of the canopy regions using a local maximum algorithm and individual tree canopies using the region growing.Finally,it calculates and describes the number of individual trees and tree canopies.The proposed approach is experimented with the data from Shanghai,China.Our results show that the individual tree detection method had an average overall accuracy of 0.953,with a precision of 0.987 for woodland scene.Meanwhile,the R^(2) value for canopy segmentation in different urban scenes is greater than 0.780 and 0.779 for canopy area and diameter size,respectively.These results confirm that the proposed method is robust enough for urban tree planning and management.
基金supported by the National Natural Science Foundation of China[Grant No.42171361]the Research Grants Council of the Hong Kong Special Administrative Region,China[Grant No.PolyU 25211819]supported by The Hong Kong Polytechnic University,China[Grant No.1-ZVN6,1-ZECE].
文摘Understanding forest health is of great importance for the conservation of the integrity of forest ecosystems.The monitoring of forest health is,therefore,indispensable for the long-term conservation of forests and their sustainable management.In this regard,evaluating the amount and quality of dead wood is of utmost interest as they are favorable indicators of biodiversity.Apparently,remote sensing-based Machine Learning(ML)techniques have proven to be more efficient and sustainable with unprecedented accuracy in forest inventory.However,the application of these techniques is still in its infancy with respect to dead wood mapping.This study,for the first time,automatically categorizing individual coniferous trees(Norway spruce)into five decay stages(live,declining,dead,loose bark,and clean)from combined Airborne Laser Scanning(ALS)point clouds and color infrared(CIR)images using three different ML methods−3D point cloud-based deep learning(KPConv),Convolutional Neural Network(CNN),and Random Forest(RF).First,CIR colorized point clouds are created by fusing the ALS point clouds and color infrared images.Then,individual tree segmentation is conducted,after which the results are further projected onto four orthogonal planes.Finally,the classification is conducted on the two datasets(3D multispectral point clouds and 2D projected images)based on the three ML algorithms.All models achieved promising results,reaching overall accuracy(OA)of up to 88.8%,88.4%and 85.9%for KPConv,CNN and RF,respectively.The experimental results reveal that color information,3D coordinates,and intensity of point clouds have significant impact on the promising classification performance.The performance of our models,therefore,shows the significance of machine/deep learning for individual tree decay stages classification and landscape-wide assessment of the dead wood amount and quality by using modern airborne remote sensing techniques.The proposed method can contribute as an important and reliable tool for monitoring biodiversity in forest ecosystems.
基金funded by the Chongqing Normal University Startup Foundation for PhD(22XLB021)supported by the Open Research Project of the State Key Laboratory of Industrial Control Technology,Zhejiang University,China(No.ICT2023B40).
文摘Cloud detection from satellite and drone imagery is crucial for applications such as weather forecasting and environmentalmonitoring.Addressing the limitations of conventional convolutional neural networks,we propose an innovative transformer-based method.This method leverages transformers,which are adept at processing data sequences,to enhance cloud detection accuracy.Additionally,we introduce a Cyclic Refinement Architecture that improves the resolution and quality of feature extraction,thereby aiding in the retention of critical details often lost during cloud detection.Our extensive experimental validation shows that our approach significantly outperforms established models,excelling in high-resolution feature extraction and precise cloud segmentation.By integrating Positional Visual Transformers(PVT)with this architecture,our method advances high-resolution feature delineation and segmentation accuracy.Ultimately,our research offers a novel perspective for surmounting traditional challenges in cloud detection and contributes to the advancement of precise and dependable image analysis across various domains.
基金funded by National Nature Science Foundation of China,Yunnan Funda-Mental Research Projects,Special Project of Guangdong Province in Key Fields of Ordinary Colleges and Universities and Chaozhou Science and Technology Plan Project of Funder Grant Numbers 82060329,202201AT070108,2023ZDZX2038 and 202201GY01.
文摘The analysis of microstates in EEG signals is a crucial technique for understanding the spatiotemporal dynamics of brain electrical activity.Traditional methods such as Atomic Agglomerative Hierarchical Clustering(AAHC),K-means clustering,Principal Component Analysis(PCA),and Independent Component Analysis(ICA)are limited by a fixed number of microstate maps and insufficient capability in cross-task feature extraction.Tackling these limitations,this study introduces a Global Map Dissimilarity(GMD)-driven density canopy K-means clustering algorithm.This innovative approach autonomously determines the optimal number of EEG microstate topographies and employs Gaussian kernel density estimation alongside the GMD index for dynamic modeling of EEG data.Utilizing this advanced algorithm,the study analyzes the Motor Imagery(MI)dataset from the GigaScience database,GigaDB.The findings reveal six distinct microstates during actual right-hand movement and five microstates across other task conditions,with microstate C showing superior performance in all task states.During imagined movement,microstate A was significantly enhanced.Comparison with existing algorithms indicates a significant improvement in clustering performance by the refined method,with an average Calinski-Harabasz Index(CHI)of 35517.29 and a Davis-Bouldin Index(DBI)average of 2.57.Furthermore,an information-theoretical analysis of the microstate sequences suggests that imagined movement exhibits higher complexity and disorder than actual movement.By utilizing the extracted microstate sequence parameters as features,the improved algorithm achieved a classification accuracy of 98.41%in EEG signal categorization for motor imagery.A performance of 78.183%accuracy was achieved in a four-class motor imagery task on the BCI-IV-2a dataset.These results demonstrate the potential of the advanced algorithm in microstate analysis,offering a more effective tool for a deeper understanding of the spatiotemporal features of EEG signals.
文摘Motor imagery(MI)based electroencephalogram(EEG)represents a frontier in enabling direct neural control of external devices and advancing neural rehabilitation.This study introduces a novel time embedding technique,termed traveling-wave based time embedding,utilized as a pseudo channel to enhance the decoding accuracy of MI-EEG signals across various neural network architectures.Unlike traditional neural network methods that fail to account for the temporal dynamics in MI-EEG in individual difference,our approach captures time-related changes for different participants based on a priori knowledge.Through extensive experimentation with multiple participants,we demonstrate that this method not only improves classification accuracy but also exhibits greater adaptability to individual differences compared to position encoding used in Transformer architecture.Significantly,our results reveal that traveling-wave based time embedding crucially enhances decoding accuracy,particularly for participants typically considered“EEG-illiteracy”.As a novel direction in EEG research,the traveling-wave based time embedding not only offers fresh insights for neural network decoding strategies but also expands new avenues for research into attention mechanisms in neuroscience and a deeper understanding of EEG signals.
文摘Leaving no one behind is a worldwide goal,but it is difficult to make policy to address this issue because we do not have a thorough knowledge of where poverty exists and in what forms due to lack of data,particularly in developing countries.Household interview surveys are the common way to collect such information,but conducting large-scale surveys frequently is difficult from the perspective of cost and time.Here,we show a novel method for estimating income levels of individual building in urban and peri-urban rural areas.The combination of high-resolution satellite imagery and household interview survey data obtained by visiting households on the ground makes it possible to estimate income levels at a detailed scale for the first time.These data are often handled in different academic disciplines and are rarely used in combination.Using the results,we can determine the number and location of poor people at the local scale.We can also identify areas with particularly high concentrations of poor people.This information enables planning and policy making for more effective poverty reduction and disaster prevention measures tailored to local conditions.Thus,the results of this study will help developing countries to achieve sustainable development.
文摘Imagery analysis is a commonly used analytical method in literary analysis.In Angela Carter’s work,the image of wolves is particularly prominent.Her“Werewolf Tetralogy”rewrites traditional culture and subverts traditional consciousness,and is the research object of many scholars.Starting from the analysis of the wolf image in The Company of Wolves,this paper uses Deleuze’s Becoming-Animal Theory to explore the construction of harmony between nature,humans and gender relations in The Company of Wolves.
基金Foundation item:the National Natural Science Foundation of China(Nos.62173010 and 11832003)。
文摘Deep learning has been applied for motor imagery electroencephalogram(MI-EEG)classification in brain-computer system to help people who suffer from serious neuromotor disorders.The inefficiency network and data shortage are the primary issues that the researchers face and need to solve.A novel MI-EEG classification method is proposed in this paper.A plain convolutional neural network(pCNN),which contains two convolution layers,is designed to extract the temporal-spatial information of MI-EEG,and a linear interpolation-based data augmentation(LIDA)method is introduced,by which any two unrepeated trials are randomly selected to generate a new data.Based on two publicly available brain-computer interface competition datasets,the experiments are conducted to confirm the structure of pCNN and optimize the parameters of pCNN and LIDA as well.The average classification accuracy values achieve 90.27%and 98.23%,and the average Kappa values are 0.805 and 0.965 respectively.The experiment results show the advantage of the proposed classification method in both accuracy and statistical consistency,compared with the existing methods.