Research indicates that microbe activity within the human body significantly influences health by being closely linked to various diseases.Accurately predicting microbe-disease interactions(MDIs)offers critical insigh...Research indicates that microbe activity within the human body significantly influences health by being closely linked to various diseases.Accurately predicting microbe-disease interactions(MDIs)offers critical insights for disease intervention and pharmaceutical research.Current advanced AI-based technologies automatically generate robust representations of microbes and diseases,enabling effective MDI predictions.However,these models continue to face significant challenges.A major issue is their reliance on complex feature extractors and classifiers,which substantially diminishes the models’generalizability.To address this,we introduce a novel graph autoencoder framework that utilizes decoupled representation learning and multi-scale information fusion strategies to efficiently infer potential MDIs.Initially,we randomly mask portions of the input microbe-disease graph based on Bernoulli distribution to boost self-supervised training and minimize noise-related performance degradation.Secondly,we employ decoupled representation learning technology,compelling the graph neural network(GNN)to independently learn the weights for each feature subspace,thus enhancing its expressive power.Finally,we implement multi-scale information fusion technology to amalgamate the multi-layer outputs of GNN,reducing information loss due to occlusion.Extensive experiments on public datasets demonstrate that our model significantly surpasses existing top MDI prediction models.This indicates that our model can accurately predict unknown MDIs and is likely to aid in disease discovery and precision pharmaceutical research.Code and data are accessible at:https://github.com/shmildsj/MDI-IFDRL.展开更多
The identification of influential nodes in complex networks is one of the most exciting topics in network science.The latest work successfully compares each node using local connectivity and weak tie theory from a new...The identification of influential nodes in complex networks is one of the most exciting topics in network science.The latest work successfully compares each node using local connectivity and weak tie theory from a new perspective.We study the structural properties of networks in depth and extend this successful node evaluation from single-scale to multi-scale.In particular,one novel position parameter based on node transmission efficiency is proposed,which mainly depends on the shortest distances from target nodes to high-degree nodes.In this regard,the novel multi-scale information importance(MSII)method is proposed to better identify the crucial nodes by combining the network's local connectivity and global position information.In simulation comparisons,five state-of-the-art algorithms,i.e.the neighbor nodes degree algorithm(NND),betweenness centrality,closeness centrality,Katz centrality and the k-shell decomposition method,are selected to compare with our MSII.The results demonstrate that our method obtains superior performance in terms of robustness and spreading propagation for both real-world and artificial networks.展开更多
Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approach...Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.展开更多
Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectra...Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectral similarity between buildings and backgrounds,sensor variations,and insufficient computational efficiency.To address these challenges,this paper proposes a novel Multi-scale Efficient Wavelet-based Change Detection Network(MewCDNet),which integrates the advantages of Convolutional Neural Networks and Transformers,balances computational costs,and achieves high-performance building change detection.The network employs EfficientNet-B4 as the backbone for hierarchical feature extraction,integrates multi-level feature maps through a multi-scale fusion strategy,and incorporates two key modules:Cross-temporal Difference Detection(CTDD)and Cross-scale Wavelet Refinement(CSWR).CTDD adopts a dual-branch architecture that combines pixel-wise differencing with semanticaware Euclidean distance weighting to enhance the distinction between true changes and background noise.CSWR integrates Haar-based Discrete Wavelet Transform with multi-head cross-attention mechanisms,enabling cross-scale feature fusion while significantly improving edge localization and suppressing spurious changes.Extensive experiments on four benchmark datasets demonstrate MewCDNet’s superiority over comparison methods:achieving F1 scores of 91.54%on LEVIR,93.70%on WHUCD,and 64.96%on S2Looking for building change detection.Furthermore,MewCDNet exhibits optimal performance on the multi-class⋅SYSU dataset(F1:82.71%),highlighting its exceptional generalization capability.展开更多
Impact craters are important for understanding the evolution of lunar geologic and surface erosion rates,among other functions.However,the morphological characteristics of these micro impact craters are not obvious an...Impact craters are important for understanding the evolution of lunar geologic and surface erosion rates,among other functions.However,the morphological characteristics of these micro impact craters are not obvious and they are numerous,resulting in low detection accuracy by deep learning models.Therefore,we proposed a new multi-scale fusion crater detection algorithm(MSF-CDA)based on the YOLO11 to improve the accuracy of lunar impact crater detection,especially for small craters with a diameter of<1 km.Using the images taken by the LROC(Lunar Reconnaissance Orbiter Camera)at the Chang’e-4(CE-4)landing area,we constructed three separate datasets for craters with diameters of 0-70 m,70-140 m,and>140 m.We then trained three submodels separately with these three datasets.Additionally,we designed a slicing-amplifying-slicing strategy to enhance the ability to extract features from small craters.To handle redundant predictions,we proposed a new Non-Maximum Suppression with Area Filtering method to fuse the results in overlapping targets within the multi-scale submodels.Finally,our new MSF-CDA method achieved high detection performance,with the Precision,Recall,and F1 score having values of 0.991,0.987,and 0.989,respectively,perfectly addressing the problems induced by the lesser features and sample imbalance of small craters.Our MSF-CDA can provide strong data support for more in-depth study of the geological evolution of the lunar surface and finer geological age estimations.This strategy can also be used to detect other small objects with lesser features and sample imbalance problems.We detected approximately 500,000 impact craters in an area of approximately 214 km2 around the CE-4 landing area.By statistically analyzing the new data,we updated the distribution function of the number and diameter of impact craters.Finally,we identified the most suitable lighting conditions for detecting impact crater targets by analyzing the effect of different lighting conditions on the detection accuracy.展开更多
With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods ...With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.展开更多
1 General information Journal of Geographical Sciences is an international academic journal that publishes papers of the highest quality in physical geography, natural resources, environmental sciences, geographic inf...1 General information Journal of Geographical Sciences is an international academic journal that publishes papers of the highest quality in physical geography, natural resources, environmental sciences, geographic information sciences, remote sensing and cartography. Manuscripts come from different parts of the world.展开更多
With the growing advancement of wireless communication technologies,WiFi-based human sensing has gained increasing attention as a non-intrusive and device-free solution.Among the available signal types,Channel State I...With the growing advancement of wireless communication technologies,WiFi-based human sensing has gained increasing attention as a non-intrusive and device-free solution.Among the available signal types,Channel State Information(CSI)offers fine-grained temporal,frequency,and spatial insights into multipath propagation,making it a crucial data source for human-centric sensing.Recently,the integration of deep learning has significantly improved the robustness and automation of feature extraction from CSI in complex environments.This paper provides a comprehensive review of deep learning-enhanced human sensing based on CSI.We first outline mainstream CSI acquisition tools and their hardware specifications,then provide a detailed discussion of preprocessing methods such as denoising,time–frequency transformation,data segmentation,and augmentation.Subsequently,we categorize deep learning approaches according to sensing tasks—namely detection,localization,and recognition—and highlight representative models across application scenarios.Finally,we examine key challenges including domain generalization,multi-user interference,and limited data availability,and we propose future research directions involving lightweight model deployment,multimodal data fusion,and semantic-level sensing.展开更多
For a single-structure deep learning fault diagnosis model,its disadvantages are an insufficient feature extraction and weak fault classification capability.This paper proposes a multi-scale deep feature fusion intell...For a single-structure deep learning fault diagnosis model,its disadvantages are an insufficient feature extraction and weak fault classification capability.This paper proposes a multi-scale deep feature fusion intelligent fault diagnosis method based on information entropy.First,a normal autoencoder,denoising autoencoder,sparse autoencoder,and contractive autoencoder are used in parallel to construct a multi-scale deep neural network feature extraction structure.A deep feature fusion strategy based on information entropy is proposed to obtain low-dimensional features and ensure the robustness of the model and the quality of deep features.Finally,the advantage of the deep belief network probability model is used as the fault classifier to identify the faults.The effectiveness of the proposed method was verified by a gearbox test-bed.Experimental results show that,compared with traditional and existing intelligent fault diagnosis methods,the proposed method can obtain representative information and features from the raw data with higher classification accuracy.展开更多
Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is ...Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is widely used and often yields notable results.However,recognizing each entity with high accuracy remains challenging.Many entities do not appear as single words but as part of complex phrases,making it difficult to achieve accurate recognition using word embedding information alone because the intricate lexical structure often impacts the performance.To address this issue,we propose an improved Bidirectional Encoder Representations from Transformers(BERT)character word conditional random field(CRF)(BCWC)model.It incorporates a pre-trained word embedding model using the skip-gram with negative sampling(SGNS)method,alongside traditional BERT embeddings.By comparing datasets with different word segmentation tools,we obtain enhanced word embedding features for segmented data.These features are then processed using the multi-scale convolution and iterated dilated convolutional neural networks(IDCNNs)with varying expansion rates to capture features at multiple scales and extract diverse contextual information.Additionally,a multi-attention mechanism is employed to fuse word and character embeddings.Finally,CRFs are applied to learn sequence constraints and optimize entity label annotations.A series of experiments are conducted on three public datasets,demonstrating that the proposed method outperforms the recent advanced baselines.BCWC is capable to address the challenge of recognizing complex entities by combining character-level and word-level embedding information,thereby improving the accuracy of CNER.Such a model is potential to the applications of more precise knowledge extraction such as knowledge graph construction and information retrieval,particularly in domain-specific natural language processing tasks that require high entity recognition precision.展开更多
Because of the challenge of compounding lightweight,high-strength Ti/Al alloys due to their considerable disparity in properties,Al 6063 as intermediate layer was proposed to fabricate TC4/Al 6063/Al 7075 three-layer ...Because of the challenge of compounding lightweight,high-strength Ti/Al alloys due to their considerable disparity in properties,Al 6063 as intermediate layer was proposed to fabricate TC4/Al 6063/Al 7075 three-layer composite plate by explosive welding.The microscopic properties of each bonding interface were elucidated through field emission scanning electron microscope and electron backscattered diffraction(EBSD).A methodology combining finite element method-smoothed particle hydrodynamics(FEM-SPH)and molecular dynamics(MD)was proposed for the analysis of the forming and evolution characteristics of explosive welding interfaces at multi-scale.The results demonstrate that the bonding interface morphologies of TC4/Al 6063 and Al 6063/Al 7075 exhibit a flat and wavy configuration,without discernible defects or cracks.The phenomenon of grain refinement is observed in the vicinity of the two bonding interfaces.Furthermore,the degree of plastic deformation of TC4 and Al 7075 is more pronounced than that of Al 6063 in the intermediate layer.The interface morphology characteristics obtained by FEM-SPH simulation exhibit a high degree of similarity to the experimental results.MD simulations reveal that the diffusion of interfacial elements predominantly occurs during the unloading phase,and the simulated thickness of interfacial diffusion aligns well with experimental outcomes.The introduction of intermediate layer in the explosive welding process can effectively produce high-quality titanium/aluminum alloy composite plates.Furthermore,this approach offers a multi-scale simulation strategy for the study of explosive welding bonding interfaces.展开更多
With the development of smart cities and smart technologies,parks,as functional units of the city,are facing smart transformation.The development of smart parks can help address challenges of technology integration wi...With the development of smart cities and smart technologies,parks,as functional units of the city,are facing smart transformation.The development of smart parks can help address challenges of technology integration within urban spaces and serve as testbeds for exploring smart city planning and governance models.Information models facilitate the effective integration of technology into space.Building Information Modeling(BIM)and City Information Modeling(CIM)have been widely used in urban construction.However,the existing information models have limitations in the application of the park,so it is necessary to develop an information model suitable for the park.This paper first traces the evolution of park smart transformation,reviews the global landscape of smart park development,and identifies key trends and persistent challenges.Addressing the particularities of parks,the concept of Park Information Modeling(PIM)is proposed.PIM leverages smart technologies such as artificial intelligence,digital twins,and collaborative sensing to help form a‘space-technology-system’smart structure,enabling systematic management of diverse park spaces,addressing the deficiency in park-level information models,and aiming to achieve scale articulation between BIM and CIM.Finally,through a detailed top-level design application case study of the Nanjing Smart Education Park in China,this paper illustrates the translation process of the PIM concept into practice,showcasing its potential to provide smart management tools for park managers and enhance services for park stakeholders,although further empirical validation is required.展开更多
Improving the volumetric energy density of supercapacitors is essential for practical applications,which highly relies on the dense storage of ions in carbon-based electrodes.The functional units of carbon-based elect...Improving the volumetric energy density of supercapacitors is essential for practical applications,which highly relies on the dense storage of ions in carbon-based electrodes.The functional units of carbon-based electrode exhibit multi-scale structural characteristics including macroscopic electrode morphologies,mesoscopic microcrystals and pores,and microscopic defects and dopants in the carbon basal plane.Therefore,the ordered combination of multi-scale structures of carbon electrode is crucial for achieving dense energy storage and high volumetric performance by leveraging the functions of various scale structu re.Considering that previous reviews have focused more on the discussion of specific scale structu re of carbon electrodes,this review takes a multi-scale perspective in which recent progresses regarding the structureperformance relationship,underlying mechanism and directional design of carbon-based multi-scale structures including carbon morphology,pore structure,carbon basal plane micro-environment and electrode technology on dense energy storage and volumetric property of supercapacitors are systematically discussed.We analyzed in detail the effects of the morphology,pore,and micro-environment of carbon electrode materials on ion dense storage,summarized the specific effects of different scale structures on volumetric property and recent research progress,and proposed the mutual influence and trade-off relationship between various scale structures.In addition,the challenges and outlooks for improving the dense storage and volumetric performance of carbon-based supercapacitors are analyzed,which can provide feasible technical reference and guidance for the design and manufacture of dense carbon-based electrode materials.展开更多
Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to ...Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.展开更多
Prediction of production decline and evaluation of the adsorbed/free gas ratio are critical for determining the lifespan and production status of shale gas wells.Traditional production prediction methods have some sho...Prediction of production decline and evaluation of the adsorbed/free gas ratio are critical for determining the lifespan and production status of shale gas wells.Traditional production prediction methods have some shortcomings because of the low permeability and tightness of shale,complex gas flow behavior of multi-scale gas transport regions and multiple gas transport mechanism superpositions,and complex and variable production regimes of shale gas wells.Recent research has demonstrated the existence of a multi-stage isotope fractionation phenomenon during shale gas production,with the fractionation characteristics of each stage associated with the pore structure,gas in place(GIP),adsorption/desorption,and gas production process.This study presents a new approach for estimating shale gas well production and evaluating the adsorbed/free gas ratio throughout production using isotope fractionation techniques.A reservoir-scale carbon isotope fractionation(CIF)model applicable to the production process of shale gas wells was developed for the first time in this research.In contrast to the traditional model,this model improves production prediction accuracy by simultaneously fitting the gas production rate and δ^(13)C_(1) data and provides a new evaluation method of the adsorbed/free gas ratio during shale gas production.The results indicate that the diffusion and adsorption/desorption properties of rock,bottom-hole flowing pressure(BHP)of gas well,and multi-scale gas transport regions of the reservoir all affect isotope fractionation,with the diffusion and adsorption/desorption parameters of rock having the greatest effect on isotope fractionation being D∗/D,PL,VL,α,and others in that order.We effectively tested the universality of the four-stage isotope fractionation feature and revealed a unique isotope fractionation mechanism caused by the superimposed coupling of multi-scale gas transport regions during shale gas well production.Finally,we applied the established CIF model to a shale gas well in the Sichuan Basin,China,and calculated the estimated ultimate recovery(EUR)of the well to be 3.33×10^(8) m^(3);the adsorbed gas ratio during shale gas production was 1.65%,10.03%,and 23.44%in the first,fifth,and tenth years,respectively.The findings are significant for understanding the isotope fractionation mechanism during natural gas transport in complex systems and for formulating and optimizing unconventional natural gas development strategies.展开更多
Multi-scale system remains a classical scientific problem in fluid dynamics,biology,etc.In the present study,a scheme of multi-scale Physics-informed neural networks is proposed to solve the boundary layer flow at hig...Multi-scale system remains a classical scientific problem in fluid dynamics,biology,etc.In the present study,a scheme of multi-scale Physics-informed neural networks is proposed to solve the boundary layer flow at high Reynolds numbers without any data.The flow is divided into several regions with different scales based on Prandtl's boundary theory.Different regions are solved with governing equations in different scales.The method of matched asymptotic expansions is used to make the flow field continuously.A flow on a semi infinite flat plate at a high Reynolds number is considered a multi-scale problem because the boundary layer scale is much smaller than the outer flow scale.The results are compared with the reference numerical solutions,which show that the msPINNs can solve the multi-scale problem of the boundary layer in high Reynolds number flows.This scheme can be developed for more multi-scale problems in the future.展开更多
In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accurac...In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.展开更多
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram...An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.展开更多
基金supported by the Natural Science Foundation of Wenzhou University of Technology,China(Grant No.:ky202211).
文摘Research indicates that microbe activity within the human body significantly influences health by being closely linked to various diseases.Accurately predicting microbe-disease interactions(MDIs)offers critical insights for disease intervention and pharmaceutical research.Current advanced AI-based technologies automatically generate robust representations of microbes and diseases,enabling effective MDI predictions.However,these models continue to face significant challenges.A major issue is their reliance on complex feature extractors and classifiers,which substantially diminishes the models’generalizability.To address this,we introduce a novel graph autoencoder framework that utilizes decoupled representation learning and multi-scale information fusion strategies to efficiently infer potential MDIs.Initially,we randomly mask portions of the input microbe-disease graph based on Bernoulli distribution to boost self-supervised training and minimize noise-related performance degradation.Secondly,we employ decoupled representation learning technology,compelling the graph neural network(GNN)to independently learn the weights for each feature subspace,thus enhancing its expressive power.Finally,we implement multi-scale information fusion technology to amalgamate the multi-layer outputs of GNN,reducing information loss due to occlusion.Extensive experiments on public datasets demonstrate that our model significantly surpasses existing top MDI prediction models.This indicates that our model can accurately predict unknown MDIs and is likely to aid in disease discovery and precision pharmaceutical research.Code and data are accessible at:https://github.com/shmildsj/MDI-IFDRL.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.11801430,11801200,61877046,and 61877047).
文摘The identification of influential nodes in complex networks is one of the most exciting topics in network science.The latest work successfully compares each node using local connectivity and weak tie theory from a new perspective.We study the structural properties of networks in depth and extend this successful node evaluation from single-scale to multi-scale.In particular,one novel position parameter based on node transmission efficiency is proposed,which mainly depends on the shortest distances from target nodes to high-degree nodes.In this regard,the novel multi-scale information importance(MSII)method is proposed to better identify the crucial nodes by combining the network's local connectivity and global position information.In simulation comparisons,five state-of-the-art algorithms,i.e.the neighbor nodes degree algorithm(NND),betweenness centrality,closeness centrality,Katz centrality and the k-shell decomposition method,are selected to compare with our MSII.The results demonstrate that our method obtains superior performance in terms of robustness and spreading propagation for both real-world and artificial networks.
基金funded by the National Natural Science Foundation of China,grant numbers 52374156 and 62476005。
文摘Images taken in dim environments frequently exhibit issues like insufficient brightness,noise,color shifts,and loss of detail.These problems pose significant challenges to dark image enhancement tasks.Current approaches,while effective in global illumination modeling,often struggle to simultaneously suppress noise and preserve structural details,especially under heterogeneous lighting.Furthermore,misalignment between luminance and color channels introduces additional challenges to accurate enhancement.In response to the aforementioned difficulties,we introduce a single-stage framework,M2ATNet,using the multi-scale multi-attention and Transformer architecture.First,to address the problems of texture blurring and residual noise,we design a multi-scale multi-attention denoising module(MMAD),which is applied separately to the luminance and color channels to enhance the structural and texture modeling capabilities.Secondly,to solve the non-alignment problem of the luminance and color channels,we introduce the multi-channel feature fusion Transformer(CFFT)module,which effectively recovers the dark details and corrects the color shifts through cross-channel alignment and deep feature interaction.To guide the model to learn more stably and efficiently,we also fuse multiple types of loss functions to form a hybrid loss term.We extensively evaluate the proposed method on various standard datasets,including LOL-v1,LOL-v2,DICM,LIME,and NPE.Evaluation in terms of numerical metrics and visual quality demonstrate that M2ATNet consistently outperforms existing advanced approaches.Ablation studies further confirm the critical roles played by the MMAD and CFFT modules to detail preservation and visual fidelity under challenging illumination-deficient environments.
基金supported by the Henan Province Key R&D Project under Grant 241111210400the Henan Provincial Science and Technology Research Project under Grants 252102211047,252102211062,252102211055 and 232102210069+2 种基金the Jiangsu Provincial Scheme Double Initiative Plan JSS-CBS20230474,the XJTLU RDF-21-02-008the Science and Technology Innovation Project of Zhengzhou University of Light Industry under Grant 23XNKJTD0205the Higher Education Teaching Reform Research and Practice Project of Henan Province under Grant 2024SJGLX0126。
文摘Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectral similarity between buildings and backgrounds,sensor variations,and insufficient computational efficiency.To address these challenges,this paper proposes a novel Multi-scale Efficient Wavelet-based Change Detection Network(MewCDNet),which integrates the advantages of Convolutional Neural Networks and Transformers,balances computational costs,and achieves high-performance building change detection.The network employs EfficientNet-B4 as the backbone for hierarchical feature extraction,integrates multi-level feature maps through a multi-scale fusion strategy,and incorporates two key modules:Cross-temporal Difference Detection(CTDD)and Cross-scale Wavelet Refinement(CSWR).CTDD adopts a dual-branch architecture that combines pixel-wise differencing with semanticaware Euclidean distance weighting to enhance the distinction between true changes and background noise.CSWR integrates Haar-based Discrete Wavelet Transform with multi-head cross-attention mechanisms,enabling cross-scale feature fusion while significantly improving edge localization and suppressing spurious changes.Extensive experiments on four benchmark datasets demonstrate MewCDNet’s superiority over comparison methods:achieving F1 scores of 91.54%on LEVIR,93.70%on WHUCD,and 64.96%on S2Looking for building change detection.Furthermore,MewCDNet exhibits optimal performance on the multi-class⋅SYSU dataset(F1:82.71%),highlighting its exceptional generalization capability.
基金the National Key Research and Development Program of China(Grant No.2022YFF0711400)which provided valuable financial support and resources for my research and made it possible for me to deeply explore the unknown mysteries in the field of lunar geologythe National Space Science Data Center Youth Open Project(Grant No.NSSDC2302001),which has not only facilitated the smooth progress of my research,but has also built a platform for me to communicate and cooperate with experts in the field.
文摘Impact craters are important for understanding the evolution of lunar geologic and surface erosion rates,among other functions.However,the morphological characteristics of these micro impact craters are not obvious and they are numerous,resulting in low detection accuracy by deep learning models.Therefore,we proposed a new multi-scale fusion crater detection algorithm(MSF-CDA)based on the YOLO11 to improve the accuracy of lunar impact crater detection,especially for small craters with a diameter of<1 km.Using the images taken by the LROC(Lunar Reconnaissance Orbiter Camera)at the Chang’e-4(CE-4)landing area,we constructed three separate datasets for craters with diameters of 0-70 m,70-140 m,and>140 m.We then trained three submodels separately with these three datasets.Additionally,we designed a slicing-amplifying-slicing strategy to enhance the ability to extract features from small craters.To handle redundant predictions,we proposed a new Non-Maximum Suppression with Area Filtering method to fuse the results in overlapping targets within the multi-scale submodels.Finally,our new MSF-CDA method achieved high detection performance,with the Precision,Recall,and F1 score having values of 0.991,0.987,and 0.989,respectively,perfectly addressing the problems induced by the lesser features and sample imbalance of small craters.Our MSF-CDA can provide strong data support for more in-depth study of the geological evolution of the lunar surface and finer geological age estimations.This strategy can also be used to detect other small objects with lesser features and sample imbalance problems.We detected approximately 500,000 impact craters in an area of approximately 214 km2 around the CE-4 landing area.By statistically analyzing the new data,we updated the distribution function of the number and diameter of impact craters.Finally,we identified the most suitable lighting conditions for detecting impact crater targets by analyzing the effect of different lighting conditions on the detection accuracy.
文摘With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.
文摘1 General information Journal of Geographical Sciences is an international academic journal that publishes papers of the highest quality in physical geography, natural resources, environmental sciences, geographic information sciences, remote sensing and cartography. Manuscripts come from different parts of the world.
基金supported by National Natural Science Foundation of China(NSFC)under grant U23A20310.
文摘With the growing advancement of wireless communication technologies,WiFi-based human sensing has gained increasing attention as a non-intrusive and device-free solution.Among the available signal types,Channel State Information(CSI)offers fine-grained temporal,frequency,and spatial insights into multipath propagation,making it a crucial data source for human-centric sensing.Recently,the integration of deep learning has significantly improved the robustness and automation of feature extraction from CSI in complex environments.This paper provides a comprehensive review of deep learning-enhanced human sensing based on CSI.We first outline mainstream CSI acquisition tools and their hardware specifications,then provide a detailed discussion of preprocessing methods such as denoising,time–frequency transformation,data segmentation,and augmentation.Subsequently,we categorize deep learning approaches according to sensing tasks—namely detection,localization,and recognition—and highlight representative models across application scenarios.Finally,we examine key challenges including domain generalization,multi-user interference,and limited data availability,and we propose future research directions involving lightweight model deployment,multimodal data fusion,and semantic-level sensing.
基金Supported by National Natural Science Foundation of China and Civil Aviation Administration of China Joint Funded Project(Grant No.U1733108)Key Project of Tianjin Science and Technology Support Program(Grant No.16YFZCSY00860).
文摘For a single-structure deep learning fault diagnosis model,its disadvantages are an insufficient feature extraction and weak fault classification capability.This paper proposes a multi-scale deep feature fusion intelligent fault diagnosis method based on information entropy.First,a normal autoencoder,denoising autoencoder,sparse autoencoder,and contractive autoencoder are used in parallel to construct a multi-scale deep neural network feature extraction structure.A deep feature fusion strategy based on information entropy is proposed to obtain low-dimensional features and ensure the robustness of the model and the quality of deep features.Finally,the advantage of the deep belief network probability model is used as the fault classifier to identify the faults.The effectiveness of the proposed method was verified by a gearbox test-bed.Experimental results show that,compared with traditional and existing intelligent fault diagnosis methods,the proposed method can obtain representative information and features from the raw data with higher classification accuracy.
基金supported by the International Research Center of Big Data for Sustainable Development Goals under Grant No.CBAS2022GSP05the Open Fund of State Key Laboratory of Remote Sensing Science under Grant No.6142A01210404the Hubei Key Laboratory of Intelligent Geo-Information Processing under Grant No.KLIGIP-2022-B03.
文摘Named entity recognition(NER)is an important part in knowledge extraction and one of the main tasks in constructing knowledge graphs.In today’s Chinese named entity recognition(CNER)task,the BERT-BiLSTM-CRF model is widely used and often yields notable results.However,recognizing each entity with high accuracy remains challenging.Many entities do not appear as single words but as part of complex phrases,making it difficult to achieve accurate recognition using word embedding information alone because the intricate lexical structure often impacts the performance.To address this issue,we propose an improved Bidirectional Encoder Representations from Transformers(BERT)character word conditional random field(CRF)(BCWC)model.It incorporates a pre-trained word embedding model using the skip-gram with negative sampling(SGNS)method,alongside traditional BERT embeddings.By comparing datasets with different word segmentation tools,we obtain enhanced word embedding features for segmented data.These features are then processed using the multi-scale convolution and iterated dilated convolutional neural networks(IDCNNs)with varying expansion rates to capture features at multiple scales and extract diverse contextual information.Additionally,a multi-attention mechanism is employed to fuse word and character embeddings.Finally,CRFs are applied to learn sequence constraints and optimize entity label annotations.A series of experiments are conducted on three public datasets,demonstrating that the proposed method outperforms the recent advanced baselines.BCWC is capable to address the challenge of recognizing complex entities by combining character-level and word-level embedding information,thereby improving the accuracy of CNER.Such a model is potential to the applications of more precise knowledge extraction such as knowledge graph construction and information retrieval,particularly in domain-specific natural language processing tasks that require high entity recognition precision.
基金Opening Foundation of Key Laboratory of Explosive Energy Utilization and Control,Anhui Province(BP20240104)Graduate Innovation Program of China University of Mining and Technology(2024WLJCRCZL049)Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX24_2701)。
文摘Because of the challenge of compounding lightweight,high-strength Ti/Al alloys due to their considerable disparity in properties,Al 6063 as intermediate layer was proposed to fabricate TC4/Al 6063/Al 7075 three-layer composite plate by explosive welding.The microscopic properties of each bonding interface were elucidated through field emission scanning electron microscope and electron backscattered diffraction(EBSD).A methodology combining finite element method-smoothed particle hydrodynamics(FEM-SPH)and molecular dynamics(MD)was proposed for the analysis of the forming and evolution characteristics of explosive welding interfaces at multi-scale.The results demonstrate that the bonding interface morphologies of TC4/Al 6063 and Al 6063/Al 7075 exhibit a flat and wavy configuration,without discernible defects or cracks.The phenomenon of grain refinement is observed in the vicinity of the two bonding interfaces.Furthermore,the degree of plastic deformation of TC4 and Al 7075 is more pronounced than that of Al 6063 in the intermediate layer.The interface morphology characteristics obtained by FEM-SPH simulation exhibit a high degree of similarity to the experimental results.MD simulations reveal that the diffusion of interfacial elements predominantly occurs during the unloading phase,and the simulated thickness of interfacial diffusion aligns well with experimental outcomes.The introduction of intermediate layer in the explosive welding process can effectively produce high-quality titanium/aluminum alloy composite plates.Furthermore,this approach offers a multi-scale simulation strategy for the study of explosive welding bonding interfaces.
基金Under the auspices of National Natural Science Foundation of China(No.42330510)。
文摘With the development of smart cities and smart technologies,parks,as functional units of the city,are facing smart transformation.The development of smart parks can help address challenges of technology integration within urban spaces and serve as testbeds for exploring smart city planning and governance models.Information models facilitate the effective integration of technology into space.Building Information Modeling(BIM)and City Information Modeling(CIM)have been widely used in urban construction.However,the existing information models have limitations in the application of the park,so it is necessary to develop an information model suitable for the park.This paper first traces the evolution of park smart transformation,reviews the global landscape of smart park development,and identifies key trends and persistent challenges.Addressing the particularities of parks,the concept of Park Information Modeling(PIM)is proposed.PIM leverages smart technologies such as artificial intelligence,digital twins,and collaborative sensing to help form a‘space-technology-system’smart structure,enabling systematic management of diverse park spaces,addressing the deficiency in park-level information models,and aiming to achieve scale articulation between BIM and CIM.Finally,through a detailed top-level design application case study of the Nanjing Smart Education Park in China,this paper illustrates the translation process of the PIM concept into practice,showcasing its potential to provide smart management tools for park managers and enhance services for park stakeholders,although further empirical validation is required.
基金funded by the Joint Fund for Regional Innovation and Development of National Natural Science Foundation of China(U21A20143)the National Science Fund for Excellent Young Scholars(52322607)the Excellent Youth Foundation of Heilongjiang Scientific Committee(YQ2022E028)。
文摘Improving the volumetric energy density of supercapacitors is essential for practical applications,which highly relies on the dense storage of ions in carbon-based electrodes.The functional units of carbon-based electrode exhibit multi-scale structural characteristics including macroscopic electrode morphologies,mesoscopic microcrystals and pores,and microscopic defects and dopants in the carbon basal plane.Therefore,the ordered combination of multi-scale structures of carbon electrode is crucial for achieving dense energy storage and high volumetric performance by leveraging the functions of various scale structu re.Considering that previous reviews have focused more on the discussion of specific scale structu re of carbon electrodes,this review takes a multi-scale perspective in which recent progresses regarding the structureperformance relationship,underlying mechanism and directional design of carbon-based multi-scale structures including carbon morphology,pore structure,carbon basal plane micro-environment and electrode technology on dense energy storage and volumetric property of supercapacitors are systematically discussed.We analyzed in detail the effects of the morphology,pore,and micro-environment of carbon electrode materials on ion dense storage,summarized the specific effects of different scale structures on volumetric property and recent research progress,and proposed the mutual influence and trade-off relationship between various scale structures.In addition,the challenges and outlooks for improving the dense storage and volumetric performance of carbon-based supercapacitors are analyzed,which can provide feasible technical reference and guidance for the design and manufacture of dense carbon-based electrode materials.
基金supported by the Natural Science Foundation of the Anhui Higher Education Institutions of China(Grant Nos.2023AH040149 and 2024AH051915)the Anhui Provincial Natural Science Foundation(Grant No.2208085MF168)+1 种基金the Science and Technology Innovation Tackle Plan Project of Maanshan(Grant No.2024RGZN001)the Scientific Research Fund Project of Anhui Medical University(Grant No.2023xkj122).
文摘Convolutional neural networks(CNNs)-based medical image segmentation technologies have been widely used in medical image segmentation because of their strong representation and generalization abilities.However,due to the inability to effectively capture global information from images,CNNs can easily lead to loss of contours and textures in segmentation results.Notice that the transformer model can effectively capture the properties of long-range dependencies in the image,and furthermore,combining the CNN and the transformer can effectively extract local details and global contextual features of the image.Motivated by this,we propose a multi-branch and multi-scale attention network(M2ANet)for medical image segmentation,whose architecture consists of three components.Specifically,in the first component,we construct an adaptive multi-branch patch module for parallel extraction of image features to reduce information loss caused by downsampling.In the second component,we apply residual block to the well-known convolutional block attention module to enhance the network’s ability to recognize important features of images and alleviate the phenomenon of gradient vanishing.In the third component,we design a multi-scale feature fusion module,in which we adopt adaptive average pooling and position encoding to enhance contextual features,and then multi-head attention is introduced to further enrich feature representation.Finally,we validate the effectiveness and feasibility of the proposed M2ANet method through comparative experiments on four benchmark medical image segmentation datasets,particularly in the context of preserving contours and textures.
基金supported by the Natural Science Foundation of China(Grant No.42302170)National Postdoctoral Innovative Talent Support Program(Grant No.BX20220062)+3 种基金CNPC Innovation Found(Grant No.2022DQ02-0104)National Science Foundation of Heilongjiang Province of China(Grant No.YQ2023D001)Postdoctoral Science Foundation of Heilongjiang Province of China(Grant No.LBH-Z22091)the Natural Science Foundation of Shandong Province(Grant No.ZR2022YQ30).
文摘Prediction of production decline and evaluation of the adsorbed/free gas ratio are critical for determining the lifespan and production status of shale gas wells.Traditional production prediction methods have some shortcomings because of the low permeability and tightness of shale,complex gas flow behavior of multi-scale gas transport regions and multiple gas transport mechanism superpositions,and complex and variable production regimes of shale gas wells.Recent research has demonstrated the existence of a multi-stage isotope fractionation phenomenon during shale gas production,with the fractionation characteristics of each stage associated with the pore structure,gas in place(GIP),adsorption/desorption,and gas production process.This study presents a new approach for estimating shale gas well production and evaluating the adsorbed/free gas ratio throughout production using isotope fractionation techniques.A reservoir-scale carbon isotope fractionation(CIF)model applicable to the production process of shale gas wells was developed for the first time in this research.In contrast to the traditional model,this model improves production prediction accuracy by simultaneously fitting the gas production rate and δ^(13)C_(1) data and provides a new evaluation method of the adsorbed/free gas ratio during shale gas production.The results indicate that the diffusion and adsorption/desorption properties of rock,bottom-hole flowing pressure(BHP)of gas well,and multi-scale gas transport regions of the reservoir all affect isotope fractionation,with the diffusion and adsorption/desorption parameters of rock having the greatest effect on isotope fractionation being D∗/D,PL,VL,α,and others in that order.We effectively tested the universality of the four-stage isotope fractionation feature and revealed a unique isotope fractionation mechanism caused by the superimposed coupling of multi-scale gas transport regions during shale gas well production.Finally,we applied the established CIF model to a shale gas well in the Sichuan Basin,China,and calculated the estimated ultimate recovery(EUR)of the well to be 3.33×10^(8) m^(3);the adsorbed gas ratio during shale gas production was 1.65%,10.03%,and 23.44%in the first,fifth,and tenth years,respectively.The findings are significant for understanding the isotope fractionation mechanism during natural gas transport in complex systems and for formulating and optimizing unconventional natural gas development strategies.
文摘Multi-scale system remains a classical scientific problem in fluid dynamics,biology,etc.In the present study,a scheme of multi-scale Physics-informed neural networks is proposed to solve the boundary layer flow at high Reynolds numbers without any data.The flow is divided into several regions with different scales based on Prandtl's boundary theory.Different regions are solved with governing equations in different scales.The method of matched asymptotic expansions is used to make the flow field continuously.A flow on a semi infinite flat plate at a high Reynolds number is considered a multi-scale problem because the boundary layer scale is much smaller than the outer flow scale.The results are compared with the reference numerical solutions,which show that the msPINNs can solve the multi-scale problem of the boundary layer in high Reynolds number flows.This scheme can be developed for more multi-scale problems in the future.
基金supported by the National Natural Science Foundation of China(62272049,62236006,62172045)the Key Projects of Beijing Union University(ZKZD202301).
文摘In recent years,gait-based emotion recognition has been widely applied in the field of computer vision.However,existing gait emotion recognition methods typically rely on complete human skeleton data,and their accuracy significantly declines when the data is occluded.To enhance the accuracy of gait emotion recognition under occlusion,this paper proposes a Multi-scale Suppression Graph ConvolutionalNetwork(MS-GCN).TheMS-GCN consists of three main components:Joint Interpolation Module(JI Moudle),Multi-scale Temporal Convolution Network(MS-TCN),and Suppression Graph Convolutional Network(SGCN).The JI Module completes the spatially occluded skeletal joints using the(K-Nearest Neighbors)KNN interpolation method.The MS-TCN employs convolutional kernels of various sizes to comprehensively capture the emotional information embedded in the gait,compensating for the temporal occlusion of gait information.The SGCN extracts more non-prominent human gait features by suppressing the extraction of key body part features,thereby reducing the negative impact of occlusion on emotion recognition results.The proposed method is evaluated on two comprehensive datasets:Emotion-Gait,containing 4227 real gaits from sources like BML,ICT-Pollick,and ELMD,and 1000 synthetic gaits generated using STEP-Gen technology,and ELMB,consisting of 3924 gaits,with 1835 labeled with emotions such as“Happy,”“Sad,”“Angry,”and“Neutral.”On the standard datasets Emotion-Gait and ELMB,the proposed method achieved accuracies of 0.900 and 0.896,respectively,attaining performance comparable to other state-ofthe-artmethods.Furthermore,on occlusion datasets,the proposedmethod significantly mitigates the performance degradation caused by occlusion compared to other methods,the accuracy is significantly higher than that of other methods.
基金supported by the National Natural Science Foundation of China(No.62241109)the Tianjin Science and Technology Commissioner Project(No.20YDTPJC01110)。
文摘An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.