Remote sensing(RS)presents laser scanning measurements,aerial photos,and high-resolution satellite images,which are utilized for extracting a range of traffic-related and road-related features.RS has a weakness,such a...Remote sensing(RS)presents laser scanning measurements,aerial photos,and high-resolution satellite images,which are utilized for extracting a range of traffic-related and road-related features.RS has a weakness,such as traffic fluctuations on small time scales that could distort the accuracy of predicted road and traffic features.This article introduces an Optimal Deep Learning for Traffic Critical Prediction Model on High-Resolution Remote Sensing Images(ODLTCP-HRRSI)to resolve these issues.The presented ODLTCP-HRRSI technique majorly aims to forecast the critical traffic in smart cities.To attain this,the presented ODLTCP-HRRSI model performs two major processes.At the initial stage,the ODLTCP-HRRSI technique employs a convolutional neural network with an auto-encoder(CNN-AE)model for productive and accurate traffic flow.Next,the hyperparameter adjustment of the CNN-AE model is performed via the Bayesian adaptive direct search optimization(BADSO)algorithm.The experimental outcomes demonstrate the enhanced performance of the ODLTCP-HRRSI technique over recent approaches with maximum accuracy of 98.23%.展开更多
Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods ex...Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.展开更多
High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes an...High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.展开更多
The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack...The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.展开更多
This research systematically investigates urban three-dimensional greening layout optimization and smart ecocity construction using deep learning and remote sensing technology.An improved U-Net++ architecture combined...This research systematically investigates urban three-dimensional greening layout optimization and smart ecocity construction using deep learning and remote sensing technology.An improved U-Net++ architecture combined with multi-source remote sensing data achieved high-precision recognition of urban three-dimensional greening with 92.8% overall accuracy.Analysis of spatiotemporal evolution patterns in Shanghai,Hangzhou,and Nanjing revealed that threedimensional greening shows a development trend from demonstration to popularization,with 16.5% annual growth rate.The study quantitatively assessed ecological benefits of various three-dimensional greening types.Results indicate that modular vertical greening and intensive roof gardens yield highest ecological benefits,while climbing-type vertical greening and extensive roof gardens offer optimal benefit-cost ratios.Integration of multiple forms generates 15-22% synergistic enhancement.Compared with traditional planning,the multi-objective optimization-based layout achieved 27.5% increase in carbon sequestration,32.6% improvement in temperature regulation,35.8% enhancement in stormwater management,and 42.3% rise in biodiversity index.Three pilot projects validated that actual ecological benefits reached 90.3-102.3% of predicted values.Multi-scenario simulations indicate optimized layouts can reduce urban heat island intensity by 15.2-18.7%,increase carbon neutrality contribution to 8.6-10.2%,and decrease stormwater runoff peaks by 25.3-32.6%.The findings provide technical methods for urban three-dimensional greening optimization and smart eco-city construction,promoting sustainable urban development.展开更多
In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose a...In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.展开更多
Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presen...Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.展开更多
The large-scale acquisition and widespread application of remote sensing image data have led to increasingly severe challenges in information security and privacy protection during transmission and storage.Urban remot...The large-scale acquisition and widespread application of remote sensing image data have led to increasingly severe challenges in information security and privacy protection during transmission and storage.Urban remote sensing image,characterized by complex content and well-defined structures,are particularly vulnerable to malicious attacks and information leakage.To address this issue,the author proposes an encryption method based on the enhanced single-neuron dynamical system(ESNDS).ESNDS generates highquality pseudo-random sequences with complex dynamics and intense sensitivity to initial conditions,which drive a structure of multi-stage cipher comprising permutation,ring-wise diffusion,and mask perturbation.Using representative GF-2 Panchromatic and Multispectral Scanner(PMS)urban scenes,the author conducts systematic evaluations in terms of inter-pixel correlation,information entropy,histogram uniformity,and number of pixel change rate(NPCR)/unified average changing intensity(UACI).The results demonstrate that the proposed scheme effectively resists statistical analysis,differential attacks,and known-plaintext attacks while maintaining competitive computational efficiency for high-resolution urban image.In addition,the cipher is lightweight and hardware-friendly,integrates readily with on-board and ground processing,and thus offers tangible engineering utility for real-time,large-volume remote-sensing data protection.展开更多
Semantic segmentation provides important technical support for Land cover/land use(LCLU)research.By calculating the cosine similarity between feature vectors,transformer-based models can effectively capture the global...Semantic segmentation provides important technical support for Land cover/land use(LCLU)research.By calculating the cosine similarity between feature vectors,transformer-based models can effectively capture the global information of high-resolution remote sensing images.However,the diversity of detailed and edge features within the same class of ground objects in high-resolution remote sensing images leads to a dispersed embedding distribution.The dispersed feature distribution enlarges feature vector angles and reduces cosine similarity,weakening the attention mechanism’s ability to identify the same class of ground objects.To address this challenge,remote sensing image information granulation transformer for semantic segmentation is proposed.The model employs adaptive granulation to extract common semantic features among objects of the same class,constructing an information granule to replace the detailed feature representation of these objects.Then,the Laplacian operator of the information granule is applied to extract the edge features of the object as represented by the information granule.In the experiments,the proposed model was validated on the Beijing Land-Use(BLU),Gaofen Image Dataset(GID),and Potsdam Dataset(PD).In particular,the model achieves 88.81%for mOA,82.64%for mF1,and 71.50%for mIoU metrics on the GID dataset.Experimental results show that the model effectively handles high-resolution remote sensing images.Our code is available at https://github.com/sjmp525/RSIGT(accessed on 16 April 2025).展开更多
The advancement of imaging resolution has made the impact of multi-frequency composite jitter in satellite platforms on non-collinear time delay and integration(TDI)charge-coupled device(CCD)imaging systems increasing...The advancement of imaging resolution has made the impact of multi-frequency composite jitter in satellite platforms on non-collinear time delay and integration(TDI)charge-coupled device(CCD)imaging systems increasingly critical.Moreover,the accuracy of jitter detection is constrained by the limited inter-chip overlap region inherent to non-collinear TDI CCDs.To address these challenges,a multi-frequency jitter detection method is proposed,achieving sub-pixel level error extraction.Furthermore,a multi-frequency jitter fitting approach utilizing a scale-adjustable sliding window is introduced.For composite multi-frequency jitter,spectral analysis decomposes the relative jitter error curve,while the scale-adjustable sliding window enables frequency-division fitting and modeling.Validation experiments using Gaofen-8(GF-8)remote sensing satellite imagery detected jitter at 0.65,20,and 100 Hz in the cross-track direction and at 0.5,100,and 120 Hz in the along-track direction,demonstrating the method’s precision in detecting platform jitter at sub-pixel accuracy(<0.2 pixels)and its efficacy in fitting and modeling for non-collinear TDI CCD imaging systems subject to multi-frequency jitter.展开更多
The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on ...The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on encoding and decoding frameworks.In response to this,we propose a model called FlowDual-PixelClsObjectMec(FPCNet),which innovatively incorporates dual flow alignment technology in the decoding stage to rectify semantic discrepancies through streamlined feature correction fusion.Furthermore,the model employs an object-level similarity measurement coupled with pixel-level classification in the PixelClsObjectMec(PCOM)module during the final discrimination stage,significantly enhancing edge detail detection and overall accuracy.Experimental evaluations on the change detection dataset(CDD)and building CDD demonstrate superior performance,with F1 scores of 95.1%and 92.8%,respectively.Our findings indicate that the FPCNet outperforms the existing algorithms in stability,robustness,and other key metrics.展开更多
This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, o...This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, often based on the encoder-decoder structure and utilizing multiple upsampling and downsampling layers, are computationally expensive. To improve efficiency, the paper proposes two modules: the efficient spatial resolution recovery module(ESRR) for upsampling and the efficient depth information augmentation module(EDIA) for downsampling.These modules not only reduce model complexity but also enhance performance. Additionally, the partial feature weight learning module(PFWL) is introduced to reduce the computational burden by applying weight learning across partial dimensions, rather than using full-channel convolution.To overcome the limitations of convolutional neural networks(CNN)-based networks, the haze distribution index transformer(HDIT) is integrated into the decoder. We also propose the physicalbased non-adjacent feature fusion module(PNFF), which leverages the atmospheric scattering model to improve generalization of our MDWR-Net. The MDWR-Net achieves superior dehazing performance with a computational cost of just 2.98×10^(9) multiply-accumulate operations(MACs),which is less than one-tenth of previous methods. Experimental results validate its effectiveness in balancing performance and computational efficiency.展开更多
The secured access is studied in this paper for the network of the image remote sensing.Each sensor in this network encounters the information security when uploading information of the images wirelessly from the sens...The secured access is studied in this paper for the network of the image remote sensing.Each sensor in this network encounters the information security when uploading information of the images wirelessly from the sensor to the central collection point.In order to enhance the sensing quality for the remote uploading,the passive reflection surface technique is employed.If one eavesdropper that exists nearby this sensor is keeping on accessing the same networks,he may receive the same image from this sensor.Our goal in this paper is to improve the SNR of legitimate collection unit while cut down the SNR of the eavesdropper as much as possible by adaptively adjust the uploading power from this sensor to enhance the security of the remote sensing images.In order to achieve this goal,the secured energy efficiency performance is theoretically analyzed with respect to the number of the passive reflection elements by calculating the instantaneous performance over the channel fading coefficients.Based on this theoretical result,the secured access is formulated as a mathematical optimization problem by adjusting the sensor uploading power as the unknown variables with the objective of the energy efficiency maximization while satisfying any required maximum data rate of the eavesdropper sensor.Finally,the analytical expression is theoretically derived for the optimum uploading power.Numerical simulations verify the design approach.展开更多
This study was to estabIish the forest resources management information system for forest farms based on the B/S structural WebGIS with trial forest farm of Hunan Academy of Forestry as the research field, forest reso...This study was to estabIish the forest resources management information system for forest farms based on the B/S structural WebGIS with trial forest farm of Hunan Academy of Forestry as the research field, forest resources field survey da-ta, ETM+ remote sensing data and basic geographical information data as research material through the extraction of forest resource data in the forest farm, require-ment analysis on the system function and the estabIishment of required software and hardware environment, with the alm to realize the management, query, editing, analysis, statistics and other functions of forest resources information to manage the forest resources.展开更多
Landslides,collapses and cracks are the main types of geological hazards,which threaten the safety of human life and property at all times.In emergency surveying and mapping,it is timeconsuming and laborious to use th...Landslides,collapses and cracks are the main types of geological hazards,which threaten the safety of human life and property at all times.In emergency surveying and mapping,it is timeconsuming and laborious to use the method of field artificial investigation and recognition and using satellite image to identify ground hazards,there are some problems,such as time lag,low resolution,and difficult to select the map on demand.In this paper,a10 cm per pixel resolution photogrammetry of a geological hazard-prone area of Taohuagou,Shanxi Province,China is carried out by DJ 4 UAV.The digital orthophoto model(DOM),digital surface model(DSM) and three-dimensional point cloud model(3 DPCM) are generated in this region.The method of visual interpretation of cracks based on DOM(as main)-3 DPCM(as auxiliary) and landslide and collapse based on 3 DPCM(as main)-DOM and DSM(as auxiliary) are proposed.Based on the low altitude remote sensing image of UAV,the shape characteristics,geological characteristics and distribution of the identified hazards are analyzed.The results show that using UAV low altitude remote sensing image,the method of combination of main and auxiliary data can quickly and accurately identify landslide,collapse and crack,the accuracy of crack identification is 93%,and the accuracy of landslide and collapse identification is 100%.It mainly occurs in silty clay and mudstone geology and is greatly affected by slope foot excavation.This study can play a great role in the recognition of sudden hazards by low altitude remote sensing images of UAV.展开更多
Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propos...Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.展开更多
This paper presents a bathymetry inversion method using single-frame fine-resolution optical remote sensing imagery based on ocean-wave refraction and shallow-water wave theory. First, the relationship among water dep...This paper presents a bathymetry inversion method using single-frame fine-resolution optical remote sensing imagery based on ocean-wave refraction and shallow-water wave theory. First, the relationship among water depth, wavelength and wave radian frequency in shallow water was deduced based on shallow-water wave theory. Considering the complex wave distribution in the optical remote sensing imagery, Fast Fourier Transform (FFT) and spatial profile measurements were applied for measuring the wavelengths. Then, the wave radian frequency was calculated by analyzing the long-distance fluctuation in the wavelength, which solved a key problem in obtaining the wave radian frequency in a single-frame image. A case study was conducted for Sanya Bay of Hainan Island, China. Single-flame fine-resolution optical remote sensing imagery from QuickBird satellite was used to invert the bathymetry without external input parameters. The result of the digital elevation model (DEM) was evaluated against a sea chart with a scale of 1:25 000. The root-mean-square error of the inverted bathymetry was 1.07 m, and the relative error was 16.2%. Therefore, the proposed method has the advantages including no requirement for true depths and environmental parameters, and is feasible for mapping the bathymetry of shallow coastal water.展开更多
Wetland research has become a hot spot linking multiple disciplines presently. Wetland classification and mapping is the basis for wetland research. It is difficult to generate wetland data sets using traditional meth...Wetland research has become a hot spot linking multiple disciplines presently. Wetland classification and mapping is the basis for wetland research. It is difficult to generate wetland data sets using traditional methods because of the low accessibility of wetlands, hence remote sensing data have become one of the primary data sources in wetland research. This paper presents a case study conducted at the core area of Honghe National Nature Reserve in the Sanjiang Plain, Northeast China. In this study, three images generated by airship, from Thematic Mapper and from SPOT 5 were selected to produce wetland maps at three different wetland landscape levels. After assessing classification accuracies of the three maps, we compared the different wetland mapping results of 11 plant communities to the airship image, 6 plant ecotypes to the TM image and 9 landscape classifications to the SPOT 5 image. We discussed the different characteristics of the hierarchical ecosystem classifications based on the spatial scales of the different images. The results indicate that spatial scales of remote sensing data have an important link to the hierarchies of wetland plant ecosystems displayed on the wetland landscape maps. The richness of wetland landscape information derived from an image closely relates to its spatial resolution. This study can enrich the ecological classification methods and mapping techniques dealing with the spatial scales of different remote sensing images. With a better understanding of classification accuracies in mapping wetlands by using different scales of remote sensing data, we can make an appropriate approach for dealing with the scale issue of remote sensing images.展开更多
Remote sensing image registration is still a challenging task owing to the significant influence of nonlinear differences between remote sensing images.To solve this problem,this paper proposes a novel approach with r...Remote sensing image registration is still a challenging task owing to the significant influence of nonlinear differences between remote sensing images.To solve this problem,this paper proposes a novel approach with regard to feature-based remote sensing image registration.There are two key contributions:1)we bring forward an improved strategy of composite nonlinear diffusion filtering according to the scale factors in multi-scale space and 2)we design a gradually decreasing resolution of multi-scale pyramid space.And a binary code string is served as feature descriptors to improve matching efficiency.Extensive experiments of different categories of remote image datasets on feature extraction and feature registration are performed.The experimental results demonstrate the superiority of our proposed scheme compared with other classical algorithms in terms of correct matching ratio,accuracy and computation efficiency.展开更多
The exploration of building detection plays an important role in urban planning,smart city and military.Aiming at the problem of high overlapping ratio of detection frames for dense building detection in high resoluti...The exploration of building detection plays an important role in urban planning,smart city and military.Aiming at the problem of high overlapping ratio of detection frames for dense building detection in high resolution remote sensing images,we present an effective YOLOv3 framework,corner regression-based YOLOv3(Correg-YOLOv3),to localize dense building accurately.This improved YOLOv3 algorithm establishes a vertex regression mechanism and an additional loss item about building vertex offsets relative to the center point of bounding box.By extending output dimensions,the trained model is able to output the rectangular bounding boxes and the building vertices meanwhile.Finally,we evaluate the performance of the Correg-YOLOv3 on our self-produced data set and provide a comparative analysis qualitatively and quantitatively.The experimental results achieve high performance in precision(96.45%),recall rate(95.75%),F1 score(96.10%)and average precision(98.05%),which were 2.73%,5.4%,4.1%and 4.73%higher than that of YOLOv3.Therefore,our proposed algorithm effectively tackles the problem of dense building detection in high resolution images.展开更多
文摘Remote sensing(RS)presents laser scanning measurements,aerial photos,and high-resolution satellite images,which are utilized for extracting a range of traffic-related and road-related features.RS has a weakness,such as traffic fluctuations on small time scales that could distort the accuracy of predicted road and traffic features.This article introduces an Optimal Deep Learning for Traffic Critical Prediction Model on High-Resolution Remote Sensing Images(ODLTCP-HRRSI)to resolve these issues.The presented ODLTCP-HRRSI technique majorly aims to forecast the critical traffic in smart cities.To attain this,the presented ODLTCP-HRRSI model performs two major processes.At the initial stage,the ODLTCP-HRRSI technique employs a convolutional neural network with an auto-encoder(CNN-AE)model for productive and accurate traffic flow.Next,the hyperparameter adjustment of the CNN-AE model is performed via the Bayesian adaptive direct search optimization(BADSO)algorithm.The experimental outcomes demonstrate the enhanced performance of the ODLTCP-HRRSI technique over recent approaches with maximum accuracy of 98.23%.
基金This study was supported by:Inner Mongolia Academy of Forestry Sciences Open Research Project(Grant No.KF2024MS03)The Project to Improve the Scientific Research Capacity of the Inner Mongolia Academy of Forestry Sciences(Grant No.2024NLTS04)The Innovation and Entrepreneurship Training Program for Undergraduates of Beijing Forestry University(Grant No.X202410022268).
文摘Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.
基金provided by the Science Research Project of Hebei Education Department under grant No.BJK2024115.
文摘High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.
基金supported by National Natural Science Foundation of China(No.52374155)Anhui Provincial Natural Science Foundation(No.2308085 MF218).
文摘The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.
文摘This research systematically investigates urban three-dimensional greening layout optimization and smart ecocity construction using deep learning and remote sensing technology.An improved U-Net++ architecture combined with multi-source remote sensing data achieved high-precision recognition of urban three-dimensional greening with 92.8% overall accuracy.Analysis of spatiotemporal evolution patterns in Shanghai,Hangzhou,and Nanjing revealed that threedimensional greening shows a development trend from demonstration to popularization,with 16.5% annual growth rate.The study quantitatively assessed ecological benefits of various three-dimensional greening types.Results indicate that modular vertical greening and intensive roof gardens yield highest ecological benefits,while climbing-type vertical greening and extensive roof gardens offer optimal benefit-cost ratios.Integration of multiple forms generates 15-22% synergistic enhancement.Compared with traditional planning,the multi-objective optimization-based layout achieved 27.5% increase in carbon sequestration,32.6% improvement in temperature regulation,35.8% enhancement in stormwater management,and 42.3% rise in biodiversity index.Three pilot projects validated that actual ecological benefits reached 90.3-102.3% of predicted values.Multi-scenario simulations indicate optimized layouts can reduce urban heat island intensity by 15.2-18.7%,increase carbon neutrality contribution to 8.6-10.2%,and decrease stormwater runoff peaks by 25.3-32.6%.The findings provide technical methods for urban three-dimensional greening optimization and smart eco-city construction,promoting sustainable urban development.
基金supported in part by the National Natural Foundation of China(Nos.52472334,U2368204)。
文摘In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.
文摘Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.
文摘The large-scale acquisition and widespread application of remote sensing image data have led to increasingly severe challenges in information security and privacy protection during transmission and storage.Urban remote sensing image,characterized by complex content and well-defined structures,are particularly vulnerable to malicious attacks and information leakage.To address this issue,the author proposes an encryption method based on the enhanced single-neuron dynamical system(ESNDS).ESNDS generates highquality pseudo-random sequences with complex dynamics and intense sensitivity to initial conditions,which drive a structure of multi-stage cipher comprising permutation,ring-wise diffusion,and mask perturbation.Using representative GF-2 Panchromatic and Multispectral Scanner(PMS)urban scenes,the author conducts systematic evaluations in terms of inter-pixel correlation,information entropy,histogram uniformity,and number of pixel change rate(NPCR)/unified average changing intensity(UACI).The results demonstrate that the proposed scheme effectively resists statistical analysis,differential attacks,and known-plaintext attacks while maintaining competitive computational efficiency for high-resolution urban image.In addition,the cipher is lightweight and hardware-friendly,integrates readily with on-board and ground processing,and thus offers tangible engineering utility for real-time,large-volume remote-sensing data protection.
基金supported by the National Natural Science Foundation of China(62462040)the Yunnan Fundamental Research Projects(202501AT070345)+2 种基金the Major Science and Technology Projects in Yunnan Province(202202AD080013)Sichuan Provincial Key Laboratory of Philosophy and Social Science Key Program on Language Intelligence Special Education(YYZN-2024-1)the Photosynthesis Fund Class A(ghfund202407010460).
文摘Semantic segmentation provides important technical support for Land cover/land use(LCLU)research.By calculating the cosine similarity between feature vectors,transformer-based models can effectively capture the global information of high-resolution remote sensing images.However,the diversity of detailed and edge features within the same class of ground objects in high-resolution remote sensing images leads to a dispersed embedding distribution.The dispersed feature distribution enlarges feature vector angles and reduces cosine similarity,weakening the attention mechanism’s ability to identify the same class of ground objects.To address this challenge,remote sensing image information granulation transformer for semantic segmentation is proposed.The model employs adaptive granulation to extract common semantic features among objects of the same class,constructing an information granule to replace the detailed feature representation of these objects.Then,the Laplacian operator of the information granule is applied to extract the edge features of the object as represented by the information granule.In the experiments,the proposed model was validated on the Beijing Land-Use(BLU),Gaofen Image Dataset(GID),and Potsdam Dataset(PD).In particular,the model achieves 88.81%for mOA,82.64%for mF1,and 71.50%for mIoU metrics on the GID dataset.Experimental results show that the model effectively handles high-resolution remote sensing images.Our code is available at https://github.com/sjmp525/RSIGT(accessed on 16 April 2025).
文摘The advancement of imaging resolution has made the impact of multi-frequency composite jitter in satellite platforms on non-collinear time delay and integration(TDI)charge-coupled device(CCD)imaging systems increasingly critical.Moreover,the accuracy of jitter detection is constrained by the limited inter-chip overlap region inherent to non-collinear TDI CCDs.To address these challenges,a multi-frequency jitter detection method is proposed,achieving sub-pixel level error extraction.Furthermore,a multi-frequency jitter fitting approach utilizing a scale-adjustable sliding window is introduced.For composite multi-frequency jitter,spectral analysis decomposes the relative jitter error curve,while the scale-adjustable sliding window enables frequency-division fitting and modeling.Validation experiments using Gaofen-8(GF-8)remote sensing satellite imagery detected jitter at 0.65,20,and 100 Hz in the cross-track direction and at 0.5,100,and 120 Hz in the along-track direction,demonstrating the method’s precision in detecting platform jitter at sub-pixel accuracy(<0.2 pixels)and its efficacy in fitting and modeling for non-collinear TDI CCD imaging systems subject to multi-frequency jitter.
文摘The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on encoding and decoding frameworks.In response to this,we propose a model called FlowDual-PixelClsObjectMec(FPCNet),which innovatively incorporates dual flow alignment technology in the decoding stage to rectify semantic discrepancies through streamlined feature correction fusion.Furthermore,the model employs an object-level similarity measurement coupled with pixel-level classification in the PixelClsObjectMec(PCOM)module during the final discrimination stage,significantly enhancing edge detail detection and overall accuracy.Experimental evaluations on the change detection dataset(CDD)and building CDD demonstrate superior performance,with F1 scores of 95.1%and 92.8%,respectively.Our findings indicate that the FPCNet outperforms the existing algorithms in stability,robustness,and other key metrics.
文摘This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, often based on the encoder-decoder structure and utilizing multiple upsampling and downsampling layers, are computationally expensive. To improve efficiency, the paper proposes two modules: the efficient spatial resolution recovery module(ESRR) for upsampling and the efficient depth information augmentation module(EDIA) for downsampling.These modules not only reduce model complexity but also enhance performance. Additionally, the partial feature weight learning module(PFWL) is introduced to reduce the computational burden by applying weight learning across partial dimensions, rather than using full-channel convolution.To overcome the limitations of convolutional neural networks(CNN)-based networks, the haze distribution index transformer(HDIT) is integrated into the decoder. We also propose the physicalbased non-adjacent feature fusion module(PNFF), which leverages the atmospheric scattering model to improve generalization of our MDWR-Net. The MDWR-Net achieves superior dehazing performance with a computational cost of just 2.98×10^(9) multiply-accumulate operations(MACs),which is less than one-tenth of previous methods. Experimental results validate its effectiveness in balancing performance and computational efficiency.
基金supported in part by Jiangsu Province High Level“333”Program (0401206044)National Natural Science Foundation of China (61801243,62072255)+4 种基金Program for Scientific Research Foundation for Talented Scholars of Jinling Institute of Technology (JIT-B-202031)University Incubator Foundation of Jinling Institute of Technology (JIT-FHXM-202110)Open Project of Fujian Provincial Key Lab.of Network Security and Cryptology (NSCL-KF2021-02)Open Foundation of National Railway Intelligence Transportation System Engineering Tech.Research Center (RITS2021KF02)China Postdoctoral Science Foundation (2019M651914)。
文摘The secured access is studied in this paper for the network of the image remote sensing.Each sensor in this network encounters the information security when uploading information of the images wirelessly from the sensor to the central collection point.In order to enhance the sensing quality for the remote uploading,the passive reflection surface technique is employed.If one eavesdropper that exists nearby this sensor is keeping on accessing the same networks,he may receive the same image from this sensor.Our goal in this paper is to improve the SNR of legitimate collection unit while cut down the SNR of the eavesdropper as much as possible by adaptively adjust the uploading power from this sensor to enhance the security of the remote sensing images.In order to achieve this goal,the secured energy efficiency performance is theoretically analyzed with respect to the number of the passive reflection elements by calculating the instantaneous performance over the channel fading coefficients.Based on this theoretical result,the secured access is formulated as a mathematical optimization problem by adjusting the sensor uploading power as the unknown variables with the objective of the energy efficiency maximization while satisfying any required maximum data rate of the eavesdropper sensor.Finally,the analytical expression is theoretically derived for the optimum uploading power.Numerical simulations verify the design approach.
文摘This study was to estabIish the forest resources management information system for forest farms based on the B/S structural WebGIS with trial forest farm of Hunan Academy of Forestry as the research field, forest resources field survey da-ta, ETM+ remote sensing data and basic geographical information data as research material through the extraction of forest resource data in the forest farm, require-ment analysis on the system function and the estabIishment of required software and hardware environment, with the alm to realize the management, query, editing, analysis, statistics and other functions of forest resources information to manage the forest resources.
基金supported by the National Natural Science Foundation of China (Award Number: 51704205)Key R & D Plan projects in Shanxi Province of China (Award Number: 201803D31044)+1 种基金Education Department Natural Science Foundation in Guizhou of China (Award Number: KY (2017) 097)the High-Level Talents Fund of Guizhou University of Engineering Science (Award Number: G2015005)。
文摘Landslides,collapses and cracks are the main types of geological hazards,which threaten the safety of human life and property at all times.In emergency surveying and mapping,it is timeconsuming and laborious to use the method of field artificial investigation and recognition and using satellite image to identify ground hazards,there are some problems,such as time lag,low resolution,and difficult to select the map on demand.In this paper,a10 cm per pixel resolution photogrammetry of a geological hazard-prone area of Taohuagou,Shanxi Province,China is carried out by DJ 4 UAV.The digital orthophoto model(DOM),digital surface model(DSM) and three-dimensional point cloud model(3 DPCM) are generated in this region.The method of visual interpretation of cracks based on DOM(as main)-3 DPCM(as auxiliary) and landslide and collapse based on 3 DPCM(as main)-DOM and DSM(as auxiliary) are proposed.Based on the low altitude remote sensing image of UAV,the shape characteristics,geological characteristics and distribution of the identified hazards are analyzed.The results show that using UAV low altitude remote sensing image,the method of combination of main and auxiliary data can quickly and accurately identify landslide,collapse and crack,the accuracy of crack identification is 93%,and the accuracy of landslide and collapse identification is 100%.It mainly occurs in silty clay and mudstone geology and is greatly affected by slope foot excavation.This study can play a great role in the recognition of sudden hazards by low altitude remote sensing images of UAV.
基金supported by the National Natural Science Foundation of China[grant number 41671452].
文摘Although the Convolutional Neural Network(CNN)has shown great potential for land cover classification,the frequently used single-scale convolution kernel limits the scope of informa-tion extraction.Therefore,we propose a Multi-Scale Fully Convolutional Network(MSFCN)with a multi-scale convolutional kernel as well as a Channel Attention Block(CAB)and a Global Pooling Module(GPM)in this paper to exploit discriminative representations from two-dimensional(2D)satellite images.Meanwhile,to explore the ability of the proposed MSFCN for spatio-temporal images,we expand our MSFCN to three-dimension using three-dimensional(3D)CNN,capable of harnessing each land cover category’s time series interac-tion from the reshaped spatio-temporal remote sensing images.To verify the effectiveness of the proposed MSFCN,we conduct experiments on two spatial datasets and two spatio-temporal datasets.The proposed MSFCN achieves 60.366%on the WHDLD dataset and 75.127%on the GID dataset in terms of mIoU index while the figures for two spatio-temporal datasets are 87.753%and 77.156%.Extensive comparative experiments and abla-tion studies demonstrate the effectiveness of the proposed MSFCN.
基金The Public Science and Technology Research Fund Project of Ocean under contract No.201105001the National Nature Science Foundation of China under contract No.41576174the Public Science and Technology Research Fund Project of Surveying,Mapping and Geoinformation under contract No.201512030
文摘This paper presents a bathymetry inversion method using single-frame fine-resolution optical remote sensing imagery based on ocean-wave refraction and shallow-water wave theory. First, the relationship among water depth, wavelength and wave radian frequency in shallow water was deduced based on shallow-water wave theory. Considering the complex wave distribution in the optical remote sensing imagery, Fast Fourier Transform (FFT) and spatial profile measurements were applied for measuring the wavelengths. Then, the wave radian frequency was calculated by analyzing the long-distance fluctuation in the wavelength, which solved a key problem in obtaining the wave radian frequency in a single-frame image. A case study was conducted for Sanya Bay of Hainan Island, China. Single-flame fine-resolution optical remote sensing imagery from QuickBird satellite was used to invert the bathymetry without external input parameters. The result of the digital elevation model (DEM) was evaluated against a sea chart with a scale of 1:25 000. The root-mean-square error of the inverted bathymetry was 1.07 m, and the relative error was 16.2%. Therefore, the proposed method has the advantages including no requirement for true depths and environmental parameters, and is feasible for mapping the bathymetry of shallow coastal water.
基金Under the auspices of National Natural Science Foundation of China (No. 40871241, 40771170)National High Technology Research and Development Program of China (No. 2007AA12Z176)
文摘Wetland research has become a hot spot linking multiple disciplines presently. Wetland classification and mapping is the basis for wetland research. It is difficult to generate wetland data sets using traditional methods because of the low accessibility of wetlands, hence remote sensing data have become one of the primary data sources in wetland research. This paper presents a case study conducted at the core area of Honghe National Nature Reserve in the Sanjiang Plain, Northeast China. In this study, three images generated by airship, from Thematic Mapper and from SPOT 5 were selected to produce wetland maps at three different wetland landscape levels. After assessing classification accuracies of the three maps, we compared the different wetland mapping results of 11 plant communities to the airship image, 6 plant ecotypes to the TM image and 9 landscape classifications to the SPOT 5 image. We discussed the different characteristics of the hierarchical ecosystem classifications based on the spatial scales of the different images. The results indicate that spatial scales of remote sensing data have an important link to the hierarchies of wetland plant ecosystems displayed on the wetland landscape maps. The richness of wetland landscape information derived from an image closely relates to its spatial resolution. This study can enrich the ecological classification methods and mapping techniques dealing with the spatial scales of different remote sensing images. With a better understanding of classification accuracies in mapping wetlands by using different scales of remote sensing data, we can make an appropriate approach for dealing with the scale issue of remote sensing images.
基金supported by National Nature Science Foundation of China(Nos.61640412 and 61762052)the Natural Science Foundation of Jiangxi Province(No.20192BAB207021)the Science and Technology Research Projects of Jiangxi Province Education Department(Nos.GJJ170633 and GJJ170632).
文摘Remote sensing image registration is still a challenging task owing to the significant influence of nonlinear differences between remote sensing images.To solve this problem,this paper proposes a novel approach with regard to feature-based remote sensing image registration.There are two key contributions:1)we bring forward an improved strategy of composite nonlinear diffusion filtering according to the scale factors in multi-scale space and 2)we design a gradually decreasing resolution of multi-scale pyramid space.And a binary code string is served as feature descriptors to improve matching efficiency.Extensive experiments of different categories of remote image datasets on feature extraction and feature registration are performed.The experimental results demonstrate the superiority of our proposed scheme compared with other classical algorithms in terms of correct matching ratio,accuracy and computation efficiency.
基金National Natural Science Foundation of China(No.41871305)National Key Research and Development Program of China(No.2017YFC0602204)+2 种基金Fundamental Research Funds for the Central Universities,China University of Geosciences(Wuhan)(No.CUGQY1945)Open Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education and the Fundamental Research Funds for the Central Universities(No.GLAB2019ZR02)Open Fund of Laboratory of Urban Land Resources Monitoring and Simulation,Ministry of Natural Resources,China(No.KF-2020-05-068)。
文摘The exploration of building detection plays an important role in urban planning,smart city and military.Aiming at the problem of high overlapping ratio of detection frames for dense building detection in high resolution remote sensing images,we present an effective YOLOv3 framework,corner regression-based YOLOv3(Correg-YOLOv3),to localize dense building accurately.This improved YOLOv3 algorithm establishes a vertex regression mechanism and an additional loss item about building vertex offsets relative to the center point of bounding box.By extending output dimensions,the trained model is able to output the rectangular bounding boxes and the building vertices meanwhile.Finally,we evaluate the performance of the Correg-YOLOv3 on our self-produced data set and provide a comparative analysis qualitatively and quantitatively.The experimental results achieve high performance in precision(96.45%),recall rate(95.75%),F1 score(96.10%)and average precision(98.05%),which were 2.73%,5.4%,4.1%and 4.73%higher than that of YOLOv3.Therefore,our proposed algorithm effectively tackles the problem of dense building detection in high resolution images.