Conventional change detection approaches are mainly based on per-pixel processing,which ignore the sub-pixel spectral variation resulted from spectral mixture.Especially for medium-resolution remote sensing images use...Conventional change detection approaches are mainly based on per-pixel processing,which ignore the sub-pixel spectral variation resulted from spectral mixture.Especially for medium-resolution remote sensing images used in urban landcover change monitoring,land use/cover components within a single pixel are usually complicated and heterogeneous due to the limitation of the spatial resolution.Thus,traditional hard detection methods based on pure pixel assumption may lead to a high level of omission and commission errors inevitably,degrading the overall accuracy of change detection.In order to address this issue and find a possible way to exploit the spectral variation in a sub-pixel level,a novel change detection scheme is designed based on the spectral mixture analysis and decision-level fusion.Nonlinear spectral mixture model is selected for spectral unmixing,and change detection is implemented in a sub-pixel level by investigating the inner-pixel subtle changes and combining multiple composition evidences.The proposed method is tested on multi-temporal Landsat Thematic Mapper and China–Brazil Earth Resources Satellite remote sensing images for the land-cover change detection over urban areas.The effectiveness of the proposed approach is confirmed in terms of several accuracy indices in contrast with two pixel-based change detection methods(i.e.change vector analysis and principal component analysis-based method).In particular,the proposed sub-pixel change detection approach not only provides the binary change information,but also obtains the characterization about change direction and intensity,which greatly extends the semantic meaning of the detected change targets.展开更多
Background:Diabetic macular edema is a prevalent retinal condition and a leading cause of visual impairment among diabetic patients’Early detection of affected areas is beneficial for effective diagnosis and treatmen...Background:Diabetic macular edema is a prevalent retinal condition and a leading cause of visual impairment among diabetic patients’Early detection of affected areas is beneficial for effective diagnosis and treatment.Traditionally,diagnosis relies on optical coherence tomography imaging technology interpreted by ophthalmologists.However,this manual image interpretation is often slow and subjective.Therefore,developing automated segmentation for macular edema images is essential to enhance to improve the diagnosis efficiency and accuracy.Methods:In order to improve clinical diagnostic efficiency and accuracy,we proposed a SegNet network structure integrated with a convolutional block attention module(CBAM).This network introduces a multi-scale input module,the CBAM attention mechanism,and jump connection.The multi-scale input module enhances the network’s perceptual capabilities,while the lightweight CBAM effectively fuses relevant features across channels and spatial dimensions,allowing for better learning of varying information levels.Results:Experimental results demonstrate that the proposed network achieves an IoU of 80.127%and an accuracy of 99.162%.Compared to the traditional segmentation network,this model has fewer parameters,faster training and testing speed,and superior performance on semantic segmentation tasks,indicating its highly practical applicability.Conclusion:The C-SegNet proposed in this study enables accurate segmentation of Diabetic macular edema lesion images,which facilitates quicker diagnosis for healthcare professionals.展开更多
Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi...Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.展开更多
Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods ex...Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.展开更多
This study conducted computer-aided image analysis of land use and land cover in Xilin River Basin, Inner Mongolia, using 4 sets of Landsat TM/ETM+ images acquired on July 31, 1987, August 11, 1991, Sep...This study conducted computer-aided image analysis of land use and land cover in Xilin River Basin, Inner Mongolia, using 4 sets of Landsat TM/ETM+ images acquired on July 31, 1987, August 11, 1991, September 27, 1997 and May 23, 2000, respectively. Primarily, 17 sub-class land cover types were recognized, including nine grassland types at community level: F.sibiricum steppe, S.baicalensis steppe, A.chinensis+ forbs steppe, A.chinensis+ bunchgrass steppe, A.chinensis+ Ar.frigida steppe, S.grandis+ A.chinensis steppe, S.grandis+ bunchgrass steppe, S.krylavii steppe, Ar.frigida steppe and eight non-grassland types: active cropland, harvested cropland, urban area, wetland, desertified land, saline and alkaline land, cloud, water body + cloud shadow. To eliminate the classification error existing among different sub-types of the same gross type, the 17 sub-class land cover types were grouped into five gross types: meadow grassland, temperate grassland, desert grassland, cropland and non-grassland. The overall classification accuracy of the five land cover types was 81.0% for 1987, 81.7% for 1991, 80.1% for 1997 and 78.2% for 2000.展开更多
Flood disasters can have a serious impact on people's production and lives, and can cause hugelosses in lives and property security. Based on multi-source remote sensing data, this study establisheddecision tree c...Flood disasters can have a serious impact on people's production and lives, and can cause hugelosses in lives and property security. Based on multi-source remote sensing data, this study establisheddecision tree classification rules through multi-source and multi-temporal feature fusion, classified groundobjects before the disaster and extracted flood information in the disaster area based on optical imagesduring the disaster, so as to achieve rapid acquisition of the disaster situation of each disaster bearing object.In the case of Qianliang Lake, which suffered from flooding in 2020, the results show that decision treeclassification algorithms based on multi-temporal features can effectively integrate multi-temporal and multispectralinformation to overcome the shortcomings of single-temporal image classification and achieveground-truth object classification.展开更多
Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often stru...Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.展开更多
Satellite images are considered reliable data that preserve land cover information. In the field of remote sensing, these images allow relevant analyses of changes in space over time through the use of computer tools....Satellite images are considered reliable data that preserve land cover information. In the field of remote sensing, these images allow relevant analyses of changes in space over time through the use of computer tools. In this study, we have applied the “discriminant” change detection algorithm. In this, we have verified its effectiveness in multi-temporal studies. Also, we have determined the change in forest dynamics in the Ikongo district of Madagascar between 2000 and 2015. During the treatments, we have used the Landsat TM satellite images for the years 2000, 2005 and 2010 as well as ETM+ for 2015. Thus, analyses carried out have allowed us to note that between 2000-2005, 1.4% of natural forest disappeared. And, between 2005-2010, forests degradation<span><span><span style="font-family:;" "=""> </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">was 1.8%. Also, between 2010-2015, about 0.5% of the natural forest conserved in 2010 disappeared. Furthermore, we have found that the discriminant algorithm is considerably efficient in terms of monitoring the dynamics of forest cover change.</span></span></span>展开更多
Colombo port and Hambantota port in Sri Lanka play a key role in transiting and supporting the shipping trade of "the 21 st-Century Maritime Silk Road". In recent years, Chinese enterprises have made huge investment...Colombo port and Hambantota port in Sri Lanka play a key role in transiting and supporting the shipping trade of "the 21 st-Century Maritime Silk Road". In recent years, Chinese enterprises have made huge investments in the infrastructure construction of Colombo port and Hambantota port. The construction progress and development trend of Colombo port and Hambantota port have been attracting the attention of Chinese investment enterprises and the society. In this paper, multi-temporal high spatial resolution remote sensing images are used to monitor the infrastructure construction condition of Colombo port and Hambantota port from 2010 to 2017. According to the interpreted infrastructure information of the two ports, the international container terminal of Colombo and Hambantota port have completed their constructions. By the end of 2017, the international container terminal of Colombo built the container yards with 28.8 ha and roads with 32.6 ha. At the south of the international container terminal of Colombo, the 62.2 ha of reclamation area were built for the planned port city. In Hambantota port, 77 ha of container yards, 48 ha of roads and 2.9 ha of oil storage areas were constructed during this period. Meanwhile, the analysis of potential storage capacity of Colombo port and Hambantota port shows that the throughput of Colombo port may increase by 3 million tons per year while the throughput of Hambantota port will be over its designed 2.5 million tons per year. These analysis results are able to provide a useful reference for Chinese investment enterprises and the related research of "the Belt and Road".展开更多
The use of unmanned aerial vehicles(UAV)for forest monitoring has grown significantly in recent years,providing information with high spatial resolution and temporal versatility.UAV with multispectral sensors allow th...The use of unmanned aerial vehicles(UAV)for forest monitoring has grown significantly in recent years,providing information with high spatial resolution and temporal versatility.UAV with multispectral sensors allow the use of indexes such as the normalized difference vegetation index(NDVI),which determines the vigor,physiological stress and photo synthetic activity of vegetation.This study aimed to analyze the spectral responses and variations of NDVI in tree crowns,as well as their correlation with climatic factors over the course of one year.The study area encompassed a 1.6-ha site in Durango,Mexico,where Pinus cembroides,Pinus engelmannii,and Quercus grisea coexist.Multispectral images were acquired with UAV and information on meteorological variables was obtained from NASA/POWER database.An ANOVA explored possible differences in NDVI among the three species.Pearson correlation was performed to identify the linear relationship between NDVI and meteorological variables.Significant differences in NDVI values were found at the genus level(Pinus and Quercus),possibly related to the physiological features of the species and their phenology.Quercus grisea had the lowest NDVI values throughout the year which may be attributed to its sensitivity to relative humidity and temperatures.Although the use of UAV with a multispectral sensor for NDVI monitoring allowed genera differentiation,in more complex forest analyses hyperspectral and LiDAR sensors should be integrated,as well other vegetation indexes be considered.展开更多
Constrained by complex imaging mechanism and extraordinary visual appearance,change detection with synthetic aperture radar(SAR)images has been a difficult research topic,especially in urban areas.Although existing st...Constrained by complex imaging mechanism and extraordinary visual appearance,change detection with synthetic aperture radar(SAR)images has been a difficult research topic,especially in urban areas.Although existing studies have extended from bi-temporal data pair to multi-temporal datasets to derive more plentiful information,there are still two problems to be solved in practical applications.First,change indicators constructed from incoherent feature only cannot characterize the change objects accurately.Second,the results of pixel-level methods are usually presented in the form of the noisy binary map,making the spatial change not intuitive and the temporal change of a single pixel meaningless.In this study,we propose an unsupervised man-made objects change detection framework using both coherent and incoherent features derived from multi-temporal SAR images.The coefficients of variation in timeseries incoherent features and the man-made object index(MOI)defined with coherent features are first combined to identify the initial change pixels.Afterwards,an improved spatiotemporal clustering algorithm is developed based on density-based spatial clustering of applications with noise(DBSCAN)and dynamic time warping(DTW),which can transform the initial results into noiseless object-level patches,and take the cluster center as a representative of the man-made object to determine the change pattern of each patch.An experiment with a stack of 10 TerraSAR-X images in Stripmap mode demonstrated that this method is effective in urban scenes and has the potential applicability to wide area change detection.展开更多
As a consumed and influential natural plant beverage,tea is widely planted in subtropical and tropical areas all over the world.Affected by(sub)tropical climate characteristics,the underlying surface of the tea distri...As a consumed and influential natural plant beverage,tea is widely planted in subtropical and tropical areas all over the world.Affected by(sub)tropical climate characteristics,the underlying surface of the tea distribution area is extremely complex,with a variety of vegetation types.In addition,tea distribution is scattered and fragmentized in most of China.Therefore,it is difficult to obtain accurate tea information based on coarse resolution remote sensing data and existing feature extraction methods.This study proposed a boundary-enhanced,object-oriented random forest method on the basis of high-resolution GF-2 and multi-temporal Sentinel-2 data.This method uses multispectral indexes,textures,vegetable indices,and variation characteristics of time-series NDVI from the multi-temporal Sentinel-2 imageries to obtain abundant features related to the growth of tea plantations.To reduce feature redundancy and computation time,the feature elimination algorithm based on Mean Decrease Accuracy(MDA)was used to generate the optimal feature set.Considering the serious boundary inconsistency problem caused by the complex and fragmented land cover types,high resolution GF-2 image was segmented based on the MultiResolution Segmentation(MRS)algorithm to assist the segmentation of Sentinel-2,which contributes to delineating meaningful objects and enhancing the reliability of the boundary for tea plantations.Finally,the object-oriented random forest method was utilized to extract the tea information based on the optimal feature combination in the Jingmai Mountain,Yunnan Province.The resulting tea plantation map had high accuracy,with a 95.38%overall accuracy and 0.91 kappa coefficient.We conclude that the proposed method is effective for mapping tea plantations in high heterogeneity mountainous areas and has the potential for mapping tea plantations in large areas.展开更多
AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited...AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited adaptive histogram equalization(NICLAHE)to improve retinal image contrast was suggested to aid in the accurate identification of retinal disorders and improve the visibility of fine retinal structures.Additionally,a minimal-order filter was applied to effectively denoise the images without compromising important retinal structures.The novel NICLAHE algorithm was inspired by the classical CLAHE algorithm,but enhanced it by selecting the clip limits and tile sized in a dynamical manner relative to the pixel values in an image as opposed to using fixed values.It was evaluated on the Drive and high-resolution fundus(HRF)datasets on conventional quality measures.RESULTS:The new proposed preprocessing technique was applied to two retinal image databases,Drive and HRF,with four quality metrics being,root mean square error(RMSE),peak signal to noise ratio(PSNR),root mean square contrast(RMSC),and overall contrast.The technique performed superiorly on both the data sets as compared to the traditional enhancement methods.In order to assess the compatibility of the method with automated diagnosis,a deep learning framework named ResNet was applied in the segmentation of retinal blood vessels.Sensitivity,specificity,precision and accuracy were used to analyse the performance.NICLAHE–enhanced images outperformed the traditional techniques on both the datasets with improved accuracy.CONCLUSION:NICLAHE provides better results than traditional methods with less error and improved contrastrelated values.These enhanced images are subsequently measured by sensitivity,specificity,precision,and accuracy,which yield a better result in both datasets.展开更多
The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is i...The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is imperative to ensure thorough tumor resection.In this study,we integrate Raman imaging technology with an artificial intelligence(AI)generative model,proposing an innovative approach for intraoperative margin status diagnosis.This method utilizes Raman imaging to swiftly and non-invasively capture tissue Raman images,which are then transformed into hematoxylin-eosin(H&E)-stained histopathological images using an AI generative model for histopathological diagnosis.The generated H&E-stained images clearly illustrate the tissue’s pathological conditions.Independently reviewed by three pathologists,the overall diagnostic accuracy for distinguishing between tumor tissue and normal muscle tissue reaches 86.7%.Notably,it outperforms current clinical practices,especially in TSCC with positive lymph node metastasis or moderately differentiated grades.This advancement highlights the potential of AI-enhanced Raman imaging to significantly improve intraoperative assessments and surgical margin evaluations,promising a versatile diagnostic tool beyond TSCC.展开更多
The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photograp...The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.展开更多
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f...The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.展开更多
Breast cancer is one of the major causes of deaths in women.However,the early diagnosis is important for screening and control the mortality rate.Thus for the diagnosis of breast cancer at the early stage,a computer-a...Breast cancer is one of the major causes of deaths in women.However,the early diagnosis is important for screening and control the mortality rate.Thus for the diagnosis of breast cancer at the early stage,a computer-aided diagnosis system is highly required.Ultrasound is an important examination technique for breast cancer diagnosis due to its low cost.Recently,many learning-based techniques have been introduced to classify breast cancer using breast ultrasound imaging dataset(BUSI)datasets;however,the manual handling is not an easy process and time consuming.The authors propose an EfficientNet-integrated ResNet deep network and XAI-based framework for accurately classifying breast cancer(malignant and benign).In the initial step,data augmentation is performed to increase the number of training samples.For this purpose,three-pixel flip mathematical equations are introduced:horizontal,vertical,and 90°.Later,two pretrained deep learning models were employed,skipped some layers,and fine-tuned.Both fine-tuned models are later trained using a deep transfer learning process and extracted features from the deeper layer.Explainable artificial intelligence-based analysed the performance of trained models.After that,a new feature selection technique is proposed based on the cuckoo search algorithm called cuckoo search controlled standard error mean.This technique selects the best features and fuses using a new parallel zeropadding maximum correlated coefficient features.In the end,the selection algorithm is applied again to the fused feature vector and classified using machine learning algorithms.The experimental process of the proposed framework is conducted on a publicly available BUSI and obtained 98.4%and 98%accuracy in two different experiments.Comparing the proposed framework is also conducted with recent techniques and shows improved accuracy.In addition,the proposed framework was executed less than the original deep learning models.展开更多
Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale who...Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.展开更多
Kinship verification is a key biometric recognition task that determines biological relationships based on physical features.Traditional methods predominantly use facial recognition,leveraging established techniques a...Kinship verification is a key biometric recognition task that determines biological relationships based on physical features.Traditional methods predominantly use facial recognition,leveraging established techniques and extensive datasets.However,recent research has highlighted ear recognition as a promising alternative,offering advantages in robustness against variations in facial expressions,aging,and occlusions.Despite its potential,a significant challenge in ear-based kinship verification is the lack of large-scale datasets necessary for training deep learning models effectively.To address this challenge,we introduce the EarKinshipVN dataset,a novel and extensive collection of ear images designed specifically for kinship verification.This dataset consists of 4876 high-resolution color images from 157 multiracial families across different regions,forming 73,220 kinship pairs.EarKinshipVN,a diverse and large-scale dataset,advances kinship verification research using ear features.Furthermore,we propose the Mixer Attention Inception(MAI)model,an improved architecture that enhances feature extraction and classification accuracy.The MAI model fuses Inceptionv4 and MLP Mixer,integrating four attention mechanisms to enhance spatial and channel-wise feature representation.Experimental results demonstrate that MAI significantly outperforms traditional backbone architectures.It achieves an accuracy of 98.71%,surpassing Vision Transformer models while reducing computational complexity by up to 95%in parameter usage.These findings suggest that ear-based kinship verification,combined with an optimized deep learning model and a comprehensive dataset,holds significant promise for biometric applications.展开更多
In this paper,the spatio-temporal variation and propagation direction of coal fire were studied in the Jharia Coalfield(JCF),India during 2006–2015 through satellite-based night-time land surface temperature(LST)imag...In this paper,the spatio-temporal variation and propagation direction of coal fire were studied in the Jharia Coalfield(JCF),India during 2006–2015 through satellite-based night-time land surface temperature(LST)imaging.The LST was retrieved from Advanced Spaceborne Thermal Emission and Reflection Radiometer(ASTER)night-time thermal-infrared data by a robust split-window algorithm based on scene-specific regression coefficients,band-specific hybrid emissivity,and night-time atmospheric transmittance.The LST-profile-based coal fire detection algorithm was formulated through statistical analysis of the LST values along multiple transects across diverse coal fire locations in the JCF in order to compute date-specific threshold temperatures for separating thermally-anomalous and background pixels.This algorithm efficiently separates surface fire,subsurface fire,and thermally-anomalous transitional pixels.During the observation period,it was noticed that the coal fire area increased significantly,which resulted from new coal fire at many places owing to extensive opencast-mining operations.It was observed that the fire propagation occurred primarily along the dip direction of the coal seams.At places,lateral-propagation of limited spatial extent was also observed along the strike direction possibly due to spatial continuity of the coal seams along strike.Moreover,the opencast-mining activities carried out during 2009–2015 and the structurally weak planes facilitated the fire propagation.展开更多
基金partially supported by the National Natural Science Foundation of China(No.41171323)Jiangsu Provincial Natural Science Foundation(No.BK2012018)+2 种基金the Key Laboratory of Geo-Informatics of National Administration of Surveying,Mapping and Geoinformation of China(No.201109)partially supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)the Fundamental Research Funds for the Central Universities.
文摘Conventional change detection approaches are mainly based on per-pixel processing,which ignore the sub-pixel spectral variation resulted from spectral mixture.Especially for medium-resolution remote sensing images used in urban landcover change monitoring,land use/cover components within a single pixel are usually complicated and heterogeneous due to the limitation of the spatial resolution.Thus,traditional hard detection methods based on pure pixel assumption may lead to a high level of omission and commission errors inevitably,degrading the overall accuracy of change detection.In order to address this issue and find a possible way to exploit the spectral variation in a sub-pixel level,a novel change detection scheme is designed based on the spectral mixture analysis and decision-level fusion.Nonlinear spectral mixture model is selected for spectral unmixing,and change detection is implemented in a sub-pixel level by investigating the inner-pixel subtle changes and combining multiple composition evidences.The proposed method is tested on multi-temporal Landsat Thematic Mapper and China–Brazil Earth Resources Satellite remote sensing images for the land-cover change detection over urban areas.The effectiveness of the proposed approach is confirmed in terms of several accuracy indices in contrast with two pixel-based change detection methods(i.e.change vector analysis and principal component analysis-based method).In particular,the proposed sub-pixel change detection approach not only provides the binary change information,but also obtains the characterization about change direction and intensity,which greatly extends the semantic meaning of the detected change targets.
基金supported by the Guangdong Pharmaceutical University 2024 Higher Education Research Projects(GKP202403,GMP202402)the Guangdong Pharmaceutical University College Students’Innovation and Entrepreneurship Training Programs(Grant No.202504302033,202504302034,202504302036,and 202504302244).
文摘Background:Diabetic macular edema is a prevalent retinal condition and a leading cause of visual impairment among diabetic patients’Early detection of affected areas is beneficial for effective diagnosis and treatment.Traditionally,diagnosis relies on optical coherence tomography imaging technology interpreted by ophthalmologists.However,this manual image interpretation is often slow and subjective.Therefore,developing automated segmentation for macular edema images is essential to enhance to improve the diagnosis efficiency and accuracy.Methods:In order to improve clinical diagnostic efficiency and accuracy,we proposed a SegNet network structure integrated with a convolutional block attention module(CBAM).This network introduces a multi-scale input module,the CBAM attention mechanism,and jump connection.The multi-scale input module enhances the network’s perceptual capabilities,while the lightweight CBAM effectively fuses relevant features across channels and spatial dimensions,allowing for better learning of varying information levels.Results:Experimental results demonstrate that the proposed network achieves an IoU of 80.127%and an accuracy of 99.162%.Compared to the traditional segmentation network,this model has fewer parameters,faster training and testing speed,and superior performance on semantic segmentation tasks,indicating its highly practical applicability.Conclusion:The C-SegNet proposed in this study enables accurate segmentation of Diabetic macular edema lesion images,which facilitates quicker diagnosis for healthcare professionals.
基金funded by University of Transport and Communications(UTC)under grant number T2025-CN-004.
文摘Reversible data hiding(RDH)enables secret data embedding while preserving complete cover image recovery,making it crucial for applications requiring image integrity.The pixel value ordering(PVO)technique used in multi-stego images provides good image quality but often results in low embedding capability.To address these challenges,this paper proposes a high-capacity RDH scheme based on PVO that generates three stego images from a single cover image.The cover image is partitioned into non-overlapping blocks with pixels sorted in ascending order.Four secret bits are embedded into each block’s maximum pixel value,while three additional bits are embedded into the second-largest value when the pixel difference exceeds a predefined threshold.A similar embedding strategy is also applied to the minimum side of the block,including the second-smallest pixel value.This design enables each block to embed up to 14 bits of secret data.Experimental results demonstrate that the proposed method achieves significantly higher embedding capacity and improved visual quality compared to existing triple-stego RDH approaches,advancing the field of reversible steganography.
基金This study was supported by:Inner Mongolia Academy of Forestry Sciences Open Research Project(Grant No.KF2024MS03)The Project to Improve the Scientific Research Capacity of the Inner Mongolia Academy of Forestry Sciences(Grant No.2024NLTS04)The Innovation and Entrepreneurship Training Program for Undergraduates of Beijing Forestry University(Grant No.X202410022268).
文摘Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.
基金Knowledge Innovation Project of CAS No.KZCX02-308+1 种基金 The NASA Land Use and Land Cover Change Program No.NAG5-11160
文摘This study conducted computer-aided image analysis of land use and land cover in Xilin River Basin, Inner Mongolia, using 4 sets of Landsat TM/ETM+ images acquired on July 31, 1987, August 11, 1991, September 27, 1997 and May 23, 2000, respectively. Primarily, 17 sub-class land cover types were recognized, including nine grassland types at community level: F.sibiricum steppe, S.baicalensis steppe, A.chinensis+ forbs steppe, A.chinensis+ bunchgrass steppe, A.chinensis+ Ar.frigida steppe, S.grandis+ A.chinensis steppe, S.grandis+ bunchgrass steppe, S.krylavii steppe, Ar.frigida steppe and eight non-grassland types: active cropland, harvested cropland, urban area, wetland, desertified land, saline and alkaline land, cloud, water body + cloud shadow. To eliminate the classification error existing among different sub-types of the same gross type, the 17 sub-class land cover types were grouped into five gross types: meadow grassland, temperate grassland, desert grassland, cropland and non-grassland. The overall classification accuracy of the five land cover types was 81.0% for 1987, 81.7% for 1991, 80.1% for 1997 and 78.2% for 2000.
文摘Flood disasters can have a serious impact on people's production and lives, and can cause hugelosses in lives and property security. Based on multi-source remote sensing data, this study establisheddecision tree classification rules through multi-source and multi-temporal feature fusion, classified groundobjects before the disaster and extracted flood information in the disaster area based on optical imagesduring the disaster, so as to achieve rapid acquisition of the disaster situation of each disaster bearing object.In the case of Qianliang Lake, which suffered from flooding in 2020, the results show that decision treeclassification algorithms based on multi-temporal features can effectively integrate multi-temporal and multispectralinformation to overcome the shortcomings of single-temporal image classification and achieveground-truth object classification.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under grant No.(DGSSR-2025-02-01295).
文摘Alzheimer’s Disease(AD)is a progressive neurodegenerative disorder that significantly affects cognitive function,making early and accurate diagnosis essential.Traditional Deep Learning(DL)-based approaches often struggle with low-contrast MRI images,class imbalance,and suboptimal feature extraction.This paper develops a Hybrid DL system that unites MobileNetV2 with adaptive classification methods to boost Alzheimer’s diagnosis by processing MRI scans.Image enhancement is done using Contrast-Limited Adaptive Histogram Equalization(CLAHE)and Enhanced Super-Resolution Generative Adversarial Networks(ESRGAN).A classification robustness enhancement system integrates class weighting techniques and a Matthews Correlation Coefficient(MCC)-based evaluation method into the design.The trained and validated model gives a 98.88%accuracy rate and 0.9614 MCC score.We also performed a 10-fold cross-validation experiment with an average accuracy of 96.52%(±1.51),a loss of 0.1671,and an MCC score of 0.9429 across folds.The proposed framework outperforms the state-of-the-art models with a 98%weighted F1-score while decreasing misdiagnosis results for every AD stage.The model demonstrates apparent separation abilities between AD progression stages according to the results of the confusion matrix analysis.These results validate the effectiveness of hybrid DL models with adaptive preprocessing for early and reliable Alzheimer’s diagnosis,contributing to improved computer-aided diagnosis(CAD)systems in clinical practice.
文摘Satellite images are considered reliable data that preserve land cover information. In the field of remote sensing, these images allow relevant analyses of changes in space over time through the use of computer tools. In this study, we have applied the “discriminant” change detection algorithm. In this, we have verified its effectiveness in multi-temporal studies. Also, we have determined the change in forest dynamics in the Ikongo district of Madagascar between 2000 and 2015. During the treatments, we have used the Landsat TM satellite images for the years 2000, 2005 and 2010 as well as ETM+ for 2015. Thus, analyses carried out have allowed us to note that between 2000-2005, 1.4% of natural forest disappeared. And, between 2005-2010, forests degradation<span><span><span style="font-family:;" "=""> </span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">was 1.8%. Also, between 2010-2015, about 0.5% of the natural forest conserved in 2010 disappeared. Furthermore, we have found that the discriminant algorithm is considerably efficient in terms of monitoring the dynamics of forest cover change.</span></span></span>
基金Key Program of Chinese Academy of Sciences,No.ZDRW-ZS-2016-6-3-4Strategic Priority Research Program of Chinese Academy of Sciences,No.XDA20030302
文摘Colombo port and Hambantota port in Sri Lanka play a key role in transiting and supporting the shipping trade of "the 21 st-Century Maritime Silk Road". In recent years, Chinese enterprises have made huge investments in the infrastructure construction of Colombo port and Hambantota port. The construction progress and development trend of Colombo port and Hambantota port have been attracting the attention of Chinese investment enterprises and the society. In this paper, multi-temporal high spatial resolution remote sensing images are used to monitor the infrastructure construction condition of Colombo port and Hambantota port from 2010 to 2017. According to the interpreted infrastructure information of the two ports, the international container terminal of Colombo and Hambantota port have completed their constructions. By the end of 2017, the international container terminal of Colombo built the container yards with 28.8 ha and roads with 32.6 ha. At the south of the international container terminal of Colombo, the 62.2 ha of reclamation area were built for the planned port city. In Hambantota port, 77 ha of container yards, 48 ha of roads and 2.9 ha of oil storage areas were constructed during this period. Meanwhile, the analysis of potential storage capacity of Colombo port and Hambantota port shows that the throughput of Colombo port may increase by 3 million tons per year while the throughput of Hambantota port will be over its designed 2.5 million tons per year. These analysis results are able to provide a useful reference for Chinese investment enterprises and the related research of "the Belt and Road".
基金supported by the National Council of Science and Technology of Mexico(CONACyT),which provided financial support through scholarships for postgraduate studies to J.L.G.S.(815176)and M.R.C.(507523)。
文摘The use of unmanned aerial vehicles(UAV)for forest monitoring has grown significantly in recent years,providing information with high spatial resolution and temporal versatility.UAV with multispectral sensors allow the use of indexes such as the normalized difference vegetation index(NDVI),which determines the vigor,physiological stress and photo synthetic activity of vegetation.This study aimed to analyze the spectral responses and variations of NDVI in tree crowns,as well as their correlation with climatic factors over the course of one year.The study area encompassed a 1.6-ha site in Durango,Mexico,where Pinus cembroides,Pinus engelmannii,and Quercus grisea coexist.Multispectral images were acquired with UAV and information on meteorological variables was obtained from NASA/POWER database.An ANOVA explored possible differences in NDVI among the three species.Pearson correlation was performed to identify the linear relationship between NDVI and meteorological variables.Significant differences in NDVI values were found at the genus level(Pinus and Quercus),possibly related to the physiological features of the species and their phenology.Quercus grisea had the lowest NDVI values throughout the year which may be attributed to its sensitivity to relative humidity and temperatures.Although the use of UAV with a multispectral sensor for NDVI monitoring allowed genera differentiation,in more complex forest analyses hyperspectral and LiDAR sensors should be integrated,as well other vegetation indexes be considered.
基金supported by the National Natural Science Foundation of China(41774006)the Comparative Study of Geo-environment and Geohazards in the Yangtze River Delta and the Red River Delta Projectthe Shanghai Science and Technology Development Foundation(20dz1201200)。
文摘Constrained by complex imaging mechanism and extraordinary visual appearance,change detection with synthetic aperture radar(SAR)images has been a difficult research topic,especially in urban areas.Although existing studies have extended from bi-temporal data pair to multi-temporal datasets to derive more plentiful information,there are still two problems to be solved in practical applications.First,change indicators constructed from incoherent feature only cannot characterize the change objects accurately.Second,the results of pixel-level methods are usually presented in the form of the noisy binary map,making the spatial change not intuitive and the temporal change of a single pixel meaningless.In this study,we propose an unsupervised man-made objects change detection framework using both coherent and incoherent features derived from multi-temporal SAR images.The coefficients of variation in timeseries incoherent features and the man-made object index(MOI)defined with coherent features are first combined to identify the initial change pixels.Afterwards,an improved spatiotemporal clustering algorithm is developed based on density-based spatial clustering of applications with noise(DBSCAN)and dynamic time warping(DTW),which can transform the initial results into noiseless object-level patches,and take the cluster center as a representative of the man-made object to determine the change pattern of each patch.An experiment with a stack of 10 TerraSAR-X images in Stripmap mode demonstrated that this method is effective in urban scenes and has the potential applicability to wide area change detection.
基金National Natural Science Foundation of China(No.41830110)National Key Research Development Program of China(No.2018YFC1503603)+2 种基金Key Laboratory of Land Satellite Remote Sensing Application,Ministry of Natural Resources of the People’s Republic of China(No.KLSMNR-202106)Water Conservancy Science and Technology Project of Jiangsu Province,China(No.2020061)Natural Science Foundation of Jiangsu Province,China(No.20180779)。
文摘As a consumed and influential natural plant beverage,tea is widely planted in subtropical and tropical areas all over the world.Affected by(sub)tropical climate characteristics,the underlying surface of the tea distribution area is extremely complex,with a variety of vegetation types.In addition,tea distribution is scattered and fragmentized in most of China.Therefore,it is difficult to obtain accurate tea information based on coarse resolution remote sensing data and existing feature extraction methods.This study proposed a boundary-enhanced,object-oriented random forest method on the basis of high-resolution GF-2 and multi-temporal Sentinel-2 data.This method uses multispectral indexes,textures,vegetable indices,and variation characteristics of time-series NDVI from the multi-temporal Sentinel-2 imageries to obtain abundant features related to the growth of tea plantations.To reduce feature redundancy and computation time,the feature elimination algorithm based on Mean Decrease Accuracy(MDA)was used to generate the optimal feature set.Considering the serious boundary inconsistency problem caused by the complex and fragmented land cover types,high resolution GF-2 image was segmented based on the MultiResolution Segmentation(MRS)algorithm to assist the segmentation of Sentinel-2,which contributes to delineating meaningful objects and enhancing the reliability of the boundary for tea plantations.Finally,the object-oriented random forest method was utilized to extract the tea information based on the optimal feature combination in the Jingmai Mountain,Yunnan Province.The resulting tea plantation map had high accuracy,with a 95.38%overall accuracy and 0.91 kappa coefficient.We conclude that the proposed method is effective for mapping tea plantations in high heterogeneity mountainous areas and has the potential for mapping tea plantations in large areas.
文摘AIM:To find the effective contrast enhancement method on retinal images for effective segmentation of retinal features.METHODS:A novel image preprocessing method that used neighbourhood-based improved contrast limited adaptive histogram equalization(NICLAHE)to improve retinal image contrast was suggested to aid in the accurate identification of retinal disorders and improve the visibility of fine retinal structures.Additionally,a minimal-order filter was applied to effectively denoise the images without compromising important retinal structures.The novel NICLAHE algorithm was inspired by the classical CLAHE algorithm,but enhanced it by selecting the clip limits and tile sized in a dynamical manner relative to the pixel values in an image as opposed to using fixed values.It was evaluated on the Drive and high-resolution fundus(HRF)datasets on conventional quality measures.RESULTS:The new proposed preprocessing technique was applied to two retinal image databases,Drive and HRF,with four quality metrics being,root mean square error(RMSE),peak signal to noise ratio(PSNR),root mean square contrast(RMSC),and overall contrast.The technique performed superiorly on both the data sets as compared to the traditional enhancement methods.In order to assess the compatibility of the method with automated diagnosis,a deep learning framework named ResNet was applied in the segmentation of retinal blood vessels.Sensitivity,specificity,precision and accuracy were used to analyse the performance.NICLAHE–enhanced images outperformed the traditional techniques on both the datasets with improved accuracy.CONCLUSION:NICLAHE provides better results than traditional methods with less error and improved contrastrelated values.These enhanced images are subsequently measured by sensitivity,specificity,precision,and accuracy,which yield a better result in both datasets.
基金supported by the National Natural Science Foundation of China(Grant Nos.82272955 and 22203057)the Natural Science Foundation of Fujian Province(Grant No.2021J011361).
文摘The presence of a positive deep surgical margin in tongue squamous cell carcinoma(TSCC)significantly elevates the risk of local recurrence.Therefore,a prompt and precise intraoperative assessment of margin status is imperative to ensure thorough tumor resection.In this study,we integrate Raman imaging technology with an artificial intelligence(AI)generative model,proposing an innovative approach for intraoperative margin status diagnosis.This method utilizes Raman imaging to swiftly and non-invasively capture tissue Raman images,which are then transformed into hematoxylin-eosin(H&E)-stained histopathological images using an AI generative model for histopathological diagnosis.The generated H&E-stained images clearly illustrate the tissue’s pathological conditions.Independently reviewed by three pathologists,the overall diagnostic accuracy for distinguishing between tumor tissue and normal muscle tissue reaches 86.7%.Notably,it outperforms current clinical practices,especially in TSCC with positive lymph node metastasis or moderately differentiated grades.This advancement highlights the potential of AI-enhanced Raman imaging to significantly improve intraoperative assessments and surgical margin evaluations,promising a versatile diagnostic tool beyond TSCC.
文摘The application of deep learning for target detection in aerial images captured by Unmanned Aerial Vehicles(UAV)has emerged as a prominent research focus.Due to the considerable distance between UAVs and the photographed objects,coupled with complex shooting environments,existing models often struggle to achieve accurate real-time target detection.In this paper,a You Only Look Once v8(YOLOv8)model is modified from four aspects:the detection head,the up-sampling module,the feature extraction module,and the parameter optimization of positive sample screening,and the YOLO-S3DT model is proposed to improve the performance of the model for detecting small targets in aerial images.Experimental results show that all detection indexes of the proposed model are significantly improved without increasing the number of model parameters and with the limited growth of computation.Moreover,this model also has the best performance compared to other detecting models,demonstrating its advancement within this category of tasks.
基金Supported by the Henan Province Key Research and Development Project(231111211300)the Central Government of Henan Province Guides Local Science and Technology Development Funds(Z20231811005)+2 种基金Henan Province Key Research and Development Project(231111110100)Henan Provincial Outstanding Foreign Scientist Studio(GZS2024006)Henan Provincial Joint Fund for Scientific and Technological Research and Development Plan(Application and Overcoming Technical Barriers)(242103810028)。
文摘The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception.
文摘Breast cancer is one of the major causes of deaths in women.However,the early diagnosis is important for screening and control the mortality rate.Thus for the diagnosis of breast cancer at the early stage,a computer-aided diagnosis system is highly required.Ultrasound is an important examination technique for breast cancer diagnosis due to its low cost.Recently,many learning-based techniques have been introduced to classify breast cancer using breast ultrasound imaging dataset(BUSI)datasets;however,the manual handling is not an easy process and time consuming.The authors propose an EfficientNet-integrated ResNet deep network and XAI-based framework for accurately classifying breast cancer(malignant and benign).In the initial step,data augmentation is performed to increase the number of training samples.For this purpose,three-pixel flip mathematical equations are introduced:horizontal,vertical,and 90°.Later,two pretrained deep learning models were employed,skipped some layers,and fine-tuned.Both fine-tuned models are later trained using a deep transfer learning process and extracted features from the deeper layer.Explainable artificial intelligence-based analysed the performance of trained models.After that,a new feature selection technique is proposed based on the cuckoo search algorithm called cuckoo search controlled standard error mean.This technique selects the best features and fuses using a new parallel zeropadding maximum correlated coefficient features.In the end,the selection algorithm is applied again to the fused feature vector and classified using machine learning algorithms.The experimental process of the proposed framework is conducted on a publicly available BUSI and obtained 98.4%and 98%accuracy in two different experiments.Comparing the proposed framework is also conducted with recent techniques and shows improved accuracy.In addition,the proposed framework was executed less than the original deep learning models.
基金supported by the National Natural Science Foundation of China(No.82371933)the National Natural Science Foundation of Shandong Province of China(No.ZR2021MH120)+1 种基金the Taishan Scholars Project(No.tsqn202211378)the Shandong Provincial Natural Science Foundation for Excellent Young Scholars(No.ZR2024YQ075).
文摘Objective:Early predicting response before neoadjuvant chemotherapy(NAC)is crucial for personalized treatment plans for locally advanced breast cancer patients.We aim to develop a multi-task model using multiscale whole slide images(WSIs)features to predict the response to breast cancer NAC more finely.Methods:This work collected 1,670 whole slide images for training and validation sets,internal testing sets,external testing sets,and prospective testing sets of the weakly-supervised deep learning-based multi-task model(DLMM)in predicting treatment response and pCR to NAC.Our approach models two-by-two feature interactions across scales by employing concatenate fusion of single-scale feature representations,and controls the expressiveness of each representation via a gating-based attention mechanism.Results:In the retrospective analysis,DLMM exhibited excellent predictive performance for the prediction of treatment response,with area under the receiver operating characteristic curves(AUCs)of 0.869[95%confidence interval(95%CI):0.806−0.933]in the internal testing set and 0.841(95%CI:0.814−0.867)in the external testing sets.For the pCR prediction task,DLMM reached AUCs of 0.865(95%CI:0.763−0.964)in the internal testing and 0.821(95%CI:0.763−0.878)in the pooled external testing set.In the prospective testing study,DLMM also demonstrated favorable predictive performance,with AUCs of 0.829(95%CI:0.754−0.903)and 0.821(95%CI:0.692−0.949)in treatment response and pCR prediction,respectively.DLMM significantly outperformed the baseline models in all testing sets(P<0.05).Heatmaps were employed to interpret the decision-making basis of the model.Furthermore,it was discovered that high DLMM scores were associated with immune-related pathways and cells in the microenvironment during biological basis exploration.Conclusions:The DLMM represents a valuable tool that aids clinicians in selecting personalized treatment strategies for breast cancer patients.
文摘Kinship verification is a key biometric recognition task that determines biological relationships based on physical features.Traditional methods predominantly use facial recognition,leveraging established techniques and extensive datasets.However,recent research has highlighted ear recognition as a promising alternative,offering advantages in robustness against variations in facial expressions,aging,and occlusions.Despite its potential,a significant challenge in ear-based kinship verification is the lack of large-scale datasets necessary for training deep learning models effectively.To address this challenge,we introduce the EarKinshipVN dataset,a novel and extensive collection of ear images designed specifically for kinship verification.This dataset consists of 4876 high-resolution color images from 157 multiracial families across different regions,forming 73,220 kinship pairs.EarKinshipVN,a diverse and large-scale dataset,advances kinship verification research using ear features.Furthermore,we propose the Mixer Attention Inception(MAI)model,an improved architecture that enhances feature extraction and classification accuracy.The MAI model fuses Inceptionv4 and MLP Mixer,integrating four attention mechanisms to enhance spatial and channel-wise feature representation.Experimental results demonstrate that MAI significantly outperforms traditional backbone architectures.It achieves an accuracy of 98.71%,surpassing Vision Transformer models while reducing computational complexity by up to 95%in parameter usage.These findings suggest that ear-based kinship verification,combined with an optimized deep learning model and a comprehensive dataset,holds significant promise for biometric applications.
文摘In this paper,the spatio-temporal variation and propagation direction of coal fire were studied in the Jharia Coalfield(JCF),India during 2006–2015 through satellite-based night-time land surface temperature(LST)imaging.The LST was retrieved from Advanced Spaceborne Thermal Emission and Reflection Radiometer(ASTER)night-time thermal-infrared data by a robust split-window algorithm based on scene-specific regression coefficients,band-specific hybrid emissivity,and night-time atmospheric transmittance.The LST-profile-based coal fire detection algorithm was formulated through statistical analysis of the LST values along multiple transects across diverse coal fire locations in the JCF in order to compute date-specific threshold temperatures for separating thermally-anomalous and background pixels.This algorithm efficiently separates surface fire,subsurface fire,and thermally-anomalous transitional pixels.During the observation period,it was noticed that the coal fire area increased significantly,which resulted from new coal fire at many places owing to extensive opencast-mining operations.It was observed that the fire propagation occurred primarily along the dip direction of the coal seams.At places,lateral-propagation of limited spatial extent was also observed along the strike direction possibly due to spatial continuity of the coal seams along strike.Moreover,the opencast-mining activities carried out during 2009–2015 and the structurally weak planes facilitated the fire propagation.