Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status...Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.展开更多
BACKGROUND Congestive hepatopathy,also known as nutmeg liver,is liver damage secondary to chronic heart failure(HF).Its morphological characteristics in terms of medical imaging are not defined and remain unclear.AIM ...BACKGROUND Congestive hepatopathy,also known as nutmeg liver,is liver damage secondary to chronic heart failure(HF).Its morphological characteristics in terms of medical imaging are not defined and remain unclear.AIM To leverage machine learning to capture imaging features of congestive hepatopathy using incidentally acquired computed tomography(CT)scans.METHODS We retrospectively analyzed 179 chronic HF patients who underwent echocardiography and CT within one year.Right HF severity was classified into three grades.Liver CT images at the paraumbilical vein level were used to develop a ResNet-based machine learning model to predict tricuspid regurgitation(TR)severity.Model accuracy was compared with that of six gastroenterology and four radiology experts.RESULTS In the included patients,120 were male(mean age:73.1±14.4 years).The accuracy of the results predicting TR severity from a single CT image for the machine learning model was significantly higher than the average accuracy of the experts.The model was found to be exceptionally reliable for predicting severe TR.CONCLUSION Deep learning models,particularly those using ResNet architectures,can help identify morphological changes associated with TR severity,aiding in early liver dysfunction detection in patients with HF,thereby improving outcomes.展开更多
Corporate image is the external manifestation of a company’s cultural and spiritual essence,as well as the overall impression formed through its interactions with the public.Huawei,as a successful multinational enter...Corporate image is the external manifestation of a company’s cultural and spiritual essence,as well as the overall impression formed through its interactions with the public.Huawei,as a successful multinational enterprise,has established a robust corporate image in the international market through technological innovation and brand building.Moreover,Huawei’s development is closely aligned with national policies and strategies,making it a representative enterprise for showcasing China’s technological independence and national image.This study examines Huawei’s English press releases on product launches published between 2022 and 2024 and conducts a comparative analysis with similar materials from Apple’s official website.Based on Fairclough’s three-dimensional discourse analysis model,this research explores the linguistic features of Huawei’s corporate image construction from the perspectives of text,discourse practice,and social practice.The findings reveal that Huawei has successfully constructed a corporate image that emphasizes technological innovation,prioritizes user needs,and underscores its identity as a national enterprise.This study not only sheds light on Huawei’s strategies for image construction in international competition but also provides a valuable reference for Chinese enterprises in their cultural communication and brand building during the globalization process.展开更多
Applying visual grammar theory,this study examines representational,interactive,and compositional meanings of the giant panda in Western media cartoons related to China from 1999 to the present.Distinct phases in the ...Applying visual grammar theory,this study examines representational,interactive,and compositional meanings of the giant panda in Western media cartoons related to China from 1999 to the present.Distinct phases in the panda’s representation were identified and illustrated by cases of cartoons in major Western media.These phases trace shift of panda cartoon image from a symbol of peace and friendliness to a politicized emblem of China’s international stance.Key visual trends,such as transitivity,color symbolism,scale enlargement,and increasing compositional complexity,embody the panda’s role in shaping China’s global image and its function in international discourse.These trends reflect the panda’s transformation into a contested symbol,which mediates between China’s self-representation and Western perceptions of its geopolitical rise.By situating the analysis within the context of China’s growing global influence,this study contributes to visual and media studies,demonstrating how cultural symbols are recontextualized to reflect and shape geopolitical narratives.展开更多
Early correction of childhood malocclusion is timely managing morphological,structural,and functional abnormalities at different dentomaxillofacial developmental stages.The selection of appropriate imaging examination...Early correction of childhood malocclusion is timely managing morphological,structural,and functional abnormalities at different dentomaxillofacial developmental stages.The selection of appropriate imaging examination and comprehensive radiological diagnosis and analysis play an important role in early correction of childhood malocclusion.This expert consensus is a collaborative effort by multidisciplinary experts in dentistry across the nation based on the current clinical evidence,aiming to provide general guidance on appropriate imaging examination selection,comprehensive and accurate imaging assessment for early orthodontic treatment patients.展开更多
BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features ...BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features is crucial for early detection and appropriate treatment planning.AIM To retrospectively analyze the relationship between different pathological types of pancreatic cancer and their corresponding imaging features.METHODS We retrospectively analyzed the data of 500 patients diagnosed with pancreatic cancer between January 2010 and December 2020 at our institution.Pathological types were determined by histopathological examination of the surgical spe-cimens or biopsy samples.The imaging features were assessed using computed tomography,magnetic resonance imaging,and endoscopic ultrasound.Statistical analyses were performed to identify significant associations between pathological types and specific imaging characteristics.RESULTS There were 320(64%)cases of pancreatic ductal adenocarcinoma,75(15%)of intraductal papillary mucinous neoplasms,50(10%)of neuroendocrine tumors,and 55(11%)of other rare types.Distinct imaging features were identified in each pathological type.Pancreatic ductal adenocarcinoma typically presents as a hypodense mass with poorly defined borders on computed tomography,whereas intraductal papillary mucinous neoplasms present as characteristic cystic lesions with mural nodules.Neuroendocrine tumors often appear as hypervascular lesions in contrast-enhanced imaging.Statistical analysis revealed significant correlations between specific imaging features and pathological types(P<0.001).CONCLUSION This study demonstrated a strong association between the pathological types of pancreatic cancer and imaging features.These findings can enhance the accuracy of noninvasive diagnosis and guide personalized treatment approaches.展开更多
Confocal laser endomicroscopy(CLE)has become an indispensable tool in the diagnosis and detection of gastrointestinal(GI)diseases due to its high-resolution and high-contrast imaging capabilities.However,the early-sta...Confocal laser endomicroscopy(CLE)has become an indispensable tool in the diagnosis and detection of gastrointestinal(GI)diseases due to its high-resolution and high-contrast imaging capabilities.However,the early-stage imaging changes of gastrointestinal disorders are often subtle,and traditional medical image analysis methods rely heavily on manual interpretation,which is time-consuming,subject to observer variability,and inefficient for accurate lesion identification across large-scale image datasets.With the introduction of artificial intelligence(AI)technologies,AI-driven CLE image analysis systems can automatically extract pathological features and have demonstrated significant clinical value in lesion recognition,classification diagnosis,and malignancy prediction of GI diseases.These systems greatly enhance diagnostic efficiency and early detection capabilities.This review summarizes the applications of AI-assisted CLE in GI diseases,analyzes the limitations of current technologies,and explores future research directions.It is expected that the deep integration of AI and confocal imaging technologies will provide strong support for precision diagnosis and personalized treatment in the field of gastrointestinal disorders.展开更多
Algal blooms,the spread of algae on the surface of water bodies,have adverse effects not only on aquatic ecosystems but also on human life.The adverse effects of harmful algal blooms(HABs)necessitate a convenient solu...Algal blooms,the spread of algae on the surface of water bodies,have adverse effects not only on aquatic ecosystems but also on human life.The adverse effects of harmful algal blooms(HABs)necessitate a convenient solution for detection and monitoring.Unmanned aerial vehicles(UAVs)have recently emerged as a tool for algal bloom detection,efficiently providing on-demand images at high spatiotemporal resolutions.This study developed an image processing method for algal bloom area estimation from the aerial images(obtained from the internet)captured using UAVs.As a remote sensing method of HAB detection,analysis,and monitoring,a combination of histogram and texture analyses was used to efficiently estimate the area of HABs.Statistical features like entropy(using the Kullback-Leibler method)were emphasized with the aid of a gray-level co-occurrence matrix.The results showed that the orthogonal images demonstrated fewer errors,and the morphological filter best detected algal blooms in real time,with a precision of 80%.This study provided efficient image processing approaches using on-board UAVs for HAB monitoring.展开更多
BACKGROUND Subchorionic hematoma(SCH)is a common complication in early pregnancy characterized by the accumulation of blood between the uterine wall and the chorionic membrane.SCH can lead to adverse pregnancy outcome...BACKGROUND Subchorionic hematoma(SCH)is a common complication in early pregnancy characterized by the accumulation of blood between the uterine wall and the chorionic membrane.SCH can lead to adverse pregnancy outcomes such as miscarriage,preterm birth,and other complications.Early detection and accurate assessment of SCH are crucial for appropriate management and improved pregnancy outcomes.AIM To evaluate the diagnostic efficacy of virtual organ computer-assisted analysis(VOCAL)in measuring the volume ratio of SCH to gestational sac(GS)combined with serum progesterone on early pregnancy outcomes in patients with SCH.METHODS A total of 153 patients with SCH in their first-trimester pregnancies between 6 and 11 wk were enrolled.All patients were followed up until a gestational age of 20 wk.The parameters of transvaginal two-dimensional ultrasound,including the circumference of SCH(Cs),surface area of SCH(Ss),circumference of GS(Cg),and surface area of GS(Sg),and the parameters of VOCAL with transvaginal three-dimensional ultrasound,including the three-dimensional volume of SCH(3DVs)and GS(3DVg),were recorded.The size of the SCH and its ratio to the GS size(Cs/Cg,Ss/Sg,3DVs/3DVg)were recorded and compared.RESULTS Compared with those in the normal pregnancy group,the adverse pregnancy group had higher Cs/Cg,Ss/Sg,and 3DVs/3DVg ratios(P<0.05).When 3DVs/3DVg was 0.220,the highest predictive performance predicted adverse pregnancy outcomes,resulting in an AUC of 0.767,and the sensitivity,specificity were 70.2%,75%respectively.VOCAL measuring 3DVs/3DVg combined with serum progesterone gave a diagnostic AUC of 0.824 for early pregnancy outcome in SCH patients,with a high sensitivity of 82.1%and a specificity of 72.1%,which showed a significant difference between AUC.CONCLUSION VOCAL-measured 3DVs/3DVg effectively quantifies the severity of SCH,while combined serum progesterone better predicts adverse pregnancy outcomes.展开更多
In the context of the accelerated pace of daily life and the development of e-commerce,online shopping is a mainstreamway for consumers to access products and services.To understand their emotional expressions in faci...In the context of the accelerated pace of daily life and the development of e-commerce,online shopping is a mainstreamway for consumers to access products and services.To understand their emotional expressions in facing different shopping experience scenarios,this paper presents a sentiment analysis method that combines the ecommerce reviewkeyword-generated imagewith a hybrid machine learning-basedmodel,inwhich theWord2Vec-TextRank is used to extract keywords that act as the inputs for generating the related images by generative Artificial Intelligence(AI).Subsequently,a hybrid Convolutional Neural Network and Support Vector Machine(CNNSVM)model is applied for sentiment classification of those keyword-generated images.For method validation,the data randomly comprised of 5000 reviews from Amazon have been analyzed.With superior keyword extraction capability,the proposedmethod achieves impressive results on sentiment classification with a remarkable accuracy of up to 97.13%.Such performance demonstrates its advantages by using the text-to-image approach,providing a unique perspective for sentiment analysis in the e-commerce review data compared to the existing works.Thus,the proposed method enhances the reliability and insights of customer feedback surveys,which would also establish a novel direction in similar cases,such as social media monitoring and market trend research.展开更多
The Ki67 index (KI) is a standard clinical marker for tumor proliferation;however, its application is hindered by intratumoral heterogeneity. In this study, we used digital image analysis to comprehensively analyze Ki...The Ki67 index (KI) is a standard clinical marker for tumor proliferation;however, its application is hindered by intratumoral heterogeneity. In this study, we used digital image analysis to comprehensively analyze Ki67 heterogeneity and distribution patterns in breast carcinoma. Using Smart Pathology software, we digitized and analyzed 42 excised breast carcinoma Ki67 slides. Boxplots, histograms, and heat maps were generated to illustrate the KI distribution. We found that 30% of cases (13/42) exhibited discrepancies between global and hotspot KI when using a 14% KI threshold for classification. Patients with higher global or hotspot KI values displayed greater heterogenicity. Ki67 distribution patterns were categorized as randomly distributed (52%, 22/42), peripheral (43%, 18/42), and centered (5%, 2/42). Our sampling simulator indicated analyzing more than 10 high-power fields was typically required to accurately estimate global KI, with sampling size being correlated with heterogeneity. In conclusion, using digital image analysis in whole-slide images allows for comprehensive Ki67 profile assessment, shedding light on heterogeneity and distribution patterns. This spatial information can facilitate KI surveys of breast cancer and other malignancies.展开更多
Objective To analyze the differences in the correlation of tongue image indicators among patients with benign lung nodules and lung cancer.Methods From July 1;2020 to March 31;2022;clinical information of lung cancer ...Objective To analyze the differences in the correlation of tongue image indicators among patients with benign lung nodules and lung cancer.Methods From July 1;2020 to March 31;2022;clinical information of lung cancer patients and benign lung nodules patients was collected at the Oncology Department of Longhua Hos-pital Affiliated to Shanghai University of Traditional Chinese Medicine and the Physical Ex-amination Center of Shuguang Hospital Affiliated to Shanghai University of Traditional Chi-nese Medicine;respectively.We obtained tongue images from patients with benign lung nod-ules and lung cancer using the TFDA-1 digital tongue diagnosis instrument;and analyzed these images with the TDAS V2.0 software.The extracted indicators included color space pa-rameters in the Lab system for both the tongue body(TB)and tongue coating(TC)(TB/TC-L;TB/TC-a;and TB/TC-b);textural parameters[TB/TC-contrast(CON);TB/TC-angular second moment(ASM);TB/TC-entropy(ENT);and TB/TC-MEAN];as well as TC parameters(perAll and perPart).The bivariate correlation of TB and TC features was analyzed using Pearson’s or Spearman’s correlation analysis;and the overall correlation was analyzed using canonical correlation analysis(CCA).Results Samples from 307 patients with benign lung nodules and 276 lung cancer patients were included after excluding outliers and extreme values.Simple correlation analysis indi-cated that the correlation of TB-L with TC-L;TB-b with TC-b;and TB-b with perAll in lung cancer group was higher than that in benign nodules group.Moreover;the correlation of TB-a with TC-a;TB-a with perAll;and the texture parameters of the TB(TB-CON;TB-ASM;TB-ENT;and TB-MEAN)with the texture parameters of the TC(TC-CON;TC-ASM;TC-ENT;and TC-MEAN)in benign nodules group was higher than lung cancer group.CCA further demon-strated a strong correlation between the TB and TC parameters in lung cancer group;with the first and second pairs of typical variables in benign nodules and lung cancer groups indicat-ing correlation coefficients of 0.918 and 0.817(P<0.05);and 0.940 and 0.822(P<0.05);re-spectively.Conclusion Benign lung nodules and lung cancer patients exhibited differences in correla-tion in the L;a;and b values of the TB and TC;as well as the perAll value of the TC;and the texture parameters(TB/TC-CON;TB/TC-ASM;TB/TC-ENT;and TB/TC-MEAN)between the TB and TC.Additionally;there were differences in the overall correlation of the TB and TC be-tween the two groups.Objective tongue diagnosis indicators can effectively assist in the diag-nosis of benign lung nodules and lung cancer;thereby providing a scientific basis for the ear-ly detection;diagnosis;and treatment of lung cancer.展开更多
Recognizing the variation of genetic resources is the first step in selection.One of the most important variations in grain crops is the uniformity of seed grain weight,which can be converted into seed size.However,it...Recognizing the variation of genetic resources is the first step in selection.One of the most important variations in grain crops is the uniformity of seed grain weight,which can be converted into seed size.However,it has been challenging since it needs high labor costs and time to measure it on a large scale.The current study used an image analysis technique to measure the grain seed area of about 100 seeds per accession with 64 germplasm of Tartary buckwheat(Fagopyrum tataricum)to study variation among and within them.To understand the nature of variation,skewness and kurtosis analysis of probability density function curve for seed area were used.As a result,a large variation among and within accessions was found.This means that the seed sizes within an accession are not uniform in this given cleistogamous species due to its non-uniform flowering time.This implies that the seed size should be considered an important factor for the germplasm enhancement program.展开更多
Methods and procedures of three-dimensional (3D) characterization of the pore structure features in the packed ore particle bed are focused. X-ray computed tomography was applied to deriving the cross-sectional imag...Methods and procedures of three-dimensional (3D) characterization of the pore structure features in the packed ore particle bed are focused. X-ray computed tomography was applied to deriving the cross-sectional images of specimens with single particle size of 1-2, 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10 ram. Based on the in-house developed 3D image analysis programs using Matlab, the volume porosity, pore size distribution and degree of connectivity were calculated and analyzed in detail. The results indicate that the volume porosity, the mean diameter of pores and the effective pore size (d50) increase with the increasing of particle size. Lognormal distribution or Gauss distribution is mostly suitable to model the pore size distribution. The degree of connectivity investigated on the basis of cluster-labeling algorithm also increases with increasing the particle size approximately.展开更多
To develop a quick, accurate and antinoise automated image registration technique for infrared images, the wavelet analysis technique was used to extract the feature points in two images followed by the compensation f...To develop a quick, accurate and antinoise automated image registration technique for infrared images, the wavelet analysis technique was used to extract the feature points in two images followed by the compensation for input image with angle difference between them. A hi erarchical feature matching algorithm was adopted to get the final transform parameters between the two images. The simulation results for two infrared images show that the method can effectively, quickly and accurately register images and be antinoise to some extent.展开更多
The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify sp...Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.展开更多
Image segmentation is attracting increasing attention in the field of medical image analysis.Since widespread utilization across various medical applications,ensuring and improving segmentation accuracy has become a c...Image segmentation is attracting increasing attention in the field of medical image analysis.Since widespread utilization across various medical applications,ensuring and improving segmentation accuracy has become a crucial topic of research.With advances in deep learning,researchers have developed numerous methods that combine Transformers and convolutional neural networks(CNNs)to create highly accurate models for medical image segmentation.However,efforts to further enhance accuracy by developing larger and more complex models or training with more extensive datasets,significantly increase computational resource consumption.To address this problem,we propose BiCLIP-nnFormer(the prefix"Bi"refers to the use of two distinct CLIP models),a virtual multimodal instrument that leverages CLIP models to enhance the segmentation performance of a medical segmentation model nnFormer.Since two CLIP models(PMC-CLIP and CoCa-CLIP)are pre-trained on large datasets,they do not require additional training,thus conserving computation resources.These models are used offline to extract image and text embeddings from medical images.These embeddings are then processed by the proposed 3D CLIP adapter,which adapts the CLIP knowledge for segmentation tasks by fine-tuning.Finally,the adapted embeddings are fused with feature maps extracted from the nnFormer encoder for generating predicted masks.This process enriches the representation capabilities of the feature maps by integrating global multimodal information,leading to more precise segmentation predictions.We demonstrate the superiority of BiCLIP-nnFormer and the effectiveness of using CLIP models to enhance nnFormer through experiments on two public datasets,namely the Synapse multi-organ segmentation dataset(Synapse)and the Automatic Cardiac Diagnosis Challenge dataset(ACDC),as well as a self-annotated lung multi-category segmentation dataset(LMCS).展开更多
The growing spectrum of Generative Adversarial Network (GAN) applications in medical imaging, cyber security, data augmentation, and the field of remote sensing tasks necessitate a sharp spike in the criticality of re...The growing spectrum of Generative Adversarial Network (GAN) applications in medical imaging, cyber security, data augmentation, and the field of remote sensing tasks necessitate a sharp spike in the criticality of review of Generative Adversarial Networks. Earlier reviews that targeted reviewing certain architecture of the GAN or emphasizing a specific application-oriented area have done so in a narrow spirit and lacked the systematic comparative analysis of the models’ performance metrics. Numerous reviews do not apply standardized frameworks, showing gaps in the efficiency evaluation of GANs, training stability, and suitability for specific tasks. In this work, a systemic review of GAN models using the PRISMA framework is developed in detail to fill the gap by structurally evaluating GAN architectures. A wide variety of GAN models have been discussed in this review, starting from the basic Conditional GAN, Wasserstein GAN, and Deep Convolutional GAN, and have gone down to many specialized models, such as EVAGAN, FCGAN, and SIF-GAN, for different applications across various domains like fault diagnosis, network security, medical imaging, and image segmentation. The PRISMA methodology systematically filters relevant studies by inclusion and exclusion criteria to ensure transparency and replicability in the review process. Hence, all models are assessed relative to specific performance metrics such as accuracy, stability, and computational efficiency. There are multiple benefits to using the PRISMA approach in this setup. Not only does this help in finding optimal models suitable for various applications, but it also provides an explicit framework for comparing GAN performance. In addition to this, diverse types of GAN are included to ensure a comprehensive view of the state-of-the-art techniques. This work is essential not only in terms of its result but also because it guides the direction of future research by pinpointing which types of applications require some GAN architectures, works to improve specific task model selection, and points out areas for further research on the development and application of GANs.展开更多
基金supported by the Deanship of Research and Graduate Studies at King Khalid University under Small Research Project grant number RGP1/139/45.
文摘Integrating multiple medical imaging techniques,including Magnetic Resonance Imaging(MRI),Computed Tomography,Positron Emission Tomography(PET),and ultrasound,provides a comprehensive view of the patient health status.Each of these methods contributes unique diagnostic insights,enhancing the overall assessment of patient condition.Nevertheless,the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution,data collection methods,and noise levels.While traditional models like Convolutional Neural Networks(CNNs)excel in single-modality tasks,they struggle to handle multi-modal complexities,lacking the capacity to model global relationships.This research presents a novel approach for examining multi-modal medical imagery using a transformer-based system.The framework employs self-attention and cross-attention mechanisms to synchronize and integrate features across various modalities.Additionally,it shows resilience to variations in noise and image quality,making it adaptable for real-time clinical use.To address the computational hurdles linked to transformer models,particularly in real-time clinical applications in resource-constrained environments,several optimization techniques have been integrated to boost scalability and efficiency.Initially,a streamlined transformer architecture was adopted to minimize the computational load while maintaining model effectiveness.Methods such as model pruning,quantization,and knowledge distillation have been applied to reduce the parameter count and enhance the inference speed.Furthermore,efficient attention mechanisms such as linear or sparse attention were employed to alleviate the substantial memory and processing requirements of traditional self-attention operations.For further deployment optimization,researchers have implemented hardware-aware acceleration strategies,including the use of TensorRT and ONNX-based model compression,to ensure efficient execution on edge devices.These optimizations allow the approach to function effectively in real-time clinical settings,ensuring viability even in environments with limited resources.Future research directions include integrating non-imaging data to facilitate personalized treatment and enhancing computational efficiency for implementation in resource-limited environments.This study highlights the transformative potential of transformer models in multi-modal medical imaging,offering improvements in diagnostic accuracy and patient care outcomes.
基金Supported by Grant-in-Aid for Research on Hepatitis from the Japan Agency for Medical Research and Development,No.24fk0210128h0002Grant-in-Aid for Scientific Research,No.KAKENHI-23K07372.
文摘BACKGROUND Congestive hepatopathy,also known as nutmeg liver,is liver damage secondary to chronic heart failure(HF).Its morphological characteristics in terms of medical imaging are not defined and remain unclear.AIM To leverage machine learning to capture imaging features of congestive hepatopathy using incidentally acquired computed tomography(CT)scans.METHODS We retrospectively analyzed 179 chronic HF patients who underwent echocardiography and CT within one year.Right HF severity was classified into three grades.Liver CT images at the paraumbilical vein level were used to develop a ResNet-based machine learning model to predict tricuspid regurgitation(TR)severity.Model accuracy was compared with that of six gastroenterology and four radiology experts.RESULTS In the included patients,120 were male(mean age:73.1±14.4 years).The accuracy of the results predicting TR severity from a single CT image for the machine learning model was significantly higher than the average accuracy of the experts.The model was found to be exceptionally reliable for predicting severe TR.CONCLUSION Deep learning models,particularly those using ResNet architectures,can help identify morphological changes associated with TR severity,aiding in early liver dysfunction detection in patients with HF,thereby improving outcomes.
文摘Corporate image is the external manifestation of a company’s cultural and spiritual essence,as well as the overall impression formed through its interactions with the public.Huawei,as a successful multinational enterprise,has established a robust corporate image in the international market through technological innovation and brand building.Moreover,Huawei’s development is closely aligned with national policies and strategies,making it a representative enterprise for showcasing China’s technological independence and national image.This study examines Huawei’s English press releases on product launches published between 2022 and 2024 and conducts a comparative analysis with similar materials from Apple’s official website.Based on Fairclough’s three-dimensional discourse analysis model,this research explores the linguistic features of Huawei’s corporate image construction from the perspectives of text,discourse practice,and social practice.The findings reveal that Huawei has successfully constructed a corporate image that emphasizes technological innovation,prioritizes user needs,and underscores its identity as a national enterprise.This study not only sheds light on Huawei’s strategies for image construction in international competition but also provides a valuable reference for Chinese enterprises in their cultural communication and brand building during the globalization process.
基金supported by the Wuhan University Undergraduate Project of Innovation and Entrepreneurship Training“The Evolution of Cartoon Images of Pandas in Western Media’s China-Related News From the Perspective of Multimodal Theory”(Project Number:S202410486013).
文摘Applying visual grammar theory,this study examines representational,interactive,and compositional meanings of the giant panda in Western media cartoons related to China from 1999 to the present.Distinct phases in the panda’s representation were identified and illustrated by cases of cartoons in major Western media.These phases trace shift of panda cartoon image from a symbol of peace and friendliness to a politicized emblem of China’s international stance.Key visual trends,such as transitivity,color symbolism,scale enlargement,and increasing compositional complexity,embody the panda’s role in shaping China’s global image and its function in international discourse.These trends reflect the panda’s transformation into a contested symbol,which mediates between China’s self-representation and Western perceptions of its geopolitical rise.By situating the analysis within the context of China’s growing global influence,this study contributes to visual and media studies,demonstrating how cultural symbols are recontextualized to reflect and shape geopolitical narratives.
基金supports by the National Natural Science Foundation of China(Nos.82201135)"2015"Cultivation Program for Reserve Talents for Academic Leaders of Nanjing Stomatological School,Medical School of Nanjing University(No.0223A204).
文摘Early correction of childhood malocclusion is timely managing morphological,structural,and functional abnormalities at different dentomaxillofacial developmental stages.The selection of appropriate imaging examination and comprehensive radiological diagnosis and analysis play an important role in early correction of childhood malocclusion.This expert consensus is a collaborative effort by multidisciplinary experts in dentistry across the nation based on the current clinical evidence,aiming to provide general guidance on appropriate imaging examination selection,comprehensive and accurate imaging assessment for early orthodontic treatment patients.
文摘BACKGROUND Pancreatic cancer remains one of the most lethal malignancies worldwide,with a poor prognosis often attributed to late diagnosis.Understanding the correlation between pathological type and imaging features is crucial for early detection and appropriate treatment planning.AIM To retrospectively analyze the relationship between different pathological types of pancreatic cancer and their corresponding imaging features.METHODS We retrospectively analyzed the data of 500 patients diagnosed with pancreatic cancer between January 2010 and December 2020 at our institution.Pathological types were determined by histopathological examination of the surgical spe-cimens or biopsy samples.The imaging features were assessed using computed tomography,magnetic resonance imaging,and endoscopic ultrasound.Statistical analyses were performed to identify significant associations between pathological types and specific imaging characteristics.RESULTS There were 320(64%)cases of pancreatic ductal adenocarcinoma,75(15%)of intraductal papillary mucinous neoplasms,50(10%)of neuroendocrine tumors,and 55(11%)of other rare types.Distinct imaging features were identified in each pathological type.Pancreatic ductal adenocarcinoma typically presents as a hypodense mass with poorly defined borders on computed tomography,whereas intraductal papillary mucinous neoplasms present as characteristic cystic lesions with mural nodules.Neuroendocrine tumors often appear as hypervascular lesions in contrast-enhanced imaging.Statistical analysis revealed significant correlations between specific imaging features and pathological types(P<0.001).CONCLUSION This study demonstrated a strong association between the pathological types of pancreatic cancer and imaging features.These findings can enhance the accuracy of noninvasive diagnosis and guide personalized treatment approaches.
基金Supported by Interdisciplinary Program of Shanghai Jiao Tong University,No.YG2024 LC01National Natural Science Foundation of China,No.62406190.
文摘Confocal laser endomicroscopy(CLE)has become an indispensable tool in the diagnosis and detection of gastrointestinal(GI)diseases due to its high-resolution and high-contrast imaging capabilities.However,the early-stage imaging changes of gastrointestinal disorders are often subtle,and traditional medical image analysis methods rely heavily on manual interpretation,which is time-consuming,subject to observer variability,and inefficient for accurate lesion identification across large-scale image datasets.With the introduction of artificial intelligence(AI)technologies,AI-driven CLE image analysis systems can automatically extract pathological features and have demonstrated significant clinical value in lesion recognition,classification diagnosis,and malignancy prediction of GI diseases.These systems greatly enhance diagnostic efficiency and early detection capabilities.This review summarizes the applications of AI-assisted CLE in GI diseases,analyzes the limitations of current technologies,and explores future research directions.It is expected that the deep integration of AI and confocal imaging technologies will provide strong support for precision diagnosis and personalized treatment in the field of gastrointestinal disorders.
文摘Algal blooms,the spread of algae on the surface of water bodies,have adverse effects not only on aquatic ecosystems but also on human life.The adverse effects of harmful algal blooms(HABs)necessitate a convenient solution for detection and monitoring.Unmanned aerial vehicles(UAVs)have recently emerged as a tool for algal bloom detection,efficiently providing on-demand images at high spatiotemporal resolutions.This study developed an image processing method for algal bloom area estimation from the aerial images(obtained from the internet)captured using UAVs.As a remote sensing method of HAB detection,analysis,and monitoring,a combination of histogram and texture analyses was used to efficiently estimate the area of HABs.Statistical features like entropy(using the Kullback-Leibler method)were emphasized with the aid of a gray-level co-occurrence matrix.The results showed that the orthogonal images demonstrated fewer errors,and the morphological filter best detected algal blooms in real time,with a precision of 80%.This study provided efficient image processing approaches using on-board UAVs for HAB monitoring.
文摘BACKGROUND Subchorionic hematoma(SCH)is a common complication in early pregnancy characterized by the accumulation of blood between the uterine wall and the chorionic membrane.SCH can lead to adverse pregnancy outcomes such as miscarriage,preterm birth,and other complications.Early detection and accurate assessment of SCH are crucial for appropriate management and improved pregnancy outcomes.AIM To evaluate the diagnostic efficacy of virtual organ computer-assisted analysis(VOCAL)in measuring the volume ratio of SCH to gestational sac(GS)combined with serum progesterone on early pregnancy outcomes in patients with SCH.METHODS A total of 153 patients with SCH in their first-trimester pregnancies between 6 and 11 wk were enrolled.All patients were followed up until a gestational age of 20 wk.The parameters of transvaginal two-dimensional ultrasound,including the circumference of SCH(Cs),surface area of SCH(Ss),circumference of GS(Cg),and surface area of GS(Sg),and the parameters of VOCAL with transvaginal three-dimensional ultrasound,including the three-dimensional volume of SCH(3DVs)and GS(3DVg),were recorded.The size of the SCH and its ratio to the GS size(Cs/Cg,Ss/Sg,3DVs/3DVg)were recorded and compared.RESULTS Compared with those in the normal pregnancy group,the adverse pregnancy group had higher Cs/Cg,Ss/Sg,and 3DVs/3DVg ratios(P<0.05).When 3DVs/3DVg was 0.220,the highest predictive performance predicted adverse pregnancy outcomes,resulting in an AUC of 0.767,and the sensitivity,specificity were 70.2%,75%respectively.VOCAL measuring 3DVs/3DVg combined with serum progesterone gave a diagnostic AUC of 0.824 for early pregnancy outcome in SCH patients,with a high sensitivity of 82.1%and a specificity of 72.1%,which showed a significant difference between AUC.CONCLUSION VOCAL-measured 3DVs/3DVg effectively quantifies the severity of SCH,while combined serum progesterone better predicts adverse pregnancy outcomes.
基金supported in part by the Guangzhou Science and Technology Plan Project under Grants 2024B03J1361,2023B03J1327,and 2023A04J0361in part by the Open Fund Project of Hubei Province Key Laboratory of Occupational Hazard Identification and Control under Grant OHIC2023Y10+3 种基金in part by the Guangdong Province Ordinary Colleges and Universities Young Innovative Talents Project under Grant 2023KQNCX036in part by the Special Fund for Science and Technology Innovation Strategy of Guangdong Province(Climbing Plan)under Grant pdjh2024a226in part by the Key Discipline Improvement Project of Guangdong Province under Grant 2022ZDJS015in part by theResearch Fund of Guangdong Polytechnic Normal University under Grants 22GPNUZDJS17 and 2022SDKYA015.
文摘In the context of the accelerated pace of daily life and the development of e-commerce,online shopping is a mainstreamway for consumers to access products and services.To understand their emotional expressions in facing different shopping experience scenarios,this paper presents a sentiment analysis method that combines the ecommerce reviewkeyword-generated imagewith a hybrid machine learning-basedmodel,inwhich theWord2Vec-TextRank is used to extract keywords that act as the inputs for generating the related images by generative Artificial Intelligence(AI).Subsequently,a hybrid Convolutional Neural Network and Support Vector Machine(CNNSVM)model is applied for sentiment classification of those keyword-generated images.For method validation,the data randomly comprised of 5000 reviews from Amazon have been analyzed.With superior keyword extraction capability,the proposedmethod achieves impressive results on sentiment classification with a remarkable accuracy of up to 97.13%.Such performance demonstrates its advantages by using the text-to-image approach,providing a unique perspective for sentiment analysis in the e-commerce review data compared to the existing works.Thus,the proposed method enhances the reliability and insights of customer feedback surveys,which would also establish a novel direction in similar cases,such as social media monitoring and market trend research.
文摘The Ki67 index (KI) is a standard clinical marker for tumor proliferation;however, its application is hindered by intratumoral heterogeneity. In this study, we used digital image analysis to comprehensively analyze Ki67 heterogeneity and distribution patterns in breast carcinoma. Using Smart Pathology software, we digitized and analyzed 42 excised breast carcinoma Ki67 slides. Boxplots, histograms, and heat maps were generated to illustrate the KI distribution. We found that 30% of cases (13/42) exhibited discrepancies between global and hotspot KI when using a 14% KI threshold for classification. Patients with higher global or hotspot KI values displayed greater heterogenicity. Ki67 distribution patterns were categorized as randomly distributed (52%, 22/42), peripheral (43%, 18/42), and centered (5%, 2/42). Our sampling simulator indicated analyzing more than 10 high-power fields was typically required to accurately estimate global KI, with sampling size being correlated with heterogeneity. In conclusion, using digital image analysis in whole-slide images allows for comprehensive Ki67 profile assessment, shedding light on heterogeneity and distribution patterns. This spatial information can facilitate KI surveys of breast cancer and other malignancies.
基金National Natural Science Foundation of China(82305090)Science and Technology Commission of Shanghai Municipality(22YF1448900)Shanghai Municipal Health Commission(20234Y0168).
文摘Objective To analyze the differences in the correlation of tongue image indicators among patients with benign lung nodules and lung cancer.Methods From July 1;2020 to March 31;2022;clinical information of lung cancer patients and benign lung nodules patients was collected at the Oncology Department of Longhua Hos-pital Affiliated to Shanghai University of Traditional Chinese Medicine and the Physical Ex-amination Center of Shuguang Hospital Affiliated to Shanghai University of Traditional Chi-nese Medicine;respectively.We obtained tongue images from patients with benign lung nod-ules and lung cancer using the TFDA-1 digital tongue diagnosis instrument;and analyzed these images with the TDAS V2.0 software.The extracted indicators included color space pa-rameters in the Lab system for both the tongue body(TB)and tongue coating(TC)(TB/TC-L;TB/TC-a;and TB/TC-b);textural parameters[TB/TC-contrast(CON);TB/TC-angular second moment(ASM);TB/TC-entropy(ENT);and TB/TC-MEAN];as well as TC parameters(perAll and perPart).The bivariate correlation of TB and TC features was analyzed using Pearson’s or Spearman’s correlation analysis;and the overall correlation was analyzed using canonical correlation analysis(CCA).Results Samples from 307 patients with benign lung nodules and 276 lung cancer patients were included after excluding outliers and extreme values.Simple correlation analysis indi-cated that the correlation of TB-L with TC-L;TB-b with TC-b;and TB-b with perAll in lung cancer group was higher than that in benign nodules group.Moreover;the correlation of TB-a with TC-a;TB-a with perAll;and the texture parameters of the TB(TB-CON;TB-ASM;TB-ENT;and TB-MEAN)with the texture parameters of the TC(TC-CON;TC-ASM;TC-ENT;and TC-MEAN)in benign nodules group was higher than lung cancer group.CCA further demon-strated a strong correlation between the TB and TC parameters in lung cancer group;with the first and second pairs of typical variables in benign nodules and lung cancer groups indicat-ing correlation coefficients of 0.918 and 0.817(P<0.05);and 0.940 and 0.822(P<0.05);re-spectively.Conclusion Benign lung nodules and lung cancer patients exhibited differences in correla-tion in the L;a;and b values of the TB and TC;as well as the perAll value of the TC;and the texture parameters(TB/TC-CON;TB/TC-ASM;TB/TC-ENT;and TB/TC-MEAN)between the TB and TC.Additionally;there were differences in the overall correlation of the TB and TC be-tween the two groups.Objective tongue diagnosis indicators can effectively assist in the diag-nosis of benign lung nodules and lung cancer;thereby providing a scientific basis for the ear-ly detection;diagnosis;and treatment of lung cancer.
基金supported by a grant from the Standardization and Integration of Resources Information for Seed-Cluster in Hub-Spoke Material Bank Program(Project No.PJ01587004),Rural Development Administration,Republic of Korea.
文摘Recognizing the variation of genetic resources is the first step in selection.One of the most important variations in grain crops is the uniformity of seed grain weight,which can be converted into seed size.However,it has been challenging since it needs high labor costs and time to measure it on a large scale.The current study used an image analysis technique to measure the grain seed area of about 100 seeds per accession with 64 germplasm of Tartary buckwheat(Fagopyrum tataricum)to study variation among and within them.To understand the nature of variation,skewness and kurtosis analysis of probability density function curve for seed area were used.As a result,a large variation among and within accessions was found.This means that the seed sizes within an accession are not uniform in this given cleistogamous species due to its non-uniform flowering time.This implies that the seed size should be considered an important factor for the germplasm enhancement program.
基金Projects(50934002,51074013,51304076,51104100)supported by the National Natural Science Foundation of ChinaProject(IRT0950)supported by the Program for Changjiang Scholars Innovative Research Team in Universities,ChinaProject(2012M510007)supported by China Postdoctoral Science Foundation
文摘Methods and procedures of three-dimensional (3D) characterization of the pore structure features in the packed ore particle bed are focused. X-ray computed tomography was applied to deriving the cross-sectional images of specimens with single particle size of 1-2, 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10 ram. Based on the in-house developed 3D image analysis programs using Matlab, the volume porosity, pore size distribution and degree of connectivity were calculated and analyzed in detail. The results indicate that the volume porosity, the mean diameter of pores and the effective pore size (d50) increase with the increasing of particle size. Lognormal distribution or Gauss distribution is mostly suitable to model the pore size distribution. The degree of connectivity investigated on the basis of cluster-labeling algorithm also increases with increasing the particle size approximately.
文摘To develop a quick, accurate and antinoise automated image registration technique for infrared images, the wavelet analysis technique was used to extract the feature points in two images followed by the compensation for input image with angle difference between them. A hi erarchical feature matching algorithm was adopted to get the final transform parameters between the two images. The simulation results for two infrared images show that the method can effectively, quickly and accurately register images and be antinoise to some extent.
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.
基金the Deanship of Scientifc Research at King Khalid University for funding this work through large group Research Project under grant number RGP2/421/45supported via funding from Prince Sattam bin Abdulaziz University project number(PSAU/2024/R/1446)+1 种基金supported by theResearchers Supporting Project Number(UM-DSR-IG-2023-07)Almaarefa University,Riyadh,Saudi Arabia.supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2021R1F1A1055408).
文摘Machine learning(ML)is increasingly applied for medical image processing with appropriate learning paradigms.These applications include analyzing images of various organs,such as the brain,lung,eye,etc.,to identify specific flaws/diseases for diagnosis.The primary concern of ML applications is the precise selection of flexible image features for pattern detection and region classification.Most of the extracted image features are irrelevant and lead to an increase in computation time.Therefore,this article uses an analytical learning paradigm to design a Congruent Feature Selection Method to select the most relevant image features.This process trains the learning paradigm using similarity and correlation-based features over different textural intensities and pixel distributions.The similarity between the pixels over the various distribution patterns with high indexes is recommended for disease diagnosis.Later,the correlation based on intensity and distribution is analyzed to improve the feature selection congruency.Therefore,the more congruent pixels are sorted in the descending order of the selection,which identifies better regions than the distribution.Now,the learning paradigm is trained using intensity and region-based similarity to maximize the chances of selection.Therefore,the probability of feature selection,regardless of the textures and medical image patterns,is improved.This process enhances the performance of ML applications for different medical image processing.The proposed method improves the accuracy,precision,and training rate by 13.19%,10.69%,and 11.06%,respectively,compared to other models for the selected dataset.The mean error and selection time is also reduced by 12.56%and 13.56%,respectively,compared to the same models and dataset.
基金funded by the National Natural Science Foundation of China(Grant No.6240072655)the Hubei Provincial Key Research and Development Program(Grant No.2023BCB151)+1 种基金the Wuhan Natural Science Foundation Exploration Program(Chenguang Program,Grant No.2024040801020202)the Natural Science Foundation of Hubei Province of China(Grant No.2025AFB148).
文摘Image segmentation is attracting increasing attention in the field of medical image analysis.Since widespread utilization across various medical applications,ensuring and improving segmentation accuracy has become a crucial topic of research.With advances in deep learning,researchers have developed numerous methods that combine Transformers and convolutional neural networks(CNNs)to create highly accurate models for medical image segmentation.However,efforts to further enhance accuracy by developing larger and more complex models or training with more extensive datasets,significantly increase computational resource consumption.To address this problem,we propose BiCLIP-nnFormer(the prefix"Bi"refers to the use of two distinct CLIP models),a virtual multimodal instrument that leverages CLIP models to enhance the segmentation performance of a medical segmentation model nnFormer.Since two CLIP models(PMC-CLIP and CoCa-CLIP)are pre-trained on large datasets,they do not require additional training,thus conserving computation resources.These models are used offline to extract image and text embeddings from medical images.These embeddings are then processed by the proposed 3D CLIP adapter,which adapts the CLIP knowledge for segmentation tasks by fine-tuning.Finally,the adapted embeddings are fused with feature maps extracted from the nnFormer encoder for generating predicted masks.This process enriches the representation capabilities of the feature maps by integrating global multimodal information,leading to more precise segmentation predictions.We demonstrate the superiority of BiCLIP-nnFormer and the effectiveness of using CLIP models to enhance nnFormer through experiments on two public datasets,namely the Synapse multi-organ segmentation dataset(Synapse)and the Automatic Cardiac Diagnosis Challenge dataset(ACDC),as well as a self-annotated lung multi-category segmentation dataset(LMCS).
文摘The growing spectrum of Generative Adversarial Network (GAN) applications in medical imaging, cyber security, data augmentation, and the field of remote sensing tasks necessitate a sharp spike in the criticality of review of Generative Adversarial Networks. Earlier reviews that targeted reviewing certain architecture of the GAN or emphasizing a specific application-oriented area have done so in a narrow spirit and lacked the systematic comparative analysis of the models’ performance metrics. Numerous reviews do not apply standardized frameworks, showing gaps in the efficiency evaluation of GANs, training stability, and suitability for specific tasks. In this work, a systemic review of GAN models using the PRISMA framework is developed in detail to fill the gap by structurally evaluating GAN architectures. A wide variety of GAN models have been discussed in this review, starting from the basic Conditional GAN, Wasserstein GAN, and Deep Convolutional GAN, and have gone down to many specialized models, such as EVAGAN, FCGAN, and SIF-GAN, for different applications across various domains like fault diagnosis, network security, medical imaging, and image segmentation. The PRISMA methodology systematically filters relevant studies by inclusion and exclusion criteria to ensure transparency and replicability in the review process. Hence, all models are assessed relative to specific performance metrics such as accuracy, stability, and computational efficiency. There are multiple benefits to using the PRISMA approach in this setup. Not only does this help in finding optimal models suitable for various applications, but it also provides an explicit framework for comparing GAN performance. In addition to this, diverse types of GAN are included to ensure a comprehensive view of the state-of-the-art techniques. This work is essential not only in terms of its result but also because it guides the direction of future research by pinpointing which types of applications require some GAN architectures, works to improve specific task model selection, and points out areas for further research on the development and application of GANs.