The satellite-based augmentation system(SBAS)provides differential and integrity augmentation services for life safety fields of aviation and navigation.However,the signal structure of SBAS is public,which incurs a ri...The satellite-based augmentation system(SBAS)provides differential and integrity augmentation services for life safety fields of aviation and navigation.However,the signal structure of SBAS is public,which incurs a risk of spoofing attacks.To improve the anti-spoofing capability of the SBAS,European Union and the United States conduct research on navigation message authentication,and promote the standardization of SBAS message authentication.For the development of Beidou satellite-based augmentation system(BDSBAS),this paper proposes navigation message authentication based on the Chinese commercial cryptographic standards.Firstly,this paper expounds the architecture and principles of the SBAS message authentication,and then carries out the design of timed efficient streaming losstolerant authentication scheme(TESLA)and elliptic curve digital signature algorithm(ECDSA)authentication schemes based on Chinese commercial cryptographic standards,message arrangement and the design of over-the-air rekeying(OTAR)message.Finally,this paper conducts a theoretical analysis of the time between authentications(TBA)and maximum authentication latency(MAL)for L5 TESLA-I and L5 ECDSA-Q,and further simulates the reception time of OTAR message,TBA and MAL from the aspects of OTAR message weight and demodulation error rate.The simulation results can provide theoretical supports for the standardization of BDSBAS message authentication.展开更多
Objective expertise evaluation of individuals,as a prerequisite stage for team formation,has been a long-term desideratum in large software development companies.With the rapid advancements in machine learning methods...Objective expertise evaluation of individuals,as a prerequisite stage for team formation,has been a long-term desideratum in large software development companies.With the rapid advancements in machine learning methods,based on reliable existing data stored in project management tools’datasets,automating this evaluation process becomes a natural step forward.In this context,our approach focuses on quantifying software developer expertise by using metadata from the task-tracking systems.For this,we mathematically formalize two categories of expertise:technology-specific expertise,which denotes the skills required for a particular technology,and general expertise,which encapsulates overall knowledge in the software industry.Afterward,we automatically classify the zones of expertise associated with each task a developer has worked on using Bidirectional Encoder Representations from Transformers(BERT)-like transformers to handle the unique characteristics of project tool datasets effectively.Finally,our method evaluates the proficiency of each software specialist across already completed projects from both technology-specific and general perspectives.The method was experimentally validated,yielding promising results.展开更多
Many spore-forming Bacillus species can cause serious human diseases,because of accidental Bacillusspore infection.Thus,developing an identification strategy with both high sensitivity and specificity is greatly in de...Many spore-forming Bacillus species can cause serious human diseases,because of accidental Bacillusspore infection.Thus,developing an identification strategy with both high sensitivity and specificity is greatly in demand.In this work,we proposed a novel approach named multi-head self-attention mechanism-guided neural network Raman platform to identify living Bacillus spores within a single-cell resolution.The multi-head self-attention mechanism-guided neural network Raman platform was created by combining single-cell Raman spectroscopy,convolutional neural network(CNN),and multi-head self-attention mechanism.To address the limited size of the original spectra dataset,Gaussian noise-based spectra augmentation was employed to increase the number of single-cell Raman spectra datasets for CNN training.Owing to the assistance of both spectra augmentation and multi-head self-attention mechanism,the obtained prediction accuracy of five Bacillus spore species was further improved from 92.29±0.82%to 99.43±0.15%.To figure out the spectra differences covered by the multi-head self-attention mechanism-guided CNN,the relative classification weight from typical Raman bands was visualized via multi-head self-attention mechanism curve.In the process of spectra augmentation from 0 to 1000,the distribution of relative classification weight varied from a discrete state to a more concentrated phase.More importantly,these highlighted four Raman bands(1017,1449,1576,and 1660 cm^(-1))were assigned large weights,showing that the spectra differences in the Raman bands produced the largest contribution to prediction accuracy.It can be foreseen that,our proposed sorting platform has great potential in accurately identifying Bacillus and its related genera species at a single-cell level.展开更多
This paper focuses on self-supervised video representation learning.Most existing approaches follow the contrastive learning pipeline to construct positive and negative pairs by sampling different clips.However,this f...This paper focuses on self-supervised video representation learning.Most existing approaches follow the contrastive learning pipeline to construct positive and negative pairs by sampling different clips.However,this formulation tends to bias the static background and has difficulty establishing global temporal structures.The major reason is that the positive pairs,i.e.,different clips sampled from the same video,have limited temporal receptive fields,and usually share similar backgrounds but differ in motions.To address these problems,we propose a framework to jointly utilize local clips and global videos to learn from detailed region-level correspondence as well as general long-term temporal relations.Based on a set of designed controllable augmentations,we implement accurate appearance and motion pattern alignment through soft spatio-temporal region contrast.Our formulation avoids the low-level redundancy shortcut with an adversarial mutual information minimization objective to improve the generalization ability.Moreover,we introduce local-global temporal order dependency to further bridge the gap between clip-level and video-level representations for robust temporal modeling.Extensive experiments demonstrate that our framework is superior on three video benchmarks in action recognition and video retrieval,and captures more accurate temporal dynamics.展开更多
Lung cancer continues to be a leading cause of cancer-related deaths worldwide,emphasizing the critical need for improved diagnostic techniques.Early detection of lung tumors significantly increases the chances of suc...Lung cancer continues to be a leading cause of cancer-related deaths worldwide,emphasizing the critical need for improved diagnostic techniques.Early detection of lung tumors significantly increases the chances of successful treatment and survival.However,current diagnostic methods often fail to detect tumors at an early stage or to accurately pinpoint their location within the lung tissue.Single-model deep learning technologies for lung cancer detection,while beneficial,cannot capture the full range of features present in medical imaging data,leading to incomplete or inaccurate detection.Furthermore,it may not be robust enough to handle the wide variability in medical images due to different imaging conditions,patient anatomy,and tumor characteristics.To overcome these disadvantages,dual-model or multi-model approaches can be employed.This research focuses on enhancing the detection of lung cancer by utilizing a combination of two learning models:a Convolutional Neural Network(CNN)for categorization and the You Only Look Once(YOLOv8)architecture for real-time identification and pinpointing of tumors.CNNs automatically learn to extract hierarchical features from raw image data,capturing patterns such as edges,textures,and complex structures that are crucial for identifying lung cancer.YOLOv8 incorporates multiscale feature extraction,enabling the detection of tumors of varying sizes and scales within a single image.This is particularly beneficial for identifying small or irregularly shaped tumors that may be challenging to detect.Furthermore,through the utilization of cutting-edge data augmentation methods,such as Deep Convolutional Generative Adversarial Networks(DCGAN),the suggested approach can handle the issue of limited data and boost the models’ability to learn from diverse and comprehensive datasets.The combined method not only improved accuracy and localization but also ensured efficient real-time processing,which is crucial for practical clinical applications.The CNN achieved an accuracy of 97.67%in classifying lung tissues into healthy and cancerous categories.The YOLOv8 model achieved an Intersection over Union(IoU)score of 0.85 for tumor localization,reflecting high precision in detecting and marking tumor boundaries within the images.Finally,the incorporation of synthetic images generated by DCGAN led to a 10%improvement in both the CNN classification accuracy and YOLOv8 detection performance.展开更多
Photonic platforms are gradually emerging as a promising option to encounter the ever-growing demand for artificial intelligence,among which photonic time-delay reservoir computing(TDRC)is widely anticipated.While suc...Photonic platforms are gradually emerging as a promising option to encounter the ever-growing demand for artificial intelligence,among which photonic time-delay reservoir computing(TDRC)is widely anticipated.While such a computing paradigm can only employ a single photonic device as the nonlinear node for data processing,the performance highly relies on the fading memory provided by the delay feedback loop(FL),which sets a restriction on the extensibility of physical implementation,especially for highly integrated chips.Here,we present a simplified photonic scheme for more flexible parameter configurations leveraging the designed quasi-convolution coding(QC),which completely gets rid of the dependence on FL.Unlike delay-based TDRC,encoded data in QC-based RC(QRC)enables temporal feature extraction,facilitating augmented memory capabilities.Thus,our proposed QRC is enabled to deal with time-related tasks or sequential data without the implementation of FL.Furthermore,we can implement this hardware with a low-power,easily integrable vertical-cavity surface-emitting laser for high-performance parallel processing.We illustrate the concept validation through simulation and experimental comparison of QRC and TDRC,wherein the simpler-structured QRC outperforms across various benchmark tasks.Our results may underscore an auspicious solution for the hardware implementation of deep neural networks.展开更多
Objective The study of medicine formulas is a core component of traditional Chinese medicine(TCM),yet traditional learning methods often lack interactivity and contextual understanding,making it challenging for beginn...Objective The study of medicine formulas is a core component of traditional Chinese medicine(TCM),yet traditional learning methods often lack interactivity and contextual understanding,making it challenging for beginners to grasp the intricate composition rules of formulas.To address this gap,we introduce Formula-S,a situated visualization method for TCM formula learning in augmented reality(AR)and evaluate its performance.This study aims to evaluate the effectiveness of Formula-S in enhancing TCM formula learning for beginners by comparing it with traditional text-based formula learning and web-based visualization.Methods Formula-S is an interactive AR tool designed for TCM formula learning,featuring three modes(3D,Web,and Table).The dataset included TCM formulas and herb properties extracted from authoritative references,including textbook and the SymMap database.In Formula-S,the hierarchical visualization of the formulas as herbal medicine compositions,is linked to the multidimensional herb attribute visualization and embedded in the real world,where real herb samples are presented.To evaluate its effectiveness,a controlled study(n=30)was conducted.Participants who had no formal TCM knowledge were tasked with herbal medicine identification,formula composition,and recognition.In the study,participants interacted with the AR tool through HoloLens 2.Data were collected on both task performance(accuracy and response time)and user experience,with a focus on task efficiency,accuracy,and user preference across the different learning modes.Results The situated visualization method of Formula-S had comparable accuracy to other methods but shorter response time for herbal formula learning tasks.Regarding user experience,our new approach demonstrated the highest system usability and lowest task load,effectively reducing cognitive load and allowing users to complete tasks with greater ease and efficiency.Participants reported that Formula-S enhanced their learning experience through its intuitive interface and immersive AR environment,suggesting this approach offers usability advantages for TCM education.Conclusions The situated visualization method in Formula-S offers more efficient and accurate searching capabilities compared to traditional and web-based methods.Additionally,it provides superior contextual understanding of TCM formulas,making it a promising new solution for TCM learning.展开更多
The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
Breast cancer is one of the major causes of deaths in women.However,the early diagnosis is important for screening and control the mortality rate.Thus for the diagnosis of breast cancer at the early stage,a computer-a...Breast cancer is one of the major causes of deaths in women.However,the early diagnosis is important for screening and control the mortality rate.Thus for the diagnosis of breast cancer at the early stage,a computer-aided diagnosis system is highly required.Ultrasound is an important examination technique for breast cancer diagnosis due to its low cost.Recently,many learning-based techniques have been introduced to classify breast cancer using breast ultrasound imaging dataset(BUSI)datasets;however,the manual handling is not an easy process and time consuming.The authors propose an EfficientNet-integrated ResNet deep network and XAI-based framework for accurately classifying breast cancer(malignant and benign).In the initial step,data augmentation is performed to increase the number of training samples.For this purpose,three-pixel flip mathematical equations are introduced:horizontal,vertical,and 90°.Later,two pretrained deep learning models were employed,skipped some layers,and fine-tuned.Both fine-tuned models are later trained using a deep transfer learning process and extracted features from the deeper layer.Explainable artificial intelligence-based analysed the performance of trained models.After that,a new feature selection technique is proposed based on the cuckoo search algorithm called cuckoo search controlled standard error mean.This technique selects the best features and fuses using a new parallel zeropadding maximum correlated coefficient features.In the end,the selection algorithm is applied again to the fused feature vector and classified using machine learning algorithms.The experimental process of the proposed framework is conducted on a publicly available BUSI and obtained 98.4%and 98%accuracy in two different experiments.Comparing the proposed framework is also conducted with recent techniques and shows improved accuracy.In addition,the proposed framework was executed less than the original deep learning models.展开更多
Large-scale point cloud datasets form the basis for training various deep learning networks and achieving high-quality network processing tasks.Due to the diversity and robustness constraints of the data,data augmenta...Large-scale point cloud datasets form the basis for training various deep learning networks and achieving high-quality network processing tasks.Due to the diversity and robustness constraints of the data,data augmentation(DA)methods are utilised to expand dataset diversity and scale.However,due to the complex and distinct characteristics of LiDAR point cloud data from different platforms(such as missile-borne and vehicular LiDAR data),directly applying traditional 2D visual domain DA methods to 3D data can lead to networks trained using this approach not robustly achieving the corresponding tasks.To address this issue,the present study explores DA for missile-borne LiDAR point cloud using a Monte Carlo(MC)simulation method that closely resembles practical application.Firstly,the model of multi-sensor imaging system is established,taking into account the joint errors arising from the platform itself and the relative motion during the imaging process.A distortion simulation method based on MC simulation for augmenting missile-borne LiDAR point cloud data is proposed,underpinned by an analysis of combined errors between different modal sensors,achieving high-quality augmentation of point cloud data.The effectiveness of the proposed method in addressing imaging system errors and distortion simulation is validated using the imaging scene dataset constructed in this paper.Comparative experiments between the proposed point cloud DA algorithm and the current state-of-the-art algorithms in point cloud detection and single object tracking tasks demonstrate that the proposed method can improve the network performance obtained from unaugmented datasets by over 17.3%and 17.9%,surpassing SOTA performance of current point cloud DA algorithms.展开更多
Solar cell defects exhibit significant variations and multiple types,with some defect data being difficult to acquire or having small scales,posing challenges in terms of small sample and small target in defect detect...Solar cell defects exhibit significant variations and multiple types,with some defect data being difficult to acquire or having small scales,posing challenges in terms of small sample and small target in defect detection for solar cells.In order to address this issue,this paper proposes a multi-step approach for detecting the complex defects of solar cells.First,individual cell plates are extracted from electroluminescence images for block-by-block detection.Then,StyleGAN2-Ada is utilized for generative adversarial networks data augmentation to expand the number of defect samples in small sample defects.Finally,the fake dataset is combined with real dataset,and the improved YOLOv5 model is trained on this mixed dataset.Experimental results demonstrate that the proposed method achieves a superior performance in detecting the defects with small sample and small target,with the final recall rate reaching 99.7%,an increase of 3.9% compared with the unimproved model.Additionally,the precision and mean average precision are increased by 3.4% and 3.5%,respectively.Moreover,the experiments demonstrate that the improved network training on the mixed dataset can effectively enhance the detection performance of the model.The combination of these approaches significantly improves the network’s ability to detect solar cell defects.展开更多
Phantom limb pain(PLP)is not only a physical pain experience but also poses a significant challenge to mental health and quality of life.Currently,the mechanism of PLP treatment is still unclear,and there are many met...Phantom limb pain(PLP)is not only a physical pain experience but also poses a significant challenge to mental health and quality of life.Currently,the mechanism of PLP treatment is still unclear,and there are many methods with varying effects.This article starts with the application research of extended reality technology in PLP treatment,through describing the application of its branch technologies(virtual reality,augmented reality,and mixed reality technology),to lay the foundation for subsequent research,in the hope of finding advanced and effective treatment methods,and providing a basis for future product transformation.展开更多
Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in ord...Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.展开更多
In order to address the widespread data shortage problem in battery research,this paper proposes a generative adversarial network model that combines it with deep convolutional networks,the Wasserstein distance,and th...In order to address the widespread data shortage problem in battery research,this paper proposes a generative adversarial network model that combines it with deep convolutional networks,the Wasserstein distance,and the gradient penalty to achieve data augmentation.To lower the threshold for implementing the proposed method,transfer learning is further introduced.The W-DC-GAN-GP-TL framework is thereby formed.This framework is evaluated on 3 different publicly available datasets to judge the quality of generated data.Through visual comparisons and the examination of two visualization methods(probability density function(PDF)and principal component analysis(PCA)),it is demonstrated that the generated data is hard to distinguish from the real data.The application of generated data for training a battery state model using transfer learning is further evaluated.Specifically,Bi-GRU-based and Transformer-based methods are implemented on 2 separate datasets for estimating state of health(SOH)and state of charge(SOC),respectively.The results indicate that the proposed framework demonstrates satisfactory performance in different scenarios:for the data replacement scenario,where real data are removed and replaced with generated data,the state estimator accuracy decreases only slightly;for the data enhancement scenario,the estimator accuracy is further improved.The estimation accuracy of SOH and SOC is as low as 0.69%and 0.58%root mean square error(RMSE)after applying the proposed framework.This framework provides a reliable method for enriching battery measurement data.It is a generalized framework capable of generating a variety of time series data.展开更多
Peri-implant keratinized mucosa(PIKM)augmentation refers to surgical procedures aimed at increasing the width of PIKM.Consensus reports emphasize the necessity of maintaining a minimum width of PIKM to ensure long-ter...Peri-implant keratinized mucosa(PIKM)augmentation refers to surgical procedures aimed at increasing the width of PIKM.Consensus reports emphasize the necessity of maintaining a minimum width of PIKM to ensure long-term peri-implant health.Currently,several surgical techniques have been validated for their effectiveness in increasing PIKM.However,the selection and application of PIKM augmentation methods may present challenges for dental practitioners due to heterogeneity in surgical techniques,variations in clinical scenarios,and anatomical differences.Therefore,clear guidelines and considerations for PIKM augmentation are needed.This expert consensus focuses on the commonly employed surgical techniques for PIKM augmentation and the factors influencing their selection at second-stage surgery.It aims to establish a standardized framework for assessing,planning,and executing PIKM augmentation procedures,with the goal of offering evidence-based guidance to enhance the predictability and success of PIKM augmentation.展开更多
Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones...Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built.展开更多
Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)t...Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)training the model solely with copy-paste mixed pictures from labeled and unlabeled input loses a lot of labeled information;(2)low-quality pseudo-labels can cause confirmation bias in pseudo-supervised learning on unlabeled data;(3)the segmentation performance in low-contrast and local regions is less than optimal.We design a Stochastic Augmentation-Based Dual-Teaching Auxiliary Training Strategy(SADT),which enhances feature diversity and learns high-quality features to overcome these problems.To be more precise,SADT trains the Student Network by using pseudo-label-based training from Teacher Network 1 and supervised learning with labeled data,which prevents the loss of rare labeled data.We introduce a bi-directional copy-pastemask with progressive high-entropy filtering to reduce data distribution disparities and mitigate confirmation bias in pseudo-supervision.For the mixed images,Deep-Shallow Spatial Contrastive Learning(DSSCL)is proposed in the feature spaces of Teacher Network 2 and the Student Network to improve the segmentation capabilities in low-contrast and local areas.In this procedure,the features retrieved by the Student Network are subjected to a random feature perturbation technique.On two openly available datasets,extensive trials show that our proposed SADT performs much better than the state-ofthe-art semi-supervised medical segmentation techniques.Using only 10%of the labeled data for training,SADT was able to acquire a Dice score of 90.10%on the ACDC(Automatic Cardiac Diagnosis Challenge)dataset.展开更多
Augmented reality(AR)is an emerging dynamic technology that effectively supports education across different levels.The increased use of mobile devices has an even greater impact.As the demand for AR applications in ed...Augmented reality(AR)is an emerging dynamic technology that effectively supports education across different levels.The increased use of mobile devices has an even greater impact.As the demand for AR applications in education continues to increase,educators actively seek innovative and immersive methods to engage students in learning.However,exploring these possibilities also entails identifying and overcoming existing barriers to optimal educational integration.Concurrently,this surge in demand has prompted the identification of specific barriers,one of which is three-dimensional(3D)modeling.Creating 3D objects for augmented reality education applications can be challenging and time-consuming for the educators.To address this,we have developed a pipeline that creates realistic 3D objects from the two-dimensional(2D)photograph.Applications for augmented and virtual reality can then utilize these created 3D objects.We evaluated the proposed pipeline based on the usability of the 3D object and performance metrics.Quantitatively,with 117 respondents,the co-creation team was surveyed with openended questions to evaluate the precision of the 3D object created by the proposed photogrammetry pipeline.We analyzed the survey data using descriptive-analytical methods and found that the proposed pipeline produces 3D models that are positively accurate when compared to real-world objects,with an average mean score above 8.This study adds new knowledge in creating 3D objects for augmented reality applications by using the photogrammetry technique;finally,it discusses potential problems and future research directions for 3D objects in the education sector.展开更多
Retinal blood vessel segmentation is crucial for diagnosing ocular and cardiovascular diseases.Although the introduction of U-Net in 2015 by Olaf Ronneberger significantly advanced this field,yet issues like limited t...Retinal blood vessel segmentation is crucial for diagnosing ocular and cardiovascular diseases.Although the introduction of U-Net in 2015 by Olaf Ronneberger significantly advanced this field,yet issues like limited training data,imbalance data distribution,and inadequate feature extraction persist,hindering both the segmentation performance and optimal model generalization.Addressing these critical issues,the DEFFA-Unet is proposed featuring an additional encoder to process domain-invariant pre-processed inputs,thereby improving both richer feature encoding and enhanced model generalization.A feature filtering fusion module is developed to ensure the precise feature filtering and robust hybrid feature fusion.In response to the task-specific need for higher precision where false positives are very costly,traditional skip connections are replaced with the attention-guided feature reconstructing fusion module.Additionally,innovative data augmentation and balancing methods are proposed to counter data scarcity and distribution imbalance,further boosting the robustness and generalization of the model.With a comprehensive suite of evaluation metrics,extensive validations on four benchmark datasets(DRIVE,CHASEDB1,STARE,and HRF)and an SLO dataset(IOSTAR),demonstrate the proposed method’s superiority over both baseline and state-of-the-art models.Particularly the proposed method significantly outperforms the compared methods in cross-validation model generalization.展开更多
基金supported by National Natural Science Foundation of China:Space-based occultation detection with ground-based GNSS atmospheric horizontal gradient model(41904033).
文摘The satellite-based augmentation system(SBAS)provides differential and integrity augmentation services for life safety fields of aviation and navigation.However,the signal structure of SBAS is public,which incurs a risk of spoofing attacks.To improve the anti-spoofing capability of the SBAS,European Union and the United States conduct research on navigation message authentication,and promote the standardization of SBAS message authentication.For the development of Beidou satellite-based augmentation system(BDSBAS),this paper proposes navigation message authentication based on the Chinese commercial cryptographic standards.Firstly,this paper expounds the architecture and principles of the SBAS message authentication,and then carries out the design of timed efficient streaming losstolerant authentication scheme(TESLA)and elliptic curve digital signature algorithm(ECDSA)authentication schemes based on Chinese commercial cryptographic standards,message arrangement and the design of over-the-air rekeying(OTAR)message.Finally,this paper conducts a theoretical analysis of the time between authentications(TBA)and maximum authentication latency(MAL)for L5 TESLA-I and L5 ECDSA-Q,and further simulates the reception time of OTAR message,TBA and MAL from the aspects of OTAR message weight and demodulation error rate.The simulation results can provide theoretical supports for the standardization of BDSBAS message authentication.
基金supported by the project“Romanian Hub for Artificial Intelligence-HRIA”,Smart Growth,Digitization and Financial Instruments Program,2021–2027,MySMIS No.334906.
文摘Objective expertise evaluation of individuals,as a prerequisite stage for team formation,has been a long-term desideratum in large software development companies.With the rapid advancements in machine learning methods,based on reliable existing data stored in project management tools’datasets,automating this evaluation process becomes a natural step forward.In this context,our approach focuses on quantifying software developer expertise by using metadata from the task-tracking systems.For this,we mathematically formalize two categories of expertise:technology-specific expertise,which denotes the skills required for a particular technology,and general expertise,which encapsulates overall knowledge in the software industry.Afterward,we automatically classify the zones of expertise associated with each task a developer has worked on using Bidirectional Encoder Representations from Transformers(BERT)-like transformers to handle the unique characteristics of project tool datasets effectively.Finally,our method evaluates the proficiency of each software specialist across already completed projects from both technology-specific and general perspectives.The method was experimentally validated,yielding promising results.
基金partially supported by the National Natural Science Foundation of China(62075137)the Guangdong Basic and Applied Basic Research Foundation(2023A1515140161)+3 种基金the Guangxi Natural Science Foundation of China(2021JJB 110003)the Dongguan Science and Technology of Social Development Program(20231800936312)the high-level talent program of Dongguan University of Technology(No.221110080)the Sanming Project of Medicine in Shenzhen(No.SZSM202103014).
文摘Many spore-forming Bacillus species can cause serious human diseases,because of accidental Bacillusspore infection.Thus,developing an identification strategy with both high sensitivity and specificity is greatly in demand.In this work,we proposed a novel approach named multi-head self-attention mechanism-guided neural network Raman platform to identify living Bacillus spores within a single-cell resolution.The multi-head self-attention mechanism-guided neural network Raman platform was created by combining single-cell Raman spectroscopy,convolutional neural network(CNN),and multi-head self-attention mechanism.To address the limited size of the original spectra dataset,Gaussian noise-based spectra augmentation was employed to increase the number of single-cell Raman spectra datasets for CNN training.Owing to the assistance of both spectra augmentation and multi-head self-attention mechanism,the obtained prediction accuracy of five Bacillus spore species was further improved from 92.29±0.82%to 99.43±0.15%.To figure out the spectra differences covered by the multi-head self-attention mechanism-guided CNN,the relative classification weight from typical Raman bands was visualized via multi-head self-attention mechanism curve.In the process of spectra augmentation from 0 to 1000,the distribution of relative classification weight varied from a discrete state to a more concentrated phase.More importantly,these highlighted four Raman bands(1017,1449,1576,and 1660 cm^(-1))were assigned large weights,showing that the spectra differences in the Raman bands produced the largest contribution to prediction accuracy.It can be foreseen that,our proposed sorting platform has great potential in accurately identifying Bacillus and its related genera species at a single-cell level.
基金supported in part by the National Natural Science Foundation of China(No.62325109,U21B2013).
文摘This paper focuses on self-supervised video representation learning.Most existing approaches follow the contrastive learning pipeline to construct positive and negative pairs by sampling different clips.However,this formulation tends to bias the static background and has difficulty establishing global temporal structures.The major reason is that the positive pairs,i.e.,different clips sampled from the same video,have limited temporal receptive fields,and usually share similar backgrounds but differ in motions.To address these problems,we propose a framework to jointly utilize local clips and global videos to learn from detailed region-level correspondence as well as general long-term temporal relations.Based on a set of designed controllable augmentations,we implement accurate appearance and motion pattern alignment through soft spatio-temporal region contrast.Our formulation avoids the low-level redundancy shortcut with an adversarial mutual information minimization objective to improve the generalization ability.Moreover,we introduce local-global temporal order dependency to further bridge the gap between clip-level and video-level representations for robust temporal modeling.Extensive experiments demonstrate that our framework is superior on three video benchmarks in action recognition and video retrieval,and captures more accurate temporal dynamics.
文摘Lung cancer continues to be a leading cause of cancer-related deaths worldwide,emphasizing the critical need for improved diagnostic techniques.Early detection of lung tumors significantly increases the chances of successful treatment and survival.However,current diagnostic methods often fail to detect tumors at an early stage or to accurately pinpoint their location within the lung tissue.Single-model deep learning technologies for lung cancer detection,while beneficial,cannot capture the full range of features present in medical imaging data,leading to incomplete or inaccurate detection.Furthermore,it may not be robust enough to handle the wide variability in medical images due to different imaging conditions,patient anatomy,and tumor characteristics.To overcome these disadvantages,dual-model or multi-model approaches can be employed.This research focuses on enhancing the detection of lung cancer by utilizing a combination of two learning models:a Convolutional Neural Network(CNN)for categorization and the You Only Look Once(YOLOv8)architecture for real-time identification and pinpointing of tumors.CNNs automatically learn to extract hierarchical features from raw image data,capturing patterns such as edges,textures,and complex structures that are crucial for identifying lung cancer.YOLOv8 incorporates multiscale feature extraction,enabling the detection of tumors of varying sizes and scales within a single image.This is particularly beneficial for identifying small or irregularly shaped tumors that may be challenging to detect.Furthermore,through the utilization of cutting-edge data augmentation methods,such as Deep Convolutional Generative Adversarial Networks(DCGAN),the suggested approach can handle the issue of limited data and boost the models’ability to learn from diverse and comprehensive datasets.The combined method not only improved accuracy and localization but also ensured efficient real-time processing,which is crucial for practical clinical applications.The CNN achieved an accuracy of 97.67%in classifying lung tissues into healthy and cancerous categories.The YOLOv8 model achieved an Intersection over Union(IoU)score of 0.85 for tumor localization,reflecting high precision in detecting and marking tumor boundaries within the images.Finally,the incorporation of synthetic images generated by DCGAN led to a 10%improvement in both the CNN classification accuracy and YOLOv8 detection performance.
基金National Natural Science Foundation of China(62171305,62405206,62004135,62001317,62111530301)Natural Science Foundation of Jiangsu Province(BK20240778,BK20241917)+3 种基金State Key Laboratory of Advanced Optical Communication Systems and Networks,China(2023GZKF08)China Postdoctoral Science Foundation(2024M752314)Postdoctoral Fellowship Program of CPSF(GZC20231883)Innovative and Entrepreneurial Talent Program of Jiangsu Province(JSSCRC2021527).
文摘Photonic platforms are gradually emerging as a promising option to encounter the ever-growing demand for artificial intelligence,among which photonic time-delay reservoir computing(TDRC)is widely anticipated.While such a computing paradigm can only employ a single photonic device as the nonlinear node for data processing,the performance highly relies on the fading memory provided by the delay feedback loop(FL),which sets a restriction on the extensibility of physical implementation,especially for highly integrated chips.Here,we present a simplified photonic scheme for more flexible parameter configurations leveraging the designed quasi-convolution coding(QC),which completely gets rid of the dependence on FL.Unlike delay-based TDRC,encoded data in QC-based RC(QRC)enables temporal feature extraction,facilitating augmented memory capabilities.Thus,our proposed QRC is enabled to deal with time-related tasks or sequential data without the implementation of FL.Furthermore,we can implement this hardware with a low-power,easily integrable vertical-cavity surface-emitting laser for high-performance parallel processing.We illustrate the concept validation through simulation and experimental comparison of QRC and TDRC,wherein the simpler-structured QRC outperforms across various benchmark tasks.Our results may underscore an auspicious solution for the hardware implementation of deep neural networks.
文摘Objective The study of medicine formulas is a core component of traditional Chinese medicine(TCM),yet traditional learning methods often lack interactivity and contextual understanding,making it challenging for beginners to grasp the intricate composition rules of formulas.To address this gap,we introduce Formula-S,a situated visualization method for TCM formula learning in augmented reality(AR)and evaluate its performance.This study aims to evaluate the effectiveness of Formula-S in enhancing TCM formula learning for beginners by comparing it with traditional text-based formula learning and web-based visualization.Methods Formula-S is an interactive AR tool designed for TCM formula learning,featuring three modes(3D,Web,and Table).The dataset included TCM formulas and herb properties extracted from authoritative references,including textbook and the SymMap database.In Formula-S,the hierarchical visualization of the formulas as herbal medicine compositions,is linked to the multidimensional herb attribute visualization and embedded in the real world,where real herb samples are presented.To evaluate its effectiveness,a controlled study(n=30)was conducted.Participants who had no formal TCM knowledge were tasked with herbal medicine identification,formula composition,and recognition.In the study,participants interacted with the AR tool through HoloLens 2.Data were collected on both task performance(accuracy and response time)and user experience,with a focus on task efficiency,accuracy,and user preference across the different learning modes.Results The situated visualization method of Formula-S had comparable accuracy to other methods but shorter response time for herbal formula learning tasks.Regarding user experience,our new approach demonstrated the highest system usability and lowest task load,effectively reducing cognitive load and allowing users to complete tasks with greater ease and efficiency.Participants reported that Formula-S enhanced their learning experience through its intuitive interface and immersive AR environment,suggesting this approach offers usability advantages for TCM education.Conclusions The situated visualization method in Formula-S offers more efficient and accurate searching capabilities compared to traditional and web-based methods.Additionally,it provides superior contextual understanding of TCM formulas,making it a promising new solution for TCM learning.
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.
文摘Breast cancer is one of the major causes of deaths in women.However,the early diagnosis is important for screening and control the mortality rate.Thus for the diagnosis of breast cancer at the early stage,a computer-aided diagnosis system is highly required.Ultrasound is an important examination technique for breast cancer diagnosis due to its low cost.Recently,many learning-based techniques have been introduced to classify breast cancer using breast ultrasound imaging dataset(BUSI)datasets;however,the manual handling is not an easy process and time consuming.The authors propose an EfficientNet-integrated ResNet deep network and XAI-based framework for accurately classifying breast cancer(malignant and benign).In the initial step,data augmentation is performed to increase the number of training samples.For this purpose,three-pixel flip mathematical equations are introduced:horizontal,vertical,and 90°.Later,two pretrained deep learning models were employed,skipped some layers,and fine-tuned.Both fine-tuned models are later trained using a deep transfer learning process and extracted features from the deeper layer.Explainable artificial intelligence-based analysed the performance of trained models.After that,a new feature selection technique is proposed based on the cuckoo search algorithm called cuckoo search controlled standard error mean.This technique selects the best features and fuses using a new parallel zeropadding maximum correlated coefficient features.In the end,the selection algorithm is applied again to the fused feature vector and classified using machine learning algorithms.The experimental process of the proposed framework is conducted on a publicly available BUSI and obtained 98.4%and 98%accuracy in two different experiments.Comparing the proposed framework is also conducted with recent techniques and shows improved accuracy.In addition,the proposed framework was executed less than the original deep learning models.
基金Postgraduate Innovation Top notch Talent Training Project of Hunan Province,Grant/Award Number:CX20220045Scientific Research Project of National University of Defense Technology,Grant/Award Number:22-ZZCX-07+2 种基金New Era Education Quality Project of Anhui Province,Grant/Award Number:2023cxcysj194National Natural Science Foundation of China,Grant/Award Numbers:62201597,62205372,1210456foundation of Hefei Comprehensive National Science Center,Grant/Award Number:KY23C502。
文摘Large-scale point cloud datasets form the basis for training various deep learning networks and achieving high-quality network processing tasks.Due to the diversity and robustness constraints of the data,data augmentation(DA)methods are utilised to expand dataset diversity and scale.However,due to the complex and distinct characteristics of LiDAR point cloud data from different platforms(such as missile-borne and vehicular LiDAR data),directly applying traditional 2D visual domain DA methods to 3D data can lead to networks trained using this approach not robustly achieving the corresponding tasks.To address this issue,the present study explores DA for missile-borne LiDAR point cloud using a Monte Carlo(MC)simulation method that closely resembles practical application.Firstly,the model of multi-sensor imaging system is established,taking into account the joint errors arising from the platform itself and the relative motion during the imaging process.A distortion simulation method based on MC simulation for augmenting missile-borne LiDAR point cloud data is proposed,underpinned by an analysis of combined errors between different modal sensors,achieving high-quality augmentation of point cloud data.The effectiveness of the proposed method in addressing imaging system errors and distortion simulation is validated using the imaging scene dataset constructed in this paper.Comparative experiments between the proposed point cloud DA algorithm and the current state-of-the-art algorithms in point cloud detection and single object tracking tasks demonstrate that the proposed method can improve the network performance obtained from unaugmented datasets by over 17.3%and 17.9%,surpassing SOTA performance of current point cloud DA algorithms.
文摘Solar cell defects exhibit significant variations and multiple types,with some defect data being difficult to acquire or having small scales,posing challenges in terms of small sample and small target in defect detection for solar cells.In order to address this issue,this paper proposes a multi-step approach for detecting the complex defects of solar cells.First,individual cell plates are extracted from electroluminescence images for block-by-block detection.Then,StyleGAN2-Ada is utilized for generative adversarial networks data augmentation to expand the number of defect samples in small sample defects.Finally,the fake dataset is combined with real dataset,and the improved YOLOv5 model is trained on this mixed dataset.Experimental results demonstrate that the proposed method achieves a superior performance in detecting the defects with small sample and small target,with the final recall rate reaching 99.7%,an increase of 3.9% compared with the unimproved model.Additionally,the precision and mean average precision are increased by 3.4% and 3.5%,respectively.Moreover,the experiments demonstrate that the improved network training on the mixed dataset can effectively enhance the detection performance of the model.The combination of these approaches significantly improves the network’s ability to detect solar cell defects.
文摘Phantom limb pain(PLP)is not only a physical pain experience but also poses a significant challenge to mental health and quality of life.Currently,the mechanism of PLP treatment is still unclear,and there are many methods with varying effects.This article starts with the application research of extended reality technology in PLP treatment,through describing the application of its branch technologies(virtual reality,augmented reality,and mixed reality technology),to lay the foundation for subsequent research,in the hope of finding advanced and effective treatment methods,and providing a basis for future product transformation.
基金Natural Science Foundation of Zhejiang Province,Grant/Award Number:LY23F020025Science and Technology Commissioner Program of Huzhou,Grant/Award Number:2023GZ42Sichuan Provincial Science and Technology Support Program,Grant/Award Numbers:2023ZHCG0005,2023ZHCG0008。
文摘Data augmentation plays an important role in training deep neural model by expanding the size and diversity of the dataset.Initially,data augmentation mainly involved some simple transformations of images.Later,in order to increase the diversity and complexity of data,more advanced methods appeared and evolved to sophisticated generative models.However,these methods required a mass of computation of training or searching.In this paper,a novel training-free method that utilises the Pre-Trained Segment Anything Model(SAM)model as a data augmentation tool(PTSAM-DA)is proposed to generate the augmented annotations for images.Without the need for training,it obtains prompt boxes from the original annotations and then feeds the boxes to the pre-trained SAM to generate diverse and improved annotations.In this way,annotations are augmented more ingenious than simple manipulations without incurring huge computation for training a data augmentation model.Multiple comparative experiments on three datasets are conducted,including an in-house dataset,ADE20K and COCO2017.On this in-house dataset,namely Agricultural Plot Segmentation Dataset,maximum improvements of 3.77%and 8.92%are gained in two mainstream metrics,mIoU and mAcc,respectively.Consequently,large vision models like SAM are proven to be promising not only in image segmentation but also in data augmentation.
基金funded by the Bavarian State Ministry of Science,Research and Art(Grant number:H.2-F1116.WE/52/2)。
文摘In order to address the widespread data shortage problem in battery research,this paper proposes a generative adversarial network model that combines it with deep convolutional networks,the Wasserstein distance,and the gradient penalty to achieve data augmentation.To lower the threshold for implementing the proposed method,transfer learning is further introduced.The W-DC-GAN-GP-TL framework is thereby formed.This framework is evaluated on 3 different publicly available datasets to judge the quality of generated data.Through visual comparisons and the examination of two visualization methods(probability density function(PDF)and principal component analysis(PCA)),it is demonstrated that the generated data is hard to distinguish from the real data.The application of generated data for training a battery state model using transfer learning is further evaluated.Specifically,Bi-GRU-based and Transformer-based methods are implemented on 2 separate datasets for estimating state of health(SOH)and state of charge(SOC),respectively.The results indicate that the proposed framework demonstrates satisfactory performance in different scenarios:for the data replacement scenario,where real data are removed and replaced with generated data,the state estimator accuracy decreases only slightly;for the data enhancement scenario,the estimator accuracy is further improved.The estimation accuracy of SOH and SOC is as low as 0.69%and 0.58%root mean square error(RMSE)after applying the proposed framework.This framework provides a reliable method for enriching battery measurement data.It is a generalized framework capable of generating a variety of time series data.
基金supported by the Natural Science Foundation of Sichuan Province(grant number:25NSFSC0265).
文摘Peri-implant keratinized mucosa(PIKM)augmentation refers to surgical procedures aimed at increasing the width of PIKM.Consensus reports emphasize the necessity of maintaining a minimum width of PIKM to ensure long-term peri-implant health.Currently,several surgical techniques have been validated for their effectiveness in increasing PIKM.However,the selection and application of PIKM augmentation methods may present challenges for dental practitioners due to heterogeneity in surgical techniques,variations in clinical scenarios,and anatomical differences.Therefore,clear guidelines and considerations for PIKM augmentation are needed.This expert consensus focuses on the commonly employed surgical techniques for PIKM augmentation and the factors influencing their selection at second-stage surgery.It aims to establish a standardized framework for assessing,planning,and executing PIKM augmentation procedures,with the goal of offering evidence-based guidance to enhance the predictability and success of PIKM augmentation.
基金supported by the National Natural Science Foundation of China(Nos.62276204 and 62203343)the Fundamental Research Funds for the Central Universities(No.YJSJ24011)+1 种基金the Natural Science Basic Research Program of Shanxi,China(Nos.2022JM-340 and 2023-JC-QN-0710)the China Postdoctoral Science Foundation(Nos.2020T130494 and 2018M633470).
文摘Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built.
基金supported by the Natural Science Foundation of China(No.41804112,author:Chengyun Song).
文摘Existing semi-supervisedmedical image segmentation algorithms use copy-paste data augmentation to correct the labeled-unlabeled data distribution mismatch.However,current copy-paste methods have three limitations:(1)training the model solely with copy-paste mixed pictures from labeled and unlabeled input loses a lot of labeled information;(2)low-quality pseudo-labels can cause confirmation bias in pseudo-supervised learning on unlabeled data;(3)the segmentation performance in low-contrast and local regions is less than optimal.We design a Stochastic Augmentation-Based Dual-Teaching Auxiliary Training Strategy(SADT),which enhances feature diversity and learns high-quality features to overcome these problems.To be more precise,SADT trains the Student Network by using pseudo-label-based training from Teacher Network 1 and supervised learning with labeled data,which prevents the loss of rare labeled data.We introduce a bi-directional copy-pastemask with progressive high-entropy filtering to reduce data distribution disparities and mitigate confirmation bias in pseudo-supervision.For the mixed images,Deep-Shallow Spatial Contrastive Learning(DSSCL)is proposed in the feature spaces of Teacher Network 2 and the Student Network to improve the segmentation capabilities in low-contrast and local areas.In this procedure,the features retrieved by the Student Network are subjected to a random feature perturbation technique.On two openly available datasets,extensive trials show that our proposed SADT performs much better than the state-ofthe-art semi-supervised medical segmentation techniques.Using only 10%of the labeled data for training,SADT was able to acquire a Dice score of 90.10%on the ACDC(Automatic Cardiac Diagnosis Challenge)dataset.
文摘Augmented reality(AR)is an emerging dynamic technology that effectively supports education across different levels.The increased use of mobile devices has an even greater impact.As the demand for AR applications in education continues to increase,educators actively seek innovative and immersive methods to engage students in learning.However,exploring these possibilities also entails identifying and overcoming existing barriers to optimal educational integration.Concurrently,this surge in demand has prompted the identification of specific barriers,one of which is three-dimensional(3D)modeling.Creating 3D objects for augmented reality education applications can be challenging and time-consuming for the educators.To address this,we have developed a pipeline that creates realistic 3D objects from the two-dimensional(2D)photograph.Applications for augmented and virtual reality can then utilize these created 3D objects.We evaluated the proposed pipeline based on the usability of the 3D object and performance metrics.Quantitatively,with 117 respondents,the co-creation team was surveyed with openended questions to evaluate the precision of the 3D object created by the proposed photogrammetry pipeline.We analyzed the survey data using descriptive-analytical methods and found that the proposed pipeline produces 3D models that are positively accurate when compared to real-world objects,with an average mean score above 8.This study adds new knowledge in creating 3D objects for augmented reality applications by using the photogrammetry technique;finally,it discusses potential problems and future research directions for 3D objects in the education sector.
文摘Retinal blood vessel segmentation is crucial for diagnosing ocular and cardiovascular diseases.Although the introduction of U-Net in 2015 by Olaf Ronneberger significantly advanced this field,yet issues like limited training data,imbalance data distribution,and inadequate feature extraction persist,hindering both the segmentation performance and optimal model generalization.Addressing these critical issues,the DEFFA-Unet is proposed featuring an additional encoder to process domain-invariant pre-processed inputs,thereby improving both richer feature encoding and enhanced model generalization.A feature filtering fusion module is developed to ensure the precise feature filtering and robust hybrid feature fusion.In response to the task-specific need for higher precision where false positives are very costly,traditional skip connections are replaced with the attention-guided feature reconstructing fusion module.Additionally,innovative data augmentation and balancing methods are proposed to counter data scarcity and distribution imbalance,further boosting the robustness and generalization of the model.With a comprehensive suite of evaluation metrics,extensive validations on four benchmark datasets(DRIVE,CHASEDB1,STARE,and HRF)and an SLO dataset(IOSTAR),demonstrate the proposed method’s superiority over both baseline and state-of-the-art models.Particularly the proposed method significantly outperforms the compared methods in cross-validation model generalization.