Lead-free double perovskites have gained recognition as top luminescent materials due to their environmental friendliness,high chemical stability,structural adjustability,and excellent photoelectric properties.However...Lead-free double perovskites have gained recognition as top luminescent materials due to their environmental friendliness,high chemical stability,structural adjustability,and excellent photoelectric properties.However,the poor modulation of emission restricts their applications,and it is highly desirable to explore stable and efficient double perovskites with multimode luminescence and adjustable spectra for multifunctional photoelectric applications.Herein,the rare earth ions Ln^(3+)(Er^(3+)and Ho^(3+))-doped Cs_(2)NaYCl_(6):Sb^(3+)crystals were synthesized by a simple solvothermal route.The X-ray diffraction pattern(XRD),energy-dispersive spectroscopy(EDS),X-ray photoelectron spectroscopy(XPS),and elemental mapping images demonstrate that the Sb^(3+),Er^(3+),and Ho^(3+)ions have been homogeneously incorporated into the Cs_(2)NaYCl_(6)crystals.As anticipated,the emissio n spectra of Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)are composed of two bands.One broad blue band derives from self-trapped exciton(STE)in[SbCl_(6)]3-octahedra while another group of emission peaks stems from the f-f transitions of Ln^(3+)ions.The emission colors of Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)phosphors can be tuned in a wide range by modulating the doping concentrations of Ln^(3+)ions.The efficient energy transfer from STE to Ln^(3+)is the key point to achieving the efficient and tunable emissions Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)samples.Interestingly,Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)can also exhibit characteristic up-conversion luminescence of Ln^(3+)under nearinfrared(NIR)excitation besides the down-conversion luminescence,revealing that the materials may have potential applicability in multimode anti-counterfeiting and information encryption applications.Furthermore,the light emitting diodes(LEDs)assembled by Cs_(2)NaYCl_(6):Sb^(3+)and Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)phosphors display dazzling blue,green,and red emissions under a forward bias current,which indicates that the as-obtained double perovskites materials may have great potential in solid-state lighting and optoelectronic devices.展开更多
This work introduces special states for light in multimode fibers featuring strongly enhanced or reduced correlations be-tween output fields in the presence of environmental temperature fluctuations.Using experimental...This work introduces special states for light in multimode fibers featuring strongly enhanced or reduced correlations be-tween output fields in the presence of environmental temperature fluctuations.Using experimentally measured multi-tem-perature transmission matrix,a set of temperature principal modes that exhibit resilience to disturbances caused by tem-perature fluctuations can be generated.Reversing this concept also allows the construction of temperature anti-principal modes,with output profiles more susceptible to temperature influences than the unmodulated wavefront.Despite changes in the length of the multimode fiber within the temperature-fluctuating region,the proposed approach remains capable of robustly controlling the temperature response within the fiber.To illustrate the practicality of the proposed spe-cial state,a learning-empowered fiber specklegram temperature sensor based on temperature anti-principal mode sensi-tization is proposed.This sensor exhibits outstanding superiority over traditional approaches in terms of resolution and accuracy.These novel states are anticipated to have wide-ranging applications in fiber communication,sensing,imaging,and spectroscopy,and serve as a source of inspiration for the discovery of other novel states.展开更多
We proposed and demonstrated the ultra-compact 1310/1550 nm wavelength multiplexer/demultiplexer assisted by subwavelength grating(SWG)using particle swarm optimization(PSO)algorithm in silicon-on-insulator(SOI)platfo...We proposed and demonstrated the ultra-compact 1310/1550 nm wavelength multiplexer/demultiplexer assisted by subwavelength grating(SWG)using particle swarm optimization(PSO)algorithm in silicon-on-insulator(SOI)platform.Through the self-imaging effect of multimode interference(MMI)coupler,the demultiplexing function for 1310 nm and 1550 nm wavelengths is implemented.After that,three parallel SWG-based slots are inserted into the MMI section so that the effective refractive index of the modes can be engineered and thus the beat length can be adjusted.Importantly,these three SWG slots significantly reduce the length of the device,which is much shorter than the length of traditional MMI-based wavelength demultiplexers.Ultimately,by using the PSO algorithm,the equivalent refractive index and width of the SWG in a certain range are optimized to achieve the best performance of the wavelength demultiplexer.It has been verified that the device footprint is only 2×30.68μm^(2),and 1 dB bandwidths of larger than 120 nm are acquired at 1310 nm and 1550 nm wavelengths.Meanwhile,the transmitted spectrum shows that the insertion loss(IL)values are below 0.47 dB at both wavelengths when the extinction ratio(ER)values are above 12.65 dB.This inverse design approach has been proved to be efficient in increasing bandwidth and reducing device length.展开更多
To improve locomotion and operation integration, this paper presents an integrated leg-arm quadruped robot(ILQR) that has a reconfigurable joint. First, the reconfigurable joint is designed and assembled at the end of...To improve locomotion and operation integration, this paper presents an integrated leg-arm quadruped robot(ILQR) that has a reconfigurable joint. First, the reconfigurable joint is designed and assembled at the end of the legarm chain. When the robot performs a task, reconfigurable configuration and mode switching can be achieved using this joint. In contrast from traditional quadruped robots, this robot can stack in a designated area to optimize the occupied volume in a nonworking state. Kinematics modeling and dynamics modeling are established to evaluate the mechanical properties for multiple modes. All working modes of the robot are classified, which can be defined as deployable mode, locomotion mode and operation mode. Based on the stability margin and mechanical modeling, switching analysis and evaluation between each mode is carried out. Finally, the prototype experimental results verify the function realization and switching stability of multimode and provide a design method to integrate and perform multimode for quadruped robots with deployable characteristics.展开更多
Optical endoscopy has become an essential diagnostic and therapeutic approach in modern biomedicine for directly observing organs and tissues deep inside the human body,enabling non-invasive,rapid diagnosis and treatm...Optical endoscopy has become an essential diagnostic and therapeutic approach in modern biomedicine for directly observing organs and tissues deep inside the human body,enabling non-invasive,rapid diagnosis and treatment.Optical fiber endoscopy is highly competitive among various endoscopic imaging techniques due to its high flexibility,compact structure,excellent resolution,and resistance to electromagnetic interference.Over the past decade,endoscopes based on a single multimode optical fiber(MMF)have attracted widespread research interest due to their potential to significantly reduce the footprint of optical fiber endoscopes and enhance imaging capabilities.In comparison with other imaging principles of MMF endoscopes,the scanning imaging method based on the wavefront shaping technique is highly developed and provides benefits including excellent imaging contrast,broad applicability to complex imaging scenarios,and good compatibility with various well-established scanning imaging modalities.In this review,various technical routes to achieve light focusing through MMF and procedures to conduct the scanning imaging of MMF endoscopes are introduced.The advancements in imaging performance enhancements,integrations of various imaging modalities with MMF scanning endoscopes,and applications are summarized.Challenges specific to this endoscopic imaging technology are analyzed,and potential remedies and avenues for future developments are discussed.展开更多
Multimodal sensor fusion can make full use of the advantages of various sensors,make up for the shortcomings of a single sensor,achieve information verification or information security through information redundancy,a...Multimodal sensor fusion can make full use of the advantages of various sensors,make up for the shortcomings of a single sensor,achieve information verification or information security through information redundancy,and improve the reliability and safety of the system.Artificial intelligence(AI),referring to the simulation of human intelligence in machines that are programmed to think and learn like humans,represents a pivotal frontier in modern scientific research.With the continuous development and promotion of AI technology in Sensor 4.0 age,multimodal sensor fusion is becoming more and more intelligent and automated,and is expected to go further in the future.With this context,this review article takes a comprehensive look at the recent progress on AI-enhanced multimodal sensors and their integrated devices and systems.Based on the concept and principle of sensor technologies and AI algorithms,the theoretical underpinnings,technological breakthroughs,and pragmatic applications of AI-enhanced multimodal sensors in various fields such as robotics,healthcare,and environmental monitoring are highlighted.Through a comparative study of the dual/tri-modal sensors with and without using AI technologies(especially machine learning and deep learning),AI-enhanced multimodal sensors highlight the potential of AI to improve sensor performance,data processing,and decision-making capabilities.Furthermore,the review analyzes the challenges and opportunities afforded by AI-enhanced multimodal sensors,and offers a prospective outlook on the forthcoming advancements.展开更多
Thunderstorm wind gusts are small in scale,typically occurring within a range of a few kilometers.It is extremely challenging to monitor and forecast thunderstorm wind gusts using only automatic weather stations.There...Thunderstorm wind gusts are small in scale,typically occurring within a range of a few kilometers.It is extremely challenging to monitor and forecast thunderstorm wind gusts using only automatic weather stations.Therefore,it is necessary to establish thunderstorm wind gust identification techniques based on multisource high-resolution observations.This paper introduces a new algorithm,called thunderstorm wind gust identification network(TGNet).It leverages multimodal feature fusion to fuse the temporal and spatial features of thunderstorm wind gust events.The shapelet transform is first used to extract the temporal features of wind speeds from automatic weather stations,which is aimed at distinguishing thunderstorm wind gusts from those caused by synoptic-scale systems or typhoons.Then,the encoder,structured upon the U-shaped network(U-Net)and incorporating recurrent residual convolutional blocks(R2U-Net),is employed to extract the corresponding spatial convective characteristics of satellite,radar,and lightning observations.Finally,by using the multimodal deep fusion module based on multi-head cross-attention,the temporal features of wind speed at each automatic weather station are incorporated into the spatial features to obtain 10-minutely classification of thunderstorm wind gusts.TGNet products have high accuracy,with a critical success index reaching 0.77.Compared with those of U-Net and R2U-Net,the false alarm rate of TGNet products decreases by 31.28%and 24.15%,respectively.The new algorithm provides grid products of thunderstorm wind gusts with a spatial resolution of 0.01°,updated every 10minutes.The results are finer and more accurate,thereby helping to improve the accuracy of operational warnings for thunderstorm wind gusts.展开更多
Multimodal deep learning has emerged as a key paradigm in contemporary medical diagnostics,advancing precision medicine by enabling integration and learning from diverse data sources.The exponential growth of high-dim...Multimodal deep learning has emerged as a key paradigm in contemporary medical diagnostics,advancing precision medicine by enabling integration and learning from diverse data sources.The exponential growth of high-dimensional healthcare data,encompassing genomic,transcriptomic,and other omics profiles,as well as radiological imaging and histopathological slides,makes this approach increasingly important because,when examined separately,these data sources only offer a fragmented picture of intricate disease processes.Multimodal deep learning leverages the complementary properties of multiple data modalities to enable more accurate prognostic modeling,more robust disease characterization,and improved treatment decision-making.This review provides a comprehensive overview of the current state of multimodal deep learning approaches in medical diagnosis.We classify and examine important application domains,such as(1)radiology,where automated report generation and lesion detection are facilitated by image-text integration;(2)histopathology,where fusion models improve tumor classification and grading;and(3)multi-omics,where molecular subtypes and latent biomarkers are revealed through cross-modal learning.We provide an overview of representative research,methodological advancements,and clinical consequences for each domain.Additionally,we critically analyzed the fundamental issues preventing wider adoption,including computational complexity(particularly in training scalable,multi-branch networks),data heterogeneity(resulting from modality-specific noise,resolution variations,and inconsistent annotations),and the challenge of maintaining significant cross-modal correlations during fusion.These problems impede interpretability,which is crucial for clinical trust and use,in addition to performance and generalizability.Lastly,we outline important areas for future research,including the development of standardized protocols for harmonizing data,the creation of lightweight and interpretable fusion architectures,the integration of real-time clinical decision support systems,and the promotion of cooperation for federated multimodal learning.Our goal is to provide researchers and clinicians with a concise overview of the field’s present state,enduring constraints,and exciting directions for further research through this review.展开更多
Sleep monitoring is an important part of health management because sleep quality is crucial for restoration of human health.However,current commercial products of polysomnography are cumbersome with connecting wires a...Sleep monitoring is an important part of health management because sleep quality is crucial for restoration of human health.However,current commercial products of polysomnography are cumbersome with connecting wires and state-of-the-art flexible sensors are still interferential for being attached to the body.Herein,we develop a flexible-integrated multimodal sensing patch based on hydrogel and its application in unconstraint sleep monitoring.The patch comprises a bottom hydrogel-based dualmode pressure–temperature sensing layer and a top electrospun nanofiber-based non-contact detection layer as one integrated device.The hydrogel as core substrate exhibits strong toughness and water retention,and the multimodal sensing of temperature,pressure,and non-contact proximity is realized based on different sensing mechanisms with no crosstalk interference.The multimodal sensing function is verified in a simulated real-world scenario by a robotic hand grasping objects to validate its practicability.Multiple multimodal sensing patches integrated on different locations of a pillow are assembled for intelligent sleep monitoring.Versatile human–pillow interaction information as well as their evolution over time are acquired and analyzed by a one-dimensional convolutional neural network.Track of head movement and recognition of bad patterns that may lead to poor sleep are achieved,which provides a promising approach for sleep monitoring.展开更多
The detection of the state of polarization(SOP)of light is essential for many optical applications.However,cost-effective SOP measurement is a challenge due to the complexity of conventional methods and the poor trans...The detection of the state of polarization(SOP)of light is essential for many optical applications.However,cost-effective SOP measurement is a challenge due to the complexity of conventional methods and the poor transferability of new methods.We propose a straightforward,low-cost,and portable SOP measurement system based on the multimode fiber speckle.A convolutional neural network is utilized to establish the mapping relationship between speckle and Stokes parameters.The lowest root-mean-square error of the estimated SOP on the Poincarésphere can be 0.0042.This method is distinguished by its low cost,clear structure,and applicability to different wavelengths with high precision.The proposed method is of great value in polarization-related applications.展开更多
Joint Multimodal Aspect-based Sentiment Analysis(JMASA)is a significant task in the research of multimodal fine-grained sentiment analysis,which combines two subtasks:Multimodal Aspect Term Extraction(MATE)and Multimo...Joint Multimodal Aspect-based Sentiment Analysis(JMASA)is a significant task in the research of multimodal fine-grained sentiment analysis,which combines two subtasks:Multimodal Aspect Term Extraction(MATE)and Multimodal Aspect-oriented Sentiment Classification(MASC).Currently,most existing models for JMASA only perform text and image feature encoding from a basic level,but often neglect the in-depth analysis of unimodal intrinsic features,which may lead to the low accuracy of aspect term extraction and the poor ability of sentiment prediction due to the insufficient learning of intra-modal features.Given this problem,we propose a Text-Image Feature Fine-grained Learning(TIFFL)model for JMASA.First,we construct an enhanced adjacency matrix of word dependencies and adopt graph convolutional network to learn the syntactic structure features for text,which addresses the context interference problem of identifying different aspect terms.Then,the adjective-noun pairs extracted from image are introduced to enable the semantic representation of visual features more intuitive,which addresses the ambiguous semantic extraction problem during image feature learning.Thereby,the model performance of aspect term extraction and sentiment polarity prediction can be further optimized and enhanced.Experiments on two Twitter benchmark datasets demonstrate that TIFFL achieves competitive results for JMASA,MATE and MASC,thus validating the effectiveness of our proposed methods.展开更多
Hateful meme is a multimodal medium that combines images and texts.The potential hate content of hateful memes has caused serious problems for social media security.The current hateful memes classification task faces ...Hateful meme is a multimodal medium that combines images and texts.The potential hate content of hateful memes has caused serious problems for social media security.The current hateful memes classification task faces significant data scarcity challenges,and direct fine-tuning of large-scale pre-trained models often leads to severe overfitting issues.In addition,it is a challenge to understand the underlying relationship between text and images in the hateful memes.To address these issues,we propose a multimodal hateful memes classification model named LABF,which is based on low-rank adapter layers and bidirectional gated feature fusion.Firstly,low-rank adapter layers are adopted to learn the feature representation of the new dataset.This is achieved by introducing a small number of additional parameters while retaining prior knowledge of the CLIP model,which effectively alleviates the overfitting phenomenon.Secondly,a bidirectional gated feature fusion mechanism is designed to dynamically adjust the interaction weights of text and image features to achieve finer cross-modal fusion.Experimental results show that the method significantly outperforms existing methods on two public datasets,verifying its effectiveness and robustness.展开更多
Multimodal Aspect-Based Sentiment Analysis(MABSA)aims to detect sentiment polarity toward specific aspects by leveraging both textual and visual inputs.However,existing models suffer from weak aspectimage alignment,mo...Multimodal Aspect-Based Sentiment Analysis(MABSA)aims to detect sentiment polarity toward specific aspects by leveraging both textual and visual inputs.However,existing models suffer from weak aspectimage alignment,modality imbalance dominated by textual signals,and limited reasoning for implicit or ambiguous sentiments requiring external knowledge.To address these issues,we propose a unified framework named Gated-Linear Aspect-Aware Multimodal Sentiment Network(GLAMSNet).First of all,an input encoding module is employed to construct modality-specific and aspect-aware representations.Subsequently,we introduce an image–aspect correlation matching module to provide hierarchical supervision for visual-textual alignment.Building upon these components,we further design a Gated-Linear Aspect-Aware Fusion(GLAF)module to enhance aspect-aware representation learning by adaptively filtering irrelevant textual information and refining semantic alignment under aspect guidance.Additionally,an External Language Model Knowledge-Guided mechanism is integrated to incorporate sentimentaware prior knowledge from GPT-4o,enabling robust semantic reasoning especially under noisy or ambiguous inputs.Experimental studies conducted based on Twitter-15 and Twitter-17 datasets demonstrate that the proposed model outperforms most state-of-the-art methods,achieving 79.36%accuracy and 74.72%F1-score,and 74.31%accuracy and 72.01%F1-score,respectively.展开更多
Gliomas are the most common malignant tumors in the central nervous system and are known for their inherent diversity and propensity to invade surrounding tissue.These features pose significant challenges in diagnosin...Gliomas are the most common malignant tumors in the central nervous system and are known for their inherent diversity and propensity to invade surrounding tissue.These features pose significant challenges in diagnosing and treating these tumors.Magnetic resonance imaging(MRI)has not only remained at the forefront of glioma management but has also evolved significantly with the advent of multimodal MRI.The rise of multimodal MRI represents a pivotal leap forward,as it seamlessly integrates diverse MRI sequences and advanced techniques to offer an unprecedented,comprehensive,and multidimensional glimpse into the complexities of glioma pathology,including encompassing structural,functional,and even molecular imaging.This holistic approach empowers clinicians with a deeper understanding of tumor characteristics,enabling more precise diagnoses,tailored treatment strategies,and enhanced monitoring capabilities,ultimately improving patient outcomes.Looking ahead,the integration of artificial intelligence(AI)with MRI data heralds a new era of unparalleled precision in glioma diagnosis and therapy.This integration holds the promise to revolutionize the field,enabling more sophisticated analyses that fully leverage all aspects of multimodal MRI.In summary,with the continuous advancement of multimodal MRI techniques and future deep integrations with artificial intelligence,glioma care is poised to evolve toward increasingly personalized,precise,and efficacious strategies.展开更多
With the rapid growth of the Internet and social media, information is widely disseminated in multimodal forms, such as text and images, where discriminatory content can manifest in various ways. Discrimination detect...With the rapid growth of the Internet and social media, information is widely disseminated in multimodal forms, such as text and images, where discriminatory content can manifest in various ways. Discrimination detection techniques for multilingual and multimodal data can identify potential discriminatory behavior and help foster a more equitable and inclusive cyberspace. However, existing methods often struggle in complex contexts and multilingual environments. To address these challenges, this paper proposes an innovative detection method, using image and multilingual text encoders to separately extract features from different modalities. It continuously updates a historical feature memory bank, aggregates the Top-K most similar samples, and utilizes a Gated Recurrent Unit (GRU) to integrate current and historical features, generating enhanced feature representations with stronger semantic expressiveness to improve the model’s ability to capture discriminatory signals. Experimental results demonstrate that the proposed method exhibits superior discriminative power and detection accuracy in multilingual and multimodal contexts, offering a reliable and effective solution for identifying discriminatory content.展开更多
In this study,we present a small,integrated jumping-crawling robot capable of intermittent jumping and self-resetting.Compared to robots with a single mode of locomotion,this multi-modal robot exhibits enhanced obstac...In this study,we present a small,integrated jumping-crawling robot capable of intermittent jumping and self-resetting.Compared to robots with a single mode of locomotion,this multi-modal robot exhibits enhanced obstacle-surmounting capabilities.To achieve this,the robot employs a novel combination of a jumping module and a crawling module.The jumping module features improved energy storage capacity and an active clutch.Within the constraints of structural robustness,the jumping module maximizes the explosive power of the linear spring by utilizing the mechanical advantage of a closed-loop mechanism and controls the energy flow of the jumping module through an active clutch mechanism.Furthermore,inspired by the limb movements of tortoises during crawling and self-righting,a single-degree-of-freedom spatial four-bar crawling mechanism was designed to enable crawling,steering,and resetting functions.To demonstrate its practicality,the integrated jumping-crawling robot was tested in a laboratory environment for functions such as jumping,crawling,self-resetting,and steering.Experimental results confirmed the feasibility of the proposed integrated jumping-crawling robot.展开更多
With the popularization of social media,stickers have become an important tool for young students to express themselves and resist mainstream culture due to their unique visual and emotional expressiveness.Most existi...With the popularization of social media,stickers have become an important tool for young students to express themselves and resist mainstream culture due to their unique visual and emotional expressiveness.Most existing studies focus on the negative impacts of spoof stickers,while paying insufficient attention to their positive functions.From the perspective of multimodal metaphor,this paper uses methods such as virtual ethnography and image-text analysis to clarify the connotation of stickers,understand the evolution of their digital dissemination forms,and explore the multiple functions of subcultural stickers in the social interactions between teachers and students.Young students use stickers to convey emotions and information.Their expressive function,social function,and cultural metaphor function progress in a progressive manner.This not only shapes students’values but also promotes self-expression and teacher-student interaction.It also reminds teachers to correct students’negative thoughts by using stickers,achieving the effect of“cultivating and influencing people through culture.”展开更多
The human gastrointestinal(GI)tract is influenced by numerous disorders.If not detected in the early stages,they may result in severe consequences such as organ failure or the development of cancer,and in extreme case...The human gastrointestinal(GI)tract is influenced by numerous disorders.If not detected in the early stages,they may result in severe consequences such as organ failure or the development of cancer,and in extreme cases,become life-threatening.Endoscopy is a specialised imaging technique used to examine the GI tract.However,physicians might neglect certain irregular morphologies during the examination due to continuous monitoring of the video recording.Recent advancements in artificial intelligence have led to the development of high-performance AI-based systems,which are optimal for computer-assisted diagnosis.Due to numerous limitations in endoscopic image analysis,including visual similarities between infected and healthy areas,retrieval of irrelevant features,and imbalanced testing and training datasets,performance accuracy is reduced.To address these challenges,we proposed a framework for analysing gastrointestinal tract images that provides a more robust and secure model,thereby reducing the chances of misclassification.Compared to single model solutions,the proposed methodology improves performance by integrating diverse models and optimizing feature fusion using a dual-branch CNN transformer architecture.The proposed approach employs a dual-branch feature extraction mechanism,where in the first branch,features are extracted using Extended BEiT,and EfficientNet-B5 is utilized in the second branch.Additionally,crossentropy loss is used to measure the error of prediction at both branches,followed by model stacking.This multimodal framework outperforms existing approaches acrossmultiple metrics,achieving 94.12%accuracy,recall and F1-score,as well as 94.15%precision on the Kvasir dataset.Furthermore,the model successfully reduced the false negative rate to 5.88%,enhancing its ability to minimize misdiagnosis.These results highlight the adaptability of the proposed work in clinical practice,where it can provide fast and accurate diagnostic assistance crucial for improving the early diagnosis of diseases in the gastrointestinal tract.展开更多
Background:Irregular heartbeats can have serious health implications if left undetected and untreated for an extended period of time.Methods:This study leverages machine learning(ML)techniques to classify electrocardi...Background:Irregular heartbeats can have serious health implications if left undetected and untreated for an extended period of time.Methods:This study leverages machine learning(ML)techniques to classify electrocardiogram(ECG)heartbeats,comparing traditional feature-based ML methods with innovative image-based approaches.The dataset underwent rigorous preprocessing,including down-sampling,frequency filtering,beat segmentation,and normalization.Two methodologies were explored:(1)handcrafted feature extraction,utilizing metrics like heart rate variability and RR distances with LightGBM classifiers,and(2)image transformation of ECG signals using Gramian Angular Field(GAF),Markov Transition Field(MTF),and Recurrence Plot(RP),enabling multimodal input for convolutional neural networks(CNNs).The Synthetic Minority Oversampling Technique(SMOTE)addressed data imbalance,significantly improving minority-class metrics.Results:The handcrafted feature approach achieved notable performance,with LightGBM excelling in precision and recall.Image-based classification further enhanced outcomes,with a custom Inception-based CNN,attaining an 85%F1 score and 97%accuracy using combined GAF,MTF,and RP transformations.Statistical analyses confirmed the significance of these improvements.Conclusion:This work highlights the potential of ML for cardiac irregularities detection,demonstrating that combining advanced preprocessing,feature engineering,and state-of-the-art neural networks can improve classification accuracy.These findings contribute to advancing AI-driven diagnostic tools,offering promising implications for cardiovascular healthcare.展开更多
In the context of digitalization,course resources exhibit multimodal characteristics,covering various forms such as text,images,and videos.Course knowledge and learning resources are becoming increasingly diverse,prov...In the context of digitalization,course resources exhibit multimodal characteristics,covering various forms such as text,images,and videos.Course knowledge and learning resources are becoming increasingly diverse,providing favorable conditions for students’in-depth and efficient learning.Against this backdrop,how to scientifically apply emerging technologies to automatically collect,process,and integrate digital learning resources such as voices,videos,and courseware texts,and better innovate the organization and presentation forms of course knowledge has become an important development direction for“artificial intelligence+education.”This article elaborates on the elements and characteristics of knowledge graphs,analyzes the construction steps of knowledge graphs,and explores the construction methods of multimodal course knowledge graphs from aspects such as dataset collection,course knowledge ontology identification,knowledge discovery,and association,providing references for the intelligent application of online open courses.展开更多
基金Project supported by the Natural Science Foundation of Hebei Province(B2023201108,B2024201076)Science Fund for Creative Research Groups of Natural Science Foundation of Hebei Province(B2021201038)+3 种基金333 Talent Project Fund of Hebei Province(C20221015)National High-End Foreign Expert Recruitment Plan(G2022003007L)Hebei Province Higher Education Science and Technology Research Project(JZX2023001)Hebei Province Innovation Capability Enhancement Plan Project(22567632H)。
文摘Lead-free double perovskites have gained recognition as top luminescent materials due to their environmental friendliness,high chemical stability,structural adjustability,and excellent photoelectric properties.However,the poor modulation of emission restricts their applications,and it is highly desirable to explore stable and efficient double perovskites with multimode luminescence and adjustable spectra for multifunctional photoelectric applications.Herein,the rare earth ions Ln^(3+)(Er^(3+)and Ho^(3+))-doped Cs_(2)NaYCl_(6):Sb^(3+)crystals were synthesized by a simple solvothermal route.The X-ray diffraction pattern(XRD),energy-dispersive spectroscopy(EDS),X-ray photoelectron spectroscopy(XPS),and elemental mapping images demonstrate that the Sb^(3+),Er^(3+),and Ho^(3+)ions have been homogeneously incorporated into the Cs_(2)NaYCl_(6)crystals.As anticipated,the emissio n spectra of Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)are composed of two bands.One broad blue band derives from self-trapped exciton(STE)in[SbCl_(6)]3-octahedra while another group of emission peaks stems from the f-f transitions of Ln^(3+)ions.The emission colors of Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)phosphors can be tuned in a wide range by modulating the doping concentrations of Ln^(3+)ions.The efficient energy transfer from STE to Ln^(3+)is the key point to achieving the efficient and tunable emissions Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)samples.Interestingly,Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)can also exhibit characteristic up-conversion luminescence of Ln^(3+)under nearinfrared(NIR)excitation besides the down-conversion luminescence,revealing that the materials may have potential applicability in multimode anti-counterfeiting and information encryption applications.Furthermore,the light emitting diodes(LEDs)assembled by Cs_(2)NaYCl_(6):Sb^(3+)and Cs_(2)NaYCl_(6):Sb^(3+)/Ln^(3+)phosphors display dazzling blue,green,and red emissions under a forward bias current,which indicates that the as-obtained double perovskites materials may have great potential in solid-state lighting and optoelectronic devices.
基金financial supports from the National Natural Science Foundation of China (62075132 and 92050202)Natural Science Foundation of Shanghai (22ZR1443100)
文摘This work introduces special states for light in multimode fibers featuring strongly enhanced or reduced correlations be-tween output fields in the presence of environmental temperature fluctuations.Using experimentally measured multi-tem-perature transmission matrix,a set of temperature principal modes that exhibit resilience to disturbances caused by tem-perature fluctuations can be generated.Reversing this concept also allows the construction of temperature anti-principal modes,with output profiles more susceptible to temperature influences than the unmodulated wavefront.Despite changes in the length of the multimode fiber within the temperature-fluctuating region,the proposed approach remains capable of robustly controlling the temperature response within the fiber.To illustrate the practicality of the proposed spe-cial state,a learning-empowered fiber specklegram temperature sensor based on temperature anti-principal mode sensi-tization is proposed.This sensor exhibits outstanding superiority over traditional approaches in terms of resolution and accuracy.These novel states are anticipated to have wide-ranging applications in fiber communication,sensing,imaging,and spectroscopy,and serve as a source of inspiration for the discovery of other novel states.
基金supported by the National Natural Science Foundation of China(No.61505160)the Innovation Capability Support Program of Shaanxi(No.2018KJXX-042)+2 种基金the Natural Science Basic Research Program of Shaanxi(No.2019JM-084)the State Key Laboratory of Transient Optics and Photonics(No.SKLST202108)the Graduate Innovation and Practical Ability Training Project of Xi’an Shiyou University(No.YCS22213190)。
文摘We proposed and demonstrated the ultra-compact 1310/1550 nm wavelength multiplexer/demultiplexer assisted by subwavelength grating(SWG)using particle swarm optimization(PSO)algorithm in silicon-on-insulator(SOI)platform.Through the self-imaging effect of multimode interference(MMI)coupler,the demultiplexing function for 1310 nm and 1550 nm wavelengths is implemented.After that,three parallel SWG-based slots are inserted into the MMI section so that the effective refractive index of the modes can be engineered and thus the beat length can be adjusted.Importantly,these three SWG slots significantly reduce the length of the device,which is much shorter than the length of traditional MMI-based wavelength demultiplexers.Ultimately,by using the PSO algorithm,the equivalent refractive index and width of the SWG in a certain range are optimized to achieve the best performance of the wavelength demultiplexer.It has been verified that the device footprint is only 2×30.68μm^(2),and 1 dB bandwidths of larger than 120 nm are acquired at 1310 nm and 1550 nm wavelengths.Meanwhile,the transmitted spectrum shows that the insertion loss(IL)values are below 0.47 dB at both wavelengths when the extinction ratio(ER)values are above 12.65 dB.This inverse design approach has been proved to be efficient in increasing bandwidth and reducing device length.
基金Supported by National Natural Science Foundation of China (Grant Nos. 52375003, 52205006)National Key R&D Program of China (Grant No. 2019YFB1309600)。
文摘To improve locomotion and operation integration, this paper presents an integrated leg-arm quadruped robot(ILQR) that has a reconfigurable joint. First, the reconfigurable joint is designed and assembled at the end of the legarm chain. When the robot performs a task, reconfigurable configuration and mode switching can be achieved using this joint. In contrast from traditional quadruped robots, this robot can stack in a designated area to optimize the occupied volume in a nonworking state. Kinematics modeling and dynamics modeling are established to evaluate the mechanical properties for multiple modes. All working modes of the robot are classified, which can be defined as deployable mode, locomotion mode and operation mode. Based on the stability margin and mechanical modeling, switching analysis and evaluation between each mode is carried out. Finally, the prototype experimental results verify the function realization and switching stability of multimode and provide a design method to integrate and perform multimode for quadruped robots with deployable characteristics.
基金supported by National Natural Science Foundation of China(62135007 and 61925502).
文摘Optical endoscopy has become an essential diagnostic and therapeutic approach in modern biomedicine for directly observing organs and tissues deep inside the human body,enabling non-invasive,rapid diagnosis and treatment.Optical fiber endoscopy is highly competitive among various endoscopic imaging techniques due to its high flexibility,compact structure,excellent resolution,and resistance to electromagnetic interference.Over the past decade,endoscopes based on a single multimode optical fiber(MMF)have attracted widespread research interest due to their potential to significantly reduce the footprint of optical fiber endoscopes and enhance imaging capabilities.In comparison with other imaging principles of MMF endoscopes,the scanning imaging method based on the wavefront shaping technique is highly developed and provides benefits including excellent imaging contrast,broad applicability to complex imaging scenarios,and good compatibility with various well-established scanning imaging modalities.In this review,various technical routes to achieve light focusing through MMF and procedures to conduct the scanning imaging of MMF endoscopes are introduced.The advancements in imaging performance enhancements,integrations of various imaging modalities with MMF scanning endoscopes,and applications are summarized.Challenges specific to this endoscopic imaging technology are analyzed,and potential remedies and avenues for future developments are discussed.
基金supported by the National Natural Science Foundation of China(No.62404111)Natural Science Foundation of Jiangsu Province(No.BK20240635)+2 种基金Natural Science Foundation of the Jiangsu Higher Education Institutions of China(No.24KJB510025)Natural Science Research Start-up Foundation of Recruiting Talents of Nanjing University of Posts and Telecommunications(No.NY223157 and NY223156)Opening Project of Advanced Inte-grated Circuit Package and Testing Research Center of Jiangsu Province(No.NTIKFJJ202303).
文摘Multimodal sensor fusion can make full use of the advantages of various sensors,make up for the shortcomings of a single sensor,achieve information verification or information security through information redundancy,and improve the reliability and safety of the system.Artificial intelligence(AI),referring to the simulation of human intelligence in machines that are programmed to think and learn like humans,represents a pivotal frontier in modern scientific research.With the continuous development and promotion of AI technology in Sensor 4.0 age,multimodal sensor fusion is becoming more and more intelligent and automated,and is expected to go further in the future.With this context,this review article takes a comprehensive look at the recent progress on AI-enhanced multimodal sensors and their integrated devices and systems.Based on the concept and principle of sensor technologies and AI algorithms,the theoretical underpinnings,technological breakthroughs,and pragmatic applications of AI-enhanced multimodal sensors in various fields such as robotics,healthcare,and environmental monitoring are highlighted.Through a comparative study of the dual/tri-modal sensors with and without using AI technologies(especially machine learning and deep learning),AI-enhanced multimodal sensors highlight the potential of AI to improve sensor performance,data processing,and decision-making capabilities.Furthermore,the review analyzes the challenges and opportunities afforded by AI-enhanced multimodal sensors,and offers a prospective outlook on the forthcoming advancements.
基金supported by the National Key Research and Development Program of China(Grant No.2022YFC3004104)the National Natural Science Foundation of China(Grant No.U2342204)+4 种基金the Innovation and Development Program of the China Meteorological Administration(Grant No.CXFZ2024J001)the Open Research Project of the Key Open Laboratory of Hydrology and Meteorology of the China Meteorological Administration(Grant No.23SWQXZ010)the Science and Technology Plan Project of Zhejiang Province(Grant No.2022C03150)the Open Research Fund Project of Anyang National Climate Observatory(Grant No.AYNCOF202401)the Open Bidding for Selecting the Best Candidates Program(Grant No.CMAJBGS202318)。
文摘Thunderstorm wind gusts are small in scale,typically occurring within a range of a few kilometers.It is extremely challenging to monitor and forecast thunderstorm wind gusts using only automatic weather stations.Therefore,it is necessary to establish thunderstorm wind gust identification techniques based on multisource high-resolution observations.This paper introduces a new algorithm,called thunderstorm wind gust identification network(TGNet).It leverages multimodal feature fusion to fuse the temporal and spatial features of thunderstorm wind gust events.The shapelet transform is first used to extract the temporal features of wind speeds from automatic weather stations,which is aimed at distinguishing thunderstorm wind gusts from those caused by synoptic-scale systems or typhoons.Then,the encoder,structured upon the U-shaped network(U-Net)and incorporating recurrent residual convolutional blocks(R2U-Net),is employed to extract the corresponding spatial convective characteristics of satellite,radar,and lightning observations.Finally,by using the multimodal deep fusion module based on multi-head cross-attention,the temporal features of wind speed at each automatic weather station are incorporated into the spatial features to obtain 10-minutely classification of thunderstorm wind gusts.TGNet products have high accuracy,with a critical success index reaching 0.77.Compared with those of U-Net and R2U-Net,the false alarm rate of TGNet products decreases by 31.28%and 24.15%,respectively.The new algorithm provides grid products of thunderstorm wind gusts with a spatial resolution of 0.01°,updated every 10minutes.The results are finer and more accurate,thereby helping to improve the accuracy of operational warnings for thunderstorm wind gusts.
文摘Multimodal deep learning has emerged as a key paradigm in contemporary medical diagnostics,advancing precision medicine by enabling integration and learning from diverse data sources.The exponential growth of high-dimensional healthcare data,encompassing genomic,transcriptomic,and other omics profiles,as well as radiological imaging and histopathological slides,makes this approach increasingly important because,when examined separately,these data sources only offer a fragmented picture of intricate disease processes.Multimodal deep learning leverages the complementary properties of multiple data modalities to enable more accurate prognostic modeling,more robust disease characterization,and improved treatment decision-making.This review provides a comprehensive overview of the current state of multimodal deep learning approaches in medical diagnosis.We classify and examine important application domains,such as(1)radiology,where automated report generation and lesion detection are facilitated by image-text integration;(2)histopathology,where fusion models improve tumor classification and grading;and(3)multi-omics,where molecular subtypes and latent biomarkers are revealed through cross-modal learning.We provide an overview of representative research,methodological advancements,and clinical consequences for each domain.Additionally,we critically analyzed the fundamental issues preventing wider adoption,including computational complexity(particularly in training scalable,multi-branch networks),data heterogeneity(resulting from modality-specific noise,resolution variations,and inconsistent annotations),and the challenge of maintaining significant cross-modal correlations during fusion.These problems impede interpretability,which is crucial for clinical trust and use,in addition to performance and generalizability.Lastly,we outline important areas for future research,including the development of standardized protocols for harmonizing data,the creation of lightweight and interpretable fusion architectures,the integration of real-time clinical decision support systems,and the promotion of cooperation for federated multimodal learning.Our goal is to provide researchers and clinicians with a concise overview of the field’s present state,enduring constraints,and exciting directions for further research through this review.
基金supported by the National Key Research and Development Program of China under Grant(2024YFE0100400)Taishan Scholars Project Special Funds(tsqn202312035)+2 种基金the open research foundation of State Key Laboratory of Integrated Chips and Systems,the Tianjin Science and Technology Plan Project(No.22JCZDJC00630)the Higher Education Institution Science and Technology Research Project of Hebei Province(No.JZX2024024)Jinan City-University Integrated Development Strategy Project under Grant(JNSX2023017).
文摘Sleep monitoring is an important part of health management because sleep quality is crucial for restoration of human health.However,current commercial products of polysomnography are cumbersome with connecting wires and state-of-the-art flexible sensors are still interferential for being attached to the body.Herein,we develop a flexible-integrated multimodal sensing patch based on hydrogel and its application in unconstraint sleep monitoring.The patch comprises a bottom hydrogel-based dualmode pressure–temperature sensing layer and a top electrospun nanofiber-based non-contact detection layer as one integrated device.The hydrogel as core substrate exhibits strong toughness and water retention,and the multimodal sensing of temperature,pressure,and non-contact proximity is realized based on different sensing mechanisms with no crosstalk interference.The multimodal sensing function is verified in a simulated real-world scenario by a robotic hand grasping objects to validate its practicability.Multiple multimodal sensing patches integrated on different locations of a pillow are assembled for intelligent sleep monitoring.Versatile human–pillow interaction information as well as their evolution over time are acquired and analyzed by a one-dimensional convolutional neural network.Track of head movement and recognition of bad patterns that may lead to poor sleep are achieved,which provides a promising approach for sleep monitoring.
基金supported by the National Key Research and Development Program of China(Grant No.2021YFB2800902)the National Natural Science Foundation of China(Grant No.62225110)+1 种基金the Key Research and Development Program of Hubei Province(No.2022BAA001)the Innovation Fund of WNLO.
文摘The detection of the state of polarization(SOP)of light is essential for many optical applications.However,cost-effective SOP measurement is a challenge due to the complexity of conventional methods and the poor transferability of new methods.We propose a straightforward,low-cost,and portable SOP measurement system based on the multimode fiber speckle.A convolutional neural network is utilized to establish the mapping relationship between speckle and Stokes parameters.The lowest root-mean-square error of the estimated SOP on the Poincarésphere can be 0.0042.This method is distinguished by its low cost,clear structure,and applicability to different wavelengths with high precision.The proposed method is of great value in polarization-related applications.
基金supported by the Science and Technology Project of Henan Province(No.222102210081).
文摘Joint Multimodal Aspect-based Sentiment Analysis(JMASA)is a significant task in the research of multimodal fine-grained sentiment analysis,which combines two subtasks:Multimodal Aspect Term Extraction(MATE)and Multimodal Aspect-oriented Sentiment Classification(MASC).Currently,most existing models for JMASA only perform text and image feature encoding from a basic level,but often neglect the in-depth analysis of unimodal intrinsic features,which may lead to the low accuracy of aspect term extraction and the poor ability of sentiment prediction due to the insufficient learning of intra-modal features.Given this problem,we propose a Text-Image Feature Fine-grained Learning(TIFFL)model for JMASA.First,we construct an enhanced adjacency matrix of word dependencies and adopt graph convolutional network to learn the syntactic structure features for text,which addresses the context interference problem of identifying different aspect terms.Then,the adjective-noun pairs extracted from image are introduced to enable the semantic representation of visual features more intuitive,which addresses the ambiguous semantic extraction problem during image feature learning.Thereby,the model performance of aspect term extraction and sentiment polarity prediction can be further optimized and enhanced.Experiments on two Twitter benchmark datasets demonstrate that TIFFL achieves competitive results for JMASA,MATE and MASC,thus validating the effectiveness of our proposed methods.
基金supported by the Funding for Research on the Evolution of Cyberbullying Incidents and Intervention Strategies(24BSH033)Discipline Innovation and Talent Introduction Bases in Higher Education Institutions(B20087).
文摘Hateful meme is a multimodal medium that combines images and texts.The potential hate content of hateful memes has caused serious problems for social media security.The current hateful memes classification task faces significant data scarcity challenges,and direct fine-tuning of large-scale pre-trained models often leads to severe overfitting issues.In addition,it is a challenge to understand the underlying relationship between text and images in the hateful memes.To address these issues,we propose a multimodal hateful memes classification model named LABF,which is based on low-rank adapter layers and bidirectional gated feature fusion.Firstly,low-rank adapter layers are adopted to learn the feature representation of the new dataset.This is achieved by introducing a small number of additional parameters while retaining prior knowledge of the CLIP model,which effectively alleviates the overfitting phenomenon.Secondly,a bidirectional gated feature fusion mechanism is designed to dynamically adjust the interaction weights of text and image features to achieve finer cross-modal fusion.Experimental results show that the method significantly outperforms existing methods on two public datasets,verifying its effectiveness and robustness.
基金supported in part by the National Nature Science Foundation of China under Grants 62476216 and 62273272in part by the Key Research and Development Program of Shaanxi Province under Grant 2024GX-YBXM-146+1 种基金in part by the Scientific Research Program Funded by Education Department of Shaanxi Provincial Government under Grant 23JP091the Youth Innovation Team of Shaanxi Universities.
文摘Multimodal Aspect-Based Sentiment Analysis(MABSA)aims to detect sentiment polarity toward specific aspects by leveraging both textual and visual inputs.However,existing models suffer from weak aspectimage alignment,modality imbalance dominated by textual signals,and limited reasoning for implicit or ambiguous sentiments requiring external knowledge.To address these issues,we propose a unified framework named Gated-Linear Aspect-Aware Multimodal Sentiment Network(GLAMSNet).First of all,an input encoding module is employed to construct modality-specific and aspect-aware representations.Subsequently,we introduce an image–aspect correlation matching module to provide hierarchical supervision for visual-textual alignment.Building upon these components,we further design a Gated-Linear Aspect-Aware Fusion(GLAF)module to enhance aspect-aware representation learning by adaptively filtering irrelevant textual information and refining semantic alignment under aspect guidance.Additionally,an External Language Model Knowledge-Guided mechanism is integrated to incorporate sentimentaware prior knowledge from GPT-4o,enabling robust semantic reasoning especially under noisy or ambiguous inputs.Experimental studies conducted based on Twitter-15 and Twitter-17 datasets demonstrate that the proposed model outperforms most state-of-the-art methods,achieving 79.36%accuracy and 74.72%F1-score,and 74.31%accuracy and 72.01%F1-score,respectively.
基金funded by Zhejiang Traditional Chinese Medicine Science and Technology Plan Project,grant number 2023ZL073the Key Science and Technology Plan of the Coconstruction Project of the National Traditional Chinese Medicine Administration Science and Technology Department and Zhejiang Province Traditional Chinese Medicine Administration,grant number GZY-ZJ-KJ-24021.
文摘Gliomas are the most common malignant tumors in the central nervous system and are known for their inherent diversity and propensity to invade surrounding tissue.These features pose significant challenges in diagnosing and treating these tumors.Magnetic resonance imaging(MRI)has not only remained at the forefront of glioma management but has also evolved significantly with the advent of multimodal MRI.The rise of multimodal MRI represents a pivotal leap forward,as it seamlessly integrates diverse MRI sequences and advanced techniques to offer an unprecedented,comprehensive,and multidimensional glimpse into the complexities of glioma pathology,including encompassing structural,functional,and even molecular imaging.This holistic approach empowers clinicians with a deeper understanding of tumor characteristics,enabling more precise diagnoses,tailored treatment strategies,and enhanced monitoring capabilities,ultimately improving patient outcomes.Looking ahead,the integration of artificial intelligence(AI)with MRI data heralds a new era of unparalleled precision in glioma diagnosis and therapy.This integration holds the promise to revolutionize the field,enabling more sophisticated analyses that fully leverage all aspects of multimodal MRI.In summary,with the continuous advancement of multimodal MRI techniques and future deep integrations with artificial intelligence,glioma care is poised to evolve toward increasingly personalized,precise,and efficacious strategies.
基金funded by the Open Foundation of Key Laboratory of Cyberspace Security,Ministry of Education[KLCS20240210].
文摘With the rapid growth of the Internet and social media, information is widely disseminated in multimodal forms, such as text and images, where discriminatory content can manifest in various ways. Discrimination detection techniques for multilingual and multimodal data can identify potential discriminatory behavior and help foster a more equitable and inclusive cyberspace. However, existing methods often struggle in complex contexts and multilingual environments. To address these challenges, this paper proposes an innovative detection method, using image and multilingual text encoders to separately extract features from different modalities. It continuously updates a historical feature memory bank, aggregates the Top-K most similar samples, and utilizes a Gated Recurrent Unit (GRU) to integrate current and historical features, generating enhanced feature representations with stronger semantic expressiveness to improve the model’s ability to capture discriminatory signals. Experimental results demonstrate that the proposed method exhibits superior discriminative power and detection accuracy in multilingual and multimodal contexts, offering a reliable and effective solution for identifying discriminatory content.
基金supported by the National Natural Science Foundation of China(Nos.51375383).
文摘In this study,we present a small,integrated jumping-crawling robot capable of intermittent jumping and self-resetting.Compared to robots with a single mode of locomotion,this multi-modal robot exhibits enhanced obstacle-surmounting capabilities.To achieve this,the robot employs a novel combination of a jumping module and a crawling module.The jumping module features improved energy storage capacity and an active clutch.Within the constraints of structural robustness,the jumping module maximizes the explosive power of the linear spring by utilizing the mechanical advantage of a closed-loop mechanism and controls the energy flow of the jumping module through an active clutch mechanism.Furthermore,inspired by the limb movements of tortoises during crawling and self-righting,a single-degree-of-freedom spatial four-bar crawling mechanism was designed to enable crawling,steering,and resetting functions.To demonstrate its practicality,the integrated jumping-crawling robot was tested in a laboratory environment for functions such as jumping,crawling,self-resetting,and steering.Experimental results confirmed the feasibility of the proposed integrated jumping-crawling robot.
文摘With the popularization of social media,stickers have become an important tool for young students to express themselves and resist mainstream culture due to their unique visual and emotional expressiveness.Most existing studies focus on the negative impacts of spoof stickers,while paying insufficient attention to their positive functions.From the perspective of multimodal metaphor,this paper uses methods such as virtual ethnography and image-text analysis to clarify the connotation of stickers,understand the evolution of their digital dissemination forms,and explore the multiple functions of subcultural stickers in the social interactions between teachers and students.Young students use stickers to convey emotions and information.Their expressive function,social function,and cultural metaphor function progress in a progressive manner.This not only shapes students’values but also promotes self-expression and teacher-student interaction.It also reminds teachers to correct students’negative thoughts by using stickers,achieving the effect of“cultivating and influencing people through culture.”
基金appreciation to Prince Sattam bin Abdulaziz University for funding for funding this research work through the project number(PSAU/2024/01/30782).
文摘The human gastrointestinal(GI)tract is influenced by numerous disorders.If not detected in the early stages,they may result in severe consequences such as organ failure or the development of cancer,and in extreme cases,become life-threatening.Endoscopy is a specialised imaging technique used to examine the GI tract.However,physicians might neglect certain irregular morphologies during the examination due to continuous monitoring of the video recording.Recent advancements in artificial intelligence have led to the development of high-performance AI-based systems,which are optimal for computer-assisted diagnosis.Due to numerous limitations in endoscopic image analysis,including visual similarities between infected and healthy areas,retrieval of irrelevant features,and imbalanced testing and training datasets,performance accuracy is reduced.To address these challenges,we proposed a framework for analysing gastrointestinal tract images that provides a more robust and secure model,thereby reducing the chances of misclassification.Compared to single model solutions,the proposed methodology improves performance by integrating diverse models and optimizing feature fusion using a dual-branch CNN transformer architecture.The proposed approach employs a dual-branch feature extraction mechanism,where in the first branch,features are extracted using Extended BEiT,and EfficientNet-B5 is utilized in the second branch.Additionally,crossentropy loss is used to measure the error of prediction at both branches,followed by model stacking.This multimodal framework outperforms existing approaches acrossmultiple metrics,achieving 94.12%accuracy,recall and F1-score,as well as 94.15%precision on the Kvasir dataset.Furthermore,the model successfully reduced the false negative rate to 5.88%,enhancing its ability to minimize misdiagnosis.These results highlight the adaptability of the proposed work in clinical practice,where it can provide fast and accurate diagnostic assistance crucial for improving the early diagnosis of diseases in the gastrointestinal tract.
文摘Background:Irregular heartbeats can have serious health implications if left undetected and untreated for an extended period of time.Methods:This study leverages machine learning(ML)techniques to classify electrocardiogram(ECG)heartbeats,comparing traditional feature-based ML methods with innovative image-based approaches.The dataset underwent rigorous preprocessing,including down-sampling,frequency filtering,beat segmentation,and normalization.Two methodologies were explored:(1)handcrafted feature extraction,utilizing metrics like heart rate variability and RR distances with LightGBM classifiers,and(2)image transformation of ECG signals using Gramian Angular Field(GAF),Markov Transition Field(MTF),and Recurrence Plot(RP),enabling multimodal input for convolutional neural networks(CNNs).The Synthetic Minority Oversampling Technique(SMOTE)addressed data imbalance,significantly improving minority-class metrics.Results:The handcrafted feature approach achieved notable performance,with LightGBM excelling in precision and recall.Image-based classification further enhanced outcomes,with a custom Inception-based CNN,attaining an 85%F1 score and 97%accuracy using combined GAF,MTF,and RP transformations.Statistical analyses confirmed the significance of these improvements.Conclusion:This work highlights the potential of ML for cardiac irregularities detection,demonstrating that combining advanced preprocessing,feature engineering,and state-of-the-art neural networks can improve classification accuracy.These findings contribute to advancing AI-driven diagnostic tools,offering promising implications for cardiovascular healthcare.
基金University-level Scientific Research Project in Natural Sciences“Research on the Retrieval Method of Multimodal First-Class Course Teaching Content Based on Knowledge Graph Collaboration”(GKY-2024KYYBK-31)。
文摘In the context of digitalization,course resources exhibit multimodal characteristics,covering various forms such as text,images,and videos.Course knowledge and learning resources are becoming increasingly diverse,providing favorable conditions for students’in-depth and efficient learning.Against this backdrop,how to scientifically apply emerging technologies to automatically collect,process,and integrate digital learning resources such as voices,videos,and courseware texts,and better innovate the organization and presentation forms of course knowledge has become an important development direction for“artificial intelligence+education.”This article elaborates on the elements and characteristics of knowledge graphs,analyzes the construction steps of knowledge graphs,and explores the construction methods of multimodal course knowledge graphs from aspects such as dataset collection,course knowledge ontology identification,knowledge discovery,and association,providing references for the intelligent application of online open courses.