The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches...The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.展开更多
Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches ofte...Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches often rely on limited data sources and simplistic hypotheses,which restrict their ability to capture the multi-faceted nature of biological systems.This study introduces adaptive multi-view learning(AMVL),a novel methodology that integrates chemical-induced transcriptional profiles(CTPs),knowledge graph(KG)embeddings,and large language model(LLM)representations,to enhance drug repurposing predictions.AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning(MVL),matrix factorization,and ensemble optimization techniques to integrate heterogeneous multi-source data.Comprehensive evaluations on benchmark datasets(Fdata-set,Cdataset,and Ydataset)and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art(SOTA)methods,achieving superior accuracy in predicting drug-disease associations across multiple metrics.Literature-based validation further confirmed the model's predictive capabilities,with seven out of the top ten predictions corroborated by post-2011 evidence.To promote transparency and reproducibility,all data and codes used in this study were open-sourced,providing resources for pro-cessing CTPs,KG,and LLM-based similarity calculations,along with the complete AMVL algorithm and benchmarking procedures.By unifying diverse data modalities,AMVL offers a robust and scalable so-lution for accelerating drug discovery,fostering advancements in translational medicine and integrating multi-omics data.We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.展开更多
Human Activity Recognition(HAR)is an active research area due to its applications in pervasive computing,human-computer interaction,artificial intelligence,health care,and social sciences.Moreover,dynamic environments...Human Activity Recognition(HAR)is an active research area due to its applications in pervasive computing,human-computer interaction,artificial intelligence,health care,and social sciences.Moreover,dynamic environments and anthropometric differences between individuals make it harder to recognize actions.This study focused on human activity in video sequences acquired with an RGB camera because of its vast range of real-world applications.It uses two-stream ConvNet to extract spatial and temporal information and proposes a fine-tuned deep neural network.Moreover,the transfer learning paradigm is adopted to extract varied and fixed frames while reusing object identification information.Six state-of-the-art pre-trained models are exploited to find the best model for spatial feature extraction.For temporal sequence,this study uses dense optical flow following the two-stream ConvNet and Bidirectional Long Short TermMemory(BiLSTM)to capture longtermdependencies.Two state-of-the-art datasets,UCF101 and HMDB51,are used for evaluation purposes.In addition,seven state-of-the-art optimizers are used to fine-tune the proposed network parameters.Furthermore,this study utilizes an ensemble mechanism to aggregate spatial-temporal features using a four-stream Convolutional Neural Network(CNN),where two streams use RGB data.In contrast,the other uses optical flow images.Finally,the proposed ensemble approach using max hard voting outperforms state-ofthe-art methods with 96.30%and 90.07%accuracies on the UCF101 and HMDB51 datasets.展开更多
Spatio-temporal cellular network traffic prediction at wide-area level plays an important role in resource reconfiguration,traffic scheduling and intrusion detection,thus potentially supporting connected intelligence ...Spatio-temporal cellular network traffic prediction at wide-area level plays an important role in resource reconfiguration,traffic scheduling and intrusion detection,thus potentially supporting connected intelligence of the sixth generation of mobile communications technology(6G).However,the existing studies just focus on the spatio-temporal modeling of traffic data of single network service,such as short message,call,or Internet.It is not conducive to accurate prediction of traffic data,characterised by diverse network service,spatio-temporality and supersize volume.To address this issue,a novel multi-task deep learning framework is developed for citywide cellular network traffic prediction.Functionally,this framework mainly consists of a dual modular feature sharing layer and a multi-task learning layer(DMFS-MT).The former aims at mining long-term spatio-temporal dependencies and local spatio-temporal fluctuation trends in data,respectively,via a new combination of convolutional gated recurrent unit(ConvGRU)and 3-dimensional convolutional neural network(3D-CNN).For the latter,each task is performed for predicting service-specific traffic data based on a fully connected network.On the real-world Telecom Italia dataset,simulation results demonstrate the effectiveness of our proposal through prediction performance measure,spatial pattern comparison and statistical distribution verification.展开更多
The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
Machine learning has emerged as a key approach in wildfire risk prediction research.However,in practical applications,the scarcity of data for specific regions often hindersmodel performance,with models trained on reg...Machine learning has emerged as a key approach in wildfire risk prediction research.However,in practical applications,the scarcity of data for specific regions often hindersmodel performance,with models trained on region-specific data struggling to generalize due to differences in data distributions.While traditional methods based on expert knowledge tend to generalize better across regions,they are limited in leveragingmulti-source data effectively,resulting in suboptimal predictive accuracy.This paper addresses this challenge by exploring how accumulated domain expertise in wildfire prediction can reduce model reliance on large volumes of high-quality data.An active learning algorithm is proposed based on XGBoost for wildfire risk assessment that autonomously identifies low-confidence predictions and seeks re-labeling through a human-in-the-loop or physics-based correction approach.This corrected data is reintegrated into the model,effectively preventing catastrophic forgetting.Experimental results demonstrate that the proposed human-in-the-loop approach significantly enhances labeling accuracy,predictive performance,and preserves the model's ability to generalize.These findings highlight the value of incorporating human expertise into machine learningmodels,offering a practical solution tomitigate data quality challenges and improvemodel reliability in wildfire risk prediction.展开更多
Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to...Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to perform multi-perspective learning of temporal signals and Electrocardiogram images, nor can they fully extract the latent information within the data, falling short of the accuracy required by clinicians. Therefore, this paper proposes an innovative hybrid multimodal spatiotemporal neural network to address these challenges. The model employs a multimodal data augmentation framework integrating visual and signal-based features to enhance the classification performance of rare arrhythmias in imbalanced datasets. Additionally, the spatiotemporal fusion module incorporates a spatiotemporal graph convolutional network to jointly model temporal and spatial features, uncovering complex dependencies within the Electrocardiogram data and improving the model’s ability to represent complex patterns. In experiments conducted on the MIT-BIH arrhythmia dataset, the model achieved 99.95% accuracy, 99.80% recall, and a 99.78% F1 score. The model was further validated for generalization using the clinical INCART arrhythmia dataset, and the results demonstrated its effectiveness in terms of both generalization and robustness.展开更多
Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-ti...Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.展开更多
Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rel...Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rely on large amounts of labeled data,which are costly and time-consuming to obtain,especially in largescale or dynamic environments.To address this challenge,we propose the Semi-Supervised Multi-View Picture Fuzzy Clustering(SS-MPFC)algorithm,which improves segmentation accuracy and robustness,particularly in complex and uncertain remote sensing scenarios.SS-MPFC unifies three paradigms:semi-supervised learning,multi-view clustering,and picture fuzzy set theory.This integration allows the model to effectively utilize a small number of labeled samples,fuse complementary information from multiple data views,and handle the ambiguity and uncertainty inherent in satellite imagery.We design a novel objective function that jointly incorporates picture fuzzy membership functions across multiple views of the data,and embeds pairwise semi-supervised constraints(must-link and cannot-link)directly into the clustering process to enhance segmentation accuracy.Experiments conducted on several benchmark satellite datasets demonstrate that SS-MPFC significantly outperforms existing state-of-the-art methods in segmentation accuracy,noise robustness,and semantic interpretability.On the Augsburg dataset,SS-MPFC achieves a Purity of 0.8158 and an Accuracy of 0.6860,highlighting its outstanding robustness and efficiency.These results demonstrate that SSMPFC offers a scalable and effective solution for real-world satellite-based monitoring systems,particularly in scenarios where rapid annotation is infeasible,such as wildfire tracking,agricultural monitoring,and dynamic urban mapping.展开更多
The dockless bike-sharing system has rapidly expanded worldwide and has been widely used as an intermodal transport to connect with public transportation.However,higher flexibility may cause an imbalance between suppl...The dockless bike-sharing system has rapidly expanded worldwide and has been widely used as an intermodal transport to connect with public transportation.However,higher flexibility may cause an imbalance between supply and demand during daily operation,especially around the metro stations.A stable and efficient rebalancing model requires spatio-temporal usage patterns as fundamental inputs.Therefore,understanding the spatio-temporal patterns and correlates is important for optimizing and rescheduling bike-sharing systems.This study proposed a dynamic time warping distance-based two-dimensional clustering method to quantify spatio-temporal patterns of dockless shared bikes in Wuhan and further applied the multiclass explainable boosting machine to explore the main related factors of these patterns.The results found six patterns on weekdays and four patterns on weekends.Three patterns show the imbalance of arrival and departure flow in the morning and evening peak hours,while these phenomena become less intensive on weekends.Road density,living service facility density and residential density are the top influencing factors on both weekdays and weekends,which means that the comprehensive impact of built-up environment attraction,facility suitability and riding demand leads to the different usage patterns.The nonlinear influence universally exists,and the probability of a certain pattern varies in different value ranges of variables.When the densities of living facilities and roads are moderate and the relationship between job and housing is relatively balanced,it can effectively promote the balanced usage of dockless shared bikes while maintaining high riding flow.The spatio-temporal patterns can identify the associated problems such as imbalance or lack of users,which could be mitigated by corresponding solutions.The relative importance and nonlinear effects help planners prioritize strategies and identify effective ranges on different patterns to promote the usage and efficiency of the bike-sharing system.展开更多
This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight...This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency.Firstly a regional multi-agent Q-learning framework is proposed,which can equivalently decompose the global Q value of the traffic system into the local values of several regions Based on the framework and the idea of human-machine cooperation,a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to realtime traffic flow densities.In order to achieve better cooperation inside each region,a lightweight spatio-temporal fusion feature extraction network is designed.The experiments in synthetic real-world and city-level scenarios show that the proposed RegionS TLight converges more quickly,is more stable,and obtains better asymptotic performance compared to state-of-theart models.展开更多
False data injection attack(FDIA)can affect the state estimation of the power grid by tampering with the measured value of the power grid data,and then destroying the stable operation of the smart grid.Existing work u...False data injection attack(FDIA)can affect the state estimation of the power grid by tampering with the measured value of the power grid data,and then destroying the stable operation of the smart grid.Existing work usually trains a detection model by fusing the data-driven features from diverse power data streams.Data-driven features,however,cannot effectively capture the differences between noisy data and attack samples.As a result,slight noise disturbances in the power grid may cause a large number of false detections for FDIA attacks.To address this problem,this paper designs a deep collaborative self-attention network to achieve robust FDIA detection,in which the spatio-temporal features of cascaded FDIA attacks are fully integrated.Firstly,a high-order Chebyshev polynomials-based graph convolution module is designed to effectively aggregate the spatio information between grid nodes,and the spatial self-attention mechanism is involved to dynamically assign attention weights to each node,which guides the network to pay more attention to the node information that is conducive to FDIA detection.Furthermore,the bi-directional Long Short-Term Memory(LSTM)network is introduced to conduct time series modeling and long-term dependence analysis for power grid data and utilizes the temporal self-attention mechanism to describe the time correlation of data and assign different weights to different time steps.Our designed deep collaborative network can effectively mine subtle perturbations from spatiotemporal feature information,efficiently distinguish power grid noise from FDIA attacks,and adapt to diverse attack intensities.Extensive experiments demonstrate that our method can obtain an efficient detection performance over actual load data from New York Independent System Operator(NYISO)in IEEE 14,IEEE 39,and IEEE 118 bus systems,and outperforms state-of-the-art FDIA detection schemes in terms of detection accuracy and robustness.展开更多
Automatic detection of student engagement levels from videos,which is a spatio-temporal classification problem is crucial for enhancing the quality of online education.This paper addresses this challenge by proposing ...Automatic detection of student engagement levels from videos,which is a spatio-temporal classification problem is crucial for enhancing the quality of online education.This paper addresses this challenge by proposing four novel hybrid end-to-end deep learning models designed for the automatic detection of student engagement levels in e-learning videos.The evaluation of these models utilizes the DAiSEE dataset,a public repository capturing student affective states in e-learning scenarios.The initial model integrates EfficientNetV2-L with Gated Recurrent Unit(GRU)and attains an accuracy of 61.45%.Subsequently,the second model combines EfficientNetV2-L with bidirectional GRU(Bi-GRU),yielding an accuracy of 61.56%.The third and fourth models leverage a fusion of EfficientNetV2-L with Long Short-Term Memory(LSTM)and bidirectional LSTM(Bi-LSTM),achieving accuracies of 62.11%and 61.67%,respectively.Our findings demonstrate the viability of these models in effectively discerning student engagement levels,with the EfficientNetV2-L+LSTM model emerging as the most proficient,reaching an accuracy of 62.11%.This study underscores the potential of hybrid spatio-temporal networks in automating the detection of student engagement,thereby contributing to advancements in online education quality.展开更多
The outbreak and subsequent recurring waves of COVID−19 pose threats on the emergency management and people's daily life,while the large-scale spatio-temporal epidemiological data have sure come in handy in epidem...The outbreak and subsequent recurring waves of COVID−19 pose threats on the emergency management and people's daily life,while the large-scale spatio-temporal epidemiological data have sure come in handy in epidemic surveillance.Nonetheless,some challenges remain to be addressed in terms of multi-source heterogeneous data fusion,deep mining,and comprehensive applications.The Spatio-Temporal Artificial Intelligence(STAI)technology,which focuses on integrating spatial related time-series data,artificial intelligence models,and digital tools to provide intelligent computing platforms and applications,opens up new opportunities for scientific epidemic control.To this end,we leverage STAI and long-term experience in location-based intelligent services in the work.Specifically,we devise and develop a STAI-driven digital infrastructure,namely,WAYZ Disease Control Intelligent Platform(WDCIP),which consists of a systematic framework for building pipelines from automatic spatio-temporal data collection,processing to AI-based analysis and inference implementation for providing appropriate applications serving various epidemic scenarios.According to the platform implementation logic,our work can be performed and summarized from three aspects:(1)a STAI-driven integrated system;(2)a hybrid GNN-based approach for hierarchical risk assessment(as the core algorithm of WDCIP);and(3)comprehensive applications for social epidemic containment.This work makes a pivotal contribution to facilitating the aggregation and full utilization of spatio-temporal epidemic data from multiple sources,where the real-time human mobility data generated by high-precision mobile positioning plays a vital role in sensing the spread of the epidemic.So far,WDCIP has accumulated more than 200 million users who have been served in life convenience and decision-making during the pandemic.展开更多
With the advancement of human-computer interaction,surface electromyography(sEMG)-based gesture recognition has garnered increasing attention.However,effectively utilizing the spatio-temporal dependencies in sEMG sign...With the advancement of human-computer interaction,surface electromyography(sEMG)-based gesture recognition has garnered increasing attention.However,effectively utilizing the spatio-temporal dependencies in sEMG signals and integrating multiple key features remain significant challenges for existing techniques.To address this issue,we propose a model named the Two-Stream Hybrid Spatio-Temporal Fusion Network(TS-HSTFNet).Specifically,we design a dynamic spatio-temporal graph convolution module that employs an adaptive dynamic adjacency matrix to explore the spatial dynamic patterns in the sEMG signals fully.Additionally,a spatio-temporal attention fusion module is designed to fully utilize the potential correlations among multiple features for the final fusion.The results indicate that the proposed TS-HSTFNet model achieves 84.96%and 88.08%accuracy on the Ninapro DB2 and Ninapro DB5 datasets,respectively,demonstrating high precision in gesture recognition.Our work emphasizes the importance of extracting spatio-temporal features in gesture recognition and provides a novel approach for multi-source information fusion.展开更多
Traditional sheep identification is based on ear tags.However,the application of ear tags not only causes stress to the animals but also leads to loss of ear tags,which affects the correct recognition of sheep identit...Traditional sheep identification is based on ear tags.However,the application of ear tags not only causes stress to the animals but also leads to loss of ear tags,which affects the correct recognition of sheep identity.In contrast,the acquisition of sheep face images offers the advantages of being non-invasive and stress-free for the animals.Nevertheless,the extant convolutional neural network-based sheep face identification model is prone to the issue of inadequate refinement,which renders its implementation on farms challenging.To address this issue,this study presented a novel sheep face recognition model that employs advanced feature fusion techniques and precise image segmentation strategies.The images were preprocessed and accurately segmented using deep learning techniques,with a dataset constructed containing sheep face images from multiple viewpoints(left,front,and right faces).In particular,the model employs a segmentation algorithm to delineate the sheep face region accurately,utilizes the Improved Convolutional Block Attention Module(I-CBAM)to emphasize the salient features of the sheep face,and achieves multi-scale fusion of the features through a Feature Pyramid Network(FPN).This process guarantees that the features captured from disparate viewpoints can be efficiently integrated to enhance recognition accuracy.Furthermore,the model guarantees the precise delineation of sheep facial contours by streamlining the image segmentation procedure,thereby establishing a robust basis for the precise identification of sheep identity.The findings demonstrate that the recognition accuracy of the Sheep Face Mask Region-based Convolutional Neural Network(SFMask RCNN)model has been enhanced by 9.64%to 98.65%in comparison to the original model.The method offers a novel technological approach to the management of animal identity in the context of sheep husbandry.展开更多
Large-scale machinery operated in a coordinat-ed manner in earthworks for mining constitutes high safety risks.Efficient scheduling of such machinery,factoring in safety constraints,could save time and significantly i...Large-scale machinery operated in a coordinat-ed manner in earthworks for mining constitutes high safety risks.Efficient scheduling of such machinery,factoring in safety constraints,could save time and significantly improve the overall safety.This paper develops a model of automated equipment scheduling in mining earthworks and presents a scheduling algorithm based on deep rein-forcement learning with spatio-temporal safety constraints.The algorithm not only performed well on safety parame-ters,but also outperformed randomized instances of various sizes set against real mining applications.Further,the study reveals that responsiveness to spatio-temporal safety constraints noticeably increases as the scheduling size increases.This method provides important noticeable improvements to safe automated scheduling in mining.展开更多
Multi-view learning is an emerging field that aims to enhance learning performance by leveraging multiple views or sources of data across various domains.By integrating information from diverse perspectives,multi-view...Multi-view learning is an emerging field that aims to enhance learning performance by leveraging multiple views or sources of data across various domains.By integrating information from diverse perspectives,multi-view learning methods effectively enhance accuracy,robustness,and generalization capabilities.The existing research on multi-view learning can be broadly categorized into four groups in the survey based on the tasks it encompasses,namely multi-view classification approaches,multi-view semi-supervised classification approaches,multi-view clustering approaches,and multi-view semi-supervised clustering approaches.Despite its potential advantages,multi-view learning poses several challenges,including view inconsistency,view complementarity,optimal view fusion,the curse of dimensionality,scalability,limited labels,and generalization across domains.Nevertheless,these challenges have not discouraged researchers from exploring the potential of multiview learning.It continues to be an active and promising research area,capable of effectively addressing complex realworld problems.展开更多
The interactions between drugs and microbes affecting microbial abundance can lead to various diseases or reduce the effectiveness of pharmaceutical treatments.Traditional Microbe-Drug Association(MDA)determination th...The interactions between drugs and microbes affecting microbial abundance can lead to various diseases or reduce the effectiveness of pharmaceutical treatments.Traditional Microbe-Drug Association(MDA)determination through biological assays is time-consuming and costly.With the accumulation of MDA data,computational methods have become a promising approach to infer potential MDAs.Although existing methods focus on predicting whether a drug interacts with a microbe,they can rarely infer whether a drug promotes or inhibits the abundance of a given microbe.Moreover,the extreme imbalance among abundance-promoted,abundance-inhibited,and non-impacted cases remains a challenge for computational prediction methods.To address these issues,we propose a framework for predicting the imbalanced Impact of Drugs on Microbial Abundance by leveraging Multi-view Learning and Data Augmentation,named IDMA-MLDA.IDMA-MLDA employs a novel method of transforming a bipartite graph into a hypergraph,uses hypergraph convolutions to capture high-order vertex neighborhoods(macro-view),and employs graph neural networks to learn individual features of drugs and microbes(micro-view).It integrates features from both macro-view and micro-view to obtain more comprehensive representations,incorporates a data augmentation module to handle class imbalance,and uses a multilayer perceptron to predict the impact of drugs on microbial abundance.We demonstrate the superiority of IDMA-MLDA through comparisons with six baseline methods,and ablation studies affirm the contributions of each key module in IDMA-MLDA’s prediction.Furthermore,a comprehensive literature review verifies the abundance types of twelve MDAs predicted by IDMA-MLDA.展开更多
To improve the accuracy and robustness of rolling bearing fault diagnosis under complex conditions, a novel method based on multi-view feature fusion is proposed. Firstly, multi-view features from perspectives of the ...To improve the accuracy and robustness of rolling bearing fault diagnosis under complex conditions, a novel method based on multi-view feature fusion is proposed. Firstly, multi-view features from perspectives of the time domain, frequency domain and time-frequency domain are extracted through the Fourier transform, Hilbert transform and empirical mode decomposition (EMD).Then, the random forest model (RF) is applied to select features which are highly correlated with the bearing operating state. Subsequently, the selected features are fused via the autoencoder (AE) to further reduce the redundancy. Finally, the effectiveness of the fused features is evaluated by the support vector machine (SVM). The experimental results indicate that the proposed method based on the multi-view feature fusion can effectively reflect the difference in the state of the rolling bearing, and improve the accuracy of fault diagnosis.展开更多
基金supported by the research on key technologies for monitoring and identifying drug abuse of anesthetic drugs and psychotropic drugs,and intervention for addiction(No.2023YFC3304200)the program of a study on the diagnosis of addiction to synthetic cannabinoids and methods of assessing the risk of abuse(No.2022YFC3300905)+1 种基金the program of Ab initio design and generation of AI models for small molecule ligands based on target structures(No.2022PE0AC03)ZHIJIANG LAB.
文摘The accurate prediction of drug absorption,distribution,metabolism,excretion,and toxicity(ADMET)properties represents a crucial step in early drug development for reducing failure risk.Current deep learning approaches face challenges with data sparsity and information loss due to single-molecule representation limitations and isolated predictive tasks.This research proposes molecular properties prediction with parallel-view and collaborative learning(MolP-PC),a multi-view fusion and multi-task deep learning framework that integrates 1D molecular fingerprints(MFs),2D molecular graphs,and 3D geometric representations,incorporating an attention-gated fusion mechanism and multi-task adaptive learning strategy for precise ADMET property predictions.Experimental results demonstrate that MolP-PC achieves optimal performance in 27 of 54 tasks,with its multi-task learning(MTL)mechanism significantly enhancing predictive performance on small-scale datasets and surpassing single-task models in 41 of 54 tasks.Additional ablation studies and interpretability analyses confirm the significance of multi-view fusion in capturing multi-dimensional molecular information and enhancing model generalization.A case study examining the anticancer compound Oroxylin A demonstrates MolP-PC’s effective generalization in predicting key pharmacokinetic parameters such as half-life(T0.5)and clearance(CL),indicating its practical utility in drug modeling.However,the model exhibits a tendency to underestimate volume of distribution(VD),indicating potential for improvement in analyzing compounds with high tissue distribution.This study presents an efficient and interpretable approach for ADMET property prediction,establishing a novel framework for molecular optimization and risk assessment in drug development.
基金supported by the National Natural Science Foundation of China(Grant No.:62101087)the China Postdoctoral Science Foundation(Grant No.:2021MD703942)+2 种基金the Chongqing Postdoctoral Research Project Special Funding,China(Grant No.:2021XM2016)the Science Foundation of Chongqing Municipal Commission of Education,China(Grant No.:KJQN202100642)the Chongqing Natural Science Foundation,China(Grant No.:cstc2021jcyj-msxmX0834).
文摘Drug repurposing offers a promising alternative to traditional drug development and significantly re-duces costs and timelines by identifying new therapeutic uses for existing drugs.However,the current approaches often rely on limited data sources and simplistic hypotheses,which restrict their ability to capture the multi-faceted nature of biological systems.This study introduces adaptive multi-view learning(AMVL),a novel methodology that integrates chemical-induced transcriptional profiles(CTPs),knowledge graph(KG)embeddings,and large language model(LLM)representations,to enhance drug repurposing predictions.AMVL incorporates an innovative similarity matrix expansion strategy and leverages multi-view learning(MVL),matrix factorization,and ensemble optimization techniques to integrate heterogeneous multi-source data.Comprehensive evaluations on benchmark datasets(Fdata-set,Cdataset,and Ydataset)and the large-scale iDrug dataset demonstrate that AMVL outperforms state-of-the-art(SOTA)methods,achieving superior accuracy in predicting drug-disease associations across multiple metrics.Literature-based validation further confirmed the model's predictive capabilities,with seven out of the top ten predictions corroborated by post-2011 evidence.To promote transparency and reproducibility,all data and codes used in this study were open-sourced,providing resources for pro-cessing CTPs,KG,and LLM-based similarity calculations,along with the complete AMVL algorithm and benchmarking procedures.By unifying diverse data modalities,AMVL offers a robust and scalable so-lution for accelerating drug discovery,fostering advancements in translational medicine and integrating multi-omics data.We aim to inspire further innovations in multi-source data integration and support the development of more precise and efficient strategies for advancing drug discovery and translational medicine.
基金This work was supported by financial support from Universiti Sains Malaysia(USM)under FRGS grant number FRGS/1/2020/TK03/USM/02/1the School of Computer Sciences USM for their support.
文摘Human Activity Recognition(HAR)is an active research area due to its applications in pervasive computing,human-computer interaction,artificial intelligence,health care,and social sciences.Moreover,dynamic environments and anthropometric differences between individuals make it harder to recognize actions.This study focused on human activity in video sequences acquired with an RGB camera because of its vast range of real-world applications.It uses two-stream ConvNet to extract spatial and temporal information and proposes a fine-tuned deep neural network.Moreover,the transfer learning paradigm is adopted to extract varied and fixed frames while reusing object identification information.Six state-of-the-art pre-trained models are exploited to find the best model for spatial feature extraction.For temporal sequence,this study uses dense optical flow following the two-stream ConvNet and Bidirectional Long Short TermMemory(BiLSTM)to capture longtermdependencies.Two state-of-the-art datasets,UCF101 and HMDB51,are used for evaluation purposes.In addition,seven state-of-the-art optimizers are used to fine-tune the proposed network parameters.Furthermore,this study utilizes an ensemble mechanism to aggregate spatial-temporal features using a four-stream Convolutional Neural Network(CNN),where two streams use RGB data.In contrast,the other uses optical flow images.Finally,the proposed ensemble approach using max hard voting outperforms state-ofthe-art methods with 96.30%and 90.07%accuracies on the UCF101 and HMDB51 datasets.
基金supported in part by the Science and Technology Project of Hebei Education Department(No.ZD2021088)in part by the S&T Major Project of the Science and Technology Ministry of China(No.2017YFE0135700)。
文摘Spatio-temporal cellular network traffic prediction at wide-area level plays an important role in resource reconfiguration,traffic scheduling and intrusion detection,thus potentially supporting connected intelligence of the sixth generation of mobile communications technology(6G).However,the existing studies just focus on the spatio-temporal modeling of traffic data of single network service,such as short message,call,or Internet.It is not conducive to accurate prediction of traffic data,characterised by diverse network service,spatio-temporality and supersize volume.To address this issue,a novel multi-task deep learning framework is developed for citywide cellular network traffic prediction.Functionally,this framework mainly consists of a dual modular feature sharing layer and a multi-task learning layer(DMFS-MT).The former aims at mining long-term spatio-temporal dependencies and local spatio-temporal fluctuation trends in data,respectively,via a new combination of convolutional gated recurrent unit(ConvGRU)and 3-dimensional convolutional neural network(3D-CNN).For the latter,each task is performed for predicting service-specific traffic data based on a fully connected network.On the real-world Telecom Italia dataset,simulation results demonstrate the effectiveness of our proposal through prediction performance measure,spatial pattern comparison and statistical distribution verification.
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
基金funded by the Natural Science Foundation of Guangxi Province(Grant AB24010157)Research Project of the Sichuan Forestry and Grassland Bureau(Grants G202206012 and G202206012-2)+1 种基金National Natural Science Foundation of China(Grants 32471878,62373081,U2330206,U2230206 and 62173068)Sichuan Science and Technology Program(Grants 2024NSFSC1483,2024ZYD0156,2023NSFC1962 and DQ202412).
文摘Machine learning has emerged as a key approach in wildfire risk prediction research.However,in practical applications,the scarcity of data for specific regions often hindersmodel performance,with models trained on region-specific data struggling to generalize due to differences in data distributions.While traditional methods based on expert knowledge tend to generalize better across regions,they are limited in leveragingmulti-source data effectively,resulting in suboptimal predictive accuracy.This paper addresses this challenge by exploring how accumulated domain expertise in wildfire prediction can reduce model reliance on large volumes of high-quality data.An active learning algorithm is proposed based on XGBoost for wildfire risk assessment that autonomously identifies low-confidence predictions and seeks re-labeling through a human-in-the-loop or physics-based correction approach.This corrected data is reintegrated into the model,effectively preventing catastrophic forgetting.Experimental results demonstrate that the proposed human-in-the-loop approach significantly enhances labeling accuracy,predictive performance,and preserves the model's ability to generalize.These findings highlight the value of incorporating human expertise into machine learningmodels,offering a practical solution tomitigate data quality challenges and improvemodel reliability in wildfire risk prediction.
基金supported by The Henan Province Science and Technology Research Project(242102211046)the Key Scientific Research Project of Higher Education Institutions in Henan Province(25A520039)+1 种基金theNatural Science Foundation project of Zhongyuan Institute of Technology(K2025YB011)the Zhongyuan University of Technology Graduate Education and Teaching Reform Research Project(JG202424).
文摘Electrocardiogram (ECG) analysis is critical for detecting arrhythmias, but traditional methods struggle with large-scale Electrocardiogram data and rare arrhythmia events in imbalanced datasets. These methods fail to perform multi-perspective learning of temporal signals and Electrocardiogram images, nor can they fully extract the latent information within the data, falling short of the accuracy required by clinicians. Therefore, this paper proposes an innovative hybrid multimodal spatiotemporal neural network to address these challenges. The model employs a multimodal data augmentation framework integrating visual and signal-based features to enhance the classification performance of rare arrhythmias in imbalanced datasets. Additionally, the spatiotemporal fusion module incorporates a spatiotemporal graph convolutional network to jointly model temporal and spatial features, uncovering complex dependencies within the Electrocardiogram data and improving the model’s ability to represent complex patterns. In experiments conducted on the MIT-BIH arrhythmia dataset, the model achieved 99.95% accuracy, 99.80% recall, and a 99.78% F1 score. The model was further validated for generalization using the clinical INCART arrhythmia dataset, and the results demonstrated its effectiveness in terms of both generalization and robustness.
基金supported in part by the National Natural Science Foundation of China(Grant No.82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)of Shenzhen Science and Technology Innovation Committee+6 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Natural Science Foundation of Jiangsu Province(No.BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038 and SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575)the Henan Province Science and Technology Research(222102310322)The Jiangsu Students’Innovation and Entrepreneurship Training Program(202110304096Y).
文摘Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.
基金funded by the Research Project:THTETN.05/24-25,VietnamAcademy of Science and Technology.
文摘Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rely on large amounts of labeled data,which are costly and time-consuming to obtain,especially in largescale or dynamic environments.To address this challenge,we propose the Semi-Supervised Multi-View Picture Fuzzy Clustering(SS-MPFC)algorithm,which improves segmentation accuracy and robustness,particularly in complex and uncertain remote sensing scenarios.SS-MPFC unifies three paradigms:semi-supervised learning,multi-view clustering,and picture fuzzy set theory.This integration allows the model to effectively utilize a small number of labeled samples,fuse complementary information from multiple data views,and handle the ambiguity and uncertainty inherent in satellite imagery.We design a novel objective function that jointly incorporates picture fuzzy membership functions across multiple views of the data,and embeds pairwise semi-supervised constraints(must-link and cannot-link)directly into the clustering process to enhance segmentation accuracy.Experiments conducted on several benchmark satellite datasets demonstrate that SS-MPFC significantly outperforms existing state-of-the-art methods in segmentation accuracy,noise robustness,and semantic interpretability.On the Augsburg dataset,SS-MPFC achieves a Purity of 0.8158 and an Accuracy of 0.6860,highlighting its outstanding robustness and efficiency.These results demonstrate that SSMPFC offers a scalable and effective solution for real-world satellite-based monitoring systems,particularly in scenarios where rapid annotation is infeasible,such as wildfire tracking,agricultural monitoring,and dynamic urban mapping.
基金supported by the National Key Research and Development Program of China[grant number 2017YFB0503601]。
文摘The dockless bike-sharing system has rapidly expanded worldwide and has been widely used as an intermodal transport to connect with public transportation.However,higher flexibility may cause an imbalance between supply and demand during daily operation,especially around the metro stations.A stable and efficient rebalancing model requires spatio-temporal usage patterns as fundamental inputs.Therefore,understanding the spatio-temporal patterns and correlates is important for optimizing and rescheduling bike-sharing systems.This study proposed a dynamic time warping distance-based two-dimensional clustering method to quantify spatio-temporal patterns of dockless shared bikes in Wuhan and further applied the multiclass explainable boosting machine to explore the main related factors of these patterns.The results found six patterns on weekdays and four patterns on weekends.Three patterns show the imbalance of arrival and departure flow in the morning and evening peak hours,while these phenomena become less intensive on weekends.Road density,living service facility density and residential density are the top influencing factors on both weekdays and weekends,which means that the comprehensive impact of built-up environment attraction,facility suitability and riding demand leads to the different usage patterns.The nonlinear influence universally exists,and the probability of a certain pattern varies in different value ranges of variables.When the densities of living facilities and roads are moderate and the relationship between job and housing is relatively balanced,it can effectively promote the balanced usage of dockless shared bikes while maintaining high riding flow.The spatio-temporal patterns can identify the associated problems such as imbalance or lack of users,which could be mitigated by corresponding solutions.The relative importance and nonlinear effects help planners prioritize strategies and identify effective ranges on different patterns to promote the usage and efficiency of the bike-sharing system.
基金supported by the National Science and Technology Major Project(2021ZD0112702)the National Natural Science Foundation(NNSF)of China(62373100,62233003)the Natural Science Foundation of Jiangsu Province of China(BK20202006)。
文摘This article studies the effective traffic signal control problem of multiple intersections in a city-level traffic system.A novel regional multi-agent cooperative reinforcement learning algorithm called RegionSTLight is proposed to improve the traffic efficiency.Firstly a regional multi-agent Q-learning framework is proposed,which can equivalently decompose the global Q value of the traffic system into the local values of several regions Based on the framework and the idea of human-machine cooperation,a dynamic zoning method is designed to divide the traffic network into several strong-coupled regions according to realtime traffic flow densities.In order to achieve better cooperation inside each region,a lightweight spatio-temporal fusion feature extraction network is designed.The experiments in synthetic real-world and city-level scenarios show that the proposed RegionS TLight converges more quickly,is more stable,and obtains better asymptotic performance compared to state-of-theart models.
基金supported in part by the Research Fund of Guangxi Key Lab of Multi-Source Information Mining&Security(MIMS21-M-02).
文摘False data injection attack(FDIA)can affect the state estimation of the power grid by tampering with the measured value of the power grid data,and then destroying the stable operation of the smart grid.Existing work usually trains a detection model by fusing the data-driven features from diverse power data streams.Data-driven features,however,cannot effectively capture the differences between noisy data and attack samples.As a result,slight noise disturbances in the power grid may cause a large number of false detections for FDIA attacks.To address this problem,this paper designs a deep collaborative self-attention network to achieve robust FDIA detection,in which the spatio-temporal features of cascaded FDIA attacks are fully integrated.Firstly,a high-order Chebyshev polynomials-based graph convolution module is designed to effectively aggregate the spatio information between grid nodes,and the spatial self-attention mechanism is involved to dynamically assign attention weights to each node,which guides the network to pay more attention to the node information that is conducive to FDIA detection.Furthermore,the bi-directional Long Short-Term Memory(LSTM)network is introduced to conduct time series modeling and long-term dependence analysis for power grid data and utilizes the temporal self-attention mechanism to describe the time correlation of data and assign different weights to different time steps.Our designed deep collaborative network can effectively mine subtle perturbations from spatiotemporal feature information,efficiently distinguish power grid noise from FDIA attacks,and adapt to diverse attack intensities.Extensive experiments demonstrate that our method can obtain an efficient detection performance over actual load data from New York Independent System Operator(NYISO)in IEEE 14,IEEE 39,and IEEE 118 bus systems,and outperforms state-of-the-art FDIA detection schemes in terms of detection accuracy and robustness.
文摘Automatic detection of student engagement levels from videos,which is a spatio-temporal classification problem is crucial for enhancing the quality of online education.This paper addresses this challenge by proposing four novel hybrid end-to-end deep learning models designed for the automatic detection of student engagement levels in e-learning videos.The evaluation of these models utilizes the DAiSEE dataset,a public repository capturing student affective states in e-learning scenarios.The initial model integrates EfficientNetV2-L with Gated Recurrent Unit(GRU)and attains an accuracy of 61.45%.Subsequently,the second model combines EfficientNetV2-L with bidirectional GRU(Bi-GRU),yielding an accuracy of 61.56%.The third and fourth models leverage a fusion of EfficientNetV2-L with Long Short-Term Memory(LSTM)and bidirectional LSTM(Bi-LSTM),achieving accuracies of 62.11%and 61.67%,respectively.Our findings demonstrate the viability of these models in effectively discerning student engagement levels,with the EfficientNetV2-L+LSTM model emerging as the most proficient,reaching an accuracy of 62.11%.This study underscores the potential of hybrid spatio-temporal networks in automating the detection of student engagement,thereby contributing to advancements in online education quality.
基金supported by the Shanghai Municipal Science and Technology Major Project[grant number 2021SHZD ZX0100]the Fundamental Research Funds for the Central Universities[grant number 2021SHZDZX0100].
文摘The outbreak and subsequent recurring waves of COVID−19 pose threats on the emergency management and people's daily life,while the large-scale spatio-temporal epidemiological data have sure come in handy in epidemic surveillance.Nonetheless,some challenges remain to be addressed in terms of multi-source heterogeneous data fusion,deep mining,and comprehensive applications.The Spatio-Temporal Artificial Intelligence(STAI)technology,which focuses on integrating spatial related time-series data,artificial intelligence models,and digital tools to provide intelligent computing platforms and applications,opens up new opportunities for scientific epidemic control.To this end,we leverage STAI and long-term experience in location-based intelligent services in the work.Specifically,we devise and develop a STAI-driven digital infrastructure,namely,WAYZ Disease Control Intelligent Platform(WDCIP),which consists of a systematic framework for building pipelines from automatic spatio-temporal data collection,processing to AI-based analysis and inference implementation for providing appropriate applications serving various epidemic scenarios.According to the platform implementation logic,our work can be performed and summarized from three aspects:(1)a STAI-driven integrated system;(2)a hybrid GNN-based approach for hierarchical risk assessment(as the core algorithm of WDCIP);and(3)comprehensive applications for social epidemic containment.This work makes a pivotal contribution to facilitating the aggregation and full utilization of spatio-temporal epidemic data from multiple sources,where the real-time human mobility data generated by high-precision mobile positioning plays a vital role in sensing the spread of the epidemic.So far,WDCIP has accumulated more than 200 million users who have been served in life convenience and decision-making during the pandemic.
基金Funding from the Key Research and development plan of Shaanxi Province"Human robot interaction technology and implementation of bionic robotic arm based on remote operation"(2023-ZDLGY-24).
文摘With the advancement of human-computer interaction,surface electromyography(sEMG)-based gesture recognition has garnered increasing attention.However,effectively utilizing the spatio-temporal dependencies in sEMG signals and integrating multiple key features remain significant challenges for existing techniques.To address this issue,we propose a model named the Two-Stream Hybrid Spatio-Temporal Fusion Network(TS-HSTFNet).Specifically,we design a dynamic spatio-temporal graph convolution module that employs an adaptive dynamic adjacency matrix to explore the spatial dynamic patterns in the sEMG signals fully.Additionally,a spatio-temporal attention fusion module is designed to fully utilize the potential correlations among multiple features for the final fusion.The results indicate that the proposed TS-HSTFNet model achieves 84.96%and 88.08%accuracy on the Ninapro DB2 and Ninapro DB5 datasets,respectively,demonstrating high precision in gesture recognition.Our work emphasizes the importance of extracting spatio-temporal features in gesture recognition and provides a novel approach for multi-source information fusion.
基金Fundamental Research Funds for Inner Mongolia Directly Affiliated Universities(Grant No.BR221032)the First Class Disciplines Research Special Project(Grant No.YLXKZX-NND-009)。
文摘Traditional sheep identification is based on ear tags.However,the application of ear tags not only causes stress to the animals but also leads to loss of ear tags,which affects the correct recognition of sheep identity.In contrast,the acquisition of sheep face images offers the advantages of being non-invasive and stress-free for the animals.Nevertheless,the extant convolutional neural network-based sheep face identification model is prone to the issue of inadequate refinement,which renders its implementation on farms challenging.To address this issue,this study presented a novel sheep face recognition model that employs advanced feature fusion techniques and precise image segmentation strategies.The images were preprocessed and accurately segmented using deep learning techniques,with a dataset constructed containing sheep face images from multiple viewpoints(left,front,and right faces).In particular,the model employs a segmentation algorithm to delineate the sheep face region accurately,utilizes the Improved Convolutional Block Attention Module(I-CBAM)to emphasize the salient features of the sheep face,and achieves multi-scale fusion of the features through a Feature Pyramid Network(FPN).This process guarantees that the features captured from disparate viewpoints can be efficiently integrated to enhance recognition accuracy.Furthermore,the model guarantees the precise delineation of sheep facial contours by streamlining the image segmentation procedure,thereby establishing a robust basis for the precise identification of sheep identity.The findings demonstrate that the recognition accuracy of the Sheep Face Mask Region-based Convolutional Neural Network(SFMask RCNN)model has been enhanced by 9.64%to 98.65%in comparison to the original model.The method offers a novel technological approach to the management of animal identity in the context of sheep husbandry.
基金National Natural Science Foundation of China(Grant Nos.72171092,52192664 and 71821001)Natural Science Fund for Distinguished Young Scholars of Hubei Province,China(Grant No.2021CFA091).
文摘Large-scale machinery operated in a coordinat-ed manner in earthworks for mining constitutes high safety risks.Efficient scheduling of such machinery,factoring in safety constraints,could save time and significantly improve the overall safety.This paper develops a model of automated equipment scheduling in mining earthworks and presents a scheduling algorithm based on deep rein-forcement learning with spatio-temporal safety constraints.The algorithm not only performed well on safety parame-ters,but also outperformed randomized instances of various sizes set against real mining applications.Further,the study reveals that responsiveness to spatio-temporal safety constraints noticeably increases as the scheduling size increases.This method provides important noticeable improvements to safe automated scheduling in mining.
基金supported in part by the Major Key Project of PCL,China(PCL2023AS7-1 and PCL2023A09)in part by the National Key R&D Program of China(2023YFA1011601)+1 种基金in part by the National Natural Science Foundation of China(Grant Nos.62106224 and U21A20478)in part by the Guangzhou Science and Technology Plan Project(2024A04J3749).
文摘Multi-view learning is an emerging field that aims to enhance learning performance by leveraging multiple views or sources of data across various domains.By integrating information from diverse perspectives,multi-view learning methods effectively enhance accuracy,robustness,and generalization capabilities.The existing research on multi-view learning can be broadly categorized into four groups in the survey based on the tasks it encompasses,namely multi-view classification approaches,multi-view semi-supervised classification approaches,multi-view clustering approaches,and multi-view semi-supervised clustering approaches.Despite its potential advantages,multi-view learning poses several challenges,including view inconsistency,view complementarity,optimal view fusion,the curse of dimensionality,scalability,limited labels,and generalization across domains.Nevertheless,these challenges have not discouraged researchers from exploring the potential of multiview learning.It continues to be an active and promising research area,capable of effectively addressing complex realworld problems.
基金supported by the National Natural Science Foundation of China(No.62372375)the Shaanxi Province Key R&D Program(No.2023-YBSF-114)the CAAI-Huawei MindSpore Open Fund(No.CAAIXSJLJJ-2022-035A).
文摘The interactions between drugs and microbes affecting microbial abundance can lead to various diseases or reduce the effectiveness of pharmaceutical treatments.Traditional Microbe-Drug Association(MDA)determination through biological assays is time-consuming and costly.With the accumulation of MDA data,computational methods have become a promising approach to infer potential MDAs.Although existing methods focus on predicting whether a drug interacts with a microbe,they can rarely infer whether a drug promotes or inhibits the abundance of a given microbe.Moreover,the extreme imbalance among abundance-promoted,abundance-inhibited,and non-impacted cases remains a challenge for computational prediction methods.To address these issues,we propose a framework for predicting the imbalanced Impact of Drugs on Microbial Abundance by leveraging Multi-view Learning and Data Augmentation,named IDMA-MLDA.IDMA-MLDA employs a novel method of transforming a bipartite graph into a hypergraph,uses hypergraph convolutions to capture high-order vertex neighborhoods(macro-view),and employs graph neural networks to learn individual features of drugs and microbes(micro-view).It integrates features from both macro-view and micro-view to obtain more comprehensive representations,incorporates a data augmentation module to handle class imbalance,and uses a multilayer perceptron to predict the impact of drugs on microbial abundance.We demonstrate the superiority of IDMA-MLDA through comparisons with six baseline methods,and ablation studies affirm the contributions of each key module in IDMA-MLDA’s prediction.Furthermore,a comprehensive literature review verifies the abundance types of twelve MDAs predicted by IDMA-MLDA.
基金The National Natural Science Foundation of China(No.51875100)
文摘To improve the accuracy and robustness of rolling bearing fault diagnosis under complex conditions, a novel method based on multi-view feature fusion is proposed. Firstly, multi-view features from perspectives of the time domain, frequency domain and time-frequency domain are extracted through the Fourier transform, Hilbert transform and empirical mode decomposition (EMD).Then, the random forest model (RF) is applied to select features which are highly correlated with the bearing operating state. Subsequently, the selected features are fused via the autoencoder (AE) to further reduce the redundancy. Finally, the effectiveness of the fused features is evaluated by the support vector machine (SVM). The experimental results indicate that the proposed method based on the multi-view feature fusion can effectively reflect the difference in the state of the rolling bearing, and improve the accuracy of fault diagnosis.