Autism spectrum disorder(AsD)is a highly heterogeneous neurodevelopmental disorder.Early diagnosis and intervention are crucial for improving outcomes.Traditional single-modality diagnostic methods are subjective,limi...Autism spectrum disorder(AsD)is a highly heterogeneous neurodevelopmental disorder.Early diagnosis and intervention are crucial for improving outcomes.Traditional single-modality diagnostic methods are subjective,limited,and struggle to reveal the underlying pathological mechanisms.In contrast,multimodal data analysis integrates behavioral,physiological,and neuroimaging information with advanced machine-learning and deeplearning algorithms to overcome these limitations.In this review,we surveyed the recent pediatric AsD literature,highlighting artificial intelligence-driven diagnostic techniques,multimodal data fusion strategies,and emerging trends in ASD assessment.We surveyed studies that integrated two or more modalities and summarized the fusion levels,learning paradigms,tasks,datasets,and metrics.Multimodal approaches outperform singlemodality baselines in classification,severity estimation,and subtyping by leveraging complementary information and reducing modality-specific biases.Multimodal approaches significantly enhance diagnostic accuracy and comprehensiveness,enabling early screening of AsD,symptom subtyping,severity assessment,and personalized interventions.Advances in multimodal fusion techniques have promoted progress in precision medicine for the treatment of ASD.展开更多
[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-base...[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management.展开更多
Forecasting energy demand is essential for optimizing energy generation and effectively predicting power system needs.Recently,many researchers have developed various models on tabular datasets to enhance the effectiv...Forecasting energy demand is essential for optimizing energy generation and effectively predicting power system needs.Recently,many researchers have developed various models on tabular datasets to enhance the effectiveness of demand prediction,including neural networks,machine learning,deep learning,and advanced architectures such as CNN and LSTM.However,research on the CNN models has struggled to provide reliable outcomes due to insufficient dataset sizes,repeated investigations,and inappropriate baseline selection.To address these challenges,we propose a Tabular data-based Lightweight Convolutional Neural Network(TLCNN)model for predicting energy demand.It frames the problem as a regression task that effectively captures complex data trends for accurate forecasting.The BanE-16 dataset is preprocessed using normalization techniques for categorical and numerical data before training the model.The proposed approach dynamically selects relevant features through a two-dimensional convolutional structure that improves adaptability.The model’s performance is evaluated using MSE,MAE,and Accuracy metrics.Experimental results show that TLCNN achieves a 10.89%lower MSE than traditional ML algorithms,demonstrating superior predictive capability.Additionally,TLCNN’s lightweight structure enhances generalization while reducing computational costs,making it suitable for real-world energy forecasting tasks.This study contributes to energy informatics by introducing an optimized deep-learning framework that improves demand prediction by ensuring robustness and adaptability for tabular data.展开更多
For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to p...For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to process intercepted signals,which has a negative effect on signal classification. A classificationmethod based on spatial data mining is presented to address theabove challenge. Inspired by the idea of spatial data mining, theclassification method applies nuclear field to depicting the distributioninformation of pulse samples in feature space, and digs out thehidden cluster information by analyzing distribution characteristics.In addition, a membership-degree criterion to quantify the correlationamong all classes is established, which ensures classificationaccuracy of signal samples. Numerical experiments show that thepresented method can effectively prevent different working statesof multi-mode emitter from being classified as several emitters,and achieve higher classification accuracy.展开更多
There are multiple operating modes in the real industrial process, and the collected data follow the complex multimodal distribution, so most traditional process monitoring methods are no longer applicable because the...There are multiple operating modes in the real industrial process, and the collected data follow the complex multimodal distribution, so most traditional process monitoring methods are no longer applicable because their presumptions are that sampled-data should obey the single Gaussian distribution or non-Gaussian distribution. In order to solve these problems, a novel weighted local standardization(WLS) strategy is proposed to standardize the multimodal data, which can eliminate the multi-mode characteristics of the collected data, and normalize them into unimodal data distribution. After detailed analysis of the raised data preprocessing strategy, a new algorithm using WLS strategy with support vector data description(SVDD) is put forward to apply for multi-mode monitoring process. Unlike the strategy of building multiple local models, the developed method only contains a model without the prior knowledge of multi-mode process. To demonstrate the proposed method's validity, it is applied to a numerical example and a Tennessee Eastman(TE) process. Finally, the simulation results show that the WLS strategy is very effective to standardize multimodal data, and the WLS-SVDD monitoring method has great advantages over the traditional SVDD and PCA combined with a local standardization strategy(LNS-PCA) in multi-mode process monitoring.展开更多
Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction mode...Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.展开更多
In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is pro...In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is proposed,which makes use of multi-neighborhood information of voxel and image information.Firstly,design an improved ResNet that maintains the structure information of far and hard objects in low-resolution feature maps,which is more suitable for detection task.Meanwhile,semantema of each image feature map is enhanced by semantic information from all subsequent feature maps.Secondly,extract multi-neighborhood context information with different receptive field sizes to make up for the defect of sparseness of point cloud which improves the ability of voxel features to represent the spatial structure and semantic information of objects.Finally,propose a multi-modal feature adaptive fusion strategy which uses learnable weights to express the contribution of different modal features to the detection task,and voxel attention further enhances the fused feature expression of effective target objects.The experimental results on the KITTI benchmark show that this method outperforms VoxelNet with remarkable margins,i.e.increasing the AP by 8.78%and 5.49%on medium and hard difficulty levels.Meanwhile,our method achieves greater detection performance compared with many mainstream multi-modal methods,i.e.outperforming the AP by 1%compared with that of MVX-Net on medium and hard difficulty levels.展开更多
To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features e...To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features extracted synchronously by the CCAE were stacked and fed to the multi-channel convolution layers for fusion. Then, the fused data was passed to all connection layers for compression and fed to the Softmax module for classification. Finally, the coupling loss function coefficients and the network parameters were optimized through an adaptive approach using the gray wolf optimization (GWO) algorithm. Experimental comparisons showed that the proposed ADCCAE fusion model was superior to existing models for multi-mode data fusion.展开更多
To improve the traffic scheduling capability in operator data center networks,an analysis prediction and online scheduling mechanism(APOS)is designed,considering both the network structure and the network traffic in t...To improve the traffic scheduling capability in operator data center networks,an analysis prediction and online scheduling mechanism(APOS)is designed,considering both the network structure and the network traffic in the operator data center.Fibonacci tree optimization algorithm(FTO)is embedded into the analysis prediction and the online scheduling stages,the FTO traffic scheduling strategy is proposed.By taking the global optimal and the multi-modal optimization advantage of FTO,the traffic scheduling optimal solution and many suboptimal solutions can be obtained.The experiment results show that the FTO traffic scheduling strategy can schedule traffic in data center networks reasonably,and improve the load balancing in the operator data center network effectively.展开更多
In this study,we present a Transformer-based encoder model to predict Alzheimer’s Disease(AD)progression from longitudinal multi-modal patient data.Our model,Longitudinal Survival Model for AD(LSM-AD),leverages rich ...In this study,we present a Transformer-based encoder model to predict Alzheimer’s Disease(AD)progression from longitudinal multi-modal patient data.Our model,Longitudinal Survival Model for AD(LSM-AD),leverages rich temporal patterns present in sequences of patient visits,integrating multi-modal data,such as cognitive assessments and Magnetic Resonance Imaging(MRI)biomarkers to compute accurate diagnostic predictions.We conduct an empirical evaluation across two patient groups—Cognitively Normal(CN)individuals and those with Mild Cognitive Impairment(MCI)—tracking their progression for up to five follow-up years.Our results indicate that incorporating longer patient histories can yield superior performance compared to relying solely on a single visit,emphasizing the importance of historical context in improving predictive accuracy.Additionally,we show that the choice of the prediction head,training loss function and method for handling input missingness can significantly impact the quality of predictions.Notably,LSM-AD can improve Area Under the Receiver Operating Characteristic(AUROC)curve by up to 15%over previous state-of-the-art,when MRI biomarkers serve as the sole longitudinal feature.Our findings reinforce the value of multi-modal longitudinal data in evaluating patients,demonstrating its potential to improve early detection and monitoring of AD progression.Our code is available at https://github.com/batuhankmkaraman/LSM-AD.展开更多
大语言模型(Large Language Model,LLM)在生成表格数据任务中展现出巨大潜力,但其生成的数据往往难以准确保持数据列间的依赖关系.针对该问题,提出一种基于LLM概率提示词的方法 TabProLLM,分别生成表格数据的数值列和分类列.使用高斯混...大语言模型(Large Language Model,LLM)在生成表格数据任务中展现出巨大潜力,但其生成的数据往往难以准确保持数据列间的依赖关系.针对该问题,提出一种基于LLM概率提示词的方法 TabProLLM,分别生成表格数据的数值列和分类列.使用高斯混合模型(Gaussian Mixture Model,GMM)切分数值列的概率密度曲线,将其划分为多个正态分布,并基于划分后的正态分布构造概率提示词用于大模型生成数值列数据.对于分类列,以某一数值列为基准进行分区,计算分类列中各类别在不同数值区间的条件概率分布,并根据条件概率分布生成提示词用于生成分类列数据.在提示词生成过程中,还引入相关系数等指标,用于校验生成数据中变量间的依赖关系是否符合原始数据的相关性模式.在10个公开数据集上的实验结果表明,TabProLLM在保证数据隐私性的同时,在SDMetrics工具中的RangeCoverage,CategoryCoverage,KSComplement,TVComplement等多个保真度评估指标上实现了18%左右的性能提升.其相关性指标CorrelationSimilarity与最优模型TabDDPM基本持平,和GPT-4o使用均值方差提示词方法相比,提升约4.1%.同时,在隐私性评估方面,TabProLLM的DCR和NNDR(取第5百分位数)指标整体表现为最优和次优.展开更多
为解决网络货运平台价格预测不准确导致的成交率下降问题,提出基于Shingling检索的表格先验数据拟合网络(tabular prior-data fitted network,TabPFN)的局部上下文学习(local context learning with TabPFN based on shingling retrieva...为解决网络货运平台价格预测不准确导致的成交率下降问题,提出基于Shingling检索的表格先验数据拟合网络(tabular prior-data fitted network,TabPFN)的局部上下文学习(local context learning with TabPFN based on shingling retrieval,ShinglingPFN)模型。首先,该模型运用w-Shingling检索算法,从历史订单数据中匹配出与预测订单最相似的订单,构建局部关联的上下文数据。然后,加载并初始化预训练的TabPFN模型实例,将筛选出的订单数据输入模型,让TabPFN基于这些上下文信息学习货运特征与运费的关联模式。最后,输出该货运样本的运费预测结果。结果表明,ShinglingPFN模型相比随机森林(random forest,RF)模型减少了30.98%的平均绝对误差(mean absolute error,MAE)。通过全局敏感性分析,进一步增强了模型的可解释性。ShinglingPFN模型可为平台优化定价策略提供决策支撑。展开更多
Saraikistan (South Punjab and surrounding) area of Pakistan is located in the central Pakistan. This area represents Triassic-Jurassic to Recent sedimentary marine and terrestrial strata. Most of the Mesozoic and Earl...Saraikistan (South Punjab and surrounding) area of Pakistan is located in the central Pakistan. This area represents Triassic-Jurassic to Recent sedimentary marine and terrestrial strata. Most of the Mesozoic and Early Cenozoic are represented by marine strata with rare terrestrial deposits, while the Late Cenozoic is represented by continental fluvial deposits. This area hosts significant mineral deposits and their development can play a significant role in the development of Saraikistan region and ultimately for Pakistan. The data of recently discovered biotas from Cambrian to Miocene age are tabulated for quick view. Mesozoic biotas show a prominent paleobiogeographic link with Gondwana and Cenozoic show Eurasian. Phylogeny and hypodigm of Poripuchian titanosaurs from India and Pakistan are hinted at here.展开更多
The fasteners employed in the railway tracks are susceptible to defects arising from their intricate composition.Foreign objects are frequently observed on the track bed in an open environment.These two types of defec...The fasteners employed in the railway tracks are susceptible to defects arising from their intricate composition.Foreign objects are frequently observed on the track bed in an open environment.These two types of defects pose potential threats to high-speed trains,thus necessitating timely and accurate track inspection.The majority of extant automatic inspection methods are predicated on the utilization of single visible light data,and the efficacy of the algorithmic processes is influenced by complex environments.Furthermore,due to the single information dimension,the detection accuracy of defects in similar,occluded,and small object categories is low.To address the aforementioned issues,this paper proposes a track defect detectionmethod based on dynamicmulti-modal fusion and challenging object enhanced perception.First,in light of the variances in the representation dimensions ofmultimodal information,this paper proposes a dynamic weighted multi-modal feature fusion module.The fused multi-modal features are assigned weights,and thenmultiplied with the extracted single-modal features atmultiple levels,achieving adaptive adjustment of the response degree of fusion features.Second,a novel stepwise multi-scale convolution feature aggregation module is proposed for challenging objects.The proposed method employs depth separable convolution and cross-scale aggregation operations of different receptive fields to enhance feature extraction and reuse,thereby reducing the degree of progressive loss of effective information.The experimental results demonstrate the efficacy of the proposed method in comparison to eight established methods,encompassing both single-modal and multi-modal methods,as evidenced by the extensive findings within the constructed RGBD dataset.展开更多
基金supported by the National Key Research and Development Program of China(Research Grant Number:2023YFC3603600).
文摘Autism spectrum disorder(AsD)is a highly heterogeneous neurodevelopmental disorder.Early diagnosis and intervention are crucial for improving outcomes.Traditional single-modality diagnostic methods are subjective,limited,and struggle to reveal the underlying pathological mechanisms.In contrast,multimodal data analysis integrates behavioral,physiological,and neuroimaging information with advanced machine-learning and deeplearning algorithms to overcome these limitations.In this review,we surveyed the recent pediatric AsD literature,highlighting artificial intelligence-driven diagnostic techniques,multimodal data fusion strategies,and emerging trends in ASD assessment.We surveyed studies that integrated two or more modalities and summarized the fusion levels,learning paradigms,tasks,datasets,and metrics.Multimodal approaches outperform singlemodality baselines in classification,severity estimation,and subtyping by leveraging complementary information and reducing modality-specific biases.Multimodal approaches significantly enhance diagnostic accuracy and comprehensiveness,enabling early screening of AsD,symptom subtyping,severity assessment,and personalized interventions.Advances in multimodal fusion techniques have promoted progress in precision medicine for the treatment of ASD.
文摘[Objective]Accurate prediction of tomato growth height is crucial for optimizing production environments in smart farming.However,current prediction methods predominantly rely on empirical,mechanistic,or learning-based models that utilize either images data or environmental data.These methods fail to fully leverage multi-modal data to capture the diverse aspects of plant growth comprehensively.[Methods]To address this limitation,a two-stage phenotypic feature extraction(PFE)model based on deep learning algorithm of recurrent neural network(RNN)and long short-term memory(LSTM)was developed.The model integrated environment and plant information to provide a holistic understanding of the growth process,emploied phenotypic and temporal feature extractors to comprehensively capture both types of features,enabled a deeper understanding of the interaction between tomato plants and their environment,ultimately leading to highly accurate predictions of growth height.[Results and Discussions]The experimental results showed the model's ef‐fectiveness:When predicting the next two days based on the past five days,the PFE-based RNN and LSTM models achieved mean absolute percentage error(MAPE)of 0.81%and 0.40%,respectively,which were significantly lower than the 8.00%MAPE of the large language model(LLM)and 6.72%MAPE of the Transformer-based model.In longer-term predictions,the 10-day prediction for 4 days ahead and the 30-day prediction for 12 days ahead,the PFE-RNN model continued to outperform the other two baseline models,with MAPE of 2.66%and 14.05%,respectively.[Conclusions]The proposed method,which leverages phenotypic-temporal collaboration,shows great potential for intelligent,data-driven management of tomato cultivation,making it a promising approach for enhancing the efficiency and precision of smart tomato planting management.
文摘Forecasting energy demand is essential for optimizing energy generation and effectively predicting power system needs.Recently,many researchers have developed various models on tabular datasets to enhance the effectiveness of demand prediction,including neural networks,machine learning,deep learning,and advanced architectures such as CNN and LSTM.However,research on the CNN models has struggled to provide reliable outcomes due to insufficient dataset sizes,repeated investigations,and inappropriate baseline selection.To address these challenges,we propose a Tabular data-based Lightweight Convolutional Neural Network(TLCNN)model for predicting energy demand.It frames the problem as a regression task that effectively captures complex data trends for accurate forecasting.The BanE-16 dataset is preprocessed using normalization techniques for categorical and numerical data before training the model.The proposed approach dynamically selects relevant features through a two-dimensional convolutional structure that improves adaptability.The model’s performance is evaluated using MSE,MAE,and Accuracy metrics.Experimental results show that TLCNN achieves a 10.89%lower MSE than traditional ML algorithms,demonstrating superior predictive capability.Additionally,TLCNN’s lightweight structure enhances generalization while reducing computational costs,making it suitable for real-world energy forecasting tasks.This study contributes to energy informatics by introducing an optimized deep-learning framework that improves demand prediction by ensuring robustness and adaptability for tabular data.
基金supported by the National Natural Science Foundation of China(61371172)the International S&T Cooperation Program of China(2015DFR10220)+1 种基金the Ocean Engineering Project of National Key Laboratory Foundation(1213)the Fundamental Research Funds for the Central Universities(HEUCF1608)
文摘For the multi-mode radar working in the modern electronicbattlefield, different working states of one single radar areprone to being classified as multiple emitters when adoptingtraditional classification methods to process intercepted signals,which has a negative effect on signal classification. A classificationmethod based on spatial data mining is presented to address theabove challenge. Inspired by the idea of spatial data mining, theclassification method applies nuclear field to depicting the distributioninformation of pulse samples in feature space, and digs out thehidden cluster information by analyzing distribution characteristics.In addition, a membership-degree criterion to quantify the correlationamong all classes is established, which ensures classificationaccuracy of signal samples. Numerical experiments show that thepresented method can effectively prevent different working statesof multi-mode emitter from being classified as several emitters,and achieve higher classification accuracy.
基金Project(61374140)supported by the National Natural Science Foundation of China
文摘There are multiple operating modes in the real industrial process, and the collected data follow the complex multimodal distribution, so most traditional process monitoring methods are no longer applicable because their presumptions are that sampled-data should obey the single Gaussian distribution or non-Gaussian distribution. In order to solve these problems, a novel weighted local standardization(WLS) strategy is proposed to standardize the multimodal data, which can eliminate the multi-mode characteristics of the collected data, and normalize them into unimodal data distribution. After detailed analysis of the raised data preprocessing strategy, a new algorithm using WLS strategy with support vector data description(SVDD) is put forward to apply for multi-mode monitoring process. Unlike the strategy of building multiple local models, the developed method only contains a model without the prior knowledge of multi-mode process. To demonstrate the proposed method's validity, it is applied to a numerical example and a Tennessee Eastman(TE) process. Finally, the simulation results show that the WLS strategy is very effective to standardize multimodal data, and the WLS-SVDD monitoring method has great advantages over the traditional SVDD and PCA combined with a local standardization strategy(LNS-PCA) in multi-mode process monitoring.
基金Project(2023JH26-10100002)supported by the Liaoning Science and Technology Major Project,ChinaProjects(U21A20117,52074085)supported by the National Natural Science Foundation of China+1 种基金Project(2022JH2/101300008)supported by the Liaoning Applied Basic Research Program Project,ChinaProject(22567612H)supported by the Hebei Provincial Key Laboratory Performance Subsidy Project,China。
文摘Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.
基金National Youth Natural Science Foundation of China(No.61806006)Innovation Program for Graduate of Jiangsu Province(No.KYLX160-781)Jiangsu University Superior Discipline Construction Project。
文摘In order to solve difficult detection of far and hard objects due to the sparseness and insufficient semantic information of LiDAR point cloud,a 3D object detection network with multi-modal data adaptive fusion is proposed,which makes use of multi-neighborhood information of voxel and image information.Firstly,design an improved ResNet that maintains the structure information of far and hard objects in low-resolution feature maps,which is more suitable for detection task.Meanwhile,semantema of each image feature map is enhanced by semantic information from all subsequent feature maps.Secondly,extract multi-neighborhood context information with different receptive field sizes to make up for the defect of sparseness of point cloud which improves the ability of voxel features to represent the spatial structure and semantic information of objects.Finally,propose a multi-modal feature adaptive fusion strategy which uses learnable weights to express the contribution of different modal features to the detection task,and voxel attention further enhances the fused feature expression of effective target objects.The experimental results on the KITTI benchmark show that this method outperforms VoxelNet with remarkable margins,i.e.increasing the AP by 8.78%and 5.49%on medium and hard difficulty levels.Meanwhile,our method achieves greater detection performance compared with many mainstream multi-modal methods,i.e.outperforming the AP by 1%compared with that of MVX-Net on medium and hard difficulty levels.
文摘To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features extracted synchronously by the CCAE were stacked and fed to the multi-channel convolution layers for fusion. Then, the fused data was passed to all connection layers for compression and fed to the Softmax module for classification. Finally, the coupling loss function coefficients and the network parameters were optimized through an adaptive approach using the gray wolf optimization (GWO) algorithm. Experimental comparisons showed that the proposed ADCCAE fusion model was superior to existing models for multi-mode data fusion.
基金supported by National Natural Science Foundation of China(No.62163036).
文摘To improve the traffic scheduling capability in operator data center networks,an analysis prediction and online scheduling mechanism(APOS)is designed,considering both the network structure and the network traffic in the operator data center.Fibonacci tree optimization algorithm(FTO)is embedded into the analysis prediction and the online scheduling stages,the FTO traffic scheduling strategy is proposed.By taking the global optimal and the multi-modal optimization advantage of FTO,the traffic scheduling optimal solution and many suboptimal solutions can be obtained.The experiment results show that the FTO traffic scheduling strategy can schedule traffic in data center networks reasonably,and improve the load balancing in the operator data center network effectively.
基金funded by National Institutes of Health of USA(NIH)(Nos.R01AG053949 and R01MH130899)National Science Foundation(NSF)CAREER of USA(No.1748377).
文摘In this study,we present a Transformer-based encoder model to predict Alzheimer’s Disease(AD)progression from longitudinal multi-modal patient data.Our model,Longitudinal Survival Model for AD(LSM-AD),leverages rich temporal patterns present in sequences of patient visits,integrating multi-modal data,such as cognitive assessments and Magnetic Resonance Imaging(MRI)biomarkers to compute accurate diagnostic predictions.We conduct an empirical evaluation across two patient groups—Cognitively Normal(CN)individuals and those with Mild Cognitive Impairment(MCI)—tracking their progression for up to five follow-up years.Our results indicate that incorporating longer patient histories can yield superior performance compared to relying solely on a single visit,emphasizing the importance of historical context in improving predictive accuracy.Additionally,we show that the choice of the prediction head,training loss function and method for handling input missingness can significantly impact the quality of predictions.Notably,LSM-AD can improve Area Under the Receiver Operating Characteristic(AUROC)curve by up to 15%over previous state-of-the-art,when MRI biomarkers serve as the sole longitudinal feature.Our findings reinforce the value of multi-modal longitudinal data in evaluating patients,demonstrating its potential to improve early detection and monitoring of AD progression.Our code is available at https://github.com/batuhankmkaraman/LSM-AD.
文摘为解决网络货运平台价格预测不准确导致的成交率下降问题,提出基于Shingling检索的表格先验数据拟合网络(tabular prior-data fitted network,TabPFN)的局部上下文学习(local context learning with TabPFN based on shingling retrieval,ShinglingPFN)模型。首先,该模型运用w-Shingling检索算法,从历史订单数据中匹配出与预测订单最相似的订单,构建局部关联的上下文数据。然后,加载并初始化预训练的TabPFN模型实例,将筛选出的订单数据输入模型,让TabPFN基于这些上下文信息学习货运特征与运费的关联模式。最后,输出该货运样本的运费预测结果。结果表明,ShinglingPFN模型相比随机森林(random forest,RF)模型减少了30.98%的平均绝对误差(mean absolute error,MAE)。通过全局敏感性分析,进一步增强了模型的可解释性。ShinglingPFN模型可为平台优化定价策略提供决策支撑。
文摘Saraikistan (South Punjab and surrounding) area of Pakistan is located in the central Pakistan. This area represents Triassic-Jurassic to Recent sedimentary marine and terrestrial strata. Most of the Mesozoic and Early Cenozoic are represented by marine strata with rare terrestrial deposits, while the Late Cenozoic is represented by continental fluvial deposits. This area hosts significant mineral deposits and their development can play a significant role in the development of Saraikistan region and ultimately for Pakistan. The data of recently discovered biotas from Cambrian to Miocene age are tabulated for quick view. Mesozoic biotas show a prominent paleobiogeographic link with Gondwana and Cenozoic show Eurasian. Phylogeny and hypodigm of Poripuchian titanosaurs from India and Pakistan are hinted at here.
基金funded by Beijing Natural Science Foundation,grant number L241078.
文摘The fasteners employed in the railway tracks are susceptible to defects arising from their intricate composition.Foreign objects are frequently observed on the track bed in an open environment.These two types of defects pose potential threats to high-speed trains,thus necessitating timely and accurate track inspection.The majority of extant automatic inspection methods are predicated on the utilization of single visible light data,and the efficacy of the algorithmic processes is influenced by complex environments.Furthermore,due to the single information dimension,the detection accuracy of defects in similar,occluded,and small object categories is low.To address the aforementioned issues,this paper proposes a track defect detectionmethod based on dynamicmulti-modal fusion and challenging object enhanced perception.First,in light of the variances in the representation dimensions ofmultimodal information,this paper proposes a dynamic weighted multi-modal feature fusion module.The fused multi-modal features are assigned weights,and thenmultiplied with the extracted single-modal features atmultiple levels,achieving adaptive adjustment of the response degree of fusion features.Second,a novel stepwise multi-scale convolution feature aggregation module is proposed for challenging objects.The proposed method employs depth separable convolution and cross-scale aggregation operations of different receptive fields to enhance feature extraction and reuse,thereby reducing the degree of progressive loss of effective information.The experimental results demonstrate the efficacy of the proposed method in comparison to eight established methods,encompassing both single-modal and multi-modal methods,as evidenced by the extensive findings within the constructed RGBD dataset.