Since chemical processes are highly non-linear and multiscale,it is vital to deeply mine the multiscale coupling relationships embedded in the massive process data for the prediction and anomaly tracing of crucial pro...Since chemical processes are highly non-linear and multiscale,it is vital to deeply mine the multiscale coupling relationships embedded in the massive process data for the prediction and anomaly tracing of crucial process parameters and production indicators.While the integrated method of adaptive signal decomposition combined with time series models could effectively predict process variables,it does have limitations in capturing the high-frequency detail of the operation state when applied to complex chemical processes.In light of this,a novel Multiscale Multi-radius Multi-step Convolutional Neural Network(Msrt Net)is proposed for mining spatiotemporal multiscale information.First,the industrial data from the Fluid Catalytic Cracking(FCC)process decomposition using Complete Ensemble Empirical Mode Decomposition with Adaptive Noise(CEEMDAN)extract the multi-energy scale information of the feature subset.Then,convolution kernels with varying stride and padding structures are established to decouple the long-period operation process information encapsulated within the multi-energy scale data.Finally,a reconciliation network is trained to reconstruct the multiscale prediction results and obtain the final output.Msrt Net is initially assessed for its capability to untangle the spatiotemporal multiscale relationships among variables in the Tennessee Eastman Process(TEP).Subsequently,the performance of Msrt Net is evaluated in predicting product yield for a 2.80×10^(6) t/a FCC unit,taking diesel and gasoline yield as examples.In conclusion,Msrt Net can decouple and effectively extract spatiotemporal multiscale information from chemical process data and achieve a approximately reduction of 30%in prediction error compared to other time-series models.Furthermore,its robustness and transferability underscore its promising potential for broader applications.展开更多
Objective In tongue diagnosis,the location,color,and distribution of spots can be used to speculate on the viscera and severity of the heat evil.This work focuses on the image analysis method of artificial intelligenc...Objective In tongue diagnosis,the location,color,and distribution of spots can be used to speculate on the viscera and severity of the heat evil.This work focuses on the image analysis method of artificial intelligence(AI)to study the spotted tongue recognition of traditional Chinese medicine(TCM).Methods A model of spotted tongue recognition and extraction is designed,which is based on the principle of image deep learning and instance segmentation.This model includes multiscale feature map generation,region proposal searching,and target region recognition.Firstly,deep convolution network is used to build multiscale low-and high-abstraction feature maps after which,target candidate box generation algorithm and selection strategy are used to select high-quality target candidate regions.Finally,classification network is used for classifying target regions and calculating target region pixels.As a result,the region segmentation of spotted tongue is obtained.Under non-standard illumination conditions,various tongue images were taken by mobile phones,and experiments were conducted.Results The spotted tongue recognition achieved an area under curve(AUC)of 92.40%,an accuracy of 84.30%with a sensitivity of 88.20%,a specificity of 94.19%,a recall of 88.20%,a regional pixel accuracy index pixel accuracy(PA)of 73.00%,a mean pixel accuracy(m PA)of73.00%,an intersection over union(Io U)of 60.00%,and a mean intersection over union(mIo U)of 56.00%.Conclusion The results of the study verify that the model is suitable for the application of the TCM tongue diagnosis system.Spotted tongue recognition via multiscale convolutional neural network(CNN)would help to improve spot classification and the accurate extraction of pixels of spot area as well as provide a practical method for intelligent tongue diagnosis of TCM.展开更多
A novel convolutional neural network based on spatial pyramid for image classification is proposed.The network exploits image features with spatial pyramid representation.First,it extracts global features from an orig...A novel convolutional neural network based on spatial pyramid for image classification is proposed.The network exploits image features with spatial pyramid representation.First,it extracts global features from an original image,and then different layers of grids are utilized to extract feature maps from different convolutional layers.Inspired by the spatial pyramid,the new network contains two parts,one of which is just like a standard convolutional neural network,composing of alternating convolutions and subsampling layers.But those convolution layers would be averagely pooled by the grid way to obtain feature maps,and then concatenated into a feature vector individually.Finally,those vectors are sequentially concatenated into a total feature vector as the last feature to the fully connection layer.This generated feature vector derives benefits from the classic and previous convolution layer,while the size of the grid adjusting the weight of the feature maps improves the recognition efficiency of the network.Experimental results demonstrate that this model improves the accuracy and applicability compared with the traditional model.展开更多
The classification of infrasound events has considerable importance in improving the capability to identify the types of natural disasters.The traditional infrasound classification mainly relies on machine learning al...The classification of infrasound events has considerable importance in improving the capability to identify the types of natural disasters.The traditional infrasound classification mainly relies on machine learning algorithms after artificial feature extraction.However,guaranteeing the effectiveness of the extracted features is difficult.The current trend focuses on using a convolution neural network to automatically extract features for classification.This method can be used to extract signal spatial features automatically through a convolution kernel;however,infrasound signals contain not only spatial information but also temporal information when used as a time series.These extracted temporal features are also crucial.If only a convolution neural network is used,then the time dependence of the infrasound sequence will be missed.Using long short-term memory networks can compensate for the missing time-series features but induces spatial feature information loss of the infrasound signal.A multiscale squeeze excitation–convolution neural network–bidirectional long short-term memory network infrasound event classification fusion model is proposed in this study to address these problems.This model automatically extracted temporal and spatial features,adaptively selected features,and also realized the fusion of the two types of features.Experimental results showed that the classification accuracy of the model was more than 98%,thus verifying the effectiveness and superiority of the proposed model.展开更多
Faced with the massive amount of online shopping clothing images,how to classify them quickly and accurately is a challenging task in image classification.In this paper,we propose a novel method,named Multi_XMNet,to s...Faced with the massive amount of online shopping clothing images,how to classify them quickly and accurately is a challenging task in image classification.In this paper,we propose a novel method,named Multi_XMNet,to solve the clothing images classification problem.The proposed method mainly consists of two convolution neural network(CNN)branches.One branch extracts multiscale features from the whole expressional image by Multi_X which is designed by improving the Xception network,while the other extracts attention mechanism features from the whole expressional image by MobileNetV3-small network.Both multiscale and attention mechanism features are aggregated before making classification.Additionally,in the training stage,global average pooling(GAP),convolutional layers,and softmax classifiers are used instead of the fully connected layer to classify the final features,which speed up model training and alleviate the problem of overfitting caused by too many parameters.Experimental comparisons are made in the public DeepFashion dataset.The experimental results show that the classification accuracy of this method is 95.38%,which is better than InceptionV3,Xception and InceptionV3_Xception by 5.58%,3.32%,and 2.22%,respectively.The proposed Multi_XMNet image classification model can help enterprises and researchers in the field of clothing e-commerce to automaticly,efficiently and accurately classify massive clothing images.展开更多
There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capaci...There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capacitor components showa continuous and stable charging and discharging state,a hierarchical time-sharing configuration algorithm of distributed energy storage cloud group end region on the power grid side based on multi-scale and multi feature convolution neural network is proposed.Firstly,a voltage stability analysis model based onmulti-scale and multi feature convolution neural network is constructed,and the multi-scale and multi feature convolution neural network is optimized based on Self-OrganizingMaps(SOM)algorithm to analyze the voltage stability of the cloud group end region of distributed energy storage on the grid side under the framework of credibility.According to the optimal scheduling objectives and network size,the distributed robust optimal configuration control model is solved under the framework of coordinated optimal scheduling at multiple time scales;Finally,the time series characteristics of regional power grid load and distributed generation are analyzed.According to the regional hierarchical time-sharing configuration model of“cloud”,“group”and“end”layer,the grid side distributed energy storage cloud group end regional hierarchical time-sharing configuration algorithm is realized.The experimental results show that after applying this algorithm,the best grid side distributed energy storage configuration scheme can be determined,and the stability of grid side distributed energy storage cloud group end region layered timesharing configuration can be improved.展开更多
Accurate and automated segmentation of 3D biomedical images is a sophisticated imperative in clinical diagnosis,imaging-guided surgery,and prognosis judgment.Although the burgeoning of deep learning technologies has f...Accurate and automated segmentation of 3D biomedical images is a sophisticated imperative in clinical diagnosis,imaging-guided surgery,and prognosis judgment.Although the burgeoning of deep learning technologies has fostered smart segmentators,the successive and simultaneous garnering global and local features still remains challenging,which is essential for an exact and efficient imageological assay.To this end,a segmentation solution dubbed the mixed parallel shunted transformer(MPSTrans)is developed here,highlighting 3DMPST blocks in a U-form framework.It enabled not only comprehensive characteristic capture and multiscale slice synchronization but also deep supervision in the decoder to facilitate the fetching of hierarchical representations.Performing on an unpublished colon cancer data set,this model achieved an impressive increase in dice similarity coefficient(DSC)and a 1.718 mm decease in Hausdorff distance at 95%(HD95),alongside a substantial shrink of computational load of 56.7%in giga floating-point operations per second(GFLOPs).Meanwhile,MPSTrans outperforms other mainstream methods(Swin UNETR,UNETR,nnU-Net,PHTrans,and 3D U-Net)on three public multiorgan(aorta,gallbladder,kidney,liver,pancreas,spleen,stomach,etc.)and multimodal(CT,PET-CT,and MRI)data sets of medical segmentation decathlon(MSD)brain tumor,multiatlas labeling beyond cranial vault(BCV),and automated cardiac diagnosis challenge(ACDC),accentuating its adaptability.These results reflect the potential of MPSTrans to advance the state-of-the-art in biomedical imaging analysis,which would offer a robust tool for enhanced diagnostic capacity.展开更多
Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as ...Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as the backbone network and designs an encoder-decoder architecture based on multidimensional attention(MDA)module and multiscale upsampling(MSU)module.The MDA module calculates the attention matrices of the three dimensions to capture the dependency of each position,and adaptively captures the image features.The MSU module adopts parallel branches to capture the multiscale features of the images,and multiscale feature aggregation can enhance contextual information.A series of experiments demonstrate the validity of the model on Cityscapes and Camvid datasets.展开更多
现有的注意力机制仅增强特征图的通道或空间维度,未能充分捕捉细微视觉元素和多尺度特征变化。为解决此问题,提出一种基于局部分块与全局多尺度特征融合的注意力机制(patch and global multiscale attention,PGMA)。将特征图分割成多个...现有的注意力机制仅增强特征图的通道或空间维度,未能充分捕捉细微视觉元素和多尺度特征变化。为解决此问题,提出一种基于局部分块与全局多尺度特征融合的注意力机制(patch and global multiscale attention,PGMA)。将特征图分割成多个小块,分别计算这些小块的注意力得分,增强对局部信息的感知能力。使用一组空洞卷积计算整个特征图的得分,获得全局多尺度信息的权衡。实验中,将PGMA集成到U-Net、DeepLab、SegNet等语义分割网络中,有效提升了它们的分割性能。这表明PGMA在增强CNN性能方面优于当前主流方法。展开更多
文摘Since chemical processes are highly non-linear and multiscale,it is vital to deeply mine the multiscale coupling relationships embedded in the massive process data for the prediction and anomaly tracing of crucial process parameters and production indicators.While the integrated method of adaptive signal decomposition combined with time series models could effectively predict process variables,it does have limitations in capturing the high-frequency detail of the operation state when applied to complex chemical processes.In light of this,a novel Multiscale Multi-radius Multi-step Convolutional Neural Network(Msrt Net)is proposed for mining spatiotemporal multiscale information.First,the industrial data from the Fluid Catalytic Cracking(FCC)process decomposition using Complete Ensemble Empirical Mode Decomposition with Adaptive Noise(CEEMDAN)extract the multi-energy scale information of the feature subset.Then,convolution kernels with varying stride and padding structures are established to decouple the long-period operation process information encapsulated within the multi-energy scale data.Finally,a reconciliation network is trained to reconstruct the multiscale prediction results and obtain the final output.Msrt Net is initially assessed for its capability to untangle the spatiotemporal multiscale relationships among variables in the Tennessee Eastman Process(TEP).Subsequently,the performance of Msrt Net is evaluated in predicting product yield for a 2.80×10^(6) t/a FCC unit,taking diesel and gasoline yield as examples.In conclusion,Msrt Net can decouple and effectively extract spatiotemporal multiscale information from chemical process data and achieve a approximately reduction of 30%in prediction error compared to other time-series models.Furthermore,its robustness and transferability underscore its promising potential for broader applications.
基金Anhui Province College Natural Science Fund Key Project of China(KJ2020ZD77)the Project of Education Department of Anhui Province(KJ2020A0379)。
文摘Objective In tongue diagnosis,the location,color,and distribution of spots can be used to speculate on the viscera and severity of the heat evil.This work focuses on the image analysis method of artificial intelligence(AI)to study the spotted tongue recognition of traditional Chinese medicine(TCM).Methods A model of spotted tongue recognition and extraction is designed,which is based on the principle of image deep learning and instance segmentation.This model includes multiscale feature map generation,region proposal searching,and target region recognition.Firstly,deep convolution network is used to build multiscale low-and high-abstraction feature maps after which,target candidate box generation algorithm and selection strategy are used to select high-quality target candidate regions.Finally,classification network is used for classifying target regions and calculating target region pixels.As a result,the region segmentation of spotted tongue is obtained.Under non-standard illumination conditions,various tongue images were taken by mobile phones,and experiments were conducted.Results The spotted tongue recognition achieved an area under curve(AUC)of 92.40%,an accuracy of 84.30%with a sensitivity of 88.20%,a specificity of 94.19%,a recall of 88.20%,a regional pixel accuracy index pixel accuracy(PA)of 73.00%,a mean pixel accuracy(m PA)of73.00%,an intersection over union(Io U)of 60.00%,and a mean intersection over union(mIo U)of 56.00%.Conclusion The results of the study verify that the model is suitable for the application of the TCM tongue diagnosis system.Spotted tongue recognition via multiscale convolutional neural network(CNN)would help to improve spot classification and the accurate extraction of pixels of spot area as well as provide a practical method for intelligent tongue diagnosis of TCM.
基金Supported by the National Natural Science Foundation of China(61601176)the Science and Technology Foundation of Hubei Provincial Department of Education(Q20161405)
文摘A novel convolutional neural network based on spatial pyramid for image classification is proposed.The network exploits image features with spatial pyramid representation.First,it extracts global features from an original image,and then different layers of grids are utilized to extract feature maps from different convolutional layers.Inspired by the spatial pyramid,the new network contains two parts,one of which is just like a standard convolutional neural network,composing of alternating convolutions and subsampling layers.But those convolution layers would be averagely pooled by the grid way to obtain feature maps,and then concatenated into a feature vector individually.Finally,those vectors are sequentially concatenated into a total feature vector as the last feature to the fully connection layer.This generated feature vector derives benefits from the classic and previous convolution layer,while the size of the grid adjusting the weight of the feature maps improves the recognition efficiency of the network.Experimental results demonstrate that this model improves the accuracy and applicability compared with the traditional model.
基金supported by the Shaanxi Province Natural Science Basic Research Plan Project(2023-JC-YB-244).
文摘The classification of infrasound events has considerable importance in improving the capability to identify the types of natural disasters.The traditional infrasound classification mainly relies on machine learning algorithms after artificial feature extraction.However,guaranteeing the effectiveness of the extracted features is difficult.The current trend focuses on using a convolution neural network to automatically extract features for classification.This method can be used to extract signal spatial features automatically through a convolution kernel;however,infrasound signals contain not only spatial information but also temporal information when used as a time series.These extracted temporal features are also crucial.If only a convolution neural network is used,then the time dependence of the infrasound sequence will be missed.Using long short-term memory networks can compensate for the missing time-series features but induces spatial feature information loss of the infrasound signal.A multiscale squeeze excitation–convolution neural network–bidirectional long short-term memory network infrasound event classification fusion model is proposed in this study to address these problems.This model automatically extracted temporal and spatial features,adaptively selected features,and also realized the fusion of the two types of features.Experimental results showed that the classification accuracy of the model was more than 98%,thus verifying the effectiveness and superiority of the proposed model.
基金Fundamental Research Funds for the Central Universities of Ministry of Education of China(No.19D111201)。
文摘Faced with the massive amount of online shopping clothing images,how to classify them quickly and accurately is a challenging task in image classification.In this paper,we propose a novel method,named Multi_XMNet,to solve the clothing images classification problem.The proposed method mainly consists of two convolution neural network(CNN)branches.One branch extracts multiscale features from the whole expressional image by Multi_X which is designed by improving the Xception network,while the other extracts attention mechanism features from the whole expressional image by MobileNetV3-small network.Both multiscale and attention mechanism features are aggregated before making classification.Additionally,in the training stage,global average pooling(GAP),convolutional layers,and softmax classifiers are used instead of the fully connected layer to classify the final features,which speed up model training and alleviate the problem of overfitting caused by too many parameters.Experimental comparisons are made in the public DeepFashion dataset.The experimental results show that the classification accuracy of this method is 95.38%,which is better than InceptionV3,Xception and InceptionV3_Xception by 5.58%,3.32%,and 2.22%,respectively.The proposed Multi_XMNet image classification model can help enterprises and researchers in the field of clothing e-commerce to automaticly,efficiently and accurately classify massive clothing images.
基金supported by State Grid Corporation Limited Science and Technology Project Funding(Contract No.SGCQSQ00YJJS2200380).
文摘There is instability in the distributed energy storage cloud group end region on the power grid side.In order to avoid large-scale fluctuating charging and discharging in the power grid environment and make the capacitor components showa continuous and stable charging and discharging state,a hierarchical time-sharing configuration algorithm of distributed energy storage cloud group end region on the power grid side based on multi-scale and multi feature convolution neural network is proposed.Firstly,a voltage stability analysis model based onmulti-scale and multi feature convolution neural network is constructed,and the multi-scale and multi feature convolution neural network is optimized based on Self-OrganizingMaps(SOM)algorithm to analyze the voltage stability of the cloud group end region of distributed energy storage on the grid side under the framework of credibility.According to the optimal scheduling objectives and network size,the distributed robust optimal configuration control model is solved under the framework of coordinated optimal scheduling at multiple time scales;Finally,the time series characteristics of regional power grid load and distributed generation are analyzed.According to the regional hierarchical time-sharing configuration model of“cloud”,“group”and“end”layer,the grid side distributed energy storage cloud group end regional hierarchical time-sharing configuration algorithm is realized.The experimental results show that after applying this algorithm,the best grid side distributed energy storage configuration scheme can be determined,and the stability of grid side distributed energy storage cloud group end region layered timesharing configuration can be improved.
基金supported by National Natural Science Foundation of China(Grant Nos.22204077,22374076)Natural Science Foundation of Jiangsu Province(BK20231455)+2 种基金Open Research Program of National Major Scientific and Technological Infrastructure for Translational Medicine(TMSK-2024-115)Fundamental Research Funds for the Central Universities(30922010501,2023303002)State Key Laboratory for Analytical Chemistry for Life Science(SKLACLS2402).
文摘Accurate and automated segmentation of 3D biomedical images is a sophisticated imperative in clinical diagnosis,imaging-guided surgery,and prognosis judgment.Although the burgeoning of deep learning technologies has fostered smart segmentators,the successive and simultaneous garnering global and local features still remains challenging,which is essential for an exact and efficient imageological assay.To this end,a segmentation solution dubbed the mixed parallel shunted transformer(MPSTrans)is developed here,highlighting 3DMPST blocks in a U-form framework.It enabled not only comprehensive characteristic capture and multiscale slice synchronization but also deep supervision in the decoder to facilitate the fetching of hierarchical representations.Performing on an unpublished colon cancer data set,this model achieved an impressive increase in dice similarity coefficient(DSC)and a 1.718 mm decease in Hausdorff distance at 95%(HD95),alongside a substantial shrink of computational load of 56.7%in giga floating-point operations per second(GFLOPs).Meanwhile,MPSTrans outperforms other mainstream methods(Swin UNETR,UNETR,nnU-Net,PHTrans,and 3D U-Net)on three public multiorgan(aorta,gallbladder,kidney,liver,pancreas,spleen,stomach,etc.)and multimodal(CT,PET-CT,and MRI)data sets of medical segmentation decathlon(MSD)brain tumor,multiatlas labeling beyond cranial vault(BCV),and automated cardiac diagnosis challenge(ACDC),accentuating its adaptability.These results reflect the potential of MPSTrans to advance the state-of-the-art in biomedical imaging analysis,which would offer a robust tool for enhanced diagnostic capacity.
基金Fundamental Research Fund in Heilongjiang Provincial Universities(Nos.135409602,135409102)。
文摘Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as the backbone network and designs an encoder-decoder architecture based on multidimensional attention(MDA)module and multiscale upsampling(MSU)module.The MDA module calculates the attention matrices of the three dimensions to capture the dependency of each position,and adaptively captures the image features.The MSU module adopts parallel branches to capture the multiscale features of the images,and multiscale feature aggregation can enhance contextual information.A series of experiments demonstrate the validity of the model on Cityscapes and Camvid datasets.
文摘现有的注意力机制仅增强特征图的通道或空间维度,未能充分捕捉细微视觉元素和多尺度特征变化。为解决此问题,提出一种基于局部分块与全局多尺度特征融合的注意力机制(patch and global multiscale attention,PGMA)。将特征图分割成多个小块,分别计算这些小块的注意力得分,增强对局部信息的感知能力。使用一组空洞卷积计算整个特征图的得分,获得全局多尺度信息的权衡。实验中,将PGMA集成到U-Net、DeepLab、SegNet等语义分割网络中,有效提升了它们的分割性能。这表明PGMA在增强CNN性能方面优于当前主流方法。