The complexity of river-tide interaction poses a significant challenge in predicting discharge in tidal rivers.Long short-term memory(LSTM)networks excel in processing and predicting crucial events with extended inter...The complexity of river-tide interaction poses a significant challenge in predicting discharge in tidal rivers.Long short-term memory(LSTM)networks excel in processing and predicting crucial events with extended intervals and time delays in time series data.Additionally,the sequence-to-sequence(Seq2Seq)model,known for handling temporal relationships,adapting to variable-length sequences,effectively capturing historical information,and accommodating various influencing factors,emerges as a robust and flexible tool in discharge forecasting.In this study,we introduce the application of LSTM-based Seq2Seq models for the first time in forecasting the discharge of a tidal reach of the Changjiang River(Yangtze River)Estuary.This study focuses on discharge forecasting using three key input characteristics:flow velocity,water level,and discharge,which means the structure of multiple input and single output is adopted.The experiment used the discharge data of the whole year of 2020,of which the first 80%is used as the training set,and the last 20%is used as the test set.This means that the data covers different tidal cycles,which helps to test the forecasting effect of different models in different tidal cycles and different runoff.The experimental results indicate that the proposed models demonstrate advantages in long-term,mid-term,and short-term discharge forecasting.The Seq2Seq models improved by 6%-60%and 5%-20%of the relative standard deviation compared to the harmonic analysis models and improved back propagation neural network models in discharge prediction,respectively.In addition,the relative accuracy of the Seq2Seq model is 1%to 3%higher than that of the LSTM model.Analytical assessment of the prediction errors shows that the Seq2Seq models are insensitive to the forecast lead time and they can capture characteristic values such as maximum flood tide flow and maximum ebb tide flow in the tidal cycle well.This indicates the significance of the Seq2Seq models.展开更多
Online sensing can provide useful information in monitoring applications,for example,machine health monitoring,structural condition monitoring,environmental monitoring,and many more.Missing data is generally a signifi...Online sensing can provide useful information in monitoring applications,for example,machine health monitoring,structural condition monitoring,environmental monitoring,and many more.Missing data is generally a significant issue in the sensory data that is collected online by sensing systems,which may affect the goals of monitoring programs.In this paper,a sequence-to-sequence learning model based on a recurrent neural network(RNN)architecture is presented.In the proposed method,multivariate time series of the monitored parameters is embedded into the neural network through layer-by-layer encoders where the hidden features of the inputs are adaptively extracted.Afterwards,predictions of the missing data are generated by network decoders,which are one-step-ahead predictive data sequences of the monitored parameters.The prediction performance of the proposed model is validated based on a real-world sensory dataset.The experimental results demonstrate the performance of the proposed RNN-encoder-decoder model with its capability in sequence-to-sequence learning for online imputation of sensory data.展开更多
In video surveillance,anomaly detection requires training machine learning models on spatio-temporal video sequences.However,sometimes the video-only data is not sufficient to accurately detect all the abnormal activi...In video surveillance,anomaly detection requires training machine learning models on spatio-temporal video sequences.However,sometimes the video-only data is not sufficient to accurately detect all the abnormal activities.Therefore,we propose a novel audio-visual spatiotemporal autoencoder specifically designed to detect anomalies for video surveillance by utilizing audio data along with video data.This paper presents a competitive approach to a multi-modal recurrent neural network for anomaly detection that combines separate spatial and temporal autoencoders to leverage both spatial and temporal features in audio-visual data.The proposed model is trained to produce low reconstruction error for normal data and high error for abnormal data,effectively distinguishing between the two and assigning an anomaly score.Training is conducted on normal datasets,while testing is performed on both normal and anomalous datasets.The anomaly scores from the models are combined using a late fusion technique,and a deep dense layer model is trained to produce decisive scores indicating whether a sequence is normal or anomalous.The model’s performance is evaluated on the University of California,San Diego Pedestrian 2(UCSD PED 2),University of Minnesota(UMN),and Tampere University of Technology(TUT)Rare Sound Events datasets using six evaluation metrics.It is compared with state-of-the-art methods depicting a high Area Under Curve(AUC)and a low Equal Error Rate(EER),achieving an(AUC)of 93.1 and an(EER)of 8.1 for the(UCSD)dataset,and an(AUC)of 94.9 and an(EER)of 5.9 for the UMN dataset.The evaluations demonstrate that the joint results from the combined audio-visual model outperform those from separate models,highlighting the competitive advantage of the proposed multi-modal approach.展开更多
针对传统词向量在自动文本摘要过程中因无法对多义词进行有效表征而降低文本摘要准确度和可读性的问题,提出一种基于BERT(Bidirectional Encoder Representations from Transformers)的自动文本摘要模型构建方法。该方法引入BERT预训练...针对传统词向量在自动文本摘要过程中因无法对多义词进行有效表征而降低文本摘要准确度和可读性的问题,提出一种基于BERT(Bidirectional Encoder Representations from Transformers)的自动文本摘要模型构建方法。该方法引入BERT预训练语言模型用于增强词向量的语义表示,将生成的词向量输入Seq2Seq模型中进行训练并形成自动文本摘要模型,实现对文本摘要的快速生成。实验结果表明,该模型在Gigaword数据集上能有效地提高生成摘要的准确率和可读性,可用于文本摘要自动生成任务。展开更多
Image captioning is an emerging field in machine learning.It refers to the ability to automatically generate a syntactically and semantically meaningful sentence that describes the content of an image.Image captioning...Image captioning is an emerging field in machine learning.It refers to the ability to automatically generate a syntactically and semantically meaningful sentence that describes the content of an image.Image captioning requires a complex machine learning process as it involves two sub models:a vision sub-model for extracting object features and a language sub-model that use the extracted features to generate meaningful captions.Attention-based vision transformers models have a great impact in vision field recently.In this paper,we studied the effect of using the vision transformers on the image captioning process by evaluating the use of four different vision transformer models for the vision sub-models of the image captioning The first vision transformers used is DINO(self-distillation with no labels).The second is PVT(Pyramid Vision Transformer)which is a vision transformer that is not using convolutional layers.The third is XCIT(cross-Covariance Image Transformer)which changes the operation in self-attention by focusing on feature dimension instead of token dimensions.The last one is SWIN(Shifted windows),it is a vision transformer which,unlike the other transformers,uses shifted-window in splitting the image.For a deeper evaluation,the four mentioned vision transformers have been tested with their different versions and different configuration,we evaluate the use of DINO model with five different backbones,PVT with two versions:PVT_v1and PVT_v2,one model of XCIT,SWIN transformer.The results show the high effectiveness of using SWIN-transformer within the proposed image captioning model with regard to the other models.展开更多
This paper presents designing sequence-to-sequence recurrent neural network(RNN)architectures for a novel study to predict soil NOx emissions,driven by the imperative of understanding and mitigating environmental impa...This paper presents designing sequence-to-sequence recurrent neural network(RNN)architectures for a novel study to predict soil NOx emissions,driven by the imperative of understanding and mitigating environmental impact.The study utilizes data collected by the Environmental Protection Agency(EPA)to develop two distinct RNN predictive models:one built upon the long-short term memory(LSTM)and the other utilizing the gated recurrent unit(GRU).These models are fed with a combination of historical and anticipated air temperature,air moisture,and NOx emissions as inputs to forecast future NOx emissions.Both LSTM and GRU models can capture the intricate pulse patterns inherent in soil NOx emissions.Notably,the GRU model emerges as the superior performer,surpassing the LSTM model in predictive accuracy while demonstrating efficiency by necessitating less training time.Intriguingly,the investigation into varying input features reveals that relying solely on past NOx emissions as input yields satisfactory performance,highlighting the dominant influence of this factor.The study also delves into the impact of altering input series lengths and training data sizes,yielding insights into optimal configurations for enhanced model performance.Importantly,the findings promise to advance our grasp of soil NOx emission dynamics,with implications for environmental management strategies.Looking ahead,the anticipated availability of additional measurements is poised to bolster machine-learning model efficacy.Furthermore,the future study will explore physical-based RNNs,a promising avenue for deeper insights into soil NOx emission prediction.展开更多
Background Human-machine dialog generation is an essential topic of research in the field of natural language processing.Generating high-quality,diverse,fluent,and emotional conversation is a challenging task.Based on...Background Human-machine dialog generation is an essential topic of research in the field of natural language processing.Generating high-quality,diverse,fluent,and emotional conversation is a challenging task.Based on continuing advancements in artificial intelligence and deep learning,new methods have come to the forefront in recent times.In particular,the end-to-end neural network model provides an extensible conversation generation framework that has the potential to enable machines to understand semantics and automatically generate responses.However,neural network models come with their own set of questions and challenges.The basic conversational model framework tends to produce universal,meaningless,and relatively"safe"answers.Methods Based on generative adversarial networks(GANs),a new emotional dialog generation framework called EMC-GAN is proposed in this study to address the task of emotional dialog generation.The proposed model comprises a generative and three discriminative models.The generator is based on the basic sequence-to-sequence(Seq2Seq)dialog generation model,and the aggregate discriminative model for the overall framework consists of a basic discriminative model,an emotion discriminative model,and a fluency discriminative model.The basic discriminative model distinguishes generated fake sentences from real sentences in the training corpus.The emotion discriminative model evaluates whether the emotion conveyed via the generated dialog agrees with a pre-specified emotion,and directs the generative model to generate dialogs that correspond to the category of the pre-specified emotion.Finally,the fluency discriminative model assigns a score to the fluency of the generated dialog and guides the generator to produce more fluent sentences.Results Based on the experimental results,this study confirms the superiority of the proposed model over similar existing models with respect to emotional accuracy,fluency,and consistency.Conclusions The proposed EMC-GAN model is capable of generating consistent,smooth,and fluent dialog that conveys pre-specified emotions,and exhibits better performance with respect to emotional accuracy,consistency,and fluency compared to its competitors.展开更多
The on-demand food delivery(OFD)service has gained rapid development in the past decades but meanwhile encounters challenges for further improving operation quality.The order dispatching problem is one of the most con...The on-demand food delivery(OFD)service has gained rapid development in the past decades but meanwhile encounters challenges for further improving operation quality.The order dispatching problem is one of the most concerning issues for the OFD platforms,which refer to dynamically dispatching a large number of orders to riders reasonably in very limited decision time.To solve such a challenging combinatorial optimization problem,an effective matching algorithm is proposed by fusing the reinforcement learning technique and the optimization method.First,to deal with the large-scale complexity,a decoupling method is designed by reducing the matching space between new orders and riders.Second,to overcome the high dynamism and satisfy the stringent requirements on decision time,a reinforcement learning based dispatching heuristic is presented.To be specific,a sequence-to-sequence neural network is constructed based on the problem characteristic to generate an order priority sequence.Besides,a training approach is specially designed to improve learning performance.Furthermore,a greedy heuristic is employed to effectively dispatch new orders according to the order priority sequence.On real-world datasets,numerical experiments are conducted to validate the effectiveness of the proposed algorithm.Statistical results show that the proposed algorithm can effectively solve the problem by improving delivery efficiency and maintaining customer satisfaction.展开更多
The intermittency of renewable energy is a key limiting factor for the successful decarbonization of both energy producing and consuming sectors. Green hydrogen has the potential to act as the central energy vector co...The intermittency of renewable energy is a key limiting factor for the successful decarbonization of both energy producing and consuming sectors. Green hydrogen has the potential to act as the central energy vector connecting hard-to-abate sectors to renewable power. However, combining energy storage and conversion for a holistic electrolyzer system remains challenging. Here, we show the innovative Zink-Zwischenschritt Elektrolyseur (ZZE), or Zinc Intermediate step Electrolyzer in English, that temporarily decouples the water splitting reaction and uses zinc to store electrical energy in chemical form. To perform optimal operation of a ZZE system, machine learning models were applied to predict the state of charge of a lab scale ZZE system. Using various models, we were able to determine the effectiveness of the prediction and contrast it to state of charge predictions of other energy storage systems. We show that a bi-directional long short-term memory neural network approach has the lowest error within the testing environment. This work serves to perform further ZZE development as well as state of charge prediction for other novel energy storage technologies.展开更多
In the era of deep learning, modeling for most natural language processing (NLP) tasks has converged into several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of...In the era of deep learning, modeling for most natural language processing (NLP) tasks has converged into several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of tasks such as POS-tagging, named entity recognition (NER), and chunking, and adopt the classification paradigm to solve tasks like sentiment analysis. With the rapid progress of pre-trained language models, recent years have witnessed a rising trend of paradigm shift, which is solving one NLP task in a new paradigm by reformulating the task. The paradigm shift has achieved great success on many tasks and is becoming a promising way to improve model performance. Moreover, some of these paradigms have shown great potential to unify a large number of NLP tasks, making it possible to build a single model to handle diverse tasks. In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.展开更多
基金The National Natural Science Foundation of China under contract Nos 42266006 and 41806114the Jiangxi Provincial Natural Science Foundation under contract Nos 20232BAB204089 and 20202ACBL214019.
文摘The complexity of river-tide interaction poses a significant challenge in predicting discharge in tidal rivers.Long short-term memory(LSTM)networks excel in processing and predicting crucial events with extended intervals and time delays in time series data.Additionally,the sequence-to-sequence(Seq2Seq)model,known for handling temporal relationships,adapting to variable-length sequences,effectively capturing historical information,and accommodating various influencing factors,emerges as a robust and flexible tool in discharge forecasting.In this study,we introduce the application of LSTM-based Seq2Seq models for the first time in forecasting the discharge of a tidal reach of the Changjiang River(Yangtze River)Estuary.This study focuses on discharge forecasting using three key input characteristics:flow velocity,water level,and discharge,which means the structure of multiple input and single output is adopted.The experiment used the discharge data of the whole year of 2020,of which the first 80%is used as the training set,and the last 20%is used as the test set.This means that the data covers different tidal cycles,which helps to test the forecasting effect of different models in different tidal cycles and different runoff.The experimental results indicate that the proposed models demonstrate advantages in long-term,mid-term,and short-term discharge forecasting.The Seq2Seq models improved by 6%-60%and 5%-20%of the relative standard deviation compared to the harmonic analysis models and improved back propagation neural network models in discharge prediction,respectively.In addition,the relative accuracy of the Seq2Seq model is 1%to 3%higher than that of the LSTM model.Analytical assessment of the prediction errors shows that the Seq2Seq models are insensitive to the forecast lead time and they can capture characteristic values such as maximum flood tide flow and maximum ebb tide flow in the tidal cycle well.This indicates the significance of the Seq2Seq models.
文摘Online sensing can provide useful information in monitoring applications,for example,machine health monitoring,structural condition monitoring,environmental monitoring,and many more.Missing data is generally a significant issue in the sensory data that is collected online by sensing systems,which may affect the goals of monitoring programs.In this paper,a sequence-to-sequence learning model based on a recurrent neural network(RNN)architecture is presented.In the proposed method,multivariate time series of the monitored parameters is embedded into the neural network through layer-by-layer encoders where the hidden features of the inputs are adaptively extracted.Afterwards,predictions of the missing data are generated by network decoders,which are one-step-ahead predictive data sequences of the monitored parameters.The prediction performance of the proposed model is validated based on a real-world sensory dataset.The experimental results demonstrate the performance of the proposed RNN-encoder-decoder model with its capability in sequence-to-sequence learning for online imputation of sensory data.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)(grant number IMSIU-RG23148).
文摘In video surveillance,anomaly detection requires training machine learning models on spatio-temporal video sequences.However,sometimes the video-only data is not sufficient to accurately detect all the abnormal activities.Therefore,we propose a novel audio-visual spatiotemporal autoencoder specifically designed to detect anomalies for video surveillance by utilizing audio data along with video data.This paper presents a competitive approach to a multi-modal recurrent neural network for anomaly detection that combines separate spatial and temporal autoencoders to leverage both spatial and temporal features in audio-visual data.The proposed model is trained to produce low reconstruction error for normal data and high error for abnormal data,effectively distinguishing between the two and assigning an anomaly score.Training is conducted on normal datasets,while testing is performed on both normal and anomalous datasets.The anomaly scores from the models are combined using a late fusion technique,and a deep dense layer model is trained to produce decisive scores indicating whether a sequence is normal or anomalous.The model’s performance is evaluated on the University of California,San Diego Pedestrian 2(UCSD PED 2),University of Minnesota(UMN),and Tampere University of Technology(TUT)Rare Sound Events datasets using six evaluation metrics.It is compared with state-of-the-art methods depicting a high Area Under Curve(AUC)and a low Equal Error Rate(EER),achieving an(AUC)of 93.1 and an(EER)of 8.1 for the(UCSD)dataset,and an(AUC)of 94.9 and an(EER)of 5.9 for the UMN dataset.The evaluations demonstrate that the joint results from the combined audio-visual model outperform those from separate models,highlighting the competitive advantage of the proposed multi-modal approach.
文摘针对传统词向量在自动文本摘要过程中因无法对多义词进行有效表征而降低文本摘要准确度和可读性的问题,提出一种基于BERT(Bidirectional Encoder Representations from Transformers)的自动文本摘要模型构建方法。该方法引入BERT预训练语言模型用于增强词向量的语义表示,将生成的词向量输入Seq2Seq模型中进行训练并形成自动文本摘要模型,实现对文本摘要的快速生成。实验结果表明,该模型在Gigaword数据集上能有效地提高生成摘要的准确率和可读性,可用于文本摘要自动生成任务。
文摘Image captioning is an emerging field in machine learning.It refers to the ability to automatically generate a syntactically and semantically meaningful sentence that describes the content of an image.Image captioning requires a complex machine learning process as it involves two sub models:a vision sub-model for extracting object features and a language sub-model that use the extracted features to generate meaningful captions.Attention-based vision transformers models have a great impact in vision field recently.In this paper,we studied the effect of using the vision transformers on the image captioning process by evaluating the use of four different vision transformer models for the vision sub-models of the image captioning The first vision transformers used is DINO(self-distillation with no labels).The second is PVT(Pyramid Vision Transformer)which is a vision transformer that is not using convolutional layers.The third is XCIT(cross-Covariance Image Transformer)which changes the operation in self-attention by focusing on feature dimension instead of token dimensions.The last one is SWIN(Shifted windows),it is a vision transformer which,unlike the other transformers,uses shifted-window in splitting the image.For a deeper evaluation,the four mentioned vision transformers have been tested with their different versions and different configuration,we evaluate the use of DINO model with five different backbones,PVT with two versions:PVT_v1and PVT_v2,one model of XCIT,SWIN transformer.The results show the high effectiveness of using SWIN-transformer within the proposed image captioning model with regard to the other models.
基金support from the University of Iowa Jumpstarting Tomorrow Community Feasibility Grants and OVPR Interdisciplinary Scholars Program for this study.Z.Wang and S.Xiao received support from the U.S.Department of Education(E.D.#P116S210005)Q.Wang and J.Wang acknowledge the support from NASA Atmospheric Composition Modeling and Analysis Program(ACMAP,Grant#:80NSSC19K0950).
文摘This paper presents designing sequence-to-sequence recurrent neural network(RNN)architectures for a novel study to predict soil NOx emissions,driven by the imperative of understanding and mitigating environmental impact.The study utilizes data collected by the Environmental Protection Agency(EPA)to develop two distinct RNN predictive models:one built upon the long-short term memory(LSTM)and the other utilizing the gated recurrent unit(GRU).These models are fed with a combination of historical and anticipated air temperature,air moisture,and NOx emissions as inputs to forecast future NOx emissions.Both LSTM and GRU models can capture the intricate pulse patterns inherent in soil NOx emissions.Notably,the GRU model emerges as the superior performer,surpassing the LSTM model in predictive accuracy while demonstrating efficiency by necessitating less training time.Intriguingly,the investigation into varying input features reveals that relying solely on past NOx emissions as input yields satisfactory performance,highlighting the dominant influence of this factor.The study also delves into the impact of altering input series lengths and training data sizes,yielding insights into optimal configurations for enhanced model performance.Importantly,the findings promise to advance our grasp of soil NOx emission dynamics,with implications for environmental management strategies.Looking ahead,the anticipated availability of additional measurements is poised to bolster machine-learning model efficacy.Furthermore,the future study will explore physical-based RNNs,a promising avenue for deeper insights into soil NOx emission prediction.
文摘Background Human-machine dialog generation is an essential topic of research in the field of natural language processing.Generating high-quality,diverse,fluent,and emotional conversation is a challenging task.Based on continuing advancements in artificial intelligence and deep learning,new methods have come to the forefront in recent times.In particular,the end-to-end neural network model provides an extensible conversation generation framework that has the potential to enable machines to understand semantics and automatically generate responses.However,neural network models come with their own set of questions and challenges.The basic conversational model framework tends to produce universal,meaningless,and relatively"safe"answers.Methods Based on generative adversarial networks(GANs),a new emotional dialog generation framework called EMC-GAN is proposed in this study to address the task of emotional dialog generation.The proposed model comprises a generative and three discriminative models.The generator is based on the basic sequence-to-sequence(Seq2Seq)dialog generation model,and the aggregate discriminative model for the overall framework consists of a basic discriminative model,an emotion discriminative model,and a fluency discriminative model.The basic discriminative model distinguishes generated fake sentences from real sentences in the training corpus.The emotion discriminative model evaluates whether the emotion conveyed via the generated dialog agrees with a pre-specified emotion,and directs the generative model to generate dialogs that correspond to the category of the pre-specified emotion.Finally,the fluency discriminative model assigns a score to the fluency of the generated dialog and guides the generator to produce more fluent sentences.Results Based on the experimental results,this study confirms the superiority of the proposed model over similar existing models with respect to emotional accuracy,fluency,and consistency.Conclusions The proposed EMC-GAN model is capable of generating consistent,smooth,and fluent dialog that conveys pre-specified emotions,and exhibits better performance with respect to emotional accuracy,consistency,and fluency compared to its competitors.
基金supported in part by the National Natural Science Foundation of China(No.62273193)Tsinghua University-Meituan Joint Institute for Digital Life,and the Research and Development Project of CRSC Research&Design Institute Group Co.,Ltd.
文摘The on-demand food delivery(OFD)service has gained rapid development in the past decades but meanwhile encounters challenges for further improving operation quality.The order dispatching problem is one of the most concerning issues for the OFD platforms,which refer to dynamically dispatching a large number of orders to riders reasonably in very limited decision time.To solve such a challenging combinatorial optimization problem,an effective matching algorithm is proposed by fusing the reinforcement learning technique and the optimization method.First,to deal with the large-scale complexity,a decoupling method is designed by reducing the matching space between new orders and riders.Second,to overcome the high dynamism and satisfy the stringent requirements on decision time,a reinforcement learning based dispatching heuristic is presented.To be specific,a sequence-to-sequence neural network is constructed based on the problem characteristic to generate an order priority sequence.Besides,a training approach is specially designed to improve learning performance.Furthermore,a greedy heuristic is employed to effectively dispatch new orders according to the order priority sequence.On real-world datasets,numerical experiments are conducted to validate the effectiveness of the proposed algorithm.Statistical results show that the proposed algorithm can effectively solve the problem by improving delivery efficiency and maintaining customer satisfaction.
基金supported by the German Federal Ministry of Education(grant number 01LY2111A)the German Federal Ministry of Economics and Climate Action(grant number 03EI3092A).
文摘The intermittency of renewable energy is a key limiting factor for the successful decarbonization of both energy producing and consuming sectors. Green hydrogen has the potential to act as the central energy vector connecting hard-to-abate sectors to renewable power. However, combining energy storage and conversion for a holistic electrolyzer system remains challenging. Here, we show the innovative Zink-Zwischenschritt Elektrolyseur (ZZE), or Zinc Intermediate step Electrolyzer in English, that temporarily decouples the water splitting reaction and uses zinc to store electrical energy in chemical form. To perform optimal operation of a ZZE system, machine learning models were applied to predict the state of charge of a lab scale ZZE system. Using various models, we were able to determine the effectiveness of the prediction and contrast it to state of charge predictions of other energy storage systems. We show that a bi-directional long short-term memory neural network approach has the lowest error within the testing environment. This work serves to perform further ZZE development as well as state of charge prediction for other novel energy storage technologies.
基金supported by National Natural Science Foundation of China(No.62022027).
文摘In the era of deep learning, modeling for most natural language processing (NLP) tasks has converged into several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of tasks such as POS-tagging, named entity recognition (NER), and chunking, and adopt the classification paradigm to solve tasks like sentiment analysis. With the rapid progress of pre-trained language models, recent years have witnessed a rising trend of paradigm shift, which is solving one NLP task in a new paradigm by reformulating the task. The paradigm shift has achieved great success on many tasks and is becoming a promising way to improve model performance. Moreover, some of these paradigms have shown great potential to unify a large number of NLP tasks, making it possible to build a single model to handle diverse tasks. In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.