The battlefield environment is changing rapidly,and fast and accurate identification of the tactical intention of enemy targets is an important condition for gaining a decision-making advantage.The current Intention R...The battlefield environment is changing rapidly,and fast and accurate identification of the tactical intention of enemy targets is an important condition for gaining a decision-making advantage.The current Intention Recognition(IR)method for air targets has shortcomings in temporality,interpretability and back-and-forth dependency of intentions.To address these problems,this paper designs a novel air target intention recognition method named STABC-IR,which is based on Bidirectional Gated Recurrent Unit(Bi GRU)and Conditional Random Field(CRF)with Space-Time Attention mechanism(STA).First,the problem of intention recognition of air targets is described and analyzed in detail.Then,a temporal network based on Bi GRU is constructed to achieve the temporal requirement.Subsequently,STA is proposed to focus on the key parts of the features and timing information to meet certain interpretability requirements while strengthening the timing requirements.Finally,an intention transformation network based on CRF is proposed to solve the back-and-forth dependency and transformation problem by jointly modeling the tactical intention of the target at each moment.The experimental results show that the recognition accuracy of the jointly trained STABC-IR model can reach 95.7%,which is higher than other latest intention recognition methods.STABC-IR solves the problem of intention transformation for the first time and considers both temporality and interpretability,which is important for improving the tactical intention recognition capability and has reference value for the construction of command and control auxiliary decision-making system.展开更多
Traditional deep learning methods pursue complex and single network architectures without considering the petrophysical relationship between different elastic parameters.The mathematical and statistical significance o...Traditional deep learning methods pursue complex and single network architectures without considering the petrophysical relationship between different elastic parameters.The mathematical and statistical significance of the inversion results may lead to model overfitting,especially when there are a limited number of well logs in a working area.Multitask learning provides an eff ective approach to addressing this issue.Simultaneously,learning multiple related tasks can improve a model’s generalization ability to a certain extent,thereby enhancing the performance of related tasks with an equal amount of labeled data.In this study,we propose an end-to-end multitask deep learning model that integrates a fully convolutional network and bidirectional gated recurrent unit for intelligent prestack inversion of“seismic data to elastic parameters.”The use of a Bayesian homoscedastic uncertainty-based loss function enables adaptive learning of the weight coeffi cients for diff erent elastic parameter inversion tasks,thereby reducing uncertainty during the inversion process.The proposed method combines the local feature perception of convolutional neural networks with the long-term memory of bidirectional gated recurrent networks.It maintains the rock physics constraint relationships among diff erent elastic parameters during the inversion process,demonstrating a high level of prediction accuracy.Numerical simulations and processing results of real seismic data validate the eff ectiveness and practicality of the proposed method.展开更多
Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency info...Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.展开更多
Purpose-The abnormal behaviors of staff at petroleum stations pose significant safety hazards.Addressing the challenges of high parameter counts,lengthy training periods and low recognition rates in existing 3D ResNet...Purpose-The abnormal behaviors of staff at petroleum stations pose significant safety hazards.Addressing the challenges of high parameter counts,lengthy training periods and low recognition rates in existing 3D ResNet behavior recognition models,this paper proposes GTB-ResNet,a network designed to detect abnormal behaviors in petroleum station staff.Design/methodology/approach-Firstly,to mitigate the issues of excessive parameters and computational complexity in 3D ResNet,a lightweight residual convolution module called the Ghost residual module(GhostNet)is introduced in the feature extraction network.Ghost convolution replaces standard convolution,reducing model parameters while preserving multi-scale feature extraction capabilities.Secondly,to enhance the model’s focus on salient features amidst wide surveillance ranges and small target objects,the triplet attention mechanism module is integrated to facilitate spatial and channel information interaction.Lastly,to address the challenge of short time-series features leading to misjudgments in similar actions,a bidirectional gated recurrent network is added to the feature extraction backbone network.This ensures the extraction of key long time-series features,thereby improving feature extraction accuracy.Findings-The experimental setup encompasses four behavior types:illegal phone answering,smoking,falling(abnormal)and touching the face(normal),comprising a total of 892 videos.Experimental results showcase GTB-ResNet achieving a recognition accuracy of 96.7%with a model parameter count of 4.46 M and a computational complexity of 3.898 G.This represents a 4.4%improvement over 3D ResNet,with reductions of 90.4%in parameters and 61.5%in computational complexity.Originality/value-Specifically designed for edge devices in oil stations,the 3D ResNet network is tailored for real-time action prediction.To address the challenges posed by the large number of parameters in 3D ResNet networks and the difficulties in deployment on edge devices,a lightweight residual module based on ghost convolution is developed.Additionally,to tackle the issue of low detection accuracy of behaviors amidst the noisy environment of petroleum stations,a triple attention mechanism is introduced during feature extraction to enhance focus on salient features.Moreover,to overcome the potential for misjudgments arising from the similarity of actions,a Bi-GRU model is introduced to enhance the extraction of key long-term features.展开更多
基金supported by the National Natural Science Foundation of China(Nos.62106283 and 72001214)。
文摘The battlefield environment is changing rapidly,and fast and accurate identification of the tactical intention of enemy targets is an important condition for gaining a decision-making advantage.The current Intention Recognition(IR)method for air targets has shortcomings in temporality,interpretability and back-and-forth dependency of intentions.To address these problems,this paper designs a novel air target intention recognition method named STABC-IR,which is based on Bidirectional Gated Recurrent Unit(Bi GRU)and Conditional Random Field(CRF)with Space-Time Attention mechanism(STA).First,the problem of intention recognition of air targets is described and analyzed in detail.Then,a temporal network based on Bi GRU is constructed to achieve the temporal requirement.Subsequently,STA is proposed to focus on the key parts of the features and timing information to meet certain interpretability requirements while strengthening the timing requirements.Finally,an intention transformation network based on CRF is proposed to solve the back-and-forth dependency and transformation problem by jointly modeling the tactical intention of the target at each moment.The experimental results show that the recognition accuracy of the jointly trained STABC-IR model can reach 95.7%,which is higher than other latest intention recognition methods.STABC-IR solves the problem of intention transformation for the first time and considers both temporality and interpretability,which is important for improving the tactical intention recognition capability and has reference value for the construction of command and control auxiliary decision-making system.
基金supported by National Key R&D Program of China(2018YFA0702501)National Natural Science Foundation of China (41974140)+1 种基金Science and Technology Management Department,China National Petroleum Corporation(2022DQ0604-01)China National Petroleum Corporation-China University of Petroleum (Beijing) Strategy。
文摘Traditional deep learning methods pursue complex and single network architectures without considering the petrophysical relationship between different elastic parameters.The mathematical and statistical significance of the inversion results may lead to model overfitting,especially when there are a limited number of well logs in a working area.Multitask learning provides an eff ective approach to addressing this issue.Simultaneously,learning multiple related tasks can improve a model’s generalization ability to a certain extent,thereby enhancing the performance of related tasks with an equal amount of labeled data.In this study,we propose an end-to-end multitask deep learning model that integrates a fully convolutional network and bidirectional gated recurrent unit for intelligent prestack inversion of“seismic data to elastic parameters.”The use of a Bayesian homoscedastic uncertainty-based loss function enables adaptive learning of the weight coeffi cients for diff erent elastic parameter inversion tasks,thereby reducing uncertainty during the inversion process.The proposed method combines the local feature perception of convolutional neural networks with the long-term memory of bidirectional gated recurrent networks.It maintains the rock physics constraint relationships among diff erent elastic parameters during the inversion process,demonstrating a high level of prediction accuracy.Numerical simulations and processing results of real seismic data validate the eff ectiveness and practicality of the proposed method.
基金supported by the German National BMBF IKT2020-Grant(16SV7213)(EmotAsS)the European-Unions Horizon 2020 Research and Innovation Programme(688835)(DE-ENIGMA)the China Scholarship Council(CSC)
文摘Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.
文摘Purpose-The abnormal behaviors of staff at petroleum stations pose significant safety hazards.Addressing the challenges of high parameter counts,lengthy training periods and low recognition rates in existing 3D ResNet behavior recognition models,this paper proposes GTB-ResNet,a network designed to detect abnormal behaviors in petroleum station staff.Design/methodology/approach-Firstly,to mitigate the issues of excessive parameters and computational complexity in 3D ResNet,a lightweight residual convolution module called the Ghost residual module(GhostNet)is introduced in the feature extraction network.Ghost convolution replaces standard convolution,reducing model parameters while preserving multi-scale feature extraction capabilities.Secondly,to enhance the model’s focus on salient features amidst wide surveillance ranges and small target objects,the triplet attention mechanism module is integrated to facilitate spatial and channel information interaction.Lastly,to address the challenge of short time-series features leading to misjudgments in similar actions,a bidirectional gated recurrent network is added to the feature extraction backbone network.This ensures the extraction of key long time-series features,thereby improving feature extraction accuracy.Findings-The experimental setup encompasses four behavior types:illegal phone answering,smoking,falling(abnormal)and touching the face(normal),comprising a total of 892 videos.Experimental results showcase GTB-ResNet achieving a recognition accuracy of 96.7%with a model parameter count of 4.46 M and a computational complexity of 3.898 G.This represents a 4.4%improvement over 3D ResNet,with reductions of 90.4%in parameters and 61.5%in computational complexity.Originality/value-Specifically designed for edge devices in oil stations,the 3D ResNet network is tailored for real-time action prediction.To address the challenges posed by the large number of parameters in 3D ResNet networks and the difficulties in deployment on edge devices,a lightweight residual module based on ghost convolution is developed.Additionally,to tackle the issue of low detection accuracy of behaviors amidst the noisy environment of petroleum stations,a triple attention mechanism is introduced during feature extraction to enhance focus on salient features.Moreover,to overcome the potential for misjudgments arising from the similarity of actions,a Bi-GRU model is introduced to enhance the extraction of key long-term features.