Bearing pitting,one of the common faults in mechanical systems,is a research hotspot in both academia and industry.Traditional fault diagnosis methods for bearings are based on manual experience with low diagnostic ef...Bearing pitting,one of the common faults in mechanical systems,is a research hotspot in both academia and industry.Traditional fault diagnosis methods for bearings are based on manual experience with low diagnostic efficiency.This study proposes a novel bearing fault diagnosis method based on deep separable convolution and spatial dropout regularization.Deep separable convolution extracts features from the raw bearing vibration signals,during which a 3×1 convolutional kernel with a one-step size selects effective features by adjusting its weights.The similarity pruning process of the channel convolution and point convolution can reduce the number of parameters and calculation quantities by evaluating the size of the weights and removing the feature maps of smaller weights.The spatial dropout regularization method focuses on bearing signal fault features,improving the independence between the bearing signal features and enhancing the robustness of the model.A batch normalization algorithm is added to the convolutional layer for gradient explosion control and network stability improvement.To validate the effectiveness of the proposed method,we collect raw vibration signals from bearings in eight different health states.The experimental results show that the proposed method can effectively distinguish different pitting faults in the bearings with a better accuracy than that of other typical deep learning methods.展开更多
Recent advances in deep neural networks have shed new light on physics,engineering,and scientific computing.Reconciling the data-centered viewpoint with physical simulation is one of the research hotspots.The physicsi...Recent advances in deep neural networks have shed new light on physics,engineering,and scientific computing.Reconciling the data-centered viewpoint with physical simulation is one of the research hotspots.The physicsinformedneural network(PINN)is currently the most general framework,which is more popular due to theconvenience of constructing NNs and excellent generalization ability.The automatic differentiation(AD)-basedPINN model is suitable for the homogeneous scientific problem;however,it is unclear how AD can enforce fluxcontinuity across boundaries between cells of different properties where spatial heterogeneity is represented bygrid cells with different physical properties.In this work,we propose a criss-cross physics-informed convolutionalneural network(CC-PINN)learning architecture,aiming to learn the solution of parametric PDEs with spatialheterogeneity of physical properties.To achieve the seamless enforcement of flux continuity and integration ofphysicalmeaning into CNN,a predefined 2D convolutional layer is proposed to accurately express transmissibilitybetween adjacent cells.The efficacy of the proposedmethodwas evaluated through predictions of several petroleumreservoir problems with spatial heterogeneity and compared against state-of-the-art(PINN)through numericalanalysis as a benchmark,which demonstrated the superiority of the proposed method over the PINN.展开更多
Flood probability maps are essential for a range of applications,including land use planning and developing mitigation strategies and early warning systems.This study describes the potential application of two archite...Flood probability maps are essential for a range of applications,including land use planning and developing mitigation strategies and early warning systems.This study describes the potential application of two architectures of deep learning neural networks,namely convolutional neural networks(CNN)and recurrent neural networks(RNN),for spatially explicit prediction and mapping of flash flood probability.To develop and validate the predictive models,a geospatial database that contained records for the historical flood events and geo-environmental characteristics of the Golestan Province in northern Iran was constructed.The step-wise weight assessment ratio analysis(SWARA)was employed to investigate the spatial interplay between floods and different influencing factors.The CNN and RNN models were trained using the SWARA weights and validated using the receiver operating characteristics technique.The results showed that the CNN model(AUC=0.832,RMSE=0.144)performed slightly better than the RNN model(AUC=0.814,RMSE=0.181)in predicting future floods.Further,these models demonstrated an improved prediction of floods compared to previous studies that used different models in the same study area.This study showed that the spatially explicit deep learning neural network models are successful in capturing the heterogeneity of spatial patterns of flood probability in the Golestan Province,and the resulting probability maps can be used for the development of mitigation plans in response to the future floods.The general policy implication of our study suggests that design,implementation,and verification of flood early warning systems should be directed to approximately 40%of the land area characterized by high and very susceptibility to flooding.展开更多
The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extrac...The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extract knowledge from these sources is imperative.Recently,the BlazePose system has been released for skeleton extraction from images oriented to mobile devices.With this skeleton graph representation in place,a Spatial-Temporal Graph Convolutional Network can be implemented to predict the action.We hypothesize that just by changing the skeleton input data for a different set of joints that offers more information about the action of interest,it is possible to increase the performance of the Spatial-Temporal Graph Convolutional Network for HAR tasks.Hence,in this study,we present the first implementation of the BlazePose skeleton topology upon this architecture for action recognition.Moreover,we propose the Enhanced-BlazePose topology that can achieve better results than its predecessor.Additionally,we propose different skeleton detection thresholds that can improve the accuracy performance even further.We reached a top-1 accuracy performance of 40.1%on the Kinetics dataset.For the NTU-RGB+D dataset,we achieved 87.59%and 92.1%accuracy for Cross-Subject and Cross-View evaluation criteria,respectively.展开更多
Irretrievable loss of vision is the predominant result of Glaucoma in the retina.Recently,multiple approaches have paid attention to the automatic detection of glaucoma on fundus images.Due to the interlace of blood v...Irretrievable loss of vision is the predominant result of Glaucoma in the retina.Recently,multiple approaches have paid attention to the automatic detection of glaucoma on fundus images.Due to the interlace of blood vessels and the herculean task involved in glaucoma detection,the exactly affected site of the optic disc of whether small or big size cup,is deemed challenging.Spatially Based Ellipse Fitting Curve Model(SBEFCM)classification is suggested based on the Ensemble for a reliable diagnosis of Glaucomain theOptic Cup(OC)and Optic Disc(OD)boundary correspondingly.This research deploys the Ensemble Convolutional Neural Network(CNN)classification for classifying Glaucoma or Diabetes Retinopathy(DR).The detection of the boundary between the OC and the OD is performed by the SBEFCM,which is the latest weighted ellipse fitting model.The SBEFCM that enhances and widens the multi-ellipse fitting technique is proposed here.There is a preprocessing of input fundus image besides segmentation of blood vessels to avoid interlacing surrounding tissues and blood vessels.The ascertaining of OCandODboundary,which characterizedmany output factors for glaucoma detection,has been developed by EnsembleCNNclassification,which includes detecting sensitivity,specificity,precision,andArea Under the receiver operating characteristic Curve(AUC)values accurately by an innovative SBEFCM.In terms of contrast,the proposed Ensemble CNNsignificantly outperformed the current methods.展开更多
Action recognition has been recognized as an activity in which individuals’behaviour can be observed.Assembling profiles of regular activities such as activities of daily living can support identifying trends in the ...Action recognition has been recognized as an activity in which individuals’behaviour can be observed.Assembling profiles of regular activities such as activities of daily living can support identifying trends in the data during critical events.A skeleton representation of the human body has been proven to be effective for this task.The skeletons are presented in graphs form-like.However,the topology of a graph is not structured like Euclideanbased data.Therefore,a new set of methods to perform the convolution operation upon the skeleton graph is proposed.Our proposal is based on the Spatial Temporal-Graph Convolutional Network(ST-GCN)framework.In this study,we proposed an improved set of label mapping methods for the ST-GCN framework.We introduce three split techniques(full distance split,connection split,and index split)as an alternative approach for the convolution operation.The experiments presented in this study have been trained using two benchmark datasets:NTU-RGB+D and Kinetics to evaluate the performance.Our results indicate that our split techniques outperform the previous partition strategies and aremore stable during training without using the edge importance weighting additional training parameter.Therefore,our proposal can provide a more realistic solution for real-time applications centred on daily living recognition systems activities for indoor environments.展开更多
High Spatial and Spectral Resolution(HSSR)remote-sensing images can provide rich spectral bands and detailed ground information,but there is a relative lack of research on this new type of remote-sensing data.Although...High Spatial and Spectral Resolution(HSSR)remote-sensing images can provide rich spectral bands and detailed ground information,but there is a relative lack of research on this new type of remote-sensing data.Although there are already some HSSR datasets for deep learning model training and testing,the data volume of these datasets is small,resulting in low classification accuracy and weak generalization ability of the trained models.In this paper,an HSSR dataset Luojia-HSSR is constructed based on aerial hyperspectral imagery of southern Shenyang City of Liaoning Province in China.To our knowledge,it is the largest HSSR dataset to date,with 6438 pairs of 256×256 sized samples(including 3480 pairs in the training set,2209 pairs in the test set,and 749 pairs in the validation set),covering area of 161 km2 with spatial resolution 0.75 m,249 Visible and Near-Infrared(VNIR)spectral bands,and corresponding to 23 classes of field-validated ground coverage.It is an ideal experimental data for spatial-spectral feature extraction.Furthermore,a new deep learning model 3D-HRNet for interpreting HSSR images is proposed.The conv-neck in HRNet is modified to better mine the spatial information of the images.Then,a 3D convolution module with attention mechanism is designed to capture the global-local fine spectral information simultaneously.Subsequently,the 3D convolution is inserted into the HRNet to optimize the performance.The experiments show that the 3D-HRNet model has good interpreting ability for the Luojia-HSSR dataset with the Frequency Weighted Intersection over Union(FWIoU)reaching 80.54%,indicating that the Luojia-HSSR dataset constructed in this paper and the proposed 3D-HRnet model have good applicable prospects for processing HSSR remote sensing images.展开更多
Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion s...Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.展开更多
Hiding secret data in digital multimedia has been essential to protect the data.Nevertheless,attackers with a steganalysis technique may break them.Existing steganalysis methods have good results with conventional Mac...Hiding secret data in digital multimedia has been essential to protect the data.Nevertheless,attackers with a steganalysis technique may break them.Existing steganalysis methods have good results with conventional Machine Learning(ML)techniques;however,the introduction of Convolutional Neural Network(CNN),a deep learning paradigm,achieved better performance over the previously proposed ML-based techniques.Though the existing CNN-based approaches yield good results,they present performance issues in classification accuracy and stability in the network training phase.This research proposes a new method with a CNN architecture to improve the hidden data detection accuracy and the training phase stability in spatial domain images.The proposed method comprises three phases:pre-processing,feature extraction,and classification.Firstly,in the pre-processing phase,we use spatial rich model filters to enhance the noise within images altered by data hiding;secondly,in the feature extraction phase,we use two-dimensional depthwise separable convolutions to improve the signal-to-noise and regular convolutions to model local features;and finally,in the classification,we use multi-scale average pooling for local features aggregation and representability enhancement regardless of the input size variation,followed by three fully connected layers to form the final feature maps that we transform into class probabilities using the softmax function.The results identify an improvement in the accuracy of the considered recent scheme ranging between 4.6 and 10.2%with reduced training time up to 30.81%.展开更多
Flame detection is a research hotspot in industrial production,and it has been widely used in various fields.Based on the ignition and combustion video sequence,this paper aims to improve the accuracy and unintuitive ...Flame detection is a research hotspot in industrial production,and it has been widely used in various fields.Based on the ignition and combustion video sequence,this paper aims to improve the accuracy and unintuitive detection results of the current flame detection methods of gasifier and industrial boiler.A furnace flame detection model based on support vector machine convolutional neural network(SCNN)is proposed.This algorithm uses the advantages of neural networks in the field of image classification to process flame burning video sequences which needs detailed analysis.Firstly,the support vector machine(SVM)with better small sample classification effect is used to replace the Softmax classification layer of the convolutional neural network(CNN)network.Secondly,a Dropout layer is introduced to improve the generalization ability of the network.Subsequently,the area,frequency and other important parameters of the flame image are analyzed and processed.Eventually,the experimental results show that the flame detection model designed in this paper is more accurate than the CNN model,and the accuracy of the judgment on the flame data set collected in the gasifier furnace reaches 99.53%.After several ignition tests,the furnace flame of the gasifier can be detected in real time.展开更多
Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications i...Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications in education,healthcare,entertainment,science,and more are being increasingly deployed based on the internet.Concurrently,malicious threats on the internet are on the rise as well.Distributed Denial of Service(DDoS)attacks are among the most common and dangerous threats on the internet today.The scale and complexity of DDoS attacks are constantly growing.Intrusion Detection Systems(IDS)have been deployed and have demonstrated their effectiveness in defense against those threats.In addition,the research of Machine Learning(ML)and Deep Learning(DL)in IDS has gained effective results and significant attention.However,one of the challenges when applying ML and DL techniques in intrusion detection is the identification of unknown attacks.These attacks,which are not encountered during the system’s training,can lead to misclassification with significant errors.In this research,we focused on addressing the issue of Unknown Attack Detection,combining two methods:Spatial Location Constraint Prototype Loss(SLCPL)and Fuzzy C-Means(FCM).With the proposed method,we achieved promising results compared to traditional methods.The proposed method demonstrates a very high accuracy of up to 99.8%with a low false positive rate for known attacks on the Intrusion Detection Evaluation Dataset(CICIDS2017)dataset.Particularly,the accuracy is also very high,reaching 99.7%,and the precision goes up to 99.9%for unknown DDoS attacks on the DDoS Evaluation Dataset(CICDDoS2019)dataset.The success of the proposed method is due to the combination of SLCPL,an advanced Open-Set Recognition(OSR)technique,and FCM,a traditional yet highly applicable clustering technique.This has yielded a novel method in the field of unknown attack detection.This further expands the trend of applying DL and ML techniques in the development of intrusion detection systems and cybersecurity.Finally,implementing the proposed method in real-world systems can enhance the security capabilities against increasingly complex threats on computer networks.展开更多
基金the National Key Research and Development Program of China (No. 2019YFB1704500)the State Ministry of Science and Technology Innovation Fund of China (No. 2018IM030200)+1 种基金the National Natural Foundation of China (No. U1708255)the China Scholarship Council (No. 201906080059)
文摘Bearing pitting,one of the common faults in mechanical systems,is a research hotspot in both academia and industry.Traditional fault diagnosis methods for bearings are based on manual experience with low diagnostic efficiency.This study proposes a novel bearing fault diagnosis method based on deep separable convolution and spatial dropout regularization.Deep separable convolution extracts features from the raw bearing vibration signals,during which a 3×1 convolutional kernel with a one-step size selects effective features by adjusting its weights.The similarity pruning process of the channel convolution and point convolution can reduce the number of parameters and calculation quantities by evaluating the size of the weights and removing the feature maps of smaller weights.The spatial dropout regularization method focuses on bearing signal fault features,improving the independence between the bearing signal features and enhancing the robustness of the model.A batch normalization algorithm is added to the convolutional layer for gradient explosion control and network stability improvement.To validate the effectiveness of the proposed method,we collect raw vibration signals from bearings in eight different health states.The experimental results show that the proposed method can effectively distinguish different pitting faults in the bearings with a better accuracy than that of other typical deep learning methods.
基金the National Natural Science Foundation of China(No.52274048)Beijing Natural Science Foundation(No.3222037)+1 种基金the CNPC 14th Five-Year Perspective Fundamental Research Project(No.2021DJ2104)the Science Foundation of China University of Petroleum,Beijing(No.2462021YXZZ010).
文摘Recent advances in deep neural networks have shed new light on physics,engineering,and scientific computing.Reconciling the data-centered viewpoint with physical simulation is one of the research hotspots.The physicsinformedneural network(PINN)is currently the most general framework,which is more popular due to theconvenience of constructing NNs and excellent generalization ability.The automatic differentiation(AD)-basedPINN model is suitable for the homogeneous scientific problem;however,it is unclear how AD can enforce fluxcontinuity across boundaries between cells of different properties where spatial heterogeneity is represented bygrid cells with different physical properties.In this work,we propose a criss-cross physics-informed convolutionalneural network(CC-PINN)learning architecture,aiming to learn the solution of parametric PDEs with spatialheterogeneity of physical properties.To achieve the seamless enforcement of flux continuity and integration ofphysicalmeaning into CNN,a predefined 2D convolutional layer is proposed to accurately express transmissibilitybetween adjacent cells.The efficacy of the proposedmethodwas evaluated through predictions of several petroleumreservoir problems with spatial heterogeneity and compared against state-of-the-art(PINN)through numericalanalysis as a benchmark,which demonstrated the superiority of the proposed method over the PINN.
基金conducted by the Basic Research Project of the Korea Institute of Geoscience and Mineral Resources(KIGAM)funded by the Ministry of Science and ICT。
文摘Flood probability maps are essential for a range of applications,including land use planning and developing mitigation strategies and early warning systems.This study describes the potential application of two architectures of deep learning neural networks,namely convolutional neural networks(CNN)and recurrent neural networks(RNN),for spatially explicit prediction and mapping of flash flood probability.To develop and validate the predictive models,a geospatial database that contained records for the historical flood events and geo-environmental characteristics of the Golestan Province in northern Iran was constructed.The step-wise weight assessment ratio analysis(SWARA)was employed to investigate the spatial interplay between floods and different influencing factors.The CNN and RNN models were trained using the SWARA weights and validated using the receiver operating characteristics technique.The results showed that the CNN model(AUC=0.832,RMSE=0.144)performed slightly better than the RNN model(AUC=0.814,RMSE=0.181)in predicting future floods.Further,these models demonstrated an improved prediction of floods compared to previous studies that used different models in the same study area.This study showed that the spatially explicit deep learning neural network models are successful in capturing the heterogeneity of spatial patterns of flood probability in the Golestan Province,and the resulting probability maps can be used for the development of mitigation plans in response to the future floods.The general policy implication of our study suggests that design,implementation,and verification of flood early warning systems should be directed to approximately 40%of the land area characterized by high and very susceptibility to flooding.
文摘The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extract knowledge from these sources is imperative.Recently,the BlazePose system has been released for skeleton extraction from images oriented to mobile devices.With this skeleton graph representation in place,a Spatial-Temporal Graph Convolutional Network can be implemented to predict the action.We hypothesize that just by changing the skeleton input data for a different set of joints that offers more information about the action of interest,it is possible to increase the performance of the Spatial-Temporal Graph Convolutional Network for HAR tasks.Hence,in this study,we present the first implementation of the BlazePose skeleton topology upon this architecture for action recognition.Moreover,we propose the Enhanced-BlazePose topology that can achieve better results than its predecessor.Additionally,we propose different skeleton detection thresholds that can improve the accuracy performance even further.We reached a top-1 accuracy performance of 40.1%on the Kinetics dataset.For the NTU-RGB+D dataset,we achieved 87.59%and 92.1%accuracy for Cross-Subject and Cross-View evaluation criteria,respectively.
文摘Irretrievable loss of vision is the predominant result of Glaucoma in the retina.Recently,multiple approaches have paid attention to the automatic detection of glaucoma on fundus images.Due to the interlace of blood vessels and the herculean task involved in glaucoma detection,the exactly affected site of the optic disc of whether small or big size cup,is deemed challenging.Spatially Based Ellipse Fitting Curve Model(SBEFCM)classification is suggested based on the Ensemble for a reliable diagnosis of Glaucomain theOptic Cup(OC)and Optic Disc(OD)boundary correspondingly.This research deploys the Ensemble Convolutional Neural Network(CNN)classification for classifying Glaucoma or Diabetes Retinopathy(DR).The detection of the boundary between the OC and the OD is performed by the SBEFCM,which is the latest weighted ellipse fitting model.The SBEFCM that enhances and widens the multi-ellipse fitting technique is proposed here.There is a preprocessing of input fundus image besides segmentation of blood vessels to avoid interlacing surrounding tissues and blood vessels.The ascertaining of OCandODboundary,which characterizedmany output factors for glaucoma detection,has been developed by EnsembleCNNclassification,which includes detecting sensitivity,specificity,precision,andArea Under the receiver operating characteristic Curve(AUC)values accurately by an innovative SBEFCM.In terms of contrast,the proposed Ensemble CNNsignificantly outperformed the current methods.
文摘Action recognition has been recognized as an activity in which individuals’behaviour can be observed.Assembling profiles of regular activities such as activities of daily living can support identifying trends in the data during critical events.A skeleton representation of the human body has been proven to be effective for this task.The skeletons are presented in graphs form-like.However,the topology of a graph is not structured like Euclideanbased data.Therefore,a new set of methods to perform the convolution operation upon the skeleton graph is proposed.Our proposal is based on the Spatial Temporal-Graph Convolutional Network(ST-GCN)framework.In this study,we proposed an improved set of label mapping methods for the ST-GCN framework.We introduce three split techniques(full distance split,connection split,and index split)as an alternative approach for the convolution operation.The experiments presented in this study have been trained using two benchmark datasets:NTU-RGB+D and Kinetics to evaluate the performance.Our results indicate that our split techniques outperform the previous partition strategies and aremore stable during training without using the edge importance weighting additional training parameter.Therefore,our proposal can provide a more realistic solution for real-time applications centred on daily living recognition systems activities for indoor environments.
基金supported by the Major Program of the National Natural Science Foundation of China[grant number 92038301]The research was also supported by the National Natural Science Foundation of China[grant number 41971295]+1 种基金the Foundation for Innovative Research Groups of the Natural Science Foundation of Hubei Province[grant number 2020CFA003]the Special Fund of Hubei Luojia Laboratory.
文摘High Spatial and Spectral Resolution(HSSR)remote-sensing images can provide rich spectral bands and detailed ground information,but there is a relative lack of research on this new type of remote-sensing data.Although there are already some HSSR datasets for deep learning model training and testing,the data volume of these datasets is small,resulting in low classification accuracy and weak generalization ability of the trained models.In this paper,an HSSR dataset Luojia-HSSR is constructed based on aerial hyperspectral imagery of southern Shenyang City of Liaoning Province in China.To our knowledge,it is the largest HSSR dataset to date,with 6438 pairs of 256×256 sized samples(including 3480 pairs in the training set,2209 pairs in the test set,and 749 pairs in the validation set),covering area of 161 km2 with spatial resolution 0.75 m,249 Visible and Near-Infrared(VNIR)spectral bands,and corresponding to 23 classes of field-validated ground coverage.It is an ideal experimental data for spatial-spectral feature extraction.Furthermore,a new deep learning model 3D-HRNet for interpreting HSSR images is proposed.The conv-neck in HRNet is modified to better mine the spatial information of the images.Then,a 3D convolution module with attention mechanism is designed to capture the global-local fine spectral information simultaneously.Subsequently,the 3D convolution is inserted into the HRNet to optimize the performance.The experiments show that the 3D-HRNet model has good interpreting ability for the Luojia-HSSR dataset with the Frequency Weighted Intersection over Union(FWIoU)reaching 80.54%,indicating that the Luojia-HSSR dataset constructed in this paper and the proposed 3D-HRnet model have good applicable prospects for processing HSSR remote sensing images.
基金supported by the Henan Provincial Science and Technology Research Project under Grants 232102211006,232102210044,232102211017,232102210055 and 222102210214the Science and Technology Innovation Project of Zhengzhou University of Light Industry under Grant 23XNKJTD0205+1 种基金the Undergraduate Universities Smart Teaching Special Research Project of Henan Province under Grant Jiao Gao[2021]No.489-29the Doctor Natural Science Foundation of Zhengzhou University of Light Industry under Grants 2021BSJJ025 and 2022BSJJZK13.
文摘Multispectral pedestrian detection technology leverages infrared images to provide reliable information for visible light images, demonstrating significant advantages in low-light conditions and background occlusion scenarios. However, while continuously improving cross-modal feature extraction and fusion, ensuring the model’s detection speed is also a challenging issue. We have devised a deep learning network model for cross-modal pedestrian detection based on Resnet50, aiming to focus on more reliable features and enhance the model’s detection efficiency. This model employs a spatial attention mechanism to reweight the input visible light and infrared image data, enhancing the model’s focus on different spatial positions and sharing the weighted feature data across different modalities, thereby reducing the interference of multi-modal features. Subsequently, lightweight modules with depthwise separable convolution are incorporated to reduce the model’s parameter count and computational load through channel-wise and point-wise convolutions. The network model algorithm proposed in this paper was experimentally validated on the publicly available KAIST dataset and compared with other existing methods. The experimental results demonstrate that our approach achieves favorable performance in various complex environments, affirming the effectiveness of the multispectral pedestrian detection technology proposed in this paper.
基金supported by the Ministry of Education,Culture,Research and Technology,The Republic of Indonesia,and Institut Teknologi Sepuluh Nopember.
文摘Hiding secret data in digital multimedia has been essential to protect the data.Nevertheless,attackers with a steganalysis technique may break them.Existing steganalysis methods have good results with conventional Machine Learning(ML)techniques;however,the introduction of Convolutional Neural Network(CNN),a deep learning paradigm,achieved better performance over the previously proposed ML-based techniques.Though the existing CNN-based approaches yield good results,they present performance issues in classification accuracy and stability in the network training phase.This research proposes a new method with a CNN architecture to improve the hidden data detection accuracy and the training phase stability in spatial domain images.The proposed method comprises three phases:pre-processing,feature extraction,and classification.Firstly,in the pre-processing phase,we use spatial rich model filters to enhance the noise within images altered by data hiding;secondly,in the feature extraction phase,we use two-dimensional depthwise separable convolutions to improve the signal-to-noise and regular convolutions to model local features;and finally,in the classification,we use multi-scale average pooling for local features aggregation and representability enhancement regardless of the input size variation,followed by three fully connected layers to form the final feature maps that we transform into class probabilities using the softmax function.The results identify an improvement in the accuracy of the considered recent scheme ranging between 4.6 and 10.2%with reduced training time up to 30.81%.
基金Supported by Shaanxi Province Key Research and Development Project(No.2021GY-280)Shaanxi Province Natural Science Basic ResearchProgram Project(No.2021JM-459)National Natural Science Foundation of China(No.61834005,61772417,61802304,61602377,61634004)。
文摘Flame detection is a research hotspot in industrial production,and it has been widely used in various fields.Based on the ignition and combustion video sequence,this paper aims to improve the accuracy and unintuitive detection results of the current flame detection methods of gasifier and industrial boiler.A furnace flame detection model based on support vector machine convolutional neural network(SCNN)is proposed.This algorithm uses the advantages of neural networks in the field of image classification to process flame burning video sequences which needs detailed analysis.Firstly,the support vector machine(SVM)with better small sample classification effect is used to replace the Softmax classification layer of the convolutional neural network(CNN)network.Secondly,a Dropout layer is introduced to improve the generalization ability of the network.Subsequently,the area,frequency and other important parameters of the flame image are analyzed and processed.Eventually,the experimental results show that the flame detection model designed in this paper is more accurate than the CNN model,and the accuracy of the judgment on the flame data set collected in the gasifier furnace reaches 99.53%.After several ignition tests,the furnace flame of the gasifier can be detected in real time.
基金This research was partly supported by the National Science and Technology Council,Taiwan with Grant Numbers 112-2221-E-992-045,112-2221-E-992-057-MY3 and 112-2622-8-992-009-TD1.
文摘Since its inception,the Internet has been rapidly evolving.With the advancement of science and technology and the explosive growth of the population,the demand for the Internet has been on the rise.Many applications in education,healthcare,entertainment,science,and more are being increasingly deployed based on the internet.Concurrently,malicious threats on the internet are on the rise as well.Distributed Denial of Service(DDoS)attacks are among the most common and dangerous threats on the internet today.The scale and complexity of DDoS attacks are constantly growing.Intrusion Detection Systems(IDS)have been deployed and have demonstrated their effectiveness in defense against those threats.In addition,the research of Machine Learning(ML)and Deep Learning(DL)in IDS has gained effective results and significant attention.However,one of the challenges when applying ML and DL techniques in intrusion detection is the identification of unknown attacks.These attacks,which are not encountered during the system’s training,can lead to misclassification with significant errors.In this research,we focused on addressing the issue of Unknown Attack Detection,combining two methods:Spatial Location Constraint Prototype Loss(SLCPL)and Fuzzy C-Means(FCM).With the proposed method,we achieved promising results compared to traditional methods.The proposed method demonstrates a very high accuracy of up to 99.8%with a low false positive rate for known attacks on the Intrusion Detection Evaluation Dataset(CICIDS2017)dataset.Particularly,the accuracy is also very high,reaching 99.7%,and the precision goes up to 99.9%for unknown DDoS attacks on the DDoS Evaluation Dataset(CICDDoS2019)dataset.The success of the proposed method is due to the combination of SLCPL,an advanced Open-Set Recognition(OSR)technique,and FCM,a traditional yet highly applicable clustering technique.This has yielded a novel method in the field of unknown attack detection.This further expands the trend of applying DL and ML techniques in the development of intrusion detection systems and cybersecurity.Finally,implementing the proposed method in real-world systems can enhance the security capabilities against increasingly complex threats on computer networks.