The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and hist...The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.展开更多
The human ear has been substantiated as a viable nonintrusive biometric modality for identification or verification.Among many feasible techniques for ear biometric recognition,convolutional neural network(CNN)models ...The human ear has been substantiated as a viable nonintrusive biometric modality for identification or verification.Among many feasible techniques for ear biometric recognition,convolutional neural network(CNN)models have recently offered high-performance and reliable systems.However,their performance can still be further improved using the capabilities of soft biometrics,a research question yet to be investigated.This research aims to augment the traditional CNN-based ear recognition performance by adding increased discriminatory ear soft biometric traits.It proposes a novel framework of augmented ear identification/verification using a group of discriminative categorical soft biometrics and deriving new,more perceptive,comparative soft biometrics for feature-level fusion with hard biometric deep features.It conducts several identification and verification experiments for performance evaluation,analysis,and comparison while varying ear image datasets,hard biometric deep-feature extractors,soft biometric augmentation methods,and classifiers used.The experimental work yields promising results,reaching up to 99.94%accuracy and up to 14%improvement using the AMI and AMIC datasets,along with their corresponding soft biometric label data.The results confirm the proposed augmented approaches’superiority over their standard counterparts and emphasize the robustness of the new ear comparative soft biometrics over their categorical peers.展开更多
Deep learning(DL)methods like multilayer perceptrons(MLPs)and convolutional neural networks(CNNs)have been applied to predict the complex traits in animal and plant breeding.However,improving the genomic prediction ac...Deep learning(DL)methods like multilayer perceptrons(MLPs)and convolutional neural networks(CNNs)have been applied to predict the complex traits in animal and plant breeding.However,improving the genomic prediction accuracy still presents signifcant challenges.In this study,we applied CNNs to predict swine traits using previously published data.Specifcally,we extensively evaluated the CNN model's performance by employing various sets of single nucleotide polymorphisms(SNPs)and concluded that the CNN model achieved optimal performance when utilizing SNP sets comprising 1,000 SNPs.Furthermore,we adopted a novel approach using the one-hot encoding method that transforms the 16 different genotypes into sets of eight binary variables.This innovative encoding method signifcantly enhanced the CNN's prediction accuracy for swine traits,outperforming the traditional one-hot encoding techniques.Our fndings suggest that the expanded one-hot encoding method can improve the accuracy of DL methods in the genomic prediction of swine agricultural economic traits.This discovery has significant implications for swine breeding programs,where genomic prediction is pivotal in improving breeding strategies.Furthermore,future research endeavors can explore additional enhancements to DL methods by incorporating advanced data pre-processing techniques.展开更多
Medical image analysis has become a cornerstone of modern healthcare,driven by the exponential growth of data from imaging modalities such as MRI,CT,PET,ultrasound,and X-ray.Traditional machine learning methods have m...Medical image analysis has become a cornerstone of modern healthcare,driven by the exponential growth of data from imaging modalities such as MRI,CT,PET,ultrasound,and X-ray.Traditional machine learning methods have made early contributions;however,recent advancements in deep learning(DL)have revolutionized the field,offering state-of-the-art performance in image classification,segmentation,detection,fusion,registration,and enhancement.This comprehensive review presents an in-depth analysis of deep learning methodologies applied across medical image analysis tasks,highlighting both foundational models and recent innovations.The article begins by introducing conventional techniques and their limitations,setting the stage for DL-based solutions.Core DL architectures,including Convolutional Neural Networks(CNNs),Recurrent Neural Networks(RNNs),Generative Adversarial Networks(GANs),Vision Transformers(ViTs),and hybrid models,are discussed in detail,including their advantages and domain-specific adaptations.Advanced learning paradigms such as semi-supervised learning,selfsupervised learning,and few-shot learning are explored for their potential to mitigate data annotation challenges in clinical datasets.This review further categorizes major tasks in medical image analysis,elaborating on how DL techniques have enabled precise tumor segmentation,lesion detection,modality fusion,super-resolution,and robust classification across diverse clinical settings.Emphasis is placed on applications in oncology,cardiology,neurology,and infectious diseases,including COVID-19.Challenges such as data scarcity,label imbalance,model generalizability,interpretability,and integration into clinical workflows are critically examined.Ethical considerations,explainable AI(XAI),federated learning,and regulatory compliance are discussed as essential components of real-world deployment.Benchmark datasets,evaluation metrics,and comparative performance analyses are presented to support future research.The article concludes with a forward-looking perspective on the role of foundation models,multimodal learning,edge AI,and bio-inspired computing in the future of medical imaging.Overall,this review serves as a valuable resource for researchers,clinicians,and developers aiming to harness deep learning for intelligent,efficient,and clinically viable medical image analysis.展开更多
Food waste presents a major global environmental challenge,contributing to resource depletion,greenhouse gas emissions,and climate change.Black Soldier Fly Larvae(BSFL)offer an eco-friendly solution due to their excep...Food waste presents a major global environmental challenge,contributing to resource depletion,greenhouse gas emissions,and climate change.Black Soldier Fly Larvae(BSFL)offer an eco-friendly solution due to their exceptional ability to decompose organic matter.However,accurately identifying larval instars is critical for optimizing feeding efficiency and downstreamapplications,as different stages exhibit only subtle visual differences.This study proposes a real-timemobile application for automatic classification of BSFL larval stages.The systemdistinguishes between early instars(Stages 1–4),suitable for food waste processing and animal feed,and late instars(Stages 5–6),optimal for pupation and industrial use.A baseline YOLO11 model was employed,achieving a mAP50-95 of 0.811.To further improve performance and efficiency,we introduce YOLO11-DSConv,a novel adaptation incorporating Depthwise Separable Convolutions specifically optimized for the unique challenges of BSFL classification.Unlike existing YOLO+DSConv implementations,our approach is tailored for the subtle visual differences between larval stages and integrated into a complete end-to-end system.The enhanced model achieved a mAP50-95 of 0.813 while reducing computational complexity by 15.5%.The proposed system demonstrates high accuracy and lightweight performance,making it suitable for deployment on resource-constrained agricultural devices,while directly supporting circular economy initiatives through precise larval stage identification.By integrating BSFL classification with realtime AI,this work contributes to sustainable food wastemanagement and advances intelligent applications in precision agriculture and circular economy initiatives.Additional supplementary materials and the implementation code are available at the following link:YOLO11-DSConv,Server Side,Mobile Application.展开更多
It is important to understand the development of joints and fractures in rock masses to ensure drilling stability and blasting effectiveness.Traditional manual observation techniques for identifying and extracting fra...It is important to understand the development of joints and fractures in rock masses to ensure drilling stability and blasting effectiveness.Traditional manual observation techniques for identifying and extracting fracture characteristics have been proven to be inefficient and prone to subjective interpretation.Moreover,conventional image processing algorithms and classical deep learning models often encounter difficulties in accurately identifying fracture areas,resulting in unclear contours.This study proposes an intelligent method for detecting internal fractures in mine rock masses to address these challenges.The proposed approach captures a nodal fracture map within the targeted blast area and integrates channel and spatial attention mechanisms into the ResUnet(RU)model.The channel attention mechanism dynamically recalibrates the importance of each feature channel,and the spatial attention mechanism enhances feature representation in key areas while minimizing background noise,thus improving segmentation accuracy.A dynamic serpentine convolution module is also introduced that adaptively adjusts the shape and orientation of the convolution kernel based on the local structure of the input feature map.Furthermore,this method enables the automatic extraction and quantification of borehole nodal fracture information by fitting sinusoidal curves to the boundaries of the fracture contours using the least squares method.In comparison to other advanced deep learning models,our enhanced RU demonstrates superior performance across evaluation metrics,including accuracy,pixel accuracy(PA),and intersection over union(IoU).Unlike traditional manual extraction methods,our intelligent detection approach provides considerable time and cost savings,with an average error rate of approximately 4%.This approach has the potential to greatly improve the efficiency of geological surveys of borehole fractures.展开更多
In the last decade,market financial forecasting has attracted high interests amongst the researchers in pattern recognition.Usually,the data used for analysing the market,and then gamble on its future trend,are provid...In the last decade,market financial forecasting has attracted high interests amongst the researchers in pattern recognition.Usually,the data used for analysing the market,and then gamble on its future trend,are provided as time series;this aspect,along with the high fluctuation of this kind of data,cuts out the use of very efficient classification tools,very popular in the state of the art,like the well known convolutional neural networks(CNNs)models such as Inception,Res Net,Alex Net,and so on.This forces the researchers to train new tools from scratch.Such operations could be very time consuming.This paper exploits an ensemble of CNNs,trained over Gramian angular fields(GAF)images,generated from time series related to the Standard&Poor's 500 index future;the aim is the prediction of the future trend of the U.S.market.A multi-resolution imaging approach is used to feed each CNN,enabling the analysis of different time intervals for a single observation.A simple trading system based on the ensemble forecaster is used to evaluate the quality of the proposed approach.Our method outperforms the buyand-hold(B&H)strategy in a time frame where the latter provides excellent returns.Both quantitative and qualitative results are provided.展开更多
The new coronavirus(COVID-19),declared by the World Health Organization as a pandemic,has infected more than 1 million people and killed more than 50 thousand.An infection caused by COVID-19 can develop into pneumonia...The new coronavirus(COVID-19),declared by the World Health Organization as a pandemic,has infected more than 1 million people and killed more than 50 thousand.An infection caused by COVID-19 can develop into pneumonia,which can be detected by a chest X-ray exam and should be treated appropriately.In this work,we propose an automatic detection method for COVID-19 infection based on chest X-ray images.The datasets constructed for this study are composed of194 X-ray images of patients diagnosed with coronavirus and 194 X-ray images of healthy patients.Since few images of patients with COVID-19 are publicly available,we apply the concept of transfer learning for this task.We use different architectures of convolutional neural networks(CNNs)trained on Image Net,and adapt them to behave as feature extractors for the X-ray images.Then,the CNNs are combined with consolidated machine learning methods,such as k-Nearest Neighbor,Bayes,Random Forest,multilayer perceptron(MLP),and support vector machine(SVM).The results show that,for one of the datasets,the extractor-classifier pair with the best performance is the Mobile Net architecture with the SVM classifier using a linear kernel,which achieves an accuracy and an F1-score of 98.5%.For the other dataset,the best pair is Dense Net201 with MLP,achieving an accuracy and an F1-score of 95.6%.Thus,the proposed approach demonstrates efficiency in detecting COVID-19 in X-ray images.展开更多
Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency info...Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.展开更多
Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to ...Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to obtain.To tackle this problem,we make an early attempt to achieve video object segmentation with scribble-level supervision,which can alleviate large amounts of human labor for collecting the manual annotation.However,using conventional network architectures and learning objective functions under this scenario cannot work well as the supervision information is highly sparse and incomplete.To address this issue,this paper introduces two novel elements to learn the video object segmentation model.The first one is the scribble attention module,which captures more accurate context information and learns an effective attention map to enhance the contrast between foreground and background.The other one is the scribble-supervised loss,which can optimize the unlabeled pixels and dynamically correct inaccurate segmented areas during the training stage.To evaluate the proposed method,we implement experiments on two video object segmentation benchmark datasets,You Tube-video object segmentation(VOS),and densely annotated video segmentation(DAVIS)-2017.We first generate the scribble annotations from the original per-pixel annotations.Then,we train our model and compare its test performance with the baseline models and other existing works.Extensive experiments demonstrate that the proposed method can work effectively and approach to the methods requiring the dense per-pixel annotations.展开更多
Reducing the defocus blur that arises from the finite aperture size and short exposure time is an essential problem in computational photography.It is very challenging because the blur kernel is spatially varying and ...Reducing the defocus blur that arises from the finite aperture size and short exposure time is an essential problem in computational photography.It is very challenging because the blur kernel is spatially varying and difficult to estimate by traditional methods.Due to its great breakthrough in low-level tasks,convolutional neural networks(CNNs)have been introdu-ced to the defocus deblurring problem and achieved significant progress.However,previous methods apply the same learned kernel for different regions of the defocus blurred images,thus it is difficult to handle nonuniform blurred images.To this end,this study designs a novel blur-aware multi-branch network(Ba-MBNet),in which different regions are treated differentially.In particular,we estimate the blur amounts of different regions by the internal geometric constraint of the dual-pixel(DP)data,which measures the defocus disparity between the left and right views.Based on the assumption that different image regions with different blur amounts have different deblurring difficulties,we leverage different networks with different capacities to treat different image regions.Moreover,we introduce a meta-learning defocus mask generation algorithm to assign each pixel to a proper branch.In this way,we can expect to maintain the information of the clear regions well while recovering the missing details of the blurred regions.Both quantitative and qualitative experiments demonstrate that our BaMBNet outperforms the state-of-the-art(SOTA)methods.For the dual-pixel defocus deblurring(DPD)-blur dataset,the proposed BaMBNet achieves 1.20 dB gain over the previous SOTA method in term of peak signal-to-noise ratio(PSNR)and reduces learnable parameters by 85%.The details of the code and dataset are available at https://github.com/junjun-jiang/BaMBNet.展开更多
文摘The integration of image analysis through deep learning(DL)into rock classification represents a significant leap forward in geological research.While traditional methods remain invaluable for their expertise and historical context,DL offers a powerful complement by enhancing the speed,objectivity,and precision of the classification process.This research explores the significance of image data augmentation techniques in optimizing the performance of convolutional neural networks(CNNs)for geological image analysis,particularly in the classification of igneous,metamorphic,and sedimentary rock types from rock thin section(RTS)images.This study primarily focuses on classic image augmentation techniques and evaluates their impact on model accuracy and precision.Results demonstrate that augmentation techniques like Equalize significantly enhance the model's classification capabilities,achieving an F1-Score of 0.9869 for igneous rocks,0.9884 for metamorphic rocks,and 0.9929 for sedimentary rocks,representing improvements compared to the baseline original results.Moreover,the weighted average F1-Score across all classes and techniques is 0.9886,indicating an enhancement.Conversely,methods like Distort lead to decreased accuracy and F1-Score,with an F1-Score of 0.949 for igneous rocks,0.954 for metamorphic rocks,and 0.9416 for sedimentary rocks,exacerbating the performance compared to the baseline.The study underscores the practicality of image data augmentation in geological image classification and advocates for the adoption of DL methods in this domain for automation and improved results.The findings of this study can benefit various fields,including remote sensing,mineral exploration,and environmental monitoring,by enhancing the accuracy of geological image analysis both for scientific research and industrial applications.
基金funded by WAQF at King Abdulaziz University,Jeddah,Saudi Arabia.
文摘The human ear has been substantiated as a viable nonintrusive biometric modality for identification or verification.Among many feasible techniques for ear biometric recognition,convolutional neural network(CNN)models have recently offered high-performance and reliable systems.However,their performance can still be further improved using the capabilities of soft biometrics,a research question yet to be investigated.This research aims to augment the traditional CNN-based ear recognition performance by adding increased discriminatory ear soft biometric traits.It proposes a novel framework of augmented ear identification/verification using a group of discriminative categorical soft biometrics and deriving new,more perceptive,comparative soft biometrics for feature-level fusion with hard biometric deep features.It conducts several identification and verification experiments for performance evaluation,analysis,and comparison while varying ear image datasets,hard biometric deep-feature extractors,soft biometric augmentation methods,and classifiers used.The experimental work yields promising results,reaching up to 99.94%accuracy and up to 14%improvement using the AMI and AMIC datasets,along with their corresponding soft biometric label data.The results confirm the proposed augmented approaches’superiority over their standard counterparts and emphasize the robustness of the new ear comparative soft biometrics over their categorical peers.
基金supported by the National Natural Science Foundation of China(32102513)the National Key Scientific Research Project(2023YFF1001100)+1 种基金the Shenzhen Innovation and Entrepreneurship PlanMajor Special Project of Science and Technology,China(KJZD20230923115003006)the Innovation Project of Chinese Academy of Agricultural Sciences(CAAS-ZDRW202006)。
文摘Deep learning(DL)methods like multilayer perceptrons(MLPs)and convolutional neural networks(CNNs)have been applied to predict the complex traits in animal and plant breeding.However,improving the genomic prediction accuracy still presents signifcant challenges.In this study,we applied CNNs to predict swine traits using previously published data.Specifcally,we extensively evaluated the CNN model's performance by employing various sets of single nucleotide polymorphisms(SNPs)and concluded that the CNN model achieved optimal performance when utilizing SNP sets comprising 1,000 SNPs.Furthermore,we adopted a novel approach using the one-hot encoding method that transforms the 16 different genotypes into sets of eight binary variables.This innovative encoding method signifcantly enhanced the CNN's prediction accuracy for swine traits,outperforming the traditional one-hot encoding techniques.Our fndings suggest that the expanded one-hot encoding method can improve the accuracy of DL methods in the genomic prediction of swine agricultural economic traits.This discovery has significant implications for swine breeding programs,where genomic prediction is pivotal in improving breeding strategies.Furthermore,future research endeavors can explore additional enhancements to DL methods by incorporating advanced data pre-processing techniques.
文摘Medical image analysis has become a cornerstone of modern healthcare,driven by the exponential growth of data from imaging modalities such as MRI,CT,PET,ultrasound,and X-ray.Traditional machine learning methods have made early contributions;however,recent advancements in deep learning(DL)have revolutionized the field,offering state-of-the-art performance in image classification,segmentation,detection,fusion,registration,and enhancement.This comprehensive review presents an in-depth analysis of deep learning methodologies applied across medical image analysis tasks,highlighting both foundational models and recent innovations.The article begins by introducing conventional techniques and their limitations,setting the stage for DL-based solutions.Core DL architectures,including Convolutional Neural Networks(CNNs),Recurrent Neural Networks(RNNs),Generative Adversarial Networks(GANs),Vision Transformers(ViTs),and hybrid models,are discussed in detail,including their advantages and domain-specific adaptations.Advanced learning paradigms such as semi-supervised learning,selfsupervised learning,and few-shot learning are explored for their potential to mitigate data annotation challenges in clinical datasets.This review further categorizes major tasks in medical image analysis,elaborating on how DL techniques have enabled precise tumor segmentation,lesion detection,modality fusion,super-resolution,and robust classification across diverse clinical settings.Emphasis is placed on applications in oncology,cardiology,neurology,and infectious diseases,including COVID-19.Challenges such as data scarcity,label imbalance,model generalizability,interpretability,and integration into clinical workflows are critically examined.Ethical considerations,explainable AI(XAI),federated learning,and regulatory compliance are discussed as essential components of real-world deployment.Benchmark datasets,evaluation metrics,and comparative performance analyses are presented to support future research.The article concludes with a forward-looking perspective on the role of foundation models,multimodal learning,edge AI,and bio-inspired computing in the future of medical imaging.Overall,this review serves as a valuable resource for researchers,clinicians,and developers aiming to harness deep learning for intelligent,efficient,and clinically viable medical image analysis.
文摘Food waste presents a major global environmental challenge,contributing to resource depletion,greenhouse gas emissions,and climate change.Black Soldier Fly Larvae(BSFL)offer an eco-friendly solution due to their exceptional ability to decompose organic matter.However,accurately identifying larval instars is critical for optimizing feeding efficiency and downstreamapplications,as different stages exhibit only subtle visual differences.This study proposes a real-timemobile application for automatic classification of BSFL larval stages.The systemdistinguishes between early instars(Stages 1–4),suitable for food waste processing and animal feed,and late instars(Stages 5–6),optimal for pupation and industrial use.A baseline YOLO11 model was employed,achieving a mAP50-95 of 0.811.To further improve performance and efficiency,we introduce YOLO11-DSConv,a novel adaptation incorporating Depthwise Separable Convolutions specifically optimized for the unique challenges of BSFL classification.Unlike existing YOLO+DSConv implementations,our approach is tailored for the subtle visual differences between larval stages and integrated into a complete end-to-end system.The enhanced model achieved a mAP50-95 of 0.813 while reducing computational complexity by 15.5%.The proposed system demonstrates high accuracy and lightweight performance,making it suitable for deployment on resource-constrained agricultural devices,while directly supporting circular economy initiatives through precise larval stage identification.By integrating BSFL classification with realtime AI,this work contributes to sustainable food wastemanagement and advances intelligent applications in precision agriculture and circular economy initiatives.Additional supplementary materials and the implementation code are available at the following link:YOLO11-DSConv,Server Side,Mobile Application.
基金supported by the National Natural Science Foundation of China(No.52474172).
文摘It is important to understand the development of joints and fractures in rock masses to ensure drilling stability and blasting effectiveness.Traditional manual observation techniques for identifying and extracting fracture characteristics have been proven to be inefficient and prone to subjective interpretation.Moreover,conventional image processing algorithms and classical deep learning models often encounter difficulties in accurately identifying fracture areas,resulting in unclear contours.This study proposes an intelligent method for detecting internal fractures in mine rock masses to address these challenges.The proposed approach captures a nodal fracture map within the targeted blast area and integrates channel and spatial attention mechanisms into the ResUnet(RU)model.The channel attention mechanism dynamically recalibrates the importance of each feature channel,and the spatial attention mechanism enhances feature representation in key areas while minimizing background noise,thus improving segmentation accuracy.A dynamic serpentine convolution module is also introduced that adaptively adjusts the shape and orientation of the convolution kernel based on the local structure of the input feature map.Furthermore,this method enables the automatic extraction and quantification of borehole nodal fracture information by fitting sinusoidal curves to the boundaries of the fracture contours using the least squares method.In comparison to other advanced deep learning models,our enhanced RU demonstrates superior performance across evaluation metrics,including accuracy,pixel accuracy(PA),and intersection over union(IoU).Unlike traditional manual extraction methods,our intelligent detection approach provides considerable time and cost savings,with an average error rate of approximately 4%.This approach has the potential to greatly improve the efficiency of geological surveys of borehole fractures.
基金supported by the“Bando Aiuti per progetti di Ricerca e Sviluppo-POR FESR 2014-2020-Asse 1,Azione 1.1.3.Project AlmostAnOracle-AI and Big Data Algorithms for Financial Time Series Forecasting”。
文摘In the last decade,market financial forecasting has attracted high interests amongst the researchers in pattern recognition.Usually,the data used for analysing the market,and then gamble on its future trend,are provided as time series;this aspect,along with the high fluctuation of this kind of data,cuts out the use of very efficient classification tools,very popular in the state of the art,like the well known convolutional neural networks(CNNs)models such as Inception,Res Net,Alex Net,and so on.This forces the researchers to train new tools from scratch.Such operations could be very time consuming.This paper exploits an ensemble of CNNs,trained over Gramian angular fields(GAF)images,generated from time series related to the Standard&Poor's 500 index future;the aim is the prediction of the future trend of the U.S.market.A multi-resolution imaging approach is used to feed each CNN,enabling the analysis of different time intervals for a single observation.A simple trading system based on the ensemble forecaster is used to evaluate the quality of the proposed approach.Our method outperforms the buyand-hold(B&H)strategy in a time frame where the latter provides excellent returns.Both quantitative and qualitative results are provided.
基金supported in part by the Coordenacao de Aperfeicoamento de Pessoal de Nível Superior-Brasil(CAPES)(001)the Brazilian National Council for Research and Development(CNPq)(431709/2018-1,311973/2018-3,304315/2017-6,430274/2018-1)。
文摘The new coronavirus(COVID-19),declared by the World Health Organization as a pandemic,has infected more than 1 million people and killed more than 50 thousand.An infection caused by COVID-19 can develop into pneumonia,which can be detected by a chest X-ray exam and should be treated appropriately.In this work,we propose an automatic detection method for COVID-19 infection based on chest X-ray images.The datasets constructed for this study are composed of194 X-ray images of patients diagnosed with coronavirus and 194 X-ray images of healthy patients.Since few images of patients with COVID-19 are publicly available,we apply the concept of transfer learning for this task.We use different architectures of convolutional neural networks(CNNs)trained on Image Net,and adapt them to behave as feature extractors for the X-ray images.Then,the CNNs are combined with consolidated machine learning methods,such as k-Nearest Neighbor,Bayes,Random Forest,multilayer perceptron(MLP),and support vector machine(SVM).The results show that,for one of the datasets,the extractor-classifier pair with the best performance is the Mobile Net architecture with the SVM classifier using a linear kernel,which achieves an accuracy and an F1-score of 98.5%.For the other dataset,the best pair is Dense Net201 with MLP,achieving an accuracy and an F1-score of 95.6%.Thus,the proposed approach demonstrates efficiency in detecting COVID-19 in X-ray images.
基金supported by the German National BMBF IKT2020-Grant(16SV7213)(EmotAsS)the European-Unions Horizon 2020 Research and Innovation Programme(688835)(DE-ENIGMA)the China Scholarship Council(CSC)
文摘Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.
基金supported in part by the National Key R&D Program of China(2017YFB0502904)the National Science Foundation of China(61876140)。
文摘Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to obtain.To tackle this problem,we make an early attempt to achieve video object segmentation with scribble-level supervision,which can alleviate large amounts of human labor for collecting the manual annotation.However,using conventional network architectures and learning objective functions under this scenario cannot work well as the supervision information is highly sparse and incomplete.To address this issue,this paper introduces two novel elements to learn the video object segmentation model.The first one is the scribble attention module,which captures more accurate context information and learns an effective attention map to enhance the contrast between foreground and background.The other one is the scribble-supervised loss,which can optimize the unlabeled pixels and dynamically correct inaccurate segmented areas during the training stage.To evaluate the proposed method,we implement experiments on two video object segmentation benchmark datasets,You Tube-video object segmentation(VOS),and densely annotated video segmentation(DAVIS)-2017.We first generate the scribble annotations from the original per-pixel annotations.Then,we train our model and compare its test performance with the baseline models and other existing works.Extensive experiments demonstrate that the proposed method can work effectively and approach to the methods requiring the dense per-pixel annotations.
基金supported by the National Natural Science Foundation of China (61971165, 61922027, 61773295)in part by the Fundamental Research Funds for the Central Universities (FRFCU5710050119)+1 种基金the Natural Science Foundation of Heilongjiang Province(YQ2020F004)the Chinese Association for Artificial Intelligence(CAAI)-Huawei Mind Spore Open Fund
文摘Reducing the defocus blur that arises from the finite aperture size and short exposure time is an essential problem in computational photography.It is very challenging because the blur kernel is spatially varying and difficult to estimate by traditional methods.Due to its great breakthrough in low-level tasks,convolutional neural networks(CNNs)have been introdu-ced to the defocus deblurring problem and achieved significant progress.However,previous methods apply the same learned kernel for different regions of the defocus blurred images,thus it is difficult to handle nonuniform blurred images.To this end,this study designs a novel blur-aware multi-branch network(Ba-MBNet),in which different regions are treated differentially.In particular,we estimate the blur amounts of different regions by the internal geometric constraint of the dual-pixel(DP)data,which measures the defocus disparity between the left and right views.Based on the assumption that different image regions with different blur amounts have different deblurring difficulties,we leverage different networks with different capacities to treat different image regions.Moreover,we introduce a meta-learning defocus mask generation algorithm to assign each pixel to a proper branch.In this way,we can expect to maintain the information of the clear regions well while recovering the missing details of the blurred regions.Both quantitative and qualitative experiments demonstrate that our BaMBNet outperforms the state-of-the-art(SOTA)methods.For the dual-pixel defocus deblurring(DPD)-blur dataset,the proposed BaMBNet achieves 1.20 dB gain over the previous SOTA method in term of peak signal-to-noise ratio(PSNR)and reduces learnable parameters by 85%.The details of the code and dataset are available at https://github.com/junjun-jiang/BaMBNet.