Facial expression generation from pure textual descriptions is widely applied in human-computer interaction,computer-aided design,assisted education,etc.However,this task is challenging due to the intricate facial str...Facial expression generation from pure textual descriptions is widely applied in human-computer interaction,computer-aided design,assisted education,etc.However,this task is challenging due to the intricate facial structure and the complex mapping between texts and images.Existing methods face limitations in generating high-resolution images or capturing diverse facial expressions.In this study,we propose a novel generation approach,named FaceCLIP,to tackle these problems.The proposed method utilizes a CLIP-based multi-stage generative adversarial model to produce vivid facial expressions with high resolutions.With strong semantic priors from multi-modal textual and visual cues,the proposed method effectively disentangles facial attributes,enabling attribute editing and semantic reasoning.To facilitate text-toexpression generation,we build a new dataset called the FET dataset,which contains facial expression images and corresponding textual descriptions.Experiments on the dataset demonstrate improved image quality and semantic consistency compared with state-of-the-art methods.展开更多
In this study,we extensively evaluated the viability of the state-of-the-art YOLOv8 architecture for object detec-tion tasks,specifically tailored for smoke and wildfire identification with a focus on agricultural and...In this study,we extensively evaluated the viability of the state-of-the-art YOLOv8 architecture for object detec-tion tasks,specifically tailored for smoke and wildfire identification with a focus on agricultural and environmen-tal safety.All available versions of YOLOv8 were initially fine-tuned on a domain-specific dataset that included a variety of scenarios,crucial for comprehensive agricultural monitoring.The‘large’version(YOLOv8l)was se-lected for further hyperparameter tuning based on its performance metrics.This model underwent a detailed hyperparameter optimization using the One Factor At a Time(OFAT)methodology,concentrating on key param-eters such as learning rate,batch size,weight decay,epochs,and optimizer.Insights from the OFAT study were used to define search spaces for a subsequent Random Search(RS).The final model derived from RS demon-strated significant improvements over the initial fine-tuned model,increasing overall precision by 1.39%,recall by 1.48%,F1-score by 1.44%,mAP@0.50 by 0.70%,and mAP@0.50:0.95 by 5.09%.We validated the enhanced model's efficacy on a diverse set of real-world images,reflecting various agricultural settings,to confirm its ro-bustness in detecting smoke and fire.These results underscore the model's reliability and effectiveness in scenar-ios critical to agricultural safety and environmental monitoring.This work,representing a significant advancement in the field of fire and smoke detection through machine learning,lays a strong foundation for fu-ture research and solutions aimed at safeguarding agricultural areas and natural environments.展开更多
This article presents an efficient approach to classify a set of corn kernels in contact,which may contain good,or defective kernels along with impurities.The proposed approach consists of two stages,the first one is ...This article presents an efficient approach to classify a set of corn kernels in contact,which may contain good,or defective kernels along with impurities.The proposed approach consists of two stages,the first one is a next-generation segmentation network,trained by using a set of synthesized images that is applied to divide the given image into a set of individual instances.An ad-hoc lightweight CNN architecture is then proposed to classify each instance into one of three categories(ie good,defective,and impurities).The segmentation network is trained using a strategy that avoids the time-consuming and human-error-prone task of manual data annotation.Regarding the classification stage,the proposed ad-hoc network is designed with only a few sets of layers to result in a lightweight architecture capable of being used in integrated solutions.Experimental results and comparisons with previous approaches showing both the improvement in accuracy and the reduction in time are provided.Finally,the segmentation and classification approach proposed can be easily adapted for use with other cereal types.展开更多
基金supported by the Natural Science Foundation of Shandong Province of China under Grant No.ZR2023MF041the National Natural Science Foundation of China under Grant No.62072469+1 种基金Shandong Data Open Innovative Application Laboratory,the Spanish Ministry of Economy and Competitiveness(MINECO)the European Regional Development Fund(ERDF)under Project No.PID2020-120611RBI00/AEI/10.13039/501100011033.
文摘Facial expression generation from pure textual descriptions is widely applied in human-computer interaction,computer-aided design,assisted education,etc.However,this task is challenging due to the intricate facial structure and the complex mapping between texts and images.Existing methods face limitations in generating high-resolution images or capturing diverse facial expressions.In this study,we propose a novel generation approach,named FaceCLIP,to tackle these problems.The proposed method utilizes a CLIP-based multi-stage generative adversarial model to produce vivid facial expressions with high resolutions.With strong semantic priors from multi-modal textual and visual cues,the proposed method effectively disentangles facial attributes,enabling attribute editing and semantic reasoning.To facilitate text-toexpression generation,we build a new dataset called the FET dataset,which contains facial expression images and corresponding textual descriptions.Experiments on the dataset demonstrate improved image quality and semantic consistency compared with state-of-the-art methods.
文摘In this study,we extensively evaluated the viability of the state-of-the-art YOLOv8 architecture for object detec-tion tasks,specifically tailored for smoke and wildfire identification with a focus on agricultural and environmen-tal safety.All available versions of YOLOv8 were initially fine-tuned on a domain-specific dataset that included a variety of scenarios,crucial for comprehensive agricultural monitoring.The‘large’version(YOLOv8l)was se-lected for further hyperparameter tuning based on its performance metrics.This model underwent a detailed hyperparameter optimization using the One Factor At a Time(OFAT)methodology,concentrating on key param-eters such as learning rate,batch size,weight decay,epochs,and optimizer.Insights from the OFAT study were used to define search spaces for a subsequent Random Search(RS).The final model derived from RS demon-strated significant improvements over the initial fine-tuned model,increasing overall precision by 1.39%,recall by 1.48%,F1-score by 1.44%,mAP@0.50 by 0.70%,and mAP@0.50:0.95 by 5.09%.We validated the enhanced model's efficacy on a diverse set of real-world images,reflecting various agricultural settings,to confirm its ro-bustness in detecting smoke and fire.These results underscore the model's reliability and effectiveness in scenar-ios critical to agricultural safety and environmental monitoring.This work,representing a significant advancement in the field of fire and smoke detection through machine learning,lays a strong foundation for fu-ture research and solutions aimed at safeguarding agricultural areas and natural environments.
基金Grant PID2021-128945NB-I00 funded by MCIN/AEI/10.13039/501100011033 and by“ERDF A way of making Europe”the“CERCA Programme/Generalitat de Catalunya”the ESPOL project CIDIS-20-2021.
文摘This article presents an efficient approach to classify a set of corn kernels in contact,which may contain good,or defective kernels along with impurities.The proposed approach consists of two stages,the first one is a next-generation segmentation network,trained by using a set of synthesized images that is applied to divide the given image into a set of individual instances.An ad-hoc lightweight CNN architecture is then proposed to classify each instance into one of three categories(ie good,defective,and impurities).The segmentation network is trained using a strategy that avoids the time-consuming and human-error-prone task of manual data annotation.Regarding the classification stage,the proposed ad-hoc network is designed with only a few sets of layers to result in a lightweight architecture capable of being used in integrated solutions.Experimental results and comparisons with previous approaches showing both the improvement in accuracy and the reduction in time are provided.Finally,the segmentation and classification approach proposed can be easily adapted for use with other cereal types.