Panoramic images, offering a 360-degree view, are essential in virtual reality(VR) and augmented reality(AR), enhancing realism with high-quality textures. However, acquiring complete and high-quality panoramic textur...Panoramic images, offering a 360-degree view, are essential in virtual reality(VR) and augmented reality(AR), enhancing realism with high-quality textures. However, acquiring complete and high-quality panoramic textures is challenging. This paper introduces a method using generative adversarial networks(GANs) and the contrastive language-image pretraining(CLIP) model to restore and control texture in panoramic images. The GAN model captures complex structures and maintains consistency, while CLIP enables fine-grained texture control via semantic text-image associations. GAN inversion optimizes latent codes for precise texture details. The resulting low dynamic range(LDR) images are converted to high dynamic range(HDR) using the Blender engine for seamless texture blending. Experimental results demonstrate the effectiveness and flexibility of this method in panoramic texture restoration and generation.展开更多
当前,全球科技创新呈现高速发展和高度融合的态势。准确识别出颠覆性技术主题以推动全面创新已成为科学技术发展和经济增长的关键动力。然而,传统的颠覆性技术主题识别方法主要依赖于单一模态数据,存在一定的局限性。本文基于CLIP(contr...当前,全球科技创新呈现高速发展和高度融合的态势。准确识别出颠覆性技术主题以推动全面创新已成为科学技术发展和经济增长的关键动力。然而,传统的颠覆性技术主题识别方法主要依赖于单一模态数据,存在一定的局限性。本文基于CLIP(contrastive language-image pre-training)和LDAGV(linear discriminant analysis&global vectors for word representation)模型构建新闻文本与图像特征融合向量,通过k-means聚类迭代并结合3个颠覆性技术主题指标进行筛选,实现了多模态信息的融合以及主题的精准识别。以新能源领域为例,验证了该模型在颠覆性技术主题识别方面的可行性和有效性。与其他单一模态模型相比,多模态信息融合模型在颠覆性技术主题识别方面更具优势。展开更多
To improve aneurysm treatment,this study examined the influence of clip locations on hemodynamic factors in patient-specific anterior communicating artery(ACoA)aneurysms with different aneurysmal angle.We proposed a s...To improve aneurysm treatment,this study examined the influence of clip locations on hemodynamic factors in patient-specific anterior communicating artery(ACoA)aneurysms with different aneurysmal angle.We proposed a simplified classification of ACoA aneurysms using aneurysmal angle,defined by the angle of pivot of the aneurysmal dome and the virtual two-dimensional plane created by both proximal A2 segments of anterior cerebral artery(ACA).ACoA aneurysms with three different aneurysmal angles,which are 15°,80°and 120°,were analyzed in our study.In this work,we obtained hemodynamics before and after clipping surgery with three clip locations based on clinical clipping strategies in three ACoA aneurysms with different aneurysm angles.Results showed that local high pressure occurs at impingement region of the ACoA aneurysm before clipping and new impingement region close to the clipping location after clipping treatment.For clipping the aneurysm with aneurysmal angle 15°and a wide neck,wall shear stress(WSS)distribution is more uniform when the clipping angle of two clips close to 180°comparing with other two angles.In addition,for clipping the aneurysm with aneurysmal angle 80°and 120°,local high pressure appears on new impingement region and high WSS distributes around the clipping location when the clip plane is normal to the direction of inflow of aneurysm from the dominance of A1 segment of ACA.Hence,we should avoid the impingement of inflow from the A1 segment and choose a favorable clipping location for the fastness of clip.The results of our study could preoperatively give a useful information to the decision of surgical plan.展开更多
Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and c...Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and categorize them into predefined entity types.This process can provide basic support for the automatic construction of knowledge bases.In contrast to general texts,biomedical texts frequently contain numerous nested entities and local dependencies among these entities,presenting significant challenges to prevailing NER models.To address these issues,we propose a novel Chinese nested biomedical NER model based on RoBERTa and Global Pointer(RoBGP).Our model initially utilizes the RoBERTa-wwm-ext-large pretrained language model to dynamically generate word-level initial vectors.It then incorporates a Bidirectional Long Short-Term Memory network for capturing bidirectional semantic information,effectively addressing the issue of long-distance dependencies.Furthermore,the Global Pointer model is employed to comprehensively recognize all nested entities in the text.We conduct extensive experiments on the Chinese medical dataset CMeEE and the results demonstrate the superior performance of RoBGP over several baseline models.This research confirms the effectiveness of RoBGP in Chinese biomedical NER,providing reliable technical support for biomedical information extraction and knowledge base construction.展开更多
We modified a three-dimensional cerebral aneurysm model for surgical simulation and educational demonstration. Novel models are made showing perforating arteries arising around the aneurysm. Information about perforat...We modified a three-dimensional cerebral aneurysm model for surgical simulation and educational demonstration. Novel models are made showing perforating arteries arising around the aneurysm. Information about perforating arteries is difficult to obtain from individual radiological data sets. Perforators are therefore reproduced based on previous anatomical knowledge instead of personal data. Due to their fragility, perforating arteries are attached to the model using hard materials. At the same time, hollow models are useful for practicing clip application. We made a model for practicing the application of fenestrated clips for paraclinoid internal carotid aneurysms. Situating aneurysm models in the fissure of a brain model simulates the real surgical field and is helpful for educational demonstrations.展开更多
In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and comput...In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.展开更多
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir...Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.展开更多
Video-text retrieval (VTR) is an essential task in multimodal learning, aiming to bridge the semantic gap between visual and textual data. Effective video frame sampling plays a crucial role in improving retrieval per...Video-text retrieval (VTR) is an essential task in multimodal learning, aiming to bridge the semantic gap between visual and textual data. Effective video frame sampling plays a crucial role in improving retrieval performance, as it determines the quality of the visual content representation. Traditional sampling methods, such as uniform sampling and optical flow-based techniques, often fail to capture the full semantic range of videos, leading to redundancy and inefficiencies. In this work, we propose CLIP4Video-Sampling: Global Semantics-Guided Multi-Granularity Frame Sampling for Video-Text Retrieval, a global semantics-guided multi-granularity frame sampling strategy designed to optimize both computational efficiency and retrieval accuracy. By integrating multi-scale global and local temporal sampling and leveraging the CLIP (Contrastive Language-Image Pre-training) model’s powerful feature extraction capabilities, our method significantly outperforms existing approaches in both zero-shot and fine-tuned video-text retrieval tasks on popular datasets. CLIP4Video-Sampling reduces redundancy, ensures keyframe coverage, and serves as an adaptable pre-processing module for multimodal models.展开更多
文摘Panoramic images, offering a 360-degree view, are essential in virtual reality(VR) and augmented reality(AR), enhancing realism with high-quality textures. However, acquiring complete and high-quality panoramic textures is challenging. This paper introduces a method using generative adversarial networks(GANs) and the contrastive language-image pretraining(CLIP) model to restore and control texture in panoramic images. The GAN model captures complex structures and maintains consistency, while CLIP enables fine-grained texture control via semantic text-image associations. GAN inversion optimizes latent codes for precise texture details. The resulting low dynamic range(LDR) images are converted to high dynamic range(HDR) using the Blender engine for seamless texture blending. Experimental results demonstrate the effectiveness and flexibility of this method in panoramic texture restoration and generation.
文摘当前,全球科技创新呈现高速发展和高度融合的态势。准确识别出颠覆性技术主题以推动全面创新已成为科学技术发展和经济增长的关键动力。然而,传统的颠覆性技术主题识别方法主要依赖于单一模态数据,存在一定的局限性。本文基于CLIP(contrastive language-image pre-training)和LDAGV(linear discriminant analysis&global vectors for word representation)模型构建新闻文本与图像特征融合向量,通过k-means聚类迭代并结合3个颠覆性技术主题指标进行筛选,实现了多模态信息的融合以及主题的精准识别。以新能源领域为例,验证了该模型在颠覆性技术主题识别方面的可行性和有效性。与其他单一模态模型相比,多模态信息融合模型在颠覆性技术主题识别方面更具优势。
基金This work was kindly supported by National Natural Science Foundation of China(11602053,51576033)Education Department of Liaoning Province general project(L2015113).
文摘To improve aneurysm treatment,this study examined the influence of clip locations on hemodynamic factors in patient-specific anterior communicating artery(ACoA)aneurysms with different aneurysmal angle.We proposed a simplified classification of ACoA aneurysms using aneurysmal angle,defined by the angle of pivot of the aneurysmal dome and the virtual two-dimensional plane created by both proximal A2 segments of anterior cerebral artery(ACA).ACoA aneurysms with three different aneurysmal angles,which are 15°,80°and 120°,were analyzed in our study.In this work,we obtained hemodynamics before and after clipping surgery with three clip locations based on clinical clipping strategies in three ACoA aneurysms with different aneurysm angles.Results showed that local high pressure occurs at impingement region of the ACoA aneurysm before clipping and new impingement region close to the clipping location after clipping treatment.For clipping the aneurysm with aneurysmal angle 15°and a wide neck,wall shear stress(WSS)distribution is more uniform when the clipping angle of two clips close to 180°comparing with other two angles.In addition,for clipping the aneurysm with aneurysmal angle 80°and 120°,local high pressure appears on new impingement region and high WSS distributes around the clipping location when the clip plane is normal to the direction of inflow of aneurysm from the dominance of A1 segment of ACA.Hence,we should avoid the impingement of inflow from the A1 segment and choose a favorable clipping location for the fastness of clip.The results of our study could preoperatively give a useful information to the decision of surgical plan.
基金supported by the Outstanding Youth Team Project of Central Universities(QNTD202308)the Ant Group through CCF-Ant Research Fund(CCF-AFSG 769498 RF20220214).
文摘Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and categorize them into predefined entity types.This process can provide basic support for the automatic construction of knowledge bases.In contrast to general texts,biomedical texts frequently contain numerous nested entities and local dependencies among these entities,presenting significant challenges to prevailing NER models.To address these issues,we propose a novel Chinese nested biomedical NER model based on RoBERTa and Global Pointer(RoBGP).Our model initially utilizes the RoBERTa-wwm-ext-large pretrained language model to dynamically generate word-level initial vectors.It then incorporates a Bidirectional Long Short-Term Memory network for capturing bidirectional semantic information,effectively addressing the issue of long-distance dependencies.Furthermore,the Global Pointer model is employed to comprehensively recognize all nested entities in the text.We conduct extensive experiments on the Chinese medical dataset CMeEE and the results demonstrate the superior performance of RoBGP over several baseline models.This research confirms the effectiveness of RoBGP in Chinese biomedical NER,providing reliable technical support for biomedical information extraction and knowledge base construction.
文摘We modified a three-dimensional cerebral aneurysm model for surgical simulation and educational demonstration. Novel models are made showing perforating arteries arising around the aneurysm. Information about perforating arteries is difficult to obtain from individual radiological data sets. Perforators are therefore reproduced based on previous anatomical knowledge instead of personal data. Due to their fragility, perforating arteries are attached to the model using hard materials. At the same time, hollow models are useful for practicing clip application. We made a model for practicing the application of fenestrated clips for paraclinoid internal carotid aneurysms. Situating aneurysm models in the fissure of a brain model simulates the real surgical field and is helpful for educational demonstrations.
基金Supported by Sichuan Science and Technology Program(2021YFQ0003,2023YFSY0026,2023YFH0004).
文摘In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.
文摘Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.
文摘Video-text retrieval (VTR) is an essential task in multimodal learning, aiming to bridge the semantic gap between visual and textual data. Effective video frame sampling plays a crucial role in improving retrieval performance, as it determines the quality of the visual content representation. Traditional sampling methods, such as uniform sampling and optical flow-based techniques, often fail to capture the full semantic range of videos, leading to redundancy and inefficiencies. In this work, we propose CLIP4Video-Sampling: Global Semantics-Guided Multi-Granularity Frame Sampling for Video-Text Retrieval, a global semantics-guided multi-granularity frame sampling strategy designed to optimize both computational efficiency and retrieval accuracy. By integrating multi-scale global and local temporal sampling and leveraging the CLIP (Contrastive Language-Image Pre-training) model’s powerful feature extraction capabilities, our method significantly outperforms existing approaches in both zero-shot and fine-tuned video-text retrieval tasks on popular datasets. CLIP4Video-Sampling reduces redundancy, ensures keyframe coverage, and serves as an adaptable pre-processing module for multimodal models.