The Internet revolution has resulted in abundant data from various sources,including social media,traditional media,etcetera.Although the availability of data is no longer an issue,data labelling for exploiting it in ...The Internet revolution has resulted in abundant data from various sources,including social media,traditional media,etcetera.Although the availability of data is no longer an issue,data labelling for exploiting it in supervised machine learning is still an expensive process and involves tedious human efforts.The overall purpose of this study is to propose a strategy to automatically label the unlabeled textual data with the support of active learning in combination with deep learning.More specifically,this study assesses the performance of different active learning strategies in automatic labelling of the textual dataset at sentence and document levels.To achieve this objective,different experiments have been performed on the publicly available dataset.In first set of experiments,we randomly choose a subset of instances from training dataset and train a deep neural network to assess performance on test set.In the second set of experiments,we replace the random selection with different active learning strategies to choose a subset of the training dataset to train the same model and reassess its performance on test set.The experimental results suggest that different active learning strategies yield performance improvement of 7% on document level datasets and 3%on sentence level datasets for auto labelling.展开更多
Immunohistochemistry(IHC)is a vital technique for detecting specific proteins and antigens in tissue sections using antibodies,aiding in the analysis of tumor growth and metastasis.However,IHC is costly and time-consu...Immunohistochemistry(IHC)is a vital technique for detecting specific proteins and antigens in tissue sections using antibodies,aiding in the analysis of tumor growth and metastasis.However,IHC is costly and time-consuming,making it challenging to implement on a large scale.To address this issue,we introduce a method that enables virtual IHC staining directly on Hematoxylin and Eosin(H&E)images.Firstly,we have developed a novel registration technique,called Bi-stage Registration based on density Clustering(BiReC),to enhance the registration efficiency between H&E and IHC images.This method involves automatically generating numerous Regions Of Interest(ROI)labels on the H&E image for model training,with the labels being determined by the intensity of IHC staining.Secondly,we propose a novel two-branch network architecture,called SeaConvNeXt,which integrates a lightweight Squeeze-Enhanced Axial(SEA)attention mechanism to efficiently extract and fuse multi-level local and global features from H&E images for direct prediction of specific proteins and antigens.The SeaConvNeXt consists of a ConvNeXt branch and a global fusion branch.The ConvNeXt branch extracts multi-level local features at four stages,while the global fusion branch,including an SEA Transformer module and three global blocks,is designed for global feature extraction and multiple feature fusion.Our experiments demonstrate that SeaConvNeXt outperforms current state-of-the-art methods on two public datasets with corresponding IHC and H&E images,achieving an AUC of 90.7%on the HER2SC dataset and 82.5%on the CRC dataset.These results suggest that SeaConvNeXt has great potential for predicting virtual IHC staining on H&E images.展开更多
基金the Deanship of Scientific Research at Shaqra University for supporting this work.
文摘The Internet revolution has resulted in abundant data from various sources,including social media,traditional media,etcetera.Although the availability of data is no longer an issue,data labelling for exploiting it in supervised machine learning is still an expensive process and involves tedious human efforts.The overall purpose of this study is to propose a strategy to automatically label the unlabeled textual data with the support of active learning in combination with deep learning.More specifically,this study assesses the performance of different active learning strategies in automatic labelling of the textual dataset at sentence and document levels.To achieve this objective,different experiments have been performed on the publicly available dataset.In first set of experiments,we randomly choose a subset of instances from training dataset and train a deep neural network to assess performance on test set.In the second set of experiments,we replace the random selection with different active learning strategies to choose a subset of the training dataset to train the same model and reassess its performance on test set.The experimental results suggest that different active learning strategies yield performance improvement of 7% on document level datasets and 3%on sentence level datasets for auto labelling.
基金supported by the National Key R&D Program of China(No.2023YFC3402800)the National Natural Science Foundation of China(Nos.62371276,62272288,and 82272084)the Fundamental Research Funds for the Central Universities,Shaanxi Normal University(No.GK202302006).
文摘Immunohistochemistry(IHC)is a vital technique for detecting specific proteins and antigens in tissue sections using antibodies,aiding in the analysis of tumor growth and metastasis.However,IHC is costly and time-consuming,making it challenging to implement on a large scale.To address this issue,we introduce a method that enables virtual IHC staining directly on Hematoxylin and Eosin(H&E)images.Firstly,we have developed a novel registration technique,called Bi-stage Registration based on density Clustering(BiReC),to enhance the registration efficiency between H&E and IHC images.This method involves automatically generating numerous Regions Of Interest(ROI)labels on the H&E image for model training,with the labels being determined by the intensity of IHC staining.Secondly,we propose a novel two-branch network architecture,called SeaConvNeXt,which integrates a lightweight Squeeze-Enhanced Axial(SEA)attention mechanism to efficiently extract and fuse multi-level local and global features from H&E images for direct prediction of specific proteins and antigens.The SeaConvNeXt consists of a ConvNeXt branch and a global fusion branch.The ConvNeXt branch extracts multi-level local features at four stages,while the global fusion branch,including an SEA Transformer module and three global blocks,is designed for global feature extraction and multiple feature fusion.Our experiments demonstrate that SeaConvNeXt outperforms current state-of-the-art methods on two public datasets with corresponding IHC and H&E images,achieving an AUC of 90.7%on the HER2SC dataset and 82.5%on the CRC dataset.These results suggest that SeaConvNeXt has great potential for predicting virtual IHC staining on H&E images.