BACKGROUND Microvascular invasion(MVI)is an important prognostic factor in hepatocellular carcinoma(HCC),but its preoperative prediction remains challenging.AIM To develop and validate a 2.5-dimensional(2.5D)deep lear...BACKGROUND Microvascular invasion(MVI)is an important prognostic factor in hepatocellular carcinoma(HCC),but its preoperative prediction remains challenging.AIM To develop and validate a 2.5-dimensional(2.5D)deep learning-based multiinstance learning(MIL)model(MIL signature)for predicting MVI in HCC,evaluate and compare its performance against the radiomics signature and clinical signature,and assess its prognostic predictive value in both surgical resection and transcatheter arterial chemoembolization(TACE)cohorts.METHODS A retrospective cohort consisting of 192 patients with pathologically confirmed HCC was included,of whom 68 were MVI-positive and 124 were MVI-negative.The patients were randomly assigned to a training set(134 patients)and a validation set(58 patients)in a 7:3 ratio.An additional 45 HCC patients undergoing TACE treatment were included in the TACE validation cohort.A modeling strategy based on computed tomography arterial phase images was implemented,utilizing 2.5D deep learning in combination with a MIL framework for the prediction of MVI in HCC.Moreover,this method was compared with the radiomics signature and clinical signatures,and the predictive performance of the various models was evaluated using receiver operating characteristic curves and decision curve analysis(DCA),with DeLong’s test applied to compare the area under the curve(AUC)between models.Kaplan-Meier curves were utilized to analyze differences in recurrence-free survival(RFS)or progression-free survival(PFS)among different HCC treatment cohorts stratified by MIL signature risk.RESULTS MIL signature demonstrated superior performance in the validation set(AUC=0.877),significantly surpassing the radiomics signature(AUC=0.727,P=0.047)and clinical signature(AUC=0.631,P=0.004).DCA curves indicated that the MIL signature provided a greater clinical net benefit across the full spectrum of risk thresholds.In the prognostic analysis,high-and low-risk groups stratified by the MIL signature exhibited significant differences in RFS within the surgical resection cohort(training set P=0.0058,validation set P=0.031)and PFS within the TACE treatment cohort(P=0.045).CONCLUSION MIL signature demonstrates more accurate MVI prediction in HCC,surpassing radiomics signature and clinical signature,and offers precise prognostic stratification,thereby providing new technical support for personalized HCC treatment strategies.展开更多
Image has become an essential medium for expressing meaning and disseminating information.Many images are uploaded to the Internet,among which some are pornographic,causing adverse effects on public psychological heal...Image has become an essential medium for expressing meaning and disseminating information.Many images are uploaded to the Internet,among which some are pornographic,causing adverse effects on public psychological health.To create a clean and positive Internet environment,network enforcement agencies need an automatic and efficient pornographic image recognition tool.Previous studies on pornographic images mainly rely on convolutional neural networks(CNN).Because of CNN’s many parameters,they must rely on a large labeled training dataset,which takes work to build.To reduce the effect of the database on the recognition performance of pornographic images,many researchers view pornographic image recognition as a binary classification task.In actual application,when faced with pornographic images of various features,the performance and recognition accuracy of the network model often decrease.In addition,the pornographic content in images usually lies in several small-sized local regions,which are not a large proportion of the image.CNN,this kind of strong supervised learning method,usually cannot automatically focus on the pornographic area of the image,thus affecting the recognition accuracy of pornographic images.This paper established an image dataset with seven classes by crawling pornographic websites and Baidu Image Library.A weakly supervised pornographic image recognition method based on multiple instance learning(MIL)is proposed.The Squeeze and Extraction(SE)module is introduced in the feature extraction to strengthen the critical information and weaken the influence of non-key and useless information on the result of pornographic image recognition.To meet the requirements of the pooling layer operation in Multiple Instance Learning,we introduced the idea of an attention mechanism to weight and average instances.The experimental results show that the proposed method has better accuracy and F1 scores than other methods.展开更多
With powerful expressiveness of multi-instance multi-label learning(MIML)for objects with multiple semantics and its great flexibility for complex object structures,MIML has been widely applied to various applications...With powerful expressiveness of multi-instance multi-label learning(MIML)for objects with multiple semantics and its great flexibility for complex object structures,MIML has been widely applied to various applications.In practical MIML tasks,the naturally skewed label distribution and label interdependence bring up the label imbalance issue and decrease model performance,which is rarely studied.To solve these problems,we propose an imbalanced multi-instance multi-label learning method via tensor product-based semantic fusion(IMIML-TPSF)to deal with label interdependence and label distribution imbalance simultaneously.Specifically,to reduce the effect of label interdependence,it models similarity between the query object and object sets of different label classes for similarity-structural features.To alleviate disturbance caused by the imbalanced label distribution,it establishes the ensemble model for imbalanced distribution features.Subsequently,IMIML-TPSF fuses two types of features by tensor product and generates the new feature vector,which can preserve the original and interactive feature information for each bag.Based on such features with rich semantics,it trains the robust generalized linear classification model and further captures label interdependence.Extensive experimental results on several datasets validate the effectiveness of IMIML-TPSF against state-of-the-art methods.展开更多
In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances, and the task is to predict the labels of unseen bags. This paper studies multi-instance learning from the v...In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances, and the task is to predict the labels of unseen bags. This paper studies multi-instance learning from the view of supervised learning. First, by analyzing some representative learning algorithms, this paper shows that multi-instance learners can be derived from supervised learners by shifting their focuses from the discrimination on the instances to the discrimination on the bags. Second, considering that ensemble learning paradigms can effectively enhance supervised learners, this paper proposes to build multi-instance ensembles to solve multi-instance problems. Experiments on a real-world benchmark test show that ensemble learning paradigms can significantly enhance multi-instance learners.展开更多
We investigate a problem of object-oriented (OO) software quality estimation from a multi-instance (MI) perspective. In detail,each set of classes that have an inheritance relation,named 'class hierarchy',is r...We investigate a problem of object-oriented (OO) software quality estimation from a multi-instance (MI) perspective. In detail,each set of classes that have an inheritance relation,named 'class hierarchy',is regarded as a bag,while each class in the set is regarded as an instance. The learning task in this study is to estimate the label of unseen bags,i.e.,the fault-proneness of untested class hierarchies. A fault-prone class hierarchy contains at least one fault-prone (negative) class,while a non-fault-prone (positive) one has no negative class. Based on the modification records (MRs) of the previous project releases and OO software metrics,the fault-proneness of an untested class hierarchy can be predicted. Several selected MI learning algorithms were evalu-ated on five datasets collected from an industrial software project. Among the MI learning algorithms investigated in the ex-periments,the kernel method using a dedicated MI-kernel was better than the others in accurately and correctly predicting the fault-proneness of the class hierarchies. In addition,when compared to a supervised support vector machine (SVM) algorithm,the MI-kernel method still had a competitive performance with much less cost.展开更多
基金Supported by the National Natural Science Foundation of China,No.81560278The“Summit Plan(New Departure)”Project for the Development of Doctoral Degree Authorization Points and Professional Disciplines at the Affiliated Hospital of Youjiang Medical University for Nationalities,No.DF20244433+1 种基金Self-funded Research Project by the Guangxi Health and Wellness Committee,No.ZL20240824 and No.Z-L20240834The Project to Enhance the Research Foundations of Young and Mid-career Faculty in Guangxi Universities,No.2024KY0562 and No.2024KY0559。
文摘BACKGROUND Microvascular invasion(MVI)is an important prognostic factor in hepatocellular carcinoma(HCC),but its preoperative prediction remains challenging.AIM To develop and validate a 2.5-dimensional(2.5D)deep learning-based multiinstance learning(MIL)model(MIL signature)for predicting MVI in HCC,evaluate and compare its performance against the radiomics signature and clinical signature,and assess its prognostic predictive value in both surgical resection and transcatheter arterial chemoembolization(TACE)cohorts.METHODS A retrospective cohort consisting of 192 patients with pathologically confirmed HCC was included,of whom 68 were MVI-positive and 124 were MVI-negative.The patients were randomly assigned to a training set(134 patients)and a validation set(58 patients)in a 7:3 ratio.An additional 45 HCC patients undergoing TACE treatment were included in the TACE validation cohort.A modeling strategy based on computed tomography arterial phase images was implemented,utilizing 2.5D deep learning in combination with a MIL framework for the prediction of MVI in HCC.Moreover,this method was compared with the radiomics signature and clinical signatures,and the predictive performance of the various models was evaluated using receiver operating characteristic curves and decision curve analysis(DCA),with DeLong’s test applied to compare the area under the curve(AUC)between models.Kaplan-Meier curves were utilized to analyze differences in recurrence-free survival(RFS)or progression-free survival(PFS)among different HCC treatment cohorts stratified by MIL signature risk.RESULTS MIL signature demonstrated superior performance in the validation set(AUC=0.877),significantly surpassing the radiomics signature(AUC=0.727,P=0.047)and clinical signature(AUC=0.631,P=0.004).DCA curves indicated that the MIL signature provided a greater clinical net benefit across the full spectrum of risk thresholds.In the prognostic analysis,high-and low-risk groups stratified by the MIL signature exhibited significant differences in RFS within the surgical resection cohort(training set P=0.0058,validation set P=0.031)and PFS within the TACE treatment cohort(P=0.045).CONCLUSION MIL signature demonstrates more accurate MVI prediction in HCC,surpassing radiomics signature and clinical signature,and offers precise prognostic stratification,thereby providing new technical support for personalized HCC treatment strategies.
基金This work is supported by the Academic Research Project of Henan Police College(Grant:HNJY-2021-QN-14 and HNJY202220)the Key Technology R&D Program of Henan Province(Grant:222102210041).
文摘Image has become an essential medium for expressing meaning and disseminating information.Many images are uploaded to the Internet,among which some are pornographic,causing adverse effects on public psychological health.To create a clean and positive Internet environment,network enforcement agencies need an automatic and efficient pornographic image recognition tool.Previous studies on pornographic images mainly rely on convolutional neural networks(CNN).Because of CNN’s many parameters,they must rely on a large labeled training dataset,which takes work to build.To reduce the effect of the database on the recognition performance of pornographic images,many researchers view pornographic image recognition as a binary classification task.In actual application,when faced with pornographic images of various features,the performance and recognition accuracy of the network model often decrease.In addition,the pornographic content in images usually lies in several small-sized local regions,which are not a large proportion of the image.CNN,this kind of strong supervised learning method,usually cannot automatically focus on the pornographic area of the image,thus affecting the recognition accuracy of pornographic images.This paper established an image dataset with seven classes by crawling pornographic websites and Baidu Image Library.A weakly supervised pornographic image recognition method based on multiple instance learning(MIL)is proposed.The Squeeze and Extraction(SE)module is introduced in the feature extraction to strengthen the critical information and weaken the influence of non-key and useless information on the result of pornographic image recognition.To meet the requirements of the pooling layer operation in Multiple Instance Learning,we introduced the idea of an attention mechanism to weight and average instances.The experimental results show that the proposed method has better accuracy and F1 scores than other methods.
基金supported by the National Natural Science Foundation of China(Grant Nos.62376281 and 62036013)the NSF for Huxiang Young Talents Program of Hunan Province(2021RC3070).
文摘With powerful expressiveness of multi-instance multi-label learning(MIML)for objects with multiple semantics and its great flexibility for complex object structures,MIML has been widely applied to various applications.In practical MIML tasks,the naturally skewed label distribution and label interdependence bring up the label imbalance issue and decrease model performance,which is rarely studied.To solve these problems,we propose an imbalanced multi-instance multi-label learning method via tensor product-based semantic fusion(IMIML-TPSF)to deal with label interdependence and label distribution imbalance simultaneously.Specifically,to reduce the effect of label interdependence,it models similarity between the query object and object sets of different label classes for similarity-structural features.To alleviate disturbance caused by the imbalanced label distribution,it establishes the ensemble model for imbalanced distribution features.Subsequently,IMIML-TPSF fuses two types of features by tensor product and generates the new feature vector,which can preserve the original and interactive feature information for each bag.Based on such features with rich semantics,it trains the robust generalized linear classification model and further captures label interdependence.Extensive experimental results on several datasets validate the effectiveness of IMIML-TPSF against state-of-the-art methods.
基金Supported by the National Natural Science Foundation of China under Grant Nos. 60105004 and 60325207. Acknowledgements The author wants to thank Min-Ling Zhang for running the experiments, Clancarlo Ruffo for providing the code of RELIC, and Nicolas Bredeche for providing the code of RIPPER-MI. A preliminary version of this paper has been presented at ECML'03 (the 14th European Conference on Machine Learning).
文摘In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances, and the task is to predict the labels of unseen bags. This paper studies multi-instance learning from the view of supervised learning. First, by analyzing some representative learning algorithms, this paper shows that multi-instance learners can be derived from supervised learners by shifting their focuses from the discrimination on the instances to the discrimination on the bags. Second, considering that ensemble learning paradigms can effectively enhance supervised learners, this paper proposes to build multi-instance ensembles to solve multi-instance problems. Experiments on a real-world benchmark test show that ensemble learning paradigms can significantly enhance multi-instance learners.
文摘We investigate a problem of object-oriented (OO) software quality estimation from a multi-instance (MI) perspective. In detail,each set of classes that have an inheritance relation,named 'class hierarchy',is regarded as a bag,while each class in the set is regarded as an instance. The learning task in this study is to estimate the label of unseen bags,i.e.,the fault-proneness of untested class hierarchies. A fault-prone class hierarchy contains at least one fault-prone (negative) class,while a non-fault-prone (positive) one has no negative class. Based on the modification records (MRs) of the previous project releases and OO software metrics,the fault-proneness of an untested class hierarchy can be predicted. Several selected MI learning algorithms were evalu-ated on five datasets collected from an industrial software project. Among the MI learning algorithms investigated in the ex-periments,the kernel method using a dedicated MI-kernel was better than the others in accurately and correctly predicting the fault-proneness of the class hierarchies. In addition,when compared to a supervised support vector machine (SVM) algorithm,the MI-kernel method still had a competitive performance with much less cost.