With powerful expressiveness of multi-instance multi-label learning(MIML)for objects with multiple semantics and its great flexibility for complex object structures,MIML has been widely applied to various applications...With powerful expressiveness of multi-instance multi-label learning(MIML)for objects with multiple semantics and its great flexibility for complex object structures,MIML has been widely applied to various applications.In practical MIML tasks,the naturally skewed label distribution and label interdependence bring up the label imbalance issue and decrease model performance,which is rarely studied.To solve these problems,we propose an imbalanced multi-instance multi-label learning method via tensor product-based semantic fusion(IMIML-TPSF)to deal with label interdependence and label distribution imbalance simultaneously.Specifically,to reduce the effect of label interdependence,it models similarity between the query object and object sets of different label classes for similarity-structural features.To alleviate disturbance caused by the imbalanced label distribution,it establishes the ensemble model for imbalanced distribution features.Subsequently,IMIML-TPSF fuses two types of features by tensor product and generates the new feature vector,which can preserve the original and interactive feature information for each bag.Based on such features with rich semantics,it trains the robust generalized linear classification model and further captures label interdependence.Extensive experimental results on several datasets validate the effectiveness of IMIML-TPSF against state-of-the-art methods.展开更多
In higher education,the initial studying period of each course plays a crucial role for students,and seriously influences the subsequent learning activities.However,given the large size of a course’s students at univ...In higher education,the initial studying period of each course plays a crucial role for students,and seriously influences the subsequent learning activities.However,given the large size of a course’s students at universities,it has become impossible for teachers to keep track of the performance of individual students.In this circumstance,an academic early warning system is desirable,which automatically detects students with difficulties in learning(i.e.,at-risk students)prior to a course starting.However,previous studies are not well suited to this purpose for two reasons:1)they have mainly concentrated on e-learning platforms,e.g.,massive open online courses(MOOCs),and relied on the data about students’online activities,which is hardly accessed in traditional teaching scenarios;and 2)they have only made performance prediction when a course is in progress or even close to the end.In this paper,for traditional classroom-teaching scenarios,we investigate the task of pre-course student performance prediction,which refers to detecting at-risk students for each course before its commencement.To better represent a student sample and utilize the correlations among courses,we cast the problem as a multi-instance multi-label(MIML)problem.Besides,given the problem of data scarcity,we propose a novel multi-task learning method,i.e.,MIML-Circle,to predict the performance of students from different specialties in a unified framework.Extensive experiments are conducted on five real-world datasets,and the results demonstrate the superiority of our approach over the state-of-the-art methods.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.62376281 and 62036013)the NSF for Huxiang Young Talents Program of Hunan Province(2021RC3070).
文摘With powerful expressiveness of multi-instance multi-label learning(MIML)for objects with multiple semantics and its great flexibility for complex object structures,MIML has been widely applied to various applications.In practical MIML tasks,the naturally skewed label distribution and label interdependence bring up the label imbalance issue and decrease model performance,which is rarely studied.To solve these problems,we propose an imbalanced multi-instance multi-label learning method via tensor product-based semantic fusion(IMIML-TPSF)to deal with label interdependence and label distribution imbalance simultaneously.Specifically,to reduce the effect of label interdependence,it models similarity between the query object and object sets of different label classes for similarity-structural features.To alleviate disturbance caused by the imbalanced label distribution,it establishes the ensemble model for imbalanced distribution features.Subsequently,IMIML-TPSF fuses two types of features by tensor product and generates the new feature vector,which can preserve the original and interactive feature information for each bag.Based on such features with rich semantics,it trains the robust generalized linear classification model and further captures label interdependence.Extensive experimental results on several datasets validate the effectiveness of IMIML-TPSF against state-of-the-art methods.
基金This work was supported by the National Natural Sci-ence Foundation of China(Grant Nos.61701281,61573219,and 61876098)Shandong Provincial Natural Science Foundation(ZR2016FM34 andZR2017QF009)+1 种基金Shandong Science and Technology Development Plan(J18KA375),Shandong Social Science Project(18BJYJ04)the Foster-ing Project of Dominant Discipline and Talent Team of Shandong ProvinceHigher Education Institutions.
文摘In higher education,the initial studying period of each course plays a crucial role for students,and seriously influences the subsequent learning activities.However,given the large size of a course’s students at universities,it has become impossible for teachers to keep track of the performance of individual students.In this circumstance,an academic early warning system is desirable,which automatically detects students with difficulties in learning(i.e.,at-risk students)prior to a course starting.However,previous studies are not well suited to this purpose for two reasons:1)they have mainly concentrated on e-learning platforms,e.g.,massive open online courses(MOOCs),and relied on the data about students’online activities,which is hardly accessed in traditional teaching scenarios;and 2)they have only made performance prediction when a course is in progress or even close to the end.In this paper,for traditional classroom-teaching scenarios,we investigate the task of pre-course student performance prediction,which refers to detecting at-risk students for each course before its commencement.To better represent a student sample and utilize the correlations among courses,we cast the problem as a multi-instance multi-label(MIML)problem.Besides,given the problem of data scarcity,we propose a novel multi-task learning method,i.e.,MIML-Circle,to predict the performance of students from different specialties in a unified framework.Extensive experiments are conducted on five real-world datasets,and the results demonstrate the superiority of our approach over the state-of-the-art methods.