Short video applications like Tik Tok have seen significant growth in recent years.One common behavior of users on these platforms is watching and swiping through videos,which can lead to a significant waste of bandwi...Short video applications like Tik Tok have seen significant growth in recent years.One common behavior of users on these platforms is watching and swiping through videos,which can lead to a significant waste of bandwidth.As such,an important challenge in short video streaming is to design a preloading algorithm that can effectively decide which videos to download,at what bitrate,and when to pause the download in order to reduce bandwidth waste while improving the Quality of Experience(QoE).However,designing such an algorithm is non-trivial,especially when considering the conflicting objectives of minimizing bandwidth waste and maximizing QoE.In this paper,we propose an end-to-end Deep reinforcement learning framework with Action Masking called DAM that leverages domain knowledge to learn an optimal policy for short video preloading.To achieve this,we introduce a reward shaping technique to minimize bandwidth waste and use action masking to make actions more reasonable,reduce playback rebuffering,and accelerate the training process.We have conducted extensive experiments using real-world video datasets and network traces including 4G/Wi Fi/5G.Our results show that DAM improves the Qo E score by 3.73%-11.28%compared to state-of-the-art algorithms,and achieves an average bandwidth waste of only 10.27%-12.07%,outperforming all baseline methods.展开更多
With the explosive growth of data available, there is an urgent need to develop continuous data mining which reduces manual interaction evidently. A novel model for data mining is proposed in evolving environment. Fir...With the explosive growth of data available, there is an urgent need to develop continuous data mining which reduces manual interaction evidently. A novel model for data mining is proposed in evolving environment. First, some valid mining task schedules are generated, and then au tonomous and local mining are executed periodically, finally, previous results are merged and refined. The framework based on the model creates a communication mechanism to in corporate domain knowledge into continuous process through ontology service. The local and merge mining are transparent to the end user and heterogeneous data ,source by ontology. Experiments suggest that the framework should be useful in guiding the continuous mining process.展开更多
Extracting mining subsidence land from RS images is one of important research contents for environment monitoring in mining area. The accuracy of traditional extracting models based on spectral features is low. In ord...Extracting mining subsidence land from RS images is one of important research contents for environment monitoring in mining area. The accuracy of traditional extracting models based on spectral features is low. In order to extract subsidence land from RS images with high accuracy, some domain knowledge should be imported and new models should be proposed. This paper, in terms of the disadvantage of traditional extracting models, imports domain knowledge from practice and experience, converts semantic knowledge into digital information, and proposes a new model for the specific task. By selecting Luan mining area as study area, this new model is tested based on GIS and related knowledge. The result shows that the proposed method is more pre- cise than traditional methods and can satisfy the demands of land subsidence monitoring in mining area.展开更多
Immune evolutionary algorithms with domain knowledge were presented to solve the problem of simultaneous localization and mapping for a mobile robot in unknown environments. Two operators with domain knowledge were de...Immune evolutionary algorithms with domain knowledge were presented to solve the problem of simultaneous localization and mapping for a mobile robot in unknown environments. Two operators with domain knowledge were designed in algorithms, where the feature of parallel line segments without the problem of data association was used to construct a vaccination operator, and the characters of convex vertices in polygonal obstacle were extended to develop a pulling operator of key point grid. The experimental results of a real mobile robot show that the computational expensiveness of algorithms designed is less than other evolutionary algorithms for simultaneous localization and mapping and the maps obtained are very accurate. Because immune evolutionary algorithms with domain knowledge have some advantages, the convergence rate of designed algorithms is about 44% higher than those of other algorithms.展开更多
A mathematical formula of high physical interpretation,and accurate prediction and large generaliza-tion power is highly desirable for science,technology and engineering.In this study,we performed a domain knowledge-g...A mathematical formula of high physical interpretation,and accurate prediction and large generaliza-tion power is highly desirable for science,technology and engineering.In this study,we performed a domain knowledge-guided machine learning to discover high interpretive formula describing the high-temperature oxidation behavior of FeCrAlCoNi-based high entropy alloys(HEAs).The domain knowledge suggests that the exposure time dependent and thermally activated oxidation behavior can be described by the synergy formula of power law multiplying Arrhenius equation.The pre-factor,time exponent(m),and activation energy(Q)are dependent on the chemical compositions of eight elements in the FeCrAlCoNi-based HEAs.The Tree-Classifier for Linear Regression(TCLR)algorithm utilizes the two exper-imental features of exposure time(t)and temperature(T)to extract the spectrums of activation energy(Q)and time exponent(m)from the complex and high dimensional feature space,which automatically gives the spectrum of pre-factor.The three spectrums are assembled by using the element features,which leads to a general and interpretive formula with high prediction accuracy of the determination coefficient R^(2)=0.971.The role of each chemical element in the high-temperature oxidation behavior is analytically illustrated in the three spectrums,thereby the discovered interpretative formula provides a guidance to the inverse design of HEAs against high-temperature oxidation.The present work demonstrates the sig-nificance of domain knowledge in the development of materials informatics.展开更多
With the rise of open-source software,the social development paradigm occupies an indispensable position in the current software development process.This paper puts forward a variant of the PageRank algorithm to build...With the rise of open-source software,the social development paradigm occupies an indispensable position in the current software development process.This paper puts forward a variant of the PageRank algorithm to build the importance assessment model,which provides quantifiable importance assessment metrics for new Java projects based on Java open-source projects or components.The critical point of the model is to use crawlers to obtain relevant information about Java open-source projects in the GitHub open-source community to build a domain knowledge graph.According to the three dimensions of the Java open-source project’s project influence,project activity and project popularity,the project is measured.A modified PageRank algorithm is proposed to construct the importance evaluation model.Thereby providing quantifiable importance evaluation indicators for new Java projects based on or components of Java open-source projects.This article evaluates the importance of 4512 Java open-source projects obtained on GitHub and has a good effect.展开更多
Understanding the abnormal electricity usage behavior of buildings is essential to enhance the resilience,efficiency,and security of urban/building energy systems while safeguarding occupant comfort.However,data refle...Understanding the abnormal electricity usage behavior of buildings is essential to enhance the resilience,efficiency,and security of urban/building energy systems while safeguarding occupant comfort.However,data reflecting such behavior are often considered as outliers,and removed or smoothed during preprocessing,limiting insights into their potential impacts.This paper proposes an abnormal behavior analysis method that identifies outliers(considering data distribution)and anomalies(considering the physical context)based on the statistical principle and domain knowledge,assessing their effects on energy supply security.A 4-quadrant graph is proposed to quantify and categorize the impacts of buildings on urban energy systems.The method is illustrated by data from 1,451 buildings in a city.Results show that the proposed method can identify abnormal data effectively.Buildings in the primary industry have more outliers,while those in the tertiary industry have more anomalies.Seven buildings affecting both the security and economy of urban energy systems are identified.The outliers rise more frequently from 8:00 to 18:00,on weekdays and in the summer and winter months.However,the anomaly distribution has a weak connection with time.Moreover,the abnormal electricity usage behavior positively correlates with outdoor air temperatures.This method provides a new perspective for identifying potential risks,managing energy usage behavior,and enhancing flexibility of the urban energy systems.展开更多
One key challenge in materials informatics is how to effectively use the material data of small size to search for desired materials from a huge unexplored material space.We review the recent progress on the use of to...One key challenge in materials informatics is how to effectively use the material data of small size to search for desired materials from a huge unexplored material space.We review the recent progress on the use of tools from data science and domain knowledge to mitigate the issues arising from limited materials data.The enhancement of data quality and amount via data augmentation and feature engineering isfirst summarized and discussed.Then the strategies that use ensemble model and transfer learning for improved machine learning model are overviewed.Next,we move to the active learning with emphasis on the uncertainty quantification and evaluation.Subsequently,the merits of the combination of domain knowledge and machine learning are stressed.Finally,we discuss some applications of large language models in thefield of materials science.We summarize this review by posing the challenges and opportunities in thefield of machine learning for small material data.展开更多
Meta-heuristic algorithms search the problem solution space to obtain a satisfactory solution within a reasonable timeframe.By combining domain knowledge of the specific optimization problem,the search efficiency and ...Meta-heuristic algorithms search the problem solution space to obtain a satisfactory solution within a reasonable timeframe.By combining domain knowledge of the specific optimization problem,the search efficiency and quality of meta-heuristic algorithms can be significantly improved,making it crucial to identify and summarize domain knowledge within the problem.In this paper,we summarize and analyze domain knowledge that can be applied to meta-heuristic algorithms in the job-shop scheduling problem(JSP).Firstly,this paper delves into the importance of domain knowledge in optimization algorithm design.After that,the development of different methods for the JSP are reviewed,and the domain knowledge in it for meta-heuristic algorithms is summarized and classified.Applications of this domain knowledge are analyzed,showing it is indispensable in ensuring the optimization performance of meta-heuristic algorithms.Finally,this paper analyzes the relationship among domain knowledge,optimization problems,and optimization algorithms,and points out the shortcomings of the existing research and puts forward research prospects.This paper comprehensively summarizes the domain knowledge in the JSP,and discusses the relationship between the optimization problems,optimization algorithms and domain knowledge,which provides a research direction for the metaheuristic algorithm design for solving the JSP in the future.展开更多
The precise prediction of molecular properties is essential for advancements in drug development,particularly in virtual screening and compound optimization.The recent introduction of numerous deep learningbased metho...The precise prediction of molecular properties is essential for advancements in drug development,particularly in virtual screening and compound optimization.The recent introduction of numerous deep learningbased methods has shown remarkable potential in enhancing Molecular Property Prediction(MPP),especially improving accuracy and insights into molecular structures.Yet,two critical questions arise:does the integration of domain knowledge augment the accuracy of molecular property prediction and does employing multi-modal data fusion yield more precise results than unique data source methods?To explore these matters,we comprehensively review and quantitatively analyze recent deep learning methods based on various benchmarks.We discover that integrating molecular information significantly improves Molecular Property Prediction(MPP)for both regression and classification tasks.Specifically,regression improvements,measured by reductions in Root Mean Square Error(RMSE),are up to 4.0%,while classification enhancements,measured by the area under the receiver operating characteristic curve(ROC-AUC),are up to 1.7%.Additionally,we discover that,as measured by ROC-AUC,augmenting 2D graphs with 3D information improves performance for classification tasks by up to 13.2%and enriching 2D graphs with 1D SMILES boosts multi-modal learning performance for regression tasks by up to 9.1%.The two consolidated insights offer crucial guidance for future advancements in drug discovery.展开更多
BACKGROUND In the rapidly evolving landscape of psychiatric research,2023 marked another year of significant progress globally,with the World Journal of Psychiatry(WJP)experiencing notable expansion and influence.AIM ...BACKGROUND In the rapidly evolving landscape of psychiatric research,2023 marked another year of significant progress globally,with the World Journal of Psychiatry(WJP)experiencing notable expansion and influence.AIM To conduct a comprehensive visualization and analysis of the articles published in the WJP throughout 2023.By delving into these publications,the aim is to deter-mine the valuable insights that can illuminate pathways for future research endeavors in the field of psychiatry.METHODS A selection process led to the inclusion of 107 papers from the WJP published in 2023,forming the dataset for the analysis.Employing advanced visualization techniques,this study mapped the knowledge domains represented in these papers.RESULTS The findings revealed a prevalent focus on key topics such as depression,mental health,anxiety,schizophrenia,and the impact of coronavirus disease 2019.Additionally,through keyword clustering,it became evident that these papers were predominantly focused on exploring mental health disorders,depression,anxiety,schizophrenia,and related factors.Noteworthy contributions hailed authors in regions such as China,the United Kingdom,United States,and Turkey.Particularly,the paper garnered the highest number of citations,while the American Psychiatric Association was the most cited reference.CONCLUSION It is recommended that the WJP continue in its efforts to enhance the quality of papers published in the field of psychiatry.Additionally,there is a pressing need to delve into the potential applications of digital interventions and artificial intelligence within the discipline.展开更多
Despite the huge accumulation of scientific literature,it is inefficient and laborious to manually search it for useful information to investigate structure-activity relationships.Here,we propose an efficient text-min...Despite the huge accumulation of scientific literature,it is inefficient and laborious to manually search it for useful information to investigate structure-activity relationships.Here,we propose an efficient text-mining framework for the discovery of credible and valuable domain knowledge from abstracts of scientific literature focusing on Nickel-based single crystal superalloys.Firstly,the credibility of abstracts is quantified in terms of source timeliness,publication authority and author’s academic standing.Next,eight entity types and domain dictionaries describing Nickel-based single crystal superalloys are predefined to realize the named entity recognition from the abstracts,achieving an accuracy of 85.10%.Thirdly,by formulating 12 naming rules for the alloy brands derived from the recognized entities,we extract the target entities and refine them as domain knowledge through the credibility analysis.Following this,we also map out the academic cooperative“Author-Literature-Institute”network,characterize the generations of Nickel-based single crystal superalloys,as well as obtain the fractions of the most important chemical elements in superalloys.The extracted rich and diverse knowledge of Nickel-based single crystal superalloys provides important insights toward understanding the structure-activity relationships for Nickel-based single crystal superalloys and is expected to accelerate the design and discovery of novel superalloys.展开更多
Deep learning provides an effective way for automatic classification of cardiac arrhythmias,but in clinical decisionmaking,pure data-driven methods working as black-boxes may lead to unsatisfactory results.A promising...Deep learning provides an effective way for automatic classification of cardiac arrhythmias,but in clinical decisionmaking,pure data-driven methods working as black-boxes may lead to unsatisfactory results.A promising solution is combining domain knowledge with deep learning.This paper develops a flexible and extensible framework for integrating domain knowledge with a deep neural network.The model consists of a deep neural network to capture the statistical pattern between input data and the ground-truth label,and a knowledge module to guarantee consistency with the domain knowledge.These two components are trained interactively to bring the best of both worlds.The experiments show that the domain knowledge is valuable in refining the neural network prediction and thus improves accuracy.展开更多
A domain knowledge driven user interface development approach is described.As a conceptual de- sign of the user interface,the domain knowledge defines the user interface in terms of objects,actions and their relations...A domain knowledge driven user interface development approach is described.As a conceptual de- sign of the user interface,the domain knowledge defines the user interface in terms of objects,actions and their relationships that the user would use to interact with the application system.It also serves as input to a user interface management system(UIMS)and is the kernel of the target user interface. The principal ideas and the implementation techniques of the approach is discussed.The user interface model,user interface designer oriented high-level specification notation,and the transformation algorithms on domain knowledge are presented.展开更多
Research papers in the field of SLA published between 2009 and 2019 are analyzed in terms of research status of domes⁃tic SLA researchers,research institutions,research frontiers and hotspots in the paper,and maps the...Research papers in the field of SLA published between 2009 and 2019 are analyzed in terms of research status of domes⁃tic SLA researchers,research institutions,research frontiers and hotspots in the paper,and maps the knowledge domains of SLA re⁃searches.The data are retrieved from 10 core journals of linguistics via the CNKI journal database.By means of CiteSpace 5.3,an analysis of the overall trend of studies on SLA in China is made.展开更多
Side-scan sonar(SSS)is now a prevalent instrument for large-scale seafloor topography measurements,deployable on an autonomous underwater vehicle(AUV)to execute fully automated underwater acoustic scanning imaging alo...Side-scan sonar(SSS)is now a prevalent instrument for large-scale seafloor topography measurements,deployable on an autonomous underwater vehicle(AUV)to execute fully automated underwater acoustic scanning imaging along a predetermined trajectory.However,SSS images often suffer from speckle noise caused by mutual interference between echoes,and limited AUV computational resources further hinder noise suppression.Existing approaches for SSS image processing and speckle noise reduction rely heavily on complex network structures and fail to combine the benefits of deep learning and domain knowledge.To address the problem,Rep DNet,a novel and effective despeckling convolutional neural network is proposed.Rep DNet introduces two re-parameterized blocks:the Pixel Smoothing Block(PSB)and Edge Enhancement Block(EEB),preserving edge information while attenuating speckle noise.During training,PSB and EEB manifest as double-layered multi-branch structures,integrating first-order and secondorder derivatives and smoothing functions.During inference,the branches are re-parameterized into a 3×3 convolution,enabling efficient inference without sacrificing accuracy.Rep DNet comprises three computational operations:3×3 convolution,element-wise summation and Rectified Linear Unit activation.Evaluations on benchmark datasets,a real SSS dataset and Data collected at Lake Mulan aestablish Rep DNet as a well-balanced network,meeting the AUV computational constraints in terms of performance and latency.展开更多
AIM:To track the knowledge structure,topics in focus,and trends in emerging research in pterygium in the past 20 y.METHODS:Base on the Web of Science Core Collection(Wo SCC),studies related to pterygium in the past 20...AIM:To track the knowledge structure,topics in focus,and trends in emerging research in pterygium in the past 20 y.METHODS:Base on the Web of Science Core Collection(Wo SCC),studies related to pterygium in the past 20 y from 2000-2019 have been included.With the help of VOSviewer software,a knowledge map was constructed and the distribution of countries,institutions,journals,and authors in the field of pterygium noted.Meanwhile,using cocitation analysis of references and co-occurrence analysis of keywords,we identified basis and hotspots,thereby obtaining an overview of this field.RESULTS:The search retrieved 1516 publications from Wo SCC on pterygium published between 2000 and 2019.In the past two decades,the annual number of publications is on the rise and fluctuated a little.Most productive institutions are from Singapore but the most prolific and active country is the United States.Journal Cornea published the most articles and Coroneo MT contributed the most publications on pterygium.From cooccurrence analysis,the keywords formed 3 clusters:1)surgical therapeutic techniques and adjuvant of pterygium,2)occurrence process and pathogenesis of pterygium,and 3)epidemiology,and etiology of pterygium formation.These three clusters were consistent with the clustering in co-citation analysis,in which Cluster 1 contained the most references(74 publications,47.74%),Cluster 2 contained 53 publications,accounting for 34.19%,and Cluster 3 focused on epidemiology with 18.06%of total 155 cocitation publications.CONCLUSION:This study demonstrates that the research of pterygium is gradually attracting the attention of scholars and researchers.The interaction between authors,institutions,and countries is lack of.Even though,the research hotspot,distribution,and research status in pterygium in this study could provide valuable information for scholars and researchers.展开更多
The key activity to build semantic web is to build ontologies. But today, the theory and methodology of ontology construction is still far from ready. This paper proposed a theoretical framework for massive knowledge ...The key activity to build semantic web is to build ontologies. But today, the theory and methodology of ontology construction is still far from ready. This paper proposed a theoretical framework for massive knowledge management- the knowledge domain framework (KDF), and introduces an integrated development environment (IDE) named large-scale ontology development environment (LODE), which implements the proposed theoretical framework. We also compared LODE with other popular ontology development environments in this paper. The practice of using LODE on management and development of agriculture ontologies shows that knowledge domain framework can handle the development activities of large scale ontologies. Application studies based on the described briefly. principle of knowledge domain framework and LODE was展开更多
The development of the information age and globalization has challenged the training of technical talents in the 21st century, and the information media and technical skills are becoming increasingly important. As a c...The development of the information age and globalization has challenged the training of technical talents in the 21st century, and the information media and technical skills are becoming increasingly important. As a creative sharing form of multimedia, the digital storytelling is being concerned by more and more educators because of its discipline applicability and media technology enhancing ability. In this study, the information visualization software, i.e. CiteSpace was applied to visualize and analyze the researches on digital storytelling from the aspects of key articles and citation hotspots, and make a review on the research status of the digital storytelling in the education fields, such as promoting language learning, and helping students develop the 21 st century skills.展开更多
The characteristics of design process, design object and domain knowledge of complex product are analyzed. A kind of knowledge representation schema based on integrated generalized rule is stated. An AND-OR tree based...The characteristics of design process, design object and domain knowledge of complex product are analyzed. A kind of knowledge representation schema based on integrated generalized rule is stated. An AND-OR tree based model of concept for domain knowledge is set up. The strategy of multilevel domain knowledge acquisition based on the model is presented. The intelligent multilevel knowledge acquisition system (IMKAS) for product design is developed, and it is applied in the intelligent decision support system of concept design of complex product.展开更多
基金supported by the National Key Research and Development Program of China(No.2021YFF0900503)partly by the National Natural Science Foundation of China(No.62262018,61971382)。
文摘Short video applications like Tik Tok have seen significant growth in recent years.One common behavior of users on these platforms is watching and swiping through videos,which can lead to a significant waste of bandwidth.As such,an important challenge in short video streaming is to design a preloading algorithm that can effectively decide which videos to download,at what bitrate,and when to pause the download in order to reduce bandwidth waste while improving the Quality of Experience(QoE).However,designing such an algorithm is non-trivial,especially when considering the conflicting objectives of minimizing bandwidth waste and maximizing QoE.In this paper,we propose an end-to-end Deep reinforcement learning framework with Action Masking called DAM that leverages domain knowledge to learn an optimal policy for short video preloading.To achieve this,we introduce a reward shaping technique to minimize bandwidth waste and use action masking to make actions more reasonable,reduce playback rebuffering,and accelerate the training process.We have conducted extensive experiments using real-world video datasets and network traces including 4G/Wi Fi/5G.Our results show that DAM improves the Qo E score by 3.73%-11.28%compared to state-of-the-art algorithms,and achieves an average bandwidth waste of only 10.27%-12.07%,outperforming all baseline methods.
基金Supported by the National Natural Science Foun-dation of China (60173058 ,70372024)
文摘With the explosive growth of data available, there is an urgent need to develop continuous data mining which reduces manual interaction evidently. A novel model for data mining is proposed in evolving environment. First, some valid mining task schedules are generated, and then au tonomous and local mining are executed periodically, finally, previous results are merged and refined. The framework based on the model creates a communication mechanism to in corporate domain knowledge into continuous process through ontology service. The local and merge mining are transparent to the end user and heterogeneous data ,source by ontology. Experiments suggest that the framework should be useful in guiding the continuous mining process.
基金Project 50774080 supported by the National Natural Science Foundation of China
文摘Extracting mining subsidence land from RS images is one of important research contents for environment monitoring in mining area. The accuracy of traditional extracting models based on spectral features is low. In order to extract subsidence land from RS images with high accuracy, some domain knowledge should be imported and new models should be proposed. This paper, in terms of the disadvantage of traditional extracting models, imports domain knowledge from practice and experience, converts semantic knowledge into digital information, and proposes a new model for the specific task. By selecting Luan mining area as study area, this new model is tested based on GIS and related knowledge. The result shows that the proposed method is more pre- cise than traditional methods and can satisfy the demands of land subsidence monitoring in mining area.
基金Projects(60234030 60404021) supported by the National Natural Science Foundation of China
文摘Immune evolutionary algorithms with domain knowledge were presented to solve the problem of simultaneous localization and mapping for a mobile robot in unknown environments. Two operators with domain knowledge were designed in algorithms, where the feature of parallel line segments without the problem of data association was used to construct a vaccination operator, and the characters of convex vertices in polygonal obstacle were extended to develop a pulling operator of key point grid. The experimental results of a real mobile robot show that the computational expensiveness of algorithms designed is less than other evolutionary algorithms for simultaneous localization and mapping and the maps obtained are very accurate. Because immune evolutionary algorithms with domain knowledge have some advantages, the convergence rate of designed algorithms is about 44% higher than those of other algorithms.
基金financially supported by the National Key Re-search and Development Program of China(No.2018YFB0704400)the Key Program of Science and Technology of Yunnan Province(No.202002AB080001-2)+1 种基金the Key Research Project of Zhejiang Laboratory(No.2021PE0AC02)the Shanghai Pujiang Program(No.20PJ1403700).
文摘A mathematical formula of high physical interpretation,and accurate prediction and large generaliza-tion power is highly desirable for science,technology and engineering.In this study,we performed a domain knowledge-guided machine learning to discover high interpretive formula describing the high-temperature oxidation behavior of FeCrAlCoNi-based high entropy alloys(HEAs).The domain knowledge suggests that the exposure time dependent and thermally activated oxidation behavior can be described by the synergy formula of power law multiplying Arrhenius equation.The pre-factor,time exponent(m),and activation energy(Q)are dependent on the chemical compositions of eight elements in the FeCrAlCoNi-based HEAs.The Tree-Classifier for Linear Regression(TCLR)algorithm utilizes the two exper-imental features of exposure time(t)and temperature(T)to extract the spectrums of activation energy(Q)and time exponent(m)from the complex and high dimensional feature space,which automatically gives the spectrum of pre-factor.The three spectrums are assembled by using the element features,which leads to a general and interpretive formula with high prediction accuracy of the determination coefficient R^(2)=0.971.The role of each chemical element in the high-temperature oxidation behavior is analytically illustrated in the three spectrums,thereby the discovered interpretative formula provides a guidance to the inverse design of HEAs against high-temperature oxidation.The present work demonstrates the sig-nificance of domain knowledge in the development of materials informatics.
基金This work has been supported by the National Science Foundation of China Grant No.61762092“Dynamic multi-objective requirement optimization based on transfer learning,”and the Open Foundation of the Key Laboratory in Software Engineering of Yunnan Province,Grant No.2017SE204+1 种基金“Research on extracting software feature models using transfer learning,”and the National Science Foundation of China Grant No.61762089“The key research of high order tensor decomposition in a distributed environment”.
文摘With the rise of open-source software,the social development paradigm occupies an indispensable position in the current software development process.This paper puts forward a variant of the PageRank algorithm to build the importance assessment model,which provides quantifiable importance assessment metrics for new Java projects based on Java open-source projects or components.The critical point of the model is to use crawlers to obtain relevant information about Java open-source projects in the GitHub open-source community to build a domain knowledge graph.According to the three dimensions of the Java open-source project’s project influence,project activity and project popularity,the project is measured.A modified PageRank algorithm is proposed to construct the importance evaluation model.Thereby providing quantifiable importance evaluation indicators for new Java projects based on or components of Java open-source projects.This article evaluates the importance of 4512 Java open-source projects obtained on GitHub and has a good effect.
基金funded by the program Research and Application of Demand Response Potential Evaluation Technologies Based on Massive Electricity Data(No.B31532238944)supported by the State Grid Hubei Electric Power Research Institute.
文摘Understanding the abnormal electricity usage behavior of buildings is essential to enhance the resilience,efficiency,and security of urban/building energy systems while safeguarding occupant comfort.However,data reflecting such behavior are often considered as outliers,and removed or smoothed during preprocessing,limiting insights into their potential impacts.This paper proposes an abnormal behavior analysis method that identifies outliers(considering data distribution)and anomalies(considering the physical context)based on the statistical principle and domain knowledge,assessing their effects on energy supply security.A 4-quadrant graph is proposed to quantify and categorize the impacts of buildings on urban energy systems.The method is illustrated by data from 1,451 buildings in a city.Results show that the proposed method can identify abnormal data effectively.Buildings in the primary industry have more outliers,while those in the tertiary industry have more anomalies.Seven buildings affecting both the security and economy of urban energy systems are identified.The outliers rise more frequently from 8:00 to 18:00,on weekdays and in the summer and winter months.However,the anomaly distribution has a weak connection with time.Moreover,the abnormal electricity usage behavior positively correlates with outdoor air temperatures.This method provides a new perspective for identifying potential risks,managing energy usage behavior,and enhancing flexibility of the urban energy systems.
基金supported by the National Key Research and Development Program of China(2021YFB3702601)the National Natural Science Foundation of China(52002326).
文摘One key challenge in materials informatics is how to effectively use the material data of small size to search for desired materials from a huge unexplored material space.We review the recent progress on the use of tools from data science and domain knowledge to mitigate the issues arising from limited materials data.The enhancement of data quality and amount via data augmentation and feature engineering isfirst summarized and discussed.Then the strategies that use ensemble model and transfer learning for improved machine learning model are overviewed.Next,we move to the active learning with emphasis on the uncertainty quantification and evaluation.Subsequently,the merits of the combination of domain knowledge and machine learning are stressed.Finally,we discuss some applications of large language models in thefield of materials science.We summarize this review by posing the challenges and opportunities in thefield of machine learning for small material data.
基金supported by the National Natural Science Foundation of China(Nos.U21B2029 and 51825502).
文摘Meta-heuristic algorithms search the problem solution space to obtain a satisfactory solution within a reasonable timeframe.By combining domain knowledge of the specific optimization problem,the search efficiency and quality of meta-heuristic algorithms can be significantly improved,making it crucial to identify and summarize domain knowledge within the problem.In this paper,we summarize and analyze domain knowledge that can be applied to meta-heuristic algorithms in the job-shop scheduling problem(JSP).Firstly,this paper delves into the importance of domain knowledge in optimization algorithm design.After that,the development of different methods for the JSP are reviewed,and the domain knowledge in it for meta-heuristic algorithms is summarized and classified.Applications of this domain knowledge are analyzed,showing it is indispensable in ensuring the optimization performance of meta-heuristic algorithms.Finally,this paper analyzes the relationship among domain knowledge,optimization problems,and optimization algorithms,and points out the shortcomings of the existing research and puts forward research prospects.This paper comprehensively summarizes the domain knowledge in the JSP,and discusses the relationship between the optimization problems,optimization algorithms and domain knowledge,which provides a research direction for the metaheuristic algorithm design for solving the JSP in the future.
文摘The precise prediction of molecular properties is essential for advancements in drug development,particularly in virtual screening and compound optimization.The recent introduction of numerous deep learningbased methods has shown remarkable potential in enhancing Molecular Property Prediction(MPP),especially improving accuracy and insights into molecular structures.Yet,two critical questions arise:does the integration of domain knowledge augment the accuracy of molecular property prediction and does employing multi-modal data fusion yield more precise results than unique data source methods?To explore these matters,we comprehensively review and quantitatively analyze recent deep learning methods based on various benchmarks.We discover that integrating molecular information significantly improves Molecular Property Prediction(MPP)for both regression and classification tasks.Specifically,regression improvements,measured by reductions in Root Mean Square Error(RMSE),are up to 4.0%,while classification enhancements,measured by the area under the receiver operating characteristic curve(ROC-AUC),are up to 1.7%.Additionally,we discover that,as measured by ROC-AUC,augmenting 2D graphs with 3D information improves performance for classification tasks by up to 13.2%and enriching 2D graphs with 1D SMILES boosts multi-modal learning performance for regression tasks by up to 9.1%.The two consolidated insights offer crucial guidance for future advancements in drug discovery.
基金Supported by Philosophy and Social Science Foundation of Hunan Province,China,No.23YBJ08China Youth&Children Research Association,No.2023B01Research Project on the Theories and Practice of Hunan Women,No.22YB06.
文摘BACKGROUND In the rapidly evolving landscape of psychiatric research,2023 marked another year of significant progress globally,with the World Journal of Psychiatry(WJP)experiencing notable expansion and influence.AIM To conduct a comprehensive visualization and analysis of the articles published in the WJP throughout 2023.By delving into these publications,the aim is to deter-mine the valuable insights that can illuminate pathways for future research endeavors in the field of psychiatry.METHODS A selection process led to the inclusion of 107 papers from the WJP published in 2023,forming the dataset for the analysis.Employing advanced visualization techniques,this study mapped the knowledge domains represented in these papers.RESULTS The findings revealed a prevalent focus on key topics such as depression,mental health,anxiety,schizophrenia,and the impact of coronavirus disease 2019.Additionally,through keyword clustering,it became evident that these papers were predominantly focused on exploring mental health disorders,depression,anxiety,schizophrenia,and related factors.Noteworthy contributions hailed authors in regions such as China,the United Kingdom,United States,and Turkey.Particularly,the paper garnered the highest number of citations,while the American Psychiatric Association was the most cited reference.CONCLUSION It is recommended that the WJP continue in its efforts to enhance the quality of papers published in the field of psychiatry.Additionally,there is a pressing need to delve into the potential applications of digital interventions and artificial intelligence within the discipline.
基金supported by the National Natural Science Foundation of China(Grant No.52073169)the National Key Research and Development Program of China(Grant No.2021YFB3802101)the Key Research Project of Zhejiang Laboratory(Grant No.2021PE0AC02)。
文摘Despite the huge accumulation of scientific literature,it is inefficient and laborious to manually search it for useful information to investigate structure-activity relationships.Here,we propose an efficient text-mining framework for the discovery of credible and valuable domain knowledge from abstracts of scientific literature focusing on Nickel-based single crystal superalloys.Firstly,the credibility of abstracts is quantified in terms of source timeliness,publication authority and author’s academic standing.Next,eight entity types and domain dictionaries describing Nickel-based single crystal superalloys are predefined to realize the named entity recognition from the abstracts,achieving an accuracy of 85.10%.Thirdly,by formulating 12 naming rules for the alloy brands derived from the recognized entities,we extract the target entities and refine them as domain knowledge through the credibility analysis.Following this,we also map out the academic cooperative“Author-Literature-Institute”network,characterize the generations of Nickel-based single crystal superalloys,as well as obtain the fractions of the most important chemical elements in superalloys.The extracted rich and diverse knowledge of Nickel-based single crystal superalloys provides important insights toward understanding the structure-activity relationships for Nickel-based single crystal superalloys and is expected to accelerate the design and discovery of novel superalloys.
基金roject supported by the Scientific Research Fund of Zhejiang Provincial Education Department(No.Y201839519)the Ningbo Natural Science Foundation(No.2019A610087)。
文摘Deep learning provides an effective way for automatic classification of cardiac arrhythmias,but in clinical decisionmaking,pure data-driven methods working as black-boxes may lead to unsatisfactory results.A promising solution is combining domain knowledge with deep learning.This paper develops a flexible and extensible framework for integrating domain knowledge with a deep neural network.The model consists of a deep neural network to capture the statistical pattern between input data and the ground-truth label,and a knowledge module to guarantee consistency with the domain knowledge.These two components are trained interactively to bring the best of both worlds.The experiments show that the domain knowledge is valuable in refining the neural network prediction and thus improves accuracy.
基金The project is supported by National Natural Science Foundation of China.
文摘A domain knowledge driven user interface development approach is described.As a conceptual de- sign of the user interface,the domain knowledge defines the user interface in terms of objects,actions and their relationships that the user would use to interact with the application system.It also serves as input to a user interface management system(UIMS)and is the kernel of the target user interface. The principal ideas and the implementation techniques of the approach is discussed.The user interface model,user interface designer oriented high-level specification notation,and the transformation algorithms on domain knowledge are presented.
文摘Research papers in the field of SLA published between 2009 and 2019 are analyzed in terms of research status of domes⁃tic SLA researchers,research institutions,research frontiers and hotspots in the paper,and maps the knowledge domains of SLA re⁃searches.The data are retrieved from 10 core journals of linguistics via the CNKI journal database.By means of CiteSpace 5.3,an analysis of the overall trend of studies on SLA in China is made.
基金supported by the National Key R&D Program of China(Grant No.2023YFC3010803)the National Nature Science Foundation of China(Grant No.52272424)+1 种基金the Key R&D Program of Hubei Province of China(Grant No.2023BCB123)the Fundamental Research Funds for the Central Universities(Grant No.WUT:2023IVB079)。
文摘Side-scan sonar(SSS)is now a prevalent instrument for large-scale seafloor topography measurements,deployable on an autonomous underwater vehicle(AUV)to execute fully automated underwater acoustic scanning imaging along a predetermined trajectory.However,SSS images often suffer from speckle noise caused by mutual interference between echoes,and limited AUV computational resources further hinder noise suppression.Existing approaches for SSS image processing and speckle noise reduction rely heavily on complex network structures and fail to combine the benefits of deep learning and domain knowledge.To address the problem,Rep DNet,a novel and effective despeckling convolutional neural network is proposed.Rep DNet introduces two re-parameterized blocks:the Pixel Smoothing Block(PSB)and Edge Enhancement Block(EEB),preserving edge information while attenuating speckle noise.During training,PSB and EEB manifest as double-layered multi-branch structures,integrating first-order and secondorder derivatives and smoothing functions.During inference,the branches are re-parameterized into a 3×3 convolution,enabling efficient inference without sacrificing accuracy.Rep DNet comprises three computational operations:3×3 convolution,element-wise summation and Rectified Linear Unit activation.Evaluations on benchmark datasets,a real SSS dataset and Data collected at Lake Mulan aestablish Rep DNet as a well-balanced network,meeting the AUV computational constraints in terms of performance and latency.
基金the National Natural Science Foundation of China(No.81870644)。
文摘AIM:To track the knowledge structure,topics in focus,and trends in emerging research in pterygium in the past 20 y.METHODS:Base on the Web of Science Core Collection(Wo SCC),studies related to pterygium in the past 20 y from 2000-2019 have been included.With the help of VOSviewer software,a knowledge map was constructed and the distribution of countries,institutions,journals,and authors in the field of pterygium noted.Meanwhile,using cocitation analysis of references and co-occurrence analysis of keywords,we identified basis and hotspots,thereby obtaining an overview of this field.RESULTS:The search retrieved 1516 publications from Wo SCC on pterygium published between 2000 and 2019.In the past two decades,the annual number of publications is on the rise and fluctuated a little.Most productive institutions are from Singapore but the most prolific and active country is the United States.Journal Cornea published the most articles and Coroneo MT contributed the most publications on pterygium.From cooccurrence analysis,the keywords formed 3 clusters:1)surgical therapeutic techniques and adjuvant of pterygium,2)occurrence process and pathogenesis of pterygium,and 3)epidemiology,and etiology of pterygium formation.These three clusters were consistent with the clustering in co-citation analysis,in which Cluster 1 contained the most references(74 publications,47.74%),Cluster 2 contained 53 publications,accounting for 34.19%,and Cluster 3 focused on epidemiology with 18.06%of total 155 cocitation publications.CONCLUSION:This study demonstrates that the research of pterygium is gradually attracting the attention of scholars and researchers.The interaction between authors,institutions,and countries is lack of.Even though,the research hotspot,distribution,and research status in pterygium in this study could provide valuable information for scholars and researchers.
基金supported by the Key Technology R&D Program of China during the 12th Five-Year Plan period:Super-Class Scientific and Technical Thesaurus and Ontology Construction Faced the Foreign Scientific and Technical Literature (2011BAH10B01)
文摘The key activity to build semantic web is to build ontologies. But today, the theory and methodology of ontology construction is still far from ready. This paper proposed a theoretical framework for massive knowledge management- the knowledge domain framework (KDF), and introduces an integrated development environment (IDE) named large-scale ontology development environment (LODE), which implements the proposed theoretical framework. We also compared LODE with other popular ontology development environments in this paper. The practice of using LODE on management and development of agriculture ontologies shows that knowledge domain framework can handle the development activities of large scale ontologies. Application studies based on the described briefly. principle of knowledge domain framework and LODE was
文摘The development of the information age and globalization has challenged the training of technical talents in the 21st century, and the information media and technical skills are becoming increasingly important. As a creative sharing form of multimedia, the digital storytelling is being concerned by more and more educators because of its discipline applicability and media technology enhancing ability. In this study, the information visualization software, i.e. CiteSpace was applied to visualize and analyze the researches on digital storytelling from the aspects of key articles and citation hotspots, and make a review on the research status of the digital storytelling in the education fields, such as promoting language learning, and helping students develop the 21 st century skills.
文摘The characteristics of design process, design object and domain knowledge of complex product are analyzed. A kind of knowledge representation schema based on integrated generalized rule is stated. An AND-OR tree based model of concept for domain knowledge is set up. The strategy of multilevel domain knowledge acquisition based on the model is presented. The intelligent multilevel knowledge acquisition system (IMKAS) for product design is developed, and it is applied in the intelligent decision support system of concept design of complex product.