Software developers often write code that has similar functionality to existing code segments.A code recommendation tool that helps developers reuse these code fragments can significantly improve their efficiency.Seve...Software developers often write code that has similar functionality to existing code segments.A code recommendation tool that helps developers reuse these code fragments can significantly improve their efficiency.Several methods have been proposed in recent years.Some use sequence matching algorithms to find the related recommendations.Most of these methods are time-consuming and can leverage only low-level textual information from code.Others extract features from code and obtain similarity using numerical feature vectors.However,the similarity of feature vectors is often not equivalent to the original code’s similarity.Structural information is lost during the process of transforming abstract syntax trees into vectors.We propose an approximate sub-tree matching based method to solve this problem.Unlike existing tree-based approaches that match feature vectors,it retains the tree structure of the query code in the matching process to find code fragments that best match the current query.It uses a fast approximation sub-tree matching algorithm by transforming the sub-tree matching problem into the match between the tree and the list.In this way,the structural information can be used for code recommendation tasks that have high time requirements.We have constructed several real-world code databases covering different languages and granularities to evaluate the effectiveness of our method.The results show that our method outperforms two compared methods,SENSORY and Aroma,in terms of the recall value on all the datasets,and can be applied to large datasets.展开更多
Code review is a critical process in software development, contributing to the overall quality of the product by identifying errors early. A key aspect of this process is the selection of appropriate reviewers to scru...Code review is a critical process in software development, contributing to the overall quality of the product by identifying errors early. A key aspect of this process is the selection of appropriate reviewers to scrutinize changes made to source code. However, in large-scale open-source projects, selecting the most suitable reviewers for a specific change can be a challenging task. To address this, we introduce the Code Context Based Reviewer Recommendation (CCB-RR), a model that leverages information from changesets to recommend the most suitable reviewers. The model takes into consideration the paths of modified files and the context derived from the changesets, including their titles and descriptions. Additionally, CCB-RR employs KeyBERT to extract the most relevant keywords and compare the semantic similarity across changesets. The model integrates the paths of modified files, keyword information, and the context of code changes to form a comprehensive picture of the changeset. We conducted extensive experiments on four open-source projects, demonstrating the effectiveness of CCB-RR. The model achieved a Top-1 accuracy of 60%, 55%, 51%, and 45% on the Android, OpenStack, QT, and LibreOffice projects respectively. For Mean Reciprocal Rank (MRR), CCB achieved 71%, 62%, 52%, and 68% on the same projects respectively, thereby highlighting its potential for practical application in code reviewer recommendation.展开更多
Code review is an important process to reduce code defects and improve software quality. In social coding communities like GitHub, as everyone can submit Pull-Requests, code review plays a more important role than eve...Code review is an important process to reduce code defects and improve software quality. In social coding communities like GitHub, as everyone can submit Pull-Requests, code review plays a more important role than ever before, and the process is quite time-consuming. Therefore, finding and recommending proper reviewers for the emerging Pull-Requests becomes a vital task. However, most of the current studies mainly focus on recommending reviewers by checking whether they will participate or not without differentiating the participation types. In this paper, we develop a two-layer reviewer recommendation model to recommend reviewers for Pull-Requests (PRs) in GitHub projects from the technical and managerial perspectives. For the first layer, we recommend suitable developers to review the target PRs based on a hybrid recommendation method. For the second layer, after getting the recommendation results from the first layer, we specify whether the target developer will technically or managerially participate in the reviewing process. We conducted experiments on two popular projects in GitHub, and tested the approach using PRs created between February 2016 and February 2017. The results show that the first layer of our recommendation model performs better than the previous work, and the second layer can effectively differentiate the types of participation.展开更多
基金supported by the National Natural Science Foundation of China(No.61772270)。
文摘Software developers often write code that has similar functionality to existing code segments.A code recommendation tool that helps developers reuse these code fragments can significantly improve their efficiency.Several methods have been proposed in recent years.Some use sequence matching algorithms to find the related recommendations.Most of these methods are time-consuming and can leverage only low-level textual information from code.Others extract features from code and obtain similarity using numerical feature vectors.However,the similarity of feature vectors is often not equivalent to the original code’s similarity.Structural information is lost during the process of transforming abstract syntax trees into vectors.We propose an approximate sub-tree matching based method to solve this problem.Unlike existing tree-based approaches that match feature vectors,it retains the tree structure of the query code in the matching process to find code fragments that best match the current query.It uses a fast approximation sub-tree matching algorithm by transforming the sub-tree matching problem into the match between the tree and the list.In this way,the structural information can be used for code recommendation tasks that have high time requirements.We have constructed several real-world code databases covering different languages and granularities to evaluate the effectiveness of our method.The results show that our method outperforms two compared methods,SENSORY and Aroma,in terms of the recall value on all the datasets,and can be applied to large datasets.
基金supported in part by the Science and Technology Development Fund(FDCT),Macao SAR,China(Nos.0047/2020/A1 and 0014/2022/A).
文摘Code review is a critical process in software development, contributing to the overall quality of the product by identifying errors early. A key aspect of this process is the selection of appropriate reviewers to scrutinize changes made to source code. However, in large-scale open-source projects, selecting the most suitable reviewers for a specific change can be a challenging task. To address this, we introduce the Code Context Based Reviewer Recommendation (CCB-RR), a model that leverages information from changesets to recommend the most suitable reviewers. The model takes into consideration the paths of modified files and the context derived from the changesets, including their titles and descriptions. Additionally, CCB-RR employs KeyBERT to extract the most relevant keywords and compare the semantic similarity across changesets. The model integrates the paths of modified files, keyword information, and the context of code changes to form a comprehensive picture of the changeset. We conducted extensive experiments on four open-source projects, demonstrating the effectiveness of CCB-RR. The model achieved a Top-1 accuracy of 60%, 55%, 51%, and 45% on the Android, OpenStack, QT, and LibreOffice projects respectively. For Mean Reciprocal Rank (MRR), CCB achieved 71%, 62%, 52%, and 68% on the same projects respectively, thereby highlighting its potential for practical application in code reviewer recommendation.
基金Project(2016-YFB1000805)supported by the National Grand R&D Plan,ChinaProjects(61502512,61432020,61472430,61532004)supported by the National Natural Science Foundation of China
文摘Code review is an important process to reduce code defects and improve software quality. In social coding communities like GitHub, as everyone can submit Pull-Requests, code review plays a more important role than ever before, and the process is quite time-consuming. Therefore, finding and recommending proper reviewers for the emerging Pull-Requests becomes a vital task. However, most of the current studies mainly focus on recommending reviewers by checking whether they will participate or not without differentiating the participation types. In this paper, we develop a two-layer reviewer recommendation model to recommend reviewers for Pull-Requests (PRs) in GitHub projects from the technical and managerial perspectives. For the first layer, we recommend suitable developers to review the target PRs based on a hybrid recommendation method. For the second layer, after getting the recommendation results from the first layer, we specify whether the target developer will technically or managerially participate in the reviewing process. We conducted experiments on two popular projects in GitHub, and tested the approach using PRs created between February 2016 and February 2017. The results show that the first layer of our recommendation model performs better than the previous work, and the second layer can effectively differentiate the types of participation.