Code reviews in pull-based model are open to community users on GitHub. Various participants are taking part in the review discussions and the review topics are not only about the improvement of code contributions but...Code reviews in pull-based model are open to community users on GitHub. Various participants are taking part in the review discussions and the review topics are not only about the improvement of code contributions but also about project evolution and social interaction. A comprehensive understanding of the review topics in pull-based model would be useful to better organize the code review process and optimize review tasks such as reviewer recommendation and pull-request prioritization. In this paper, we first conduct a qualitative study on three popular open-source software projects hosted on GitHub and construct a fine-grained two-level taxonomy covering four level-1 categories (code correctness, pull- request decision-making, project management, and social interaction) and 11 level-2 subcategories (e.g., defect detecting, reviewer assigning, contribution encouraging). Second, we conduct preliminary quantitative analysis on a large set of review comments that were labeled by TSHC (a two-stage hybrid classification algorithm), which is able to automatically classify review comments by combining rule-based and machine-learning techniques. Through the quantitative study, we explore the typical review patterns. We find that the three projects present similar comments distribution on each subeategory. Pull-requests submitted by inexperienced contributors tend to contain potential issues even though they have passed the tests. Furthermore, external contributors are more likely to break project conventions in their early contributions.展开更多
In programming courses, the traditional assessment approach tends to evaluate student performance by scoring one or more project-level summative assignments. This approach no longer meets the requirements of a quality...In programming courses, the traditional assessment approach tends to evaluate student performance by scoring one or more project-level summative assignments. This approach no longer meets the requirements of a quality programming language education. Based on an upgraded peer code review model, we propose a formative assessment approach to assess the learning of computer programming languages, and develop an online assessment system(OOCourse) to implement this approach. Peer code review and inspection is an effective way to ensure the high quality of a program by systematically checking the source code. Though it is commonly applied in industrial and open-source software development, it is rarely taught and practiced in undergraduate-level programming courses. We conduct a case study using the formative assessment method in a sophomore level Object-Oriented Design and Construction course with more than 240 students. We use Moodle(an online learning system) and some relevant plugins to conduct peer code review. We also conduct data mining on the running data from the peer assessment activities. The case study shows that formative assessment based on peer code review gradually improved the programming ability of students in the undergraduate class.展开更多
Code review is a critical process in software development, contributing to the overall quality of the product by identifying errors early. A key aspect of this process is the selection of appropriate reviewers to scru...Code review is a critical process in software development, contributing to the overall quality of the product by identifying errors early. A key aspect of this process is the selection of appropriate reviewers to scrutinize changes made to source code. However, in large-scale open-source projects, selecting the most suitable reviewers for a specific change can be a challenging task. To address this, we introduce the Code Context Based Reviewer Recommendation (CCB-RR), a model that leverages information from changesets to recommend the most suitable reviewers. The model takes into consideration the paths of modified files and the context derived from the changesets, including their titles and descriptions. Additionally, CCB-RR employs KeyBERT to extract the most relevant keywords and compare the semantic similarity across changesets. The model integrates the paths of modified files, keyword information, and the context of code changes to form a comprehensive picture of the changeset. We conducted extensive experiments on four open-source projects, demonstrating the effectiveness of CCB-RR. The model achieved a Top-1 accuracy of 60%, 55%, 51%, and 45% on the Android, OpenStack, QT, and LibreOffice projects respectively. For Mean Reciprocal Rank (MRR), CCB achieved 71%, 62%, 52%, and 68% on the same projects respectively, thereby highlighting its potential for practical application in code reviewer recommendation.展开更多
Code review is an important process to reduce code defects and improve software quality. In social coding communities like GitHub, as everyone can submit Pull-Requests, code review plays a more important role than eve...Code review is an important process to reduce code defects and improve software quality. In social coding communities like GitHub, as everyone can submit Pull-Requests, code review plays a more important role than ever before, and the process is quite time-consuming. Therefore, finding and recommending proper reviewers for the emerging Pull-Requests becomes a vital task. However, most of the current studies mainly focus on recommending reviewers by checking whether they will participate or not without differentiating the participation types. In this paper, we develop a two-layer reviewer recommendation model to recommend reviewers for Pull-Requests (PRs) in GitHub projects from the technical and managerial perspectives. For the first layer, we recommend suitable developers to review the target PRs based on a hybrid recommendation method. For the second layer, after getting the recommendation results from the first layer, we specify whether the target developer will technically or managerially participate in the reviewing process. We conducted experiments on two popular projects in GitHub, and tested the approach using PRs created between February 2016 and February 2017. The results show that the first layer of our recommendation model performs better than the previous work, and the second layer can effectively differentiate the types of participation.展开更多
基金This work was supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000805 and the National Natural Science Foundation of China under Grant Nos. 61432020, 61303064, 61472430 and 61502512.
文摘Code reviews in pull-based model are open to community users on GitHub. Various participants are taking part in the review discussions and the review topics are not only about the improvement of code contributions but also about project evolution and social interaction. A comprehensive understanding of the review topics in pull-based model would be useful to better organize the code review process and optimize review tasks such as reviewer recommendation and pull-request prioritization. In this paper, we first conduct a qualitative study on three popular open-source software projects hosted on GitHub and construct a fine-grained two-level taxonomy covering four level-1 categories (code correctness, pull- request decision-making, project management, and social interaction) and 11 level-2 subcategories (e.g., defect detecting, reviewer assigning, contribution encouraging). Second, we conduct preliminary quantitative analysis on a large set of review comments that were labeled by TSHC (a two-stage hybrid classification algorithm), which is able to automatically classify review comments by combining rule-based and machine-learning techniques. Through the quantitative study, we explore the typical review patterns. We find that the three projects present similar comments distribution on each subeategory. Pull-requests submitted by inexperienced contributors tend to contain potential issues even though they have passed the tests. Furthermore, external contributors are more likely to break project conventions in their early contributions.
文摘In programming courses, the traditional assessment approach tends to evaluate student performance by scoring one or more project-level summative assignments. This approach no longer meets the requirements of a quality programming language education. Based on an upgraded peer code review model, we propose a formative assessment approach to assess the learning of computer programming languages, and develop an online assessment system(OOCourse) to implement this approach. Peer code review and inspection is an effective way to ensure the high quality of a program by systematically checking the source code. Though it is commonly applied in industrial and open-source software development, it is rarely taught and practiced in undergraduate-level programming courses. We conduct a case study using the formative assessment method in a sophomore level Object-Oriented Design and Construction course with more than 240 students. We use Moodle(an online learning system) and some relevant plugins to conduct peer code review. We also conduct data mining on the running data from the peer assessment activities. The case study shows that formative assessment based on peer code review gradually improved the programming ability of students in the undergraduate class.
基金supported in part by the Science and Technology Development Fund(FDCT),Macao SAR,China(Nos.0047/2020/A1 and 0014/2022/A).
文摘Code review is a critical process in software development, contributing to the overall quality of the product by identifying errors early. A key aspect of this process is the selection of appropriate reviewers to scrutinize changes made to source code. However, in large-scale open-source projects, selecting the most suitable reviewers for a specific change can be a challenging task. To address this, we introduce the Code Context Based Reviewer Recommendation (CCB-RR), a model that leverages information from changesets to recommend the most suitable reviewers. The model takes into consideration the paths of modified files and the context derived from the changesets, including their titles and descriptions. Additionally, CCB-RR employs KeyBERT to extract the most relevant keywords and compare the semantic similarity across changesets. The model integrates the paths of modified files, keyword information, and the context of code changes to form a comprehensive picture of the changeset. We conducted extensive experiments on four open-source projects, demonstrating the effectiveness of CCB-RR. The model achieved a Top-1 accuracy of 60%, 55%, 51%, and 45% on the Android, OpenStack, QT, and LibreOffice projects respectively. For Mean Reciprocal Rank (MRR), CCB achieved 71%, 62%, 52%, and 68% on the same projects respectively, thereby highlighting its potential for practical application in code reviewer recommendation.
基金Project(2016-YFB1000805)supported by the National Grand R&D Plan,ChinaProjects(61502512,61432020,61472430,61532004)supported by the National Natural Science Foundation of China
文摘Code review is an important process to reduce code defects and improve software quality. In social coding communities like GitHub, as everyone can submit Pull-Requests, code review plays a more important role than ever before, and the process is quite time-consuming. Therefore, finding and recommending proper reviewers for the emerging Pull-Requests becomes a vital task. However, most of the current studies mainly focus on recommending reviewers by checking whether they will participate or not without differentiating the participation types. In this paper, we develop a two-layer reviewer recommendation model to recommend reviewers for Pull-Requests (PRs) in GitHub projects from the technical and managerial perspectives. For the first layer, we recommend suitable developers to review the target PRs based on a hybrid recommendation method. For the second layer, after getting the recommendation results from the first layer, we specify whether the target developer will technically or managerially participate in the reviewing process. We conducted experiments on two popular projects in GitHub, and tested the approach using PRs created between February 2016 and February 2017. The results show that the first layer of our recommendation model performs better than the previous work, and the second layer can effectively differentiate the types of participation.