Duplicate bug reporting is a critical problem in the software repositories’mining area.Duplicate bug reports can lead to redundant efforts,wasted resources,and delayed software releases.Thus,their accurate identifica...Duplicate bug reporting is a critical problem in the software repositories’mining area.Duplicate bug reports can lead to redundant efforts,wasted resources,and delayed software releases.Thus,their accurate identification is essential for streamlining the bug triage process mining area.Several researchers have explored classical information retrieval,natural language processing,text and data mining,and machine learning approaches.The emergence of large language models(LLMs)(ChatGPT and Huggingface)has presented a new line of models for semantic textual similarity(STS).Although LLMs have shown remarkable advancements,there remains a need for longitudinal studies to determine whether performance improvements are due to the scale of the models or the unique embeddings they produce compared to classical encoding models.This study systematically investigates this issue by comparing classical word embedding techniques against LLM-based embeddings for duplicate bug detection.In this study,we have proposed an amalgamation of models to detect duplicate bug reports using textual and non-textual information about bug reports.The empirical evaluation has been performed on the open-source datasets and evaluated based on established metrics using the mean reciprocal rank(MRR),mean average precision(MAP),and recall rate.The experimental results have shown that combined LLMs can outperform(recall-rate@k 68%–74%)other individual=models for duplicate bug detection.These findings highlight the effectiveness of amalgamating multiple techniques in improving the duplicate bug report detection accuracy.展开更多
In software development projects,bugs are common phenomena.Developers report bugs in open source repositories.There is a need to develop high quality developer prediction model that considers developer work satisfacti...In software development projects,bugs are common phenomena.Developers report bugs in open source repositories.There is a need to develop high quality developer prediction model that considers developer work satisfaction,keep within limited development cost,and improve bug resolution time.To address and resolve bug report as soon as possible is the main focus of triager when a new bug is reported.Thus,developer work efficiency is an important factor in bug-fixing.To address these issues,a proposed approach recommends a set of developers that could potentially share their knowledge with each other to fix new bug reports.The proposed approach is called developer working efficiency and social network based developer recommendation(DweSn).It is a composite model that builds developers'profile by using developer average bug fixing time,work efficiency to fix variety of bugs,as well as the developer's social interactions with other developers.A similarity measure is applied between new bug and bugs in corpus to extract the list of capable developers from the corpus.The proposed approach only selects those developers who are active and less loaded with work.The developer with the highest profile score is assigned the bugs.We evaluated our approach on the subset of five large open-source projects including Mozilla,Netbeans,Eclipse,Firefox and OpenOffice,and compared it with the state-of-the-art.The results demonstrate that combination of developers'efficiency with their average bug fixing time and interactions in their social network gives good accuracy and efficiently reduces bug tossing length.This approach shows an improvement in prediction accuracy,precision,recall,F-score and reduced bug tossing length up to 93.89%,93.12%,93.46%,93.27%and 93.25%,respectively.The proposed approach achieved a 93%hit ratio and 93.34%mean reciprocal rank,indicating that our proposed triager is able to efficiently assign bugs to correct developers.展开更多
Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learn...Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode.展开更多
Tobacco leaf shapes including the length,width,area,perimeter and roundness parameters and so on,Only obtain exact boundaries of the leaf information to calculate a large number of leaf parameters.This paper introduce...Tobacco leaf shapes including the length,width,area,perimeter and roundness parameters and so on,Only obtain exact boundaries of the leaf information to calculate a large number of leaf parameters.This paper introduces the classical edge detection Methods,bug method is used to track the boundaries of tobacco leaf extractly.The test shows that the algorithm has a good edge extraction capability.展开更多
文摘Duplicate bug reporting is a critical problem in the software repositories’mining area.Duplicate bug reports can lead to redundant efforts,wasted resources,and delayed software releases.Thus,their accurate identification is essential for streamlining the bug triage process mining area.Several researchers have explored classical information retrieval,natural language processing,text and data mining,and machine learning approaches.The emergence of large language models(LLMs)(ChatGPT and Huggingface)has presented a new line of models for semantic textual similarity(STS).Although LLMs have shown remarkable advancements,there remains a need for longitudinal studies to determine whether performance improvements are due to the scale of the models or the unique embeddings they produce compared to classical encoding models.This study systematically investigates this issue by comparing classical word embedding techniques against LLM-based embeddings for duplicate bug detection.In this study,we have proposed an amalgamation of models to detect duplicate bug reports using textual and non-textual information about bug reports.The empirical evaluation has been performed on the open-source datasets and evaluated based on established metrics using the mean reciprocal rank(MRR),mean average precision(MAP),and recall rate.The experimental results have shown that combined LLMs can outperform(recall-rate@k 68%–74%)other individual=models for duplicate bug detection.These findings highlight the effectiveness of amalgamating multiple techniques in improving the duplicate bug report detection accuracy.
文摘In software development projects,bugs are common phenomena.Developers report bugs in open source repositories.There is a need to develop high quality developer prediction model that considers developer work satisfaction,keep within limited development cost,and improve bug resolution time.To address and resolve bug report as soon as possible is the main focus of triager when a new bug is reported.Thus,developer work efficiency is an important factor in bug-fixing.To address these issues,a proposed approach recommends a set of developers that could potentially share their knowledge with each other to fix new bug reports.The proposed approach is called developer working efficiency and social network based developer recommendation(DweSn).It is a composite model that builds developers'profile by using developer average bug fixing time,work efficiency to fix variety of bugs,as well as the developer's social interactions with other developers.A similarity measure is applied between new bug and bugs in corpus to extract the list of capable developers from the corpus.The proposed approach only selects those developers who are active and less loaded with work.The developer with the highest profile score is assigned the bugs.We evaluated our approach on the subset of five large open-source projects including Mozilla,Netbeans,Eclipse,Firefox and OpenOffice,and compared it with the state-of-the-art.The results demonstrate that combination of developers'efficiency with their average bug fixing time and interactions in their social network gives good accuracy and efficiently reduces bug tossing length.This approach shows an improvement in prediction accuracy,precision,recall,F-score and reduced bug tossing length up to 93.89%,93.12%,93.46%,93.27%and 93.25%,respectively.The proposed approach achieved a 93%hit ratio and 93.34%mean reciprocal rank,indicating that our proposed triager is able to efficiently assign bugs to correct developers.
基金This Research is funded by Researchers Supporting Project Number(RSPD2024R947),King Saud University,Riyadh,Saudi Arabia.
文摘Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode.
基金Supported by Key Technologies R & D Program of Henan Province(082102210065)Natural Science Research Project of Henan Educational Committee(2007210005)~~
文摘Tobacco leaf shapes including the length,width,area,perimeter and roundness parameters and so on,Only obtain exact boundaries of the leaf information to calculate a large number of leaf parameters.This paper introduces the classical edge detection Methods,bug method is used to track the boundaries of tobacco leaf extractly.The test shows that the algorithm has a good edge extraction capability.