It remains difficult to automate the creation and validation of Unified Modeling Language(UML)dia-grams due to unstructured requirements,limited automated pipelines,and the lack of reliable evaluation methods.This stu...It remains difficult to automate the creation and validation of Unified Modeling Language(UML)dia-grams due to unstructured requirements,limited automated pipelines,and the lack of reliable evaluation methods.This study introduces a cohesive architecture that amalgamates requirement development,UML synthesis,and multimodal validation.First,LLaMA-3.2-1B-Instruct was utilized to generate user-focused requirements.Then,DeepSeek-R1-Distill-Qwen-32B applies its reasoning skills to transform these requirements into PlantUML code.Using this dual-LLM pipeline,we constructed a synthetic dataset of 11,997 UML diagrams spanning six major diagram families.Rendering analysis showed that 89.5%of the generated diagrams compile correctly,while invalid cases were detected automatically.To assess quality,we employed a multimodal scoring method that combines Qwen2.5-VL-3B,LLaMA-3.2-11B-Vision-Instruct and Aya-Vision-8B,with weights based on MMMU performance.A study with 94 experts revealed strong alignment between automatic and manual evaluations,yielding a Pearson correlation of r=0.82 and a Fleiss’Kappa of 0.78.This indicates a high degree of concordance between automated metrics and human judgment.Overall,the results demonstrated that our scoring system is effective and that the proposed generation pipeline produces UML diagrams that are both syntactically correct and semantically coherent.More broadly,the system provides a scalable and reproducible foundation for future work in AI-driven software modeling and multimodal verification.展开更多
Through reusing software test components, automated software testing generally costs less than manual software testing. There has been much research on how to develop the reusable test components, but few fall on how ...Through reusing software test components, automated software testing generally costs less than manual software testing. There has been much research on how to develop the reusable test components, but few fall on how to estimate the reusability of test conlponents for automated testing. The purpose of this paper is to present a method of minimum reusability estimation for automated testing based on the return on investment (ROI) model. Minimum reusability is a benchmark for the whole automated testing process. If the reusability in one test execution is less than the minimum reusability, some new strategies must be adopted ill the next test execution to increase the reusability. Only by this way, we can reduce unnecessary costs and finally get a return on the investment of automated testing.展开更多
In recent years,automation has become a key focus in software development as organizations seek to improve efficiency and reduce time-to-market.The integration of artificial intelligence(AI)tools,particularly those us...In recent years,automation has become a key focus in software development as organizations seek to improve efficiency and reduce time-to-market.The integration of artificial intelligence(AI)tools,particularly those using natural language processing(NLP)like ChatGPT,has opened new possibilities for automating various stages of the development lifecycle.The primary objective of this study is to evaluate the effectiveness of ChatGPT in automating various phases of software development.An artificial intelligence(AI)tool was developed using the OpenAI—Application Programming Interface(API),incorporating two key functionalities:1)generating user stories based on case or process inputs,and 2)estimating the effort required to execute each user story.Additionally,ChatGPT was employed to generate application code.The AI tool was tested in three case studies,each explored under two different development strategies:a semi-automated process utilizing the AI tools and a traditional manual approach.The results demonstrated a significant reduction in total development time,ranging from 40%to 51%.However,it was observed that the generated content could be inaccurate and incomplete,necessitating review and debugging before being applied to projects.In conclusion,given the increasing shift towards automation in software engineering,further research is critical to enhance the efficiency and reliability of AI tools,particularly those that leverage natural language processing(NLP)technologies.展开更多
A method using quantifier-elimination is proposed for automatically generating program invariants/inductive assertions. Given a program, inductive assertions, hypothesized as parameterized formulas in a theory, are as...A method using quantifier-elimination is proposed for automatically generating program invariants/inductive assertions. Given a program, inductive assertions, hypothesized as parameterized formulas in a theory, are associated with program locations. Parameters in inductive assertions are discovered by generating constraints on parameters by ensuring that an inductive assertion is indeed preserved by all execution paths leading to the associated location of the program. The method can be used to discover loop invariants-properties of variables that remain invariant at the entry of a loop. The parameterized formula can be successively refined by considering execution paths one by one; heuristics can be developed for determining the order in which the paths are considered. Initialization of program variables as well as the precondition and postcondition, if available, can also be used to further refine the hypothesized invariant. The method does not depend on the availability of the precondition and postcondition of a program. Constraints on parameters generated in this way are solved for possible values of parameters. If no solution is possible, this means that an invariant of the hypothesized form is not likely to exist for the loop under the assumptions/approximations made to generate the associated verification condition. Otherwise, if the parametric constraints are solvable, then under certain conditions on methods for generating these constraints, the strongest possible invariant of the hypothesized form can be generated from most general solutions of the parametric constraints. The approach is illustrated using the logical languages of conjunction of polynomial equations as well as Presburger arithmetic for expressing assertions.展开更多
FGSPEC is a wide spectrum specification language intended to facilitate the software specification and the expression of transformation process from the functional specification which describes“what to do”to the cor...FGSPEC is a wide spectrum specification language intended to facilitate the software specification and the expression of transformation process from the functional specification which describes“what to do”to the corresponding design(operational)specification which describes“how to do”.The design emphasizes the coherence of multi-level specification mechanisms and a tree structure model is provided which unifies the wide spectrum specification styles from“what”to“how”.展开更多
基金supported by the DH2025-TN07-07 project conducted at the Thai Nguyen University of Information and Communication Technology,Thai Nguyen,Vietnam,with additional support from the AI in Software Engineering Lab.
文摘It remains difficult to automate the creation and validation of Unified Modeling Language(UML)dia-grams due to unstructured requirements,limited automated pipelines,and the lack of reliable evaluation methods.This study introduces a cohesive architecture that amalgamates requirement development,UML synthesis,and multimodal validation.First,LLaMA-3.2-1B-Instruct was utilized to generate user-focused requirements.Then,DeepSeek-R1-Distill-Qwen-32B applies its reasoning skills to transform these requirements into PlantUML code.Using this dual-LLM pipeline,we constructed a synthetic dataset of 11,997 UML diagrams spanning six major diagram families.Rendering analysis showed that 89.5%of the generated diagrams compile correctly,while invalid cases were detected automatically.To assess quality,we employed a multimodal scoring method that combines Qwen2.5-VL-3B,LLaMA-3.2-11B-Vision-Instruct and Aya-Vision-8B,with weights based on MMMU performance.A study with 94 experts revealed strong alignment between automatic and manual evaluations,yielding a Pearson correlation of r=0.82 and a Fleiss’Kappa of 0.78.This indicates a high degree of concordance between automated metrics and human judgment.Overall,the results demonstrated that our scoring system is effective and that the proposed generation pipeline produces UML diagrams that are both syntactically correct and semantically coherent.More broadly,the system provides a scalable and reproducible foundation for future work in AI-driven software modeling and multimodal verification.
基金Foundation item: the National Natural Science Foundation of China (No. 90718037)
文摘Through reusing software test components, automated software testing generally costs less than manual software testing. There has been much research on how to develop the reusable test components, but few fall on how to estimate the reusability of test conlponents for automated testing. The purpose of this paper is to present a method of minimum reusability estimation for automated testing based on the return on investment (ROI) model. Minimum reusability is a benchmark for the whole automated testing process. If the reusability in one test execution is less than the minimum reusability, some new strategies must be adopted ill the next test execution to increase the reusability. Only by this way, we can reduce unnecessary costs and finally get a return on the investment of automated testing.
文摘In recent years,automation has become a key focus in software development as organizations seek to improve efficiency and reduce time-to-market.The integration of artificial intelligence(AI)tools,particularly those using natural language processing(NLP)like ChatGPT,has opened new possibilities for automating various stages of the development lifecycle.The primary objective of this study is to evaluate the effectiveness of ChatGPT in automating various phases of software development.An artificial intelligence(AI)tool was developed using the OpenAI—Application Programming Interface(API),incorporating two key functionalities:1)generating user stories based on case or process inputs,and 2)estimating the effort required to execute each user story.Additionally,ChatGPT was employed to generate application code.The AI tool was tested in three case studies,each explored under two different development strategies:a semi-automated process utilizing the AI tools and a traditional manual approach.The results demonstrated a significant reduction in total development time,ranging from 40%to 51%.However,it was observed that the generated content could be inaccurate and incomplete,necessitating review and debugging before being applied to projects.In conclusion,given the increasing shift towards automation in software engineering,further research is critical to enhance the efficiency and reliability of AI tools,particularly those that leverage natural language processing(NLP)technologies.
基金This research was partially supported by an National Science Foundation(NSF)Information Technology Research(ITR)award CCR-0113611an NSF award CCR-0203051.
文摘A method using quantifier-elimination is proposed for automatically generating program invariants/inductive assertions. Given a program, inductive assertions, hypothesized as parameterized formulas in a theory, are associated with program locations. Parameters in inductive assertions are discovered by generating constraints on parameters by ensuring that an inductive assertion is indeed preserved by all execution paths leading to the associated location of the program. The method can be used to discover loop invariants-properties of variables that remain invariant at the entry of a loop. The parameterized formula can be successively refined by considering execution paths one by one; heuristics can be developed for determining the order in which the paths are considered. Initialization of program variables as well as the precondition and postcondition, if available, can also be used to further refine the hypothesized invariant. The method does not depend on the availability of the precondition and postcondition of a program. Constraints on parameters generated in this way are solved for possible values of parameters. If no solution is possible, this means that an invariant of the hypothesized form is not likely to exist for the loop under the assumptions/approximations made to generate the associated verification condition. Otherwise, if the parametric constraints are solvable, then under certain conditions on methods for generating these constraints, the strongest possible invariant of the hypothesized form can be generated from most general solutions of the parametric constraints. The approach is illustrated using the logical languages of conjunction of polynomial equations as well as Presburger arithmetic for expressing assertions.
文摘FGSPEC is a wide spectrum specification language intended to facilitate the software specification and the expression of transformation process from the functional specification which describes“what to do”to the corresponding design(operational)specification which describes“how to do”.The design emphasizes the coherence of multi-level specification mechanisms and a tree structure model is provided which unifies the wide spectrum specification styles from“what”to“how”.