On-site programming big data refers to the massive data generated in the process of software development with the characteristics of real-time,complexity and high-difficulty for processing.Therefore,data cleaning is e...On-site programming big data refers to the massive data generated in the process of software development with the characteristics of real-time,complexity and high-difficulty for processing.Therefore,data cleaning is essential for on-site programming big data.Duplicate data detection is an important step in data cleaning,which can save storage resources and enhance data consistency.Due to the insufficiency in traditional Sorted Neighborhood Method(SNM)and the difficulty of high-dimensional data detection,an optimized algorithm based on random forests with the dynamic and adaptive window size is proposed.The efficiency of the algorithm can be elevated by improving the method of the key-selection,reducing dimension of data set and using an adaptive variable size sliding window.Experimental results show that the improved SNM algorithm exhibits better performance and achieve higher accuracy.展开更多
PL/SQL is the most common language for ORACLE database application. It allows the developer to create stored program units (Procedures, Functions, and Packages) to improve software reusability and hide the complexity ...PL/SQL is the most common language for ORACLE database application. It allows the developer to create stored program units (Procedures, Functions, and Packages) to improve software reusability and hide the complexity of the execution of a specific operation behind a name. Also, it acts as an interface between SQL database and DEVELOPER. Therefore, it is important to test these modules that consist of procedures and functions. In this paper, a new genetic algorithm (GA), as search technique, is used in order to find the required test data according to branch criteria to test stored PL/SQL program units. The experimental results show that this was not fully achieved, such that the test target in some branches is not reached and the coverage percentage is 98%. A problem rises when target branch is depending on data retrieved from tables;in this case, GA is not able to generate test cases for this branch.展开更多
Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision...Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.展开更多
While adopting an elevation-over-azimuth architecture by an inter-satellite linkage antenna of a user satellite, a zenith pass problem always occurs when the antenna is tracing the tracking and data relay satellite (...While adopting an elevation-over-azimuth architecture by an inter-satellite linkage antenna of a user satellite, a zenith pass problem always occurs when the antenna is tracing the tracking and data relay satellite (TDRS). This paper deals with this problem by way of, firstly, introducing movement laws of the inter-satellite linkage to predict the movement of the user satellite antenna followed by analyzing the potential pass moment and the actual one of the zenith pass in detail. A number of specific orbit altitudes for the user satellite that can remove the blindness zone are obtained. Finally, on the base of the predicted results from the movement laws of the inter-satellite linkage, the zenith pass tracing strategies for the user satellite antenna are designed under the program guidance using a trajectory preprocessor. Simulations have confirmed the reasonability and feasibility of the strategies in dealing with the zenith pass problem.展开更多
基金supported by the National Key R&D Program of China(Nos.2018YFB1003905)the National Natural Science Foundation of China under Grant No.61971032,Fundamental Research Funds for the Central Universities(No.FRF-TP-18-008A3).
文摘On-site programming big data refers to the massive data generated in the process of software development with the characteristics of real-time,complexity and high-difficulty for processing.Therefore,data cleaning is essential for on-site programming big data.Duplicate data detection is an important step in data cleaning,which can save storage resources and enhance data consistency.Due to the insufficiency in traditional Sorted Neighborhood Method(SNM)and the difficulty of high-dimensional data detection,an optimized algorithm based on random forests with the dynamic and adaptive window size is proposed.The efficiency of the algorithm can be elevated by improving the method of the key-selection,reducing dimension of data set and using an adaptive variable size sliding window.Experimental results show that the improved SNM algorithm exhibits better performance and achieve higher accuracy.
文摘PL/SQL is the most common language for ORACLE database application. It allows the developer to create stored program units (Procedures, Functions, and Packages) to improve software reusability and hide the complexity of the execution of a specific operation behind a name. Also, it acts as an interface between SQL database and DEVELOPER. Therefore, it is important to test these modules that consist of procedures and functions. In this paper, a new genetic algorithm (GA), as search technique, is used in order to find the required test data according to branch criteria to test stored PL/SQL program units. The experimental results show that this was not fully achieved, such that the test target in some branches is not reached and the coverage percentage is 98%. A problem rises when target branch is depending on data retrieved from tables;in this case, GA is not able to generate test cases for this branch.
文摘Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.
文摘While adopting an elevation-over-azimuth architecture by an inter-satellite linkage antenna of a user satellite, a zenith pass problem always occurs when the antenna is tracing the tracking and data relay satellite (TDRS). This paper deals with this problem by way of, firstly, introducing movement laws of the inter-satellite linkage to predict the movement of the user satellite antenna followed by analyzing the potential pass moment and the actual one of the zenith pass in detail. A number of specific orbit altitudes for the user satellite that can remove the blindness zone are obtained. Finally, on the base of the predicted results from the movement laws of the inter-satellite linkage, the zenith pass tracing strategies for the user satellite antenna are designed under the program guidance using a trajectory preprocessor. Simulations have confirmed the reasonability and feasibility of the strategies in dealing with the zenith pass problem.