Chronic kidney disease(CKD)is an increasingly prevalent medical condition associated with high mortality and cardiovascular complications.The intricate interplay between kidney dysfunction and subsequent metabolic dis...Chronic kidney disease(CKD)is an increasingly prevalent medical condition associated with high mortality and cardiovascular complications.The intricate interplay between kidney dysfunction and subsequent metabolic disturbances may provide insights into the underlying mechanisms driving CKD onset and progression.Herein,we proposed a large-scale plasma metabolite identification and quantification system that combines the strengths of targeted and untargeted metabolomics technologies,i.e.,widely-targeted metabolomics(WT-Met)approach.WT-Met method enables large-scale identification and accurate quantification of thousands of metabolites.We collected plasma samples from 21 healthy controls and 62CKD patients,categorized into different stages(22 in stages 1-3,20 in stage 4,and 20 in stage 5).Using LC-MS-based WT-Met approach,we were able to effectively annotate and quantify a total of 1431metabolites from the plasma samples.Focusing on the 539 endogenous metabolites,we identified 399significantly altered metabolites and depicted their changing patterns from healthy controls to end-stage CKD.Furthermore,we employed machine-learning to identify the optimal combination of metabolites for predicting different stages of CKD.We generated a multiclass classifier consisting of 7 metabolites by machine-learning,which exhibited an average AUC of 0.99 for the test set.In general,amino acids,nucleotides,organic acids,and their metabolites emerged as the most significantly altered metabolites.However,their patterns of change varied across different stages of CKD.The 7-metabolite panel demonstrates promising potential as biomarker candidates for CKD.Further exploration of these metabolites can provide valuable insights into their roles in the etiology and progression of CKD.展开更多
Edge devices,due to their limited computational and storage resources,often require the use of compilers for program optimization.Therefore,ensuring the security and reliability of these compilers is of paramount impo...Edge devices,due to their limited computational and storage resources,often require the use of compilers for program optimization.Therefore,ensuring the security and reliability of these compilers is of paramount importance in the emerging field of edge AI.One widely used testing method for this purpose is fuzz testing,which detects bugs by inputting random test cases into the target program.However,this process consumes significant time and resources.To improve the efficiency of compiler fuzz testing,it is common practice to utilize test case prioritization techniques.Some researchers use machine learning to predict the code coverage of test cases,aiming to maximize the test capability for the target compiler by increasing the overall predicted coverage of the test cases.Nevertheless,these methods can only forecast the code coverage of the compiler at a specific optimization level,potentially missing many optimization-related bugs.In this paper,we introduce C-CORE(short for Clustering by Code Representation),the first framework to prioritize test cases according to their code representations,which are derived directly from the source codes.This approach avoids being limited to specific compiler states and extends to a broader range of compiler bugs.Specifically,we first train a scaled pre-trained programming language model to capture as many common features as possible from the test cases generated by a fuzzer.Using this pre-trained model,we then train two downstream models:one for predicting the likelihood of triggering a bug and another for identifying code representations associated with bugs.Subsequently,we cluster the test cases according to their code representations and select the highest-scoring test case from each cluster as the high-quality test case.This reduction in redundant testing cases leads to time savings.Comprehensive evaluation results reveal that code representations are better at distinguishing test capabilities,and C-CORE significantly enhances testing efficiency.Across four datasets,C-CORE increases the average of the percentage of faults detected(APFD)value by 0.16 to 0.31 and reduces test time by over 50% in 46% of cases.When compared to the best results from approaches using predicted code coverage,C-CORE improves the APFD value by 1.1% to 12.3% and achieves an overall time-saving of 159.1%.展开更多
In order to adapt different languages and platforms, the paper discusses how to process and validate IDL symbol table and intermediate code by XML API. It puts emphasis on IDL AP1 extension towards DOM API based on th...In order to adapt different languages and platforms, the paper discusses how to process and validate IDL symbol table and intermediate code by XML API. It puts emphasis on IDL AP1 extension towards DOM API based on the idea of combining XML with IDL compilers. At last, the IDL compiler designing framework based on XML AP! is given, in which compiler front end can be managed and validated by some XML techniques and tools, IDL API can be validated on the basis of test, so IDL intermediate code is provided with maintainability, portability and generation. IDL compiler can be developed and extended by XML-based API, which realizes versatility and portability of modern compiler.展开更多
Objective: Postoperative complications adversely affected the prognosis in patients with gastric cancer. This study intends to investigate the feasibility of using machine-learning model to predict surgical outcomes i...Objective: Postoperative complications adversely affected the prognosis in patients with gastric cancer. This study intends to investigate the feasibility of using machine-learning model to predict surgical outcomes in patients undergoing gastrectomy.Methods: In this study, cancer patients who underwent gastrectomy at Shanghai Rui Jin Hospital in 2017 were randomly assigned to a development or validation cohort in a 9:1 ratio. A support vector classification(SVC) model to predict surgical outcomes in patients undergoing gastrectomy was developed and further validated.Results: A total of 321 patients with 32 features were collected. The positive and negative outcomes of postoperative complication after gastrectomy appeared in 100(31.2%) and 221(68.8%) patients, respectively. The SVC model was constructed to predict surgical outcomes in patients undergoing gastrectomy. The accuracy of 10-fold cross validation and external verification was 78.17% and 78.12%, respectively. Further, an online web server has been developed to share the SVC model for machine-learning-assisted prediction of surgical outcomes in patients undergoing gastrectomy in the future procedures, which is accessible at the web address:http://47.100.47.97:5005/r_model_prediction.Conclusions: The SVC model was a useful predictor for measuring the risk of postoperative complications after gastrectomy, which may help stratify patients with different overall status for choice of surgical procedure or other treatments. It can be expected that machine-learning models in cancer informatics research are possibly shareable and accessible via web address all over the world.展开更多
The composite exciter and the CaO to Na_(2)SO_(4) dosing ratios are known to have a strong impact on the mechanical strength offly-ash concrete.In the present study a hybrid approach relying on experiments and a machi...The composite exciter and the CaO to Na_(2)SO_(4) dosing ratios are known to have a strong impact on the mechanical strength offly-ash concrete.In the present study a hybrid approach relying on experiments and a machine-learn-ing technique has been used to tackle this problem.The tests have shown that the optimal admixture of CaO and Na_(2)SO_(4) alone is 8%.The best 3D mechanical strength offly-ash concrete is achieved at 8%of the compound activator;If the 28-day mechanical strength is considered,then,the best performances are obtained at 4%of the compound activator.Moreover,the 3D mechanical strength offly-ash concrete is better when the dosing ratio of CaO to Na_(2)SO_(4) in the compound activator is 1:1;the maximum strength offly-ash concrete at 28-day can be achieved for a 1:1 ratio of CaO to Na_(2)SO_(4) by considering a 4%compound activator.In this case,the compressive andflexural strengths are 260 MPa and 53.6 MPa,respectively;the mechanical strength offly-ash concrete at 28-day can be improved by a 4:1 ratio of CaO to Na_(2)SO_(4) by considering 8%and 12%compound excitants.It is shown that the predictions based on the aforementioned machine-learning approach are accurate and reliable.展开更多
The coronavirus 3C-like(3CL)protease,a cysteine protease,plays an important role in viral infection and immune escape.However,there is still a lack of effective tools for determining the cleavage sites of the 3CL prot...The coronavirus 3C-like(3CL)protease,a cysteine protease,plays an important role in viral infection and immune escape.However,there is still a lack of effective tools for determining the cleavage sites of the 3CL protease.This study systematically investigated the diversity of the cleavage sites of the coronavirus 3CL protease on the viral polyprotein,and found that the cleavage motif were highly conserved for viruses in the genera of Alphacoronavirus,Betacoronavirus and Gammacoronavirus.Strong residue preferences were observed at the neighboring positions of the cleavage sites.A random forest(RF)model was built to predict the cleavage sites of the coronavirus 3CL protease based on the representation of residues in cleavage motifs by amino acid indexes,and the model achieved an AUC of 0.96 in cross-validations.The RF model was further tested on an independent test dataset which were composed of cleavage sites on 99 proteins from multiple coronavirus hosts.It achieved an AUC of 0.95 and predicted correctly 80%of the cleavage sites.Then,1,352 human proteins were predicted to be cleaved by the 3CL protease by the RF model.These proteins were enriched in several GO terms related to the cytoskeleton,such as the microtubule,actin and tubulin.Finally,a webserver named 3CLP was built to predict the cleavage sites of the coronavirus 3CL protease based on the RF model.Overall,the study provides an effective tool for identifying cleavage sites of the 3CL protease and provides insights into the molecular mechanism underlying the pathogenicity of coronaviruses.展开更多
Anomaly detection is becoming increasingly significant in industrial cyber security,and different machine-learning algorithms have been generally acknowledged as various effective intrusion detection engines to succes...Anomaly detection is becoming increasingly significant in industrial cyber security,and different machine-learning algorithms have been generally acknowledged as various effective intrusion detection engines to successfully identify cyber attacks.However,different machine-learning algorithms may exhibit their own detection effects even if they analyze the same feature samples.As a sequence,after developing one feature generation approach,the most effective and applicable detection engines should be desperately selected by comparing distinct properties of each machine-learning algorithm.Based on process control features generated by directed function transition diagrams,this paper introduces five different machine-learning algorithms as alternative detection engines to discuss their matching abilities.Furthermore,this paper not only describes some qualitative properties to compare their advantages and disadvantages,but also gives an in-depth and meticulous research on their detection accuracies and consuming time.In the verified experiments,two attack models and four different attack intensities are defined to facilitate all quantitative comparisons,and the impacts of detection accuracy caused by the feature parameter are also comparatively analyzed.All experimental results can clearly explain that SVM(Support Vector Machine)and WNN(Wavelet Neural Network)are suggested as two applicable detection engines under differing cases.展开更多
A systolic array architecture computer (FXCQ) has been designed for signal processing. R can handle floating point data at very high speed. It is composed of 16 processing cells and a cache that are connected linearly...A systolic array architecture computer (FXCQ) has been designed for signal processing. R can handle floating point data at very high speed. It is composed of 16 processing cells and a cache that are connected linearly and form a ring structure. All processing cells are identical and programmable. Each processing cell has the peak performance of 20 million floating-point operations per second (20MFLOPS). The machine therefore has a peak performance of 320 M FLOPS. It is integrated as an attached processor into a host system through VME bus interface. Programs for FXCQ are written in a high-level language -B language, which is supported by a parallel optimizing compiler. This paper describes the architecture of FXCQ, B language and its compiler.展开更多
Syntax Notation One (ASN.1) has been widely used in specifications of high level communication protocol. It is also very important for Intelligent Networks Application Protocol(INAP). This paper presents the design an...Syntax Notation One (ASN.1) has been widely used in specifications of high level communication protocol. It is also very important for Intelligent Networks Application Protocol(INAP). This paper presents the design and implementation of the ASN.1 C++ compiler. According to the ASN.1 text, this compiler can generate C++ code of functions for encoding and decoding the data types which are defined by ASN.1. These functions are based on the Basic Encoding Rules(BER) of ASN.1. They have been used in the CIN 01 and CIN 02 systems.展开更多
The paper addresses the challenge of transmitting a big number offiles stored in a data center(DC),encrypting them by compilers,and sending them through a network at an acceptable time.Face to the big number offiles,o...The paper addresses the challenge of transmitting a big number offiles stored in a data center(DC),encrypting them by compilers,and sending them through a network at an acceptable time.Face to the big number offiles,only one compiler may not be sufficient to encrypt data in an acceptable time.In this paper,we consider the problem of several compilers and the objective is tofind an algorithm that can give an efficient schedule for the givenfiles to be compiled by the compilers.The main objective of the work is to minimize the gap in the total size of assignedfiles between compilers.This minimization ensures the fair distribution offiles to different compilers.This problem is considered to be a very hard problem.This paper presents two research axes.Thefirst axis is related to architecture.We propose a novel pre-compiler architecture in this context.The second axis is algorithmic development.We develop six algorithms to solve the problem,in this context.These algorithms are based on the dispatching rules method,decomposition method,and an iterative approach.These algorithms give approximate solutions for the studied problem.An experimental result is imple-mented to show the performance of algorithms.Several indicators are used to measure the performance of the proposed algorithms.In addition,five classes are proposed to test the algorithms with a total of 2350 instances.A comparison between the proposed algorithms is presented in different tables discussed to show the performance of each algorithm.The result showed that the best algorithm is the Iterative-mixed Smallest-Longest-Heuristic(ISL)with a percentage equal to 97.7%and an average running time equal to 0.148 s.All other algorithms did not exceed 22%as a percentage.The best algorithm excluding ISL is Iterative-mixed Longest-Smallest Heuristic(ILS)with a percentage equal to 21,4%and an average running time equal to 0.150 s.展开更多
An object-oriented C++ parallel compiler System, called OOCPCS, is developed to facilitate programmers to write sequential programs using C++ or Annotated C++ language for parallel computahon. OOCPCS bases on an integ...An object-oriented C++ parallel compiler System, called OOCPCS, is developed to facilitate programmers to write sequential programs using C++ or Annotated C++ language for parallel computahon. OOCPCS bases on an integrated object-oriented paradigm and large-grain data flow model, called OOLGDFM, and recognizes automatically parallel objects using parallel compiling techniques. The paper describes the object-oriented parallel model and realization of the System on networks.展开更多
The paper’s purpose is to design and program the four operation-calculators that receives voice instructions and runs them as either a voice or text phase. The Calculator simulates the work of the Compiler. The paper...The paper’s purpose is to design and program the four operation-calculators that receives voice instructions and runs them as either a voice or text phase. The Calculator simulates the work of the Compiler. The paper is a practical <span style="font-family:Verdana;">example programmed to support that it is possible to construct a verbal</span><span style="font-family:Verdana;"> Compiler.</span>展开更多
With the continuous expansion of software applications,people’s requirements for software quality are increasing.Software defect prediction is an important technology to improve software quality.It often encodes the ...With the continuous expansion of software applications,people’s requirements for software quality are increasing.Software defect prediction is an important technology to improve software quality.It often encodes the software into several features and applies the machine learning method to build defect prediction classifiers,which can estimate the software areas is clean or buggy.However,the current encoding methods are mainly based on the traditional manual features or the AST of source code.Traditional manual features are difficult to reflect the deep semantics of programs,and there is a lot of noise information in AST,which affects the expression of semantic features.To overcome the above deficiencies,we combined with the Convolutional Neural Networks(CNN)and proposed a novel compiler Intermediate Representation(IR)based program encoding method for software defect prediction(CIR-CNN).Specifically,our program encoding method is based on the compiler IR,which can eliminate a large amount of noise information in the syntax structure of the source code and facilitate the acquisition of more accurate semantic information.Secondly,with the help of data flow analysis,a Data Dependency Graph(DDG)is constructed on the compiler IR,which helps to capture the deeper semantic information of the program.Finally,we use the widely used CNN model to build a software defect prediction model,which can increase the adaptive ability of the method.To evaluate the performance of the CIR-CNN,we use seven projects from PROMISE datasets to set up comparative experiments.The experiments results show that,in WPDP,with our CIR-CNN method,the prediction accuracy was improved by 12%for the AST-encoded CNN-based model and by 20.9%for the traditional features-based LR model,respectively.And in CPDP,the AST-encoded DBNbased model was improved by 9.1%and the traditional features-based TCA+model by 19.2%,respectively.展开更多
Drug addiction can cause abnormal brain activation changes,which are the root cause of drug craving and brain function errors.This study enrolled drug abusers to determine the effects of different drugs on brain activ...Drug addiction can cause abnormal brain activation changes,which are the root cause of drug craving and brain function errors.This study enrolled drug abusers to determine the effects of different drugs on brain activation.A functional near-infrared spectroscopy(fNIRS)device was used for the research.This study was designed with an experimental paradigm that included the induction of resting and drug addiction cravings.We collected the fNIRS data of 30 drug users,including 10 who used heroin,10 who used Methamphetamine,and 10 who used mixed drugs.First,using Statistical Analysis,the study analyzed the activations of eight functional areas of the left and right hemispheres of the prefrontal cortex of drug addicts who respectively used heroin,Methamphetamine,and mixed drugs,including Left/Right-Dorsolateral prefrontal cortex(L/R-DLPFC),Left/Right-Ventrolateral prefrontal cortex(L/R-VLPFC),Left/Right-Fronto-polar prefrontal cortex(L/R-FPC),and Left/Right Orbitofrontal Cortex(L/R-OFC).Second,referencing the degrees of activation of oxyhaemoglobin concentration(HbO2),the study made an analysis and got the specific activation patterns of each group of the addicts.Finally,after taking out data which are related to the addicts who recorded high degrees of activation among the three groups of addicts,and which had the same channel numbers,the paper classified the different drug abusers using the data as the input data for Convolutional Neural Networks(CNNs).The average three-class accuracy is 67.13%.It is of great significance for the analysis of brain function errors and personalized rehabilitation.展开更多
The diversity of software and hardware forces programmers to spend a great deal of time optimizing their source code,which often requires specific treatment for each platform.The problem becomes critical on embedded d...The diversity of software and hardware forces programmers to spend a great deal of time optimizing their source code,which often requires specific treatment for each platform.The problem becomes critical on embedded devices,where computational and memory resources are strictly constrained.Compilers play an essential role in deploying source code on a target device through the backend.In this work,a novel backend for the Open Neural Network Compiler(ONNC)is proposed,which exploits machine learning to optimize code for the ARM Cortex-M device.The backend requires minimal changes to Open Neural Network Exchange(ONNX)models.Several novel optimization techniques are also incorporated in the backend,such as quantizing the ONNX model’s weight and automatically tuning the dimensions of operators in computations.The performance of the proposed framework is evaluated for two applications:handwritten digit recognition on the Modified National Institute of Standards and Technology(MNIST)dataset and model,and image classification on the Canadian Institute For Advanced Research and 10(CIFAR-10)dataset with the AlexNet-Light model.The system achieves 98.90%and 90.55%accuracy for handwritten digit recognition and image classification,respectively.Furthermore,the proposed architecture is significantly more lightweight than other state-of-theart models in terms of both computation time and generated source code complexity.From the system perspective,this work provides a novel approach to deploying direct computations from the available ONNX models to target devices by optimizing compilers while maintaining high efficiency in accuracy performance.展开更多
Reliable sales forecasts are important to the garment industry. In recent years, the global climate is warming, the weather changes frequently, and clothing sales are affected by weather fluctuations. The purpose of t...Reliable sales forecasts are important to the garment industry. In recent years, the global climate is warming, the weather changes frequently, and clothing sales are affected by weather fluctuations. The purpose of this study is to investigate whether weather data can improve the accuracy of product sales and to establish a corresponding clothing sales forecasting model. This model uses the basic attributes of clothing product data, historical sales data, and weather data. It is based on a random forest, XGB, and GBDT adopting a stacking strategy. We found that weather information is not useful for basic clothing sales forecasts, but it did improve the accuracy of seasonal clothing sales forecasts. The MSE of the dresses, down jackets, and shirts are reduced by 86.03%, 80.14%, and 41.49% on average. In addition, we found that the stacking strategy model outperformed the voting strategy model, with an average MSE reduction of 49.28%. Clothing managers can use this model to forecast their sales when they make sales plans based on weather information.展开更多
基金supported by the National Key R&D Program of China(Nos.2022YFC3400700,2022YFA0806600)the Key Research and Development Project of Hubei Province(No.2023BCB094)+1 种基金the Interdisciplinary Innovative Talents Foundation from Renmin Hospital of Wuhan University(No.JCRCGW-2022-008)the Key Laboratory of Hubei Province(No.2021KFY005)。
文摘Chronic kidney disease(CKD)is an increasingly prevalent medical condition associated with high mortality and cardiovascular complications.The intricate interplay between kidney dysfunction and subsequent metabolic disturbances may provide insights into the underlying mechanisms driving CKD onset and progression.Herein,we proposed a large-scale plasma metabolite identification and quantification system that combines the strengths of targeted and untargeted metabolomics technologies,i.e.,widely-targeted metabolomics(WT-Met)approach.WT-Met method enables large-scale identification and accurate quantification of thousands of metabolites.We collected plasma samples from 21 healthy controls and 62CKD patients,categorized into different stages(22 in stages 1-3,20 in stage 4,and 20 in stage 5).Using LC-MS-based WT-Met approach,we were able to effectively annotate and quantify a total of 1431metabolites from the plasma samples.Focusing on the 539 endogenous metabolites,we identified 399significantly altered metabolites and depicted their changing patterns from healthy controls to end-stage CKD.Furthermore,we employed machine-learning to identify the optimal combination of metabolites for predicting different stages of CKD.We generated a multiclass classifier consisting of 7 metabolites by machine-learning,which exhibited an average AUC of 0.99 for the test set.In general,amino acids,nucleotides,organic acids,and their metabolites emerged as the most significantly altered metabolites.However,their patterns of change varied across different stages of CKD.The 7-metabolite panel demonstrates promising potential as biomarker candidates for CKD.Further exploration of these metabolites can provide valuable insights into their roles in the etiology and progression of CKD.
文摘Edge devices,due to their limited computational and storage resources,often require the use of compilers for program optimization.Therefore,ensuring the security and reliability of these compilers is of paramount importance in the emerging field of edge AI.One widely used testing method for this purpose is fuzz testing,which detects bugs by inputting random test cases into the target program.However,this process consumes significant time and resources.To improve the efficiency of compiler fuzz testing,it is common practice to utilize test case prioritization techniques.Some researchers use machine learning to predict the code coverage of test cases,aiming to maximize the test capability for the target compiler by increasing the overall predicted coverage of the test cases.Nevertheless,these methods can only forecast the code coverage of the compiler at a specific optimization level,potentially missing many optimization-related bugs.In this paper,we introduce C-CORE(short for Clustering by Code Representation),the first framework to prioritize test cases according to their code representations,which are derived directly from the source codes.This approach avoids being limited to specific compiler states and extends to a broader range of compiler bugs.Specifically,we first train a scaled pre-trained programming language model to capture as many common features as possible from the test cases generated by a fuzzer.Using this pre-trained model,we then train two downstream models:one for predicting the likelihood of triggering a bug and another for identifying code representations associated with bugs.Subsequently,we cluster the test cases according to their code representations and select the highest-scoring test case from each cluster as the high-quality test case.This reduction in redundant testing cases leads to time savings.Comprehensive evaluation results reveal that code representations are better at distinguishing test capabilities,and C-CORE significantly enhances testing efficiency.Across four datasets,C-CORE increases the average of the percentage of faults detected(APFD)value by 0.16 to 0.31 and reduces test time by over 50% in 46% of cases.When compared to the best results from approaches using predicted code coverage,C-CORE improves the APFD value by 1.1% to 12.3% and achieves an overall time-saving of 159.1%.
基金Supported by the Natural Science Foundation of Hubei Province (2005ABA266)the Natural Science Foundation of Henan Prov-ince (0611054800)
文摘In order to adapt different languages and platforms, the paper discusses how to process and validate IDL symbol table and intermediate code by XML API. It puts emphasis on IDL AP1 extension towards DOM API based on the idea of combining XML with IDL compilers. At last, the IDL compiler designing framework based on XML AP! is given, in which compiler front end can be managed and validated by some XML techniques and tools, IDL API can be validated on the basis of test, so IDL intermediate code is provided with maintainability, portability and generation. IDL compiler can be developed and extended by XML-based API, which realizes versatility and portability of modern compiler.
文摘Objective: Postoperative complications adversely affected the prognosis in patients with gastric cancer. This study intends to investigate the feasibility of using machine-learning model to predict surgical outcomes in patients undergoing gastrectomy.Methods: In this study, cancer patients who underwent gastrectomy at Shanghai Rui Jin Hospital in 2017 were randomly assigned to a development or validation cohort in a 9:1 ratio. A support vector classification(SVC) model to predict surgical outcomes in patients undergoing gastrectomy was developed and further validated.Results: A total of 321 patients with 32 features were collected. The positive and negative outcomes of postoperative complication after gastrectomy appeared in 100(31.2%) and 221(68.8%) patients, respectively. The SVC model was constructed to predict surgical outcomes in patients undergoing gastrectomy. The accuracy of 10-fold cross validation and external verification was 78.17% and 78.12%, respectively. Further, an online web server has been developed to share the SVC model for machine-learning-assisted prediction of surgical outcomes in patients undergoing gastrectomy in the future procedures, which is accessible at the web address:http://47.100.47.97:5005/r_model_prediction.Conclusions: The SVC model was a useful predictor for measuring the risk of postoperative complications after gastrectomy, which may help stratify patients with different overall status for choice of surgical procedure or other treatments. It can be expected that machine-learning models in cancer informatics research are possibly shareable and accessible via web address all over the world.
基金supported by the Scientific Research Fund Project of Yunnan Education Department(Grant Numbers 2023J1974 and 2023J1976)the Yunnan University Professional Degree Graduate Student Practical Innovation Fund Project(Grant Number ZC-22222374)also supported by the Yunnan Provincial Education Department Fund(Grant No.2022Y286).
文摘The composite exciter and the CaO to Na_(2)SO_(4) dosing ratios are known to have a strong impact on the mechanical strength offly-ash concrete.In the present study a hybrid approach relying on experiments and a machine-learn-ing technique has been used to tackle this problem.The tests have shown that the optimal admixture of CaO and Na_(2)SO_(4) alone is 8%.The best 3D mechanical strength offly-ash concrete is achieved at 8%of the compound activator;If the 28-day mechanical strength is considered,then,the best performances are obtained at 4%of the compound activator.Moreover,the 3D mechanical strength offly-ash concrete is better when the dosing ratio of CaO to Na_(2)SO_(4) in the compound activator is 1:1;the maximum strength offly-ash concrete at 28-day can be achieved for a 1:1 ratio of CaO to Na_(2)SO_(4) by considering a 4%compound activator.In this case,the compressive andflexural strengths are 260 MPa and 53.6 MPa,respectively;the mechanical strength offly-ash concrete at 28-day can be improved by a 4:1 ratio of CaO to Na_(2)SO_(4) by considering 8%and 12%compound excitants.It is shown that the predictions based on the aforementioned machine-learning approach are accurate and reliable.
基金supported by the National Key Plan for Scientific Research and Development of China(2016YFD0500300)National Natural Science Foundation of China(32170651)Hunan Provincial Natural Science Foundation of China(2020JJ3006)。
文摘The coronavirus 3C-like(3CL)protease,a cysteine protease,plays an important role in viral infection and immune escape.However,there is still a lack of effective tools for determining the cleavage sites of the 3CL protease.This study systematically investigated the diversity of the cleavage sites of the coronavirus 3CL protease on the viral polyprotein,and found that the cleavage motif were highly conserved for viruses in the genera of Alphacoronavirus,Betacoronavirus and Gammacoronavirus.Strong residue preferences were observed at the neighboring positions of the cleavage sites.A random forest(RF)model was built to predict the cleavage sites of the coronavirus 3CL protease based on the representation of residues in cleavage motifs by amino acid indexes,and the model achieved an AUC of 0.96 in cross-validations.The RF model was further tested on an independent test dataset which were composed of cleavage sites on 99 proteins from multiple coronavirus hosts.It achieved an AUC of 0.95 and predicted correctly 80%of the cleavage sites.Then,1,352 human proteins were predicted to be cleaved by the 3CL protease by the RF model.These proteins were enriched in several GO terms related to the cytoskeleton,such as the microtubule,actin and tubulin.Finally,a webserver named 3CLP was built to predict the cleavage sites of the coronavirus 3CL protease based on the RF model.Overall,the study provides an effective tool for identifying cleavage sites of the 3CL protease and provides insights into the molecular mechanism underlying the pathogenicity of coronaviruses.
基金This work is supported by the Scientific Research Project of Educational Department of Liaoning Province(Grant No.LJKZ0082)the Program of Hainan Association for Science and Technology Plans to Youth R&D Innovation(Grant No.QCXM201910)+2 种基金the National Natural Science Foundation of China(Grant Nos.61802092 and 92067110)the Hainan Provincial Natural Science Foundation of China(Grant No.620RC562)2020 Industrial Internet Innovation and Development Project-Industrial Internet Identification Data Interaction Middleware and Resource Pool Service Platform Project,Ministry of Industry and Information Technology of the People’s Republic of China.
文摘Anomaly detection is becoming increasingly significant in industrial cyber security,and different machine-learning algorithms have been generally acknowledged as various effective intrusion detection engines to successfully identify cyber attacks.However,different machine-learning algorithms may exhibit their own detection effects even if they analyze the same feature samples.As a sequence,after developing one feature generation approach,the most effective and applicable detection engines should be desperately selected by comparing distinct properties of each machine-learning algorithm.Based on process control features generated by directed function transition diagrams,this paper introduces five different machine-learning algorithms as alternative detection engines to discuss their matching abilities.Furthermore,this paper not only describes some qualitative properties to compare their advantages and disadvantages,but also gives an in-depth and meticulous research on their detection accuracies and consuming time.In the verified experiments,two attack models and four different attack intensities are defined to facilitate all quantitative comparisons,and the impacts of detection accuracy caused by the feature parameter are also comparatively analyzed.All experimental results can clearly explain that SVM(Support Vector Machine)and WNN(Wavelet Neural Network)are suggested as two applicable detection engines under differing cases.
文摘A systolic array architecture computer (FXCQ) has been designed for signal processing. R can handle floating point data at very high speed. It is composed of 16 processing cells and a cache that are connected linearly and form a ring structure. All processing cells are identical and programmable. Each processing cell has the peak performance of 20 million floating-point operations per second (20MFLOPS). The machine therefore has a peak performance of 320 M FLOPS. It is integrated as an attached processor into a host system through VME bus interface. Programs for FXCQ are written in a high-level language -B language, which is supported by a parallel optimizing compiler. This paper describes the architecture of FXCQ, B language and its compiler.
文摘Syntax Notation One (ASN.1) has been widely used in specifications of high level communication protocol. It is also very important for Intelligent Networks Application Protocol(INAP). This paper presents the design and implementation of the ASN.1 C++ compiler. According to the ASN.1 text, this compiler can generate C++ code of functions for encoding and decoding the data types which are defined by ASN.1. These functions are based on the Basic Encoding Rules(BER) of ASN.1. They have been used in the CIN 01 and CIN 02 systems.
基金The author would like to thank the Deanship of Scientific Research at Majmaah University for supporting this work under Project Number No.R-2022-85.
文摘The paper addresses the challenge of transmitting a big number offiles stored in a data center(DC),encrypting them by compilers,and sending them through a network at an acceptable time.Face to the big number offiles,only one compiler may not be sufficient to encrypt data in an acceptable time.In this paper,we consider the problem of several compilers and the objective is tofind an algorithm that can give an efficient schedule for the givenfiles to be compiled by the compilers.The main objective of the work is to minimize the gap in the total size of assignedfiles between compilers.This minimization ensures the fair distribution offiles to different compilers.This problem is considered to be a very hard problem.This paper presents two research axes.Thefirst axis is related to architecture.We propose a novel pre-compiler architecture in this context.The second axis is algorithmic development.We develop six algorithms to solve the problem,in this context.These algorithms are based on the dispatching rules method,decomposition method,and an iterative approach.These algorithms give approximate solutions for the studied problem.An experimental result is imple-mented to show the performance of algorithms.Several indicators are used to measure the performance of the proposed algorithms.In addition,five classes are proposed to test the algorithms with a total of 2350 instances.A comparison between the proposed algorithms is presented in different tables discussed to show the performance of each algorithm.The result showed that the best algorithm is the Iterative-mixed Smallest-Longest-Heuristic(ISL)with a percentage equal to 97.7%and an average running time equal to 0.148 s.All other algorithms did not exceed 22%as a percentage.The best algorithm excluding ISL is Iterative-mixed Longest-Smallest Heuristic(ILS)with a percentage equal to 21,4%and an average running time equal to 0.150 s.
文摘An object-oriented C++ parallel compiler System, called OOCPCS, is developed to facilitate programmers to write sequential programs using C++ or Annotated C++ language for parallel computahon. OOCPCS bases on an integrated object-oriented paradigm and large-grain data flow model, called OOLGDFM, and recognizes automatically parallel objects using parallel compiling techniques. The paper describes the object-oriented parallel model and realization of the System on networks.
文摘The paper’s purpose is to design and program the four operation-calculators that receives voice instructions and runs them as either a voice or text phase. The Calculator simulates the work of the Compiler. The paper is a practical <span style="font-family:Verdana;">example programmed to support that it is possible to construct a verbal</span><span style="font-family:Verdana;"> Compiler.</span>
基金This work was supported by the Universities Natural Science Research Project of Jiangsu Province under Grant 20KJB520026 and 20KJA520002the Foundation for Young Teachers of Nanjing Auditing University under Grant 19QNPY018the National Nature Science Foundation of China under Grant 71972102 and 61902189.
文摘With the continuous expansion of software applications,people’s requirements for software quality are increasing.Software defect prediction is an important technology to improve software quality.It often encodes the software into several features and applies the machine learning method to build defect prediction classifiers,which can estimate the software areas is clean or buggy.However,the current encoding methods are mainly based on the traditional manual features or the AST of source code.Traditional manual features are difficult to reflect the deep semantics of programs,and there is a lot of noise information in AST,which affects the expression of semantic features.To overcome the above deficiencies,we combined with the Convolutional Neural Networks(CNN)and proposed a novel compiler Intermediate Representation(IR)based program encoding method for software defect prediction(CIR-CNN).Specifically,our program encoding method is based on the compiler IR,which can eliminate a large amount of noise information in the syntax structure of the source code and facilitate the acquisition of more accurate semantic information.Secondly,with the help of data flow analysis,a Data Dependency Graph(DDG)is constructed on the compiler IR,which helps to capture the deeper semantic information of the program.Finally,we use the widely used CNN model to build a software defect prediction model,which can increase the adaptive ability of the method.To evaluate the performance of the CIR-CNN,we use seven projects from PROMISE datasets to set up comparative experiments.The experiments results show that,in WPDP,with our CIR-CNN method,the prediction accuracy was improved by 12%for the AST-encoded CNN-based model and by 20.9%for the traditional features-based LR model,respectively.And in CPDP,the AST-encoded DBNbased model was improved by 9.1%and the traditional features-based TCA+model by 19.2%,respectively.
基金This work was supported by the National Natural Science Foundation of China(No.61976133)Shanghai Industrial Collaborative Technology Innovation Project(No.2021-cyxt1-kj14)+2 种基金Major Scienti¯c and Technological Innovation Projects of Shan Dong Province(No.2019JZZY021010)Science and Technology Innovation Base Project of Shanghai Science and Technology Commission(19DZ2255200)Defense Industrial Technology Development Program(JCKY2019413D002).
文摘Drug addiction can cause abnormal brain activation changes,which are the root cause of drug craving and brain function errors.This study enrolled drug abusers to determine the effects of different drugs on brain activation.A functional near-infrared spectroscopy(fNIRS)device was used for the research.This study was designed with an experimental paradigm that included the induction of resting and drug addiction cravings.We collected the fNIRS data of 30 drug users,including 10 who used heroin,10 who used Methamphetamine,and 10 who used mixed drugs.First,using Statistical Analysis,the study analyzed the activations of eight functional areas of the left and right hemispheres of the prefrontal cortex of drug addicts who respectively used heroin,Methamphetamine,and mixed drugs,including Left/Right-Dorsolateral prefrontal cortex(L/R-DLPFC),Left/Right-Ventrolateral prefrontal cortex(L/R-VLPFC),Left/Right-Fronto-polar prefrontal cortex(L/R-FPC),and Left/Right Orbitofrontal Cortex(L/R-OFC).Second,referencing the degrees of activation of oxyhaemoglobin concentration(HbO2),the study made an analysis and got the specific activation patterns of each group of the addicts.Finally,after taking out data which are related to the addicts who recorded high degrees of activation among the three groups of addicts,and which had the same channel numbers,the paper classified the different drug abusers using the data as the input data for Convolutional Neural Networks(CNNs).The average three-class accuracy is 67.13%.It is of great significance for the analysis of brain function errors and personalized rehabilitation.
基金This work was supported in part by the Ministry of Science and Technology of Taiwan,R.O.C.,the Grant Number of project 108-2218-E-194-007.
文摘The diversity of software and hardware forces programmers to spend a great deal of time optimizing their source code,which often requires specific treatment for each platform.The problem becomes critical on embedded devices,where computational and memory resources are strictly constrained.Compilers play an essential role in deploying source code on a target device through the backend.In this work,a novel backend for the Open Neural Network Compiler(ONNC)is proposed,which exploits machine learning to optimize code for the ARM Cortex-M device.The backend requires minimal changes to Open Neural Network Exchange(ONNX)models.Several novel optimization techniques are also incorporated in the backend,such as quantizing the ONNX model’s weight and automatically tuning the dimensions of operators in computations.The performance of the proposed framework is evaluated for two applications:handwritten digit recognition on the Modified National Institute of Standards and Technology(MNIST)dataset and model,and image classification on the Canadian Institute For Advanced Research and 10(CIFAR-10)dataset with the AlexNet-Light model.The system achieves 98.90%and 90.55%accuracy for handwritten digit recognition and image classification,respectively.Furthermore,the proposed architecture is significantly more lightweight than other state-of-theart models in terms of both computation time and generated source code complexity.From the system perspective,this work provides a novel approach to deploying direct computations from the available ONNX models to target devices by optimizing compilers while maintaining high efficiency in accuracy performance.
文摘Reliable sales forecasts are important to the garment industry. In recent years, the global climate is warming, the weather changes frequently, and clothing sales are affected by weather fluctuations. The purpose of this study is to investigate whether weather data can improve the accuracy of product sales and to establish a corresponding clothing sales forecasting model. This model uses the basic attributes of clothing product data, historical sales data, and weather data. It is based on a random forest, XGB, and GBDT adopting a stacking strategy. We found that weather information is not useful for basic clothing sales forecasts, but it did improve the accuracy of seasonal clothing sales forecasts. The MSE of the dresses, down jackets, and shirts are reduced by 86.03%, 80.14%, and 41.49% on average. In addition, we found that the stacking strategy model outperformed the voting strategy model, with an average MSE reduction of 49.28%. Clothing managers can use this model to forecast their sales when they make sales plans based on weather information.