Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Informati...Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.展开更多
This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, pr...This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.展开更多
It is quite common in statistical modeling to select a model and make inference as if the model had been known in advance;i.e. ignoring model selection uncertainty. The resulted estimator is called post-model selectio...It is quite common in statistical modeling to select a model and make inference as if the model had been known in advance;i.e. ignoring model selection uncertainty. The resulted estimator is called post-model selection estimator (PMSE) whose properties are hard to derive. Conditioning on data at hand (as it is usually the case), Bayesian model selection is free of this phenomenon. This paper is concerned with the properties of Bayesian estimator obtained after model selection when the frequentist (long run) performances of the resulted Bayesian estimator are of interest. The proposed method, using Bayesian decision theory, is based on the well known Bayesian model averaging (BMA)’s machinery;and outperforms PMSE and BMA. It is shown that if the unconditional model selection probability is equal to model prior, then the proposed approach reduces BMA. The method is illustrated using Bernoulli trials.展开更多
The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made befor...The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.展开更多
Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,differe...Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.展开更多
The ongoing research for model choice and selection has generated a plethora of approaches. With such a wealth of methods, it can be difficult for a researcher to know what model selection approach is the proper w...The ongoing research for model choice and selection has generated a plethora of approaches. With such a wealth of methods, it can be difficult for a researcher to know what model selection approach is the proper way to proceed to select the appropriate model for prediction. The authors present an evaluation of various model selection criteria from decision-theoretic perspective using experimental data to define and recommend a criterion to select the best model. In this analysis, six of the most common selection criteria, nineteen friction factor correlations, and eight sets of experimental data are employed. The results show that while the use of the traditional correlation coefficient, R2 is inappropriate, root mean square error, RMSE can be used to rank models, but does not give much insight on their accuracy. Other criteria such as correlation ratio, mean absolute error, and standard deviation are also evaluated. The AIC (Akaike Information Criterion) has shown its superiority to other selection criteria. The authors propose AIC as an alternative to use when fitting experimental data or evaluating existing correlations. Indeed, the AIC method is an information theory based, theoretically sound and stable. The paper presents a detailed discussion of the model selection criteria, their pros and cons, and how they can be utilized to allow proper comparison of different models for the best model to be inferred based on sound mathematical theory. In conclusion, model selection is an interesting problem and an innovative strategy to help alleviate similar challenges faced by the professionals in the oil and gas industry is introduced.展开更多
This paper proposes a new search strategy using mutative scale chaos optimization algorithm (MSCO) for model selection of support vector machine (SVM). It searches the parameter space of SVM with a very high effic...This paper proposes a new search strategy using mutative scale chaos optimization algorithm (MSCO) for model selection of support vector machine (SVM). It searches the parameter space of SVM with a very high efficiency and finds the optimum parameter setting for a practical classification problem with very low time cost. To demonstrate the performance of the proposed method it is applied to model selection of SVM in ultrasonic flaw classification and compared with grid search for model selection. Experimental results show that MSCO is a very powerful tool for model selection of SVM, and outperforms grid search in search speed and precision in ultrasonic flaw classification.展开更多
To solve the medium and long term power load forecasting problem,the combination forecasting method is further expanded and a weighted combination forecasting model for power load is put forward.This model is divided ...To solve the medium and long term power load forecasting problem,the combination forecasting method is further expanded and a weighted combination forecasting model for power load is put forward.This model is divided into two stages which are forecasting model selection and weighted combination forecasting.Based on Markov chain conversion and cloud model,the forecasting model selection is implanted and several outstanding models are selected for the combination forecasting.For the weighted combination forecasting,a fuzzy scale joint evaluation method is proposed to determine the weight of selected forecasting model.The percentage error and mean absolute percentage error of weighted combination forecasting result of the power consumption in a certain area of China are 0.7439%and 0.3198%,respectively,while the maximum values of these two indexes of single forecasting models are 5.2278%and 1.9497%.It shows that the forecasting indexes of proposed model are improved significantly compared with the single forecasting models.展开更多
Regional climate change impact assessments are becoming increasingly important for developing adaptation strategies in an uncertain future with respect to hydro-climatic extremes. There are a number of Global Climate ...Regional climate change impact assessments are becoming increasingly important for developing adaptation strategies in an uncertain future with respect to hydro-climatic extremes. There are a number of Global Climate Models (GCMs) and emission scenarios providing predictions of future changes in climate. As a result, there is a level of uncertainty associated with the decision of which climate models to use for the assessment of climate change impacts. The IPCC has recommended using as many global climate model scenarios as possible;however, this approach may be impractical for regional assessments that are computationally demanding. Methods have been developed to select climate model scenarios, generally consisting of selecting a model with the highest skill (validation), creating an ensemble, or selecting one or more extremes. Validation methods limit analyses to models with higher skill in simulating historical climate, ensemble methods typically take multi model means, median, or percentiles, and extremes methods tend to use scenarios which bound the projected changes in precipitation and temperature. In this paper a quantile regression based validation method is developed and applied to generate a reduced set of GCM-scenarios to analyze daily maximum streamflow uncertainty in the Upper Thames River Basin, Canada, while extremes and percentile ensemble approaches are also used for comparison. Results indicate that the validation method was able to effectively rank and reduce the set of scenarios, while the extremes and percentile ensemble methods were found not to necessarily correlate well with the range of extreme flows for all calendar months and return periods.展开更多
We focus on the development of model selection criteria in linear mixed models. In particular, we propose the model selection criteria following the Mallows’ Conceptual Predictive Statistic (Cp) [1] [2] in linear mix...We focus on the development of model selection criteria in linear mixed models. In particular, we propose the model selection criteria following the Mallows’ Conceptual Predictive Statistic (Cp) [1] [2] in linear mixed models. When correlation exists between the observations in data, the normal Gauss discrepancy in univariate case is not appropriate to measure the distance between the true model and a candidate model. Instead, we define a marginal Gauss discrepancy which takes the correlation into account in the mixed models. The model selection criterion, marginal Cp, called MCp, serves as an asymptotically unbiased estimator of the expected marginal Gauss discrepancy. An improvement of MCp, called IMCp, is then derived and proved to be a more accurate estimator of the expected marginal Gauss discrepancy than MCp. The performance of the proposed criteria is investigated in a simulation study. The simulation results show that in small samples, the proposed criteria outperform the Akaike Information Criteria (AIC) [3] [4] and Bayesian Information Criterion (BIC) [5] in selecting the correct model;in large samples, their performance is competitive. Further, the proposed criteria perform significantly better for highly correlated response data than for weakly correlated data.展开更多
We consider the model selection problem of the dependency between the?terminal event and the non-terminal event under semi-competing risks data. When the relationship between the two events is unspecified, the inferen...We consider the model selection problem of the dependency between the?terminal event and the non-terminal event under semi-competing risks data. When the relationship between the two events is unspecified, the inference on the non-terminal event is not identifiable. We cannot make inference on the non-terminal event without extra assumptions. Thus, an association model for?semi-competing risks data is necessary, and it is important to select an appropriate dependence model for a data set. We construct the likelihood function for semi-competing risks data to select an appropriate dependence model. From?simulation studies, it shows the performance of the proposed approach is well. Finally, we apply our method to a bone marrow transplant data set.展开更多
Computations involved in Bayesian approach to practical model selection problems are usually very difficult. Computational simplifications are sometimes possible, but are not generally applicable. There is a large lit...Computations involved in Bayesian approach to practical model selection problems are usually very difficult. Computational simplifications are sometimes possible, but are not generally applicable. There is a large literature available on a methodology based on information theory called Minimum Description Length (MDL). It is described here how many of these techniques are either directly Bayesian in nature, or are very good objective approximations to Bayesian solutions. First, connections between the Bayesian approach and MDL are theoretically explored;thereafter a few illustrations are provided to describe how MDL can give useful computational simplifications.展开更多
Multiple response surface methodology (MRSM) most often involves the analysis of small sample size datasets which have associated inherent statistical modeling problems. Firstly, classical model selection criteria in ...Multiple response surface methodology (MRSM) most often involves the analysis of small sample size datasets which have associated inherent statistical modeling problems. Firstly, classical model selection criteria in use are very inefficient with small sample size datasets. Secondly, classical model selection criteria have an acknowledged selection uncertainty problem. Finally, there is a credibility problem associated with modeling small sample sizes of the order of most MRSM datasets. This work focuses on determination of a solution to these identified problems. The small sample model selection uncertainty problem is analysed using sixteen model selection criteria and a typical two-input MRSM dataset. Selection of candidate models, for the responses in consideration, is done based on response surface conformity to expectation to deliberately avoid selection of models using the problematic classical model selection criteria. A set of permutations of combinations of response models with conforming response surfaces is determined. Each combination is optimised and results are obtained using overlaying of data matrices. The permutation of results is then averaged to obtain credible results. Thus, a transparent multiple model approach is used to obtain the solution which gives some credibility to the small sample size results of the typical MRSM dataset. The conclusion is that, for a two-input process MRSM problem, conformity of response surfaces can be effectively used to select candidate models and thus the use of the problematic model selection criteria is avoidable.展开更多
In the perspective of new institutional economics,we regard farmers' cooperatives as a "contractual set" integrating a series of long-term contractual relations,and transform the selection problem of org...In the perspective of new institutional economics,we regard farmers' cooperatives as a "contractual set" integrating a series of long-term contractual relations,and transform the selection problem of organization forms into selection problem of contractual model within organization.By the theoretical framework of Transaction Cost Economics,we analyze the formation mechanism and determinant factors of contractual model of different farmers' cooperatives and conduct case study on Production and Marketing Cooperative of Sweet Pomegranate in Mengzi,Yunnan.The results show that selecting contractual forms of cooperatives is the result of weighing many factors;new organization model or contractual arrangement is complementary to the former system arrangement;the selection of cooperatives model is an important factor impacting cooperation efficiency and stability of organization.One organization model with efficiency not only hinges on the transaction characteristic of organization,but also considers the compatibility with exterior transaction environment.In the process of selecting contractual model,we should conform to objective evolving law,but not be in thrall to a certain given form.展开更多
Kriging models are widely employed due to their simplicity and flexibility in a variety of fields.To gain more distributional information about the unknown parameters,we focus on constructing the fiducial distribution...Kriging models are widely employed due to their simplicity and flexibility in a variety of fields.To gain more distributional information about the unknown parameters,we focus on constructing the fiducial distribution of Kriging model parameters.To solve the challenge of constructing the fiducial marginal distribution for the spatially related parameter,we substitute the Bayesian posterior distribution for the fiducial distribution of this spatially related parameter and present a quasi-fiducial distribution for Kriging model parameters.A Gibbs sampling algorithm is given to get the samples of the quasi-fiducial distribution.Then a model selection criterion based on the quasi-fiducial distribution is proposed.Numerical studies demonstrate that the proposed method is superior to the Lasso and Elastic Net.展开更多
Deep neural network(DNN)models have achieved remarkable performance across diverse tasks,leading to widespread commercial adoption.However,training high-accuracy models demands extensive data,substantial computational...Deep neural network(DNN)models have achieved remarkable performance across diverse tasks,leading to widespread commercial adoption.However,training high-accuracy models demands extensive data,substantial computational resources,and significant time investment,making them valuable assets vulnerable to unauthorized exploitation.To address this issue,this paper proposes an intellectual property(IP)protection framework for DNN models based on feature layer selection and hyper-chaotic mapping.Firstly,a sensitivity-based importance evaluation algorithm is used to identify the key feature layers for encryption,effectively protecting the core components of the model.Next,the L1 regularization criterion is applied to further select high-weight features that significantly impact the model’s performance,ensuring that the encryption process minimizes performance loss.Finally,a dual-layer encryption mechanism is designed,introducing perturbations into the weight values and utilizing hyperchaotic mapping to disrupt channel information,further enhancing the model’s security.Experimental results demonstrate that encrypting only a small subset of parameters effectively reduces model accuracy to random-guessing levels while ensuring full recoverability.The scheme exhibits strong robustness against model pruning and fine-tuning attacks and maintains consistent performance across multiple datasets,providing an efficient and practical solution for authorization-based DNN IP protection.展开更多
As the density of wireless networks increases globally, the vulnerability of overlapped dense wireless communications to interference by hidden nodes and denial-of-service (DoS) attacks is becoming more apparent. Ther...As the density of wireless networks increases globally, the vulnerability of overlapped dense wireless communications to interference by hidden nodes and denial-of-service (DoS) attacks is becoming more apparent. There exists a gap in research on the detection and response to attacks on Medium Access Control (MAC) mechanisms themselves, which would lead to service outages between nodes. Classifying exploitation and deceptive jamming attacks on control mechanisms is particularly challengingdue to their resemblance to normal heavy communication patterns. Accordingly, this paper proposes a machine learning-based selective attack mitigation model that detects DoS attacks on wireless networks by monitoring packet log data. Based on the type of detected attack, it implements effective corresponding mitigation techniques to restore performance to nodes whose availability has been compromised. Experimental results reveal that the accuracy of the proposed model is 14% higher than that of a baseline anomaly detection model. Further, the appropriate mitigation techniques selected by the proposed system based on the attack type improve the average throughput by more than 440% compared to the case without a response.展开更多
KaKs_Calculator is a software package that calculates nonsynonymous (Ka) and synonymous (Ks) substitution rates through model selection and model averaging. Since existing methods for this estimation adopt their s...KaKs_Calculator is a software package that calculates nonsynonymous (Ka) and synonymous (Ks) substitution rates through model selection and model averaging. Since existing methods for this estimation adopt their specific mutation (substitution) models that consider different evolutionary features, leading to diverse estimates, KaKs_Calculator implements a set of candidate models in a maximum likelihood framework and adopts the Akaike information criterion to measure fitness between models and data, aiming to include as many features as needed for accurately capturing evolutionary information in protein-coding sequences. In addition, several existing methods for calculating Ka and Ks are also incorporated into this software. KaKs_Calculator, including source codes, compiled executables, and documentation, is freely available for academic use at http://evolution.genomics.org.cn/software.htm.展开更多
The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches inc...The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.展开更多
A powerful investigative tool in biology is to consider not a single mathematical model but a collection of models designed to explore different working hypotheses and select the best model in that collection.In these...A powerful investigative tool in biology is to consider not a single mathematical model but a collection of models designed to explore different working hypotheses and select the best model in that collection.In these lecture notes,the usual workflow of the use of mathematical models to investigate a biological problem is described and the use of a collection of model is motivated.Models depend on parameters that must be estimated using observations;and when a collection of models is considered,the best model has then to be identified based on available observations.Hence,model calibration and selection,which are intrinsically linked,are essential steps of the workflow.Here,some procedures for model calibration and a criterion,the Akaike Information Criterion,of model selection based on experimental data are described.Rough derivation,practical technique of computation and use of this criterion are detailed.展开更多
文摘Traditional methods for selecting models in experimental data analysis are susceptible to researcher bias, hindering exploration of alternative explanations and potentially leading to overfitting. The Finite Information Quantity (FIQ) approach offers a novel solution by acknowledging the inherent limitations in information processing capacity of physical systems. This framework facilitates the development of objective criteria for model selection (comparative uncertainty) and paves the way for a more comprehensive understanding of phenomena through exploring diverse explanations. This work presents a detailed comparison of the FIQ approach with ten established model selection methods, highlighting the advantages and limitations of each. We demonstrate the potential of FIQ to enhance the objectivity and robustness of scientific inquiry through three practical examples: selecting appropriate models for measuring fundamental constants, sound velocity, and underwater electrical discharges. Further research is warranted to explore the full applicability of FIQ across various scientific disciplines.
文摘This paper briefs the configuration and performance of large size gas turbines and their composed combined cycle power plants designed and produced by four large renown gas turbine manufacturing firms in the world, providing reference for the relevant sectors and enterprises in importing advanced gas turbines and technologies.
文摘It is quite common in statistical modeling to select a model and make inference as if the model had been known in advance;i.e. ignoring model selection uncertainty. The resulted estimator is called post-model selection estimator (PMSE) whose properties are hard to derive. Conditioning on data at hand (as it is usually the case), Bayesian model selection is free of this phenomenon. This paper is concerned with the properties of Bayesian estimator obtained after model selection when the frequentist (long run) performances of the resulted Bayesian estimator are of interest. The proposed method, using Bayesian decision theory, is based on the well known Bayesian model averaging (BMA)’s machinery;and outperforms PMSE and BMA. It is shown that if the unconditional model selection probability is equal to model prior, then the proposed approach reduces BMA. The method is illustrated using Bernoulli trials.
基金partly supported by National Basic Research Program of China(973 Program,2011CB707802,2013CB910200)National Science Foundation of China(11201466)
文摘The traditional model selection criterions try to make a balance between fitted error and model complexity. Assumptions on the distribution of the response or the noise, which may be misspecified, should be made before using the traditional ones. In this ar- ticle, we give a new model selection criterion, based on the assumption that noise term in the model is independent with explanatory variables, of minimizing the association strength between regression residuals and the response, with fewer assumptions. Maximal Information Coe^cient (MIC), a recently proposed dependence measure, captures a wide range of associ- ations, and gives almost the same score to different type of relationships with equal noise, so MIC is used to measure the association strength. Furthermore, partial maximal information coefficient (PMIC) is introduced to capture the association between two variables removing a third controlling random variable. In addition, the definition of general partial relationship is given.
基金funded by the Young Academic Leaders Supporting Project in Institutions of Higher Education of Shanxi Province,China
文摘Covariance functions have been proposed as an alternative to model longitudinal data in animal breeding because of their various merits in comparison to the classical analytical methods.In practical estimation,different models and polynomial orders fitted can influence the estimates of covariance functions and thus genetic parameters.The objective of this study was to select model for estimation of covariance functions for body weights of Angora goats at 7 time points.Covariance functions were estimated by fitting 6 random regression models with birth year,birth month,sex,age of dam,birth type,and relative birth date as fixed effects.Random effects involved were direct and maternal additive genetic,and animal and maternal permanent environmental effects with different orders of fit.Selection of model and orders of fit were carried out by likelihood ratio test and 4 types of information criteria.The results showed that model with 6 orders of polynomial fit for direct additive genetic and animal permanent environmental effects and 4 and 5 orders for maternal genetic and permanent environmental effects,respectively,were preferable for estimation of covariance functions.Models with and without maternal effects influenced the estimates of covariance functions greatly.Maternal permanent environmental effect does not explain the variation of all permanent environments,well suggesting different sources of permanent environmental effects also has large influence on covariance function estimates.
文摘The ongoing research for model choice and selection has generated a plethora of approaches. With such a wealth of methods, it can be difficult for a researcher to know what model selection approach is the proper way to proceed to select the appropriate model for prediction. The authors present an evaluation of various model selection criteria from decision-theoretic perspective using experimental data to define and recommend a criterion to select the best model. In this analysis, six of the most common selection criteria, nineteen friction factor correlations, and eight sets of experimental data are employed. The results show that while the use of the traditional correlation coefficient, R2 is inappropriate, root mean square error, RMSE can be used to rank models, but does not give much insight on their accuracy. Other criteria such as correlation ratio, mean absolute error, and standard deviation are also evaluated. The AIC (Akaike Information Criterion) has shown its superiority to other selection criteria. The authors propose AIC as an alternative to use when fitting experimental data or evaluating existing correlations. Indeed, the AIC method is an information theory based, theoretically sound and stable. The paper presents a detailed discussion of the model selection criteria, their pros and cons, and how they can be utilized to allow proper comparison of different models for the best model to be inferred based on sound mathematical theory. In conclusion, model selection is an interesting problem and an innovative strategy to help alleviate similar challenges faced by the professionals in the oil and gas industry is introduced.
基金Project supported by National High-Technology Research and De-velopment Program of China (Grant No .863-2001AA602021)
文摘This paper proposes a new search strategy using mutative scale chaos optimization algorithm (MSCO) for model selection of support vector machine (SVM). It searches the parameter space of SVM with a very high efficiency and finds the optimum parameter setting for a practical classification problem with very low time cost. To demonstrate the performance of the proposed method it is applied to model selection of SVM in ultrasonic flaw classification and compared with grid search for model selection. Experimental results show that MSCO is a very powerful tool for model selection of SVM, and outperforms grid search in search speed and precision in ultrasonic flaw classification.
文摘To solve the medium and long term power load forecasting problem,the combination forecasting method is further expanded and a weighted combination forecasting model for power load is put forward.This model is divided into two stages which are forecasting model selection and weighted combination forecasting.Based on Markov chain conversion and cloud model,the forecasting model selection is implanted and several outstanding models are selected for the combination forecasting.For the weighted combination forecasting,a fuzzy scale joint evaluation method is proposed to determine the weight of selected forecasting model.The percentage error and mean absolute percentage error of weighted combination forecasting result of the power consumption in a certain area of China are 0.7439%and 0.3198%,respectively,while the maximum values of these two indexes of single forecasting models are 5.2278%and 1.9497%.It shows that the forecasting indexes of proposed model are improved significantly compared with the single forecasting models.
文摘Regional climate change impact assessments are becoming increasingly important for developing adaptation strategies in an uncertain future with respect to hydro-climatic extremes. There are a number of Global Climate Models (GCMs) and emission scenarios providing predictions of future changes in climate. As a result, there is a level of uncertainty associated with the decision of which climate models to use for the assessment of climate change impacts. The IPCC has recommended using as many global climate model scenarios as possible;however, this approach may be impractical for regional assessments that are computationally demanding. Methods have been developed to select climate model scenarios, generally consisting of selecting a model with the highest skill (validation), creating an ensemble, or selecting one or more extremes. Validation methods limit analyses to models with higher skill in simulating historical climate, ensemble methods typically take multi model means, median, or percentiles, and extremes methods tend to use scenarios which bound the projected changes in precipitation and temperature. In this paper a quantile regression based validation method is developed and applied to generate a reduced set of GCM-scenarios to analyze daily maximum streamflow uncertainty in the Upper Thames River Basin, Canada, while extremes and percentile ensemble approaches are also used for comparison. Results indicate that the validation method was able to effectively rank and reduce the set of scenarios, while the extremes and percentile ensemble methods were found not to necessarily correlate well with the range of extreme flows for all calendar months and return periods.
文摘We focus on the development of model selection criteria in linear mixed models. In particular, we propose the model selection criteria following the Mallows’ Conceptual Predictive Statistic (Cp) [1] [2] in linear mixed models. When correlation exists between the observations in data, the normal Gauss discrepancy in univariate case is not appropriate to measure the distance between the true model and a candidate model. Instead, we define a marginal Gauss discrepancy which takes the correlation into account in the mixed models. The model selection criterion, marginal Cp, called MCp, serves as an asymptotically unbiased estimator of the expected marginal Gauss discrepancy. An improvement of MCp, called IMCp, is then derived and proved to be a more accurate estimator of the expected marginal Gauss discrepancy than MCp. The performance of the proposed criteria is investigated in a simulation study. The simulation results show that in small samples, the proposed criteria outperform the Akaike Information Criteria (AIC) [3] [4] and Bayesian Information Criterion (BIC) [5] in selecting the correct model;in large samples, their performance is competitive. Further, the proposed criteria perform significantly better for highly correlated response data than for weakly correlated data.
文摘We consider the model selection problem of the dependency between the?terminal event and the non-terminal event under semi-competing risks data. When the relationship between the two events is unspecified, the inference on the non-terminal event is not identifiable. We cannot make inference on the non-terminal event without extra assumptions. Thus, an association model for?semi-competing risks data is necessary, and it is important to select an appropriate dependence model for a data set. We construct the likelihood function for semi-competing risks data to select an appropriate dependence model. From?simulation studies, it shows the performance of the proposed approach is well. Finally, we apply our method to a bone marrow transplant data set.
文摘Computations involved in Bayesian approach to practical model selection problems are usually very difficult. Computational simplifications are sometimes possible, but are not generally applicable. There is a large literature available on a methodology based on information theory called Minimum Description Length (MDL). It is described here how many of these techniques are either directly Bayesian in nature, or are very good objective approximations to Bayesian solutions. First, connections between the Bayesian approach and MDL are theoretically explored;thereafter a few illustrations are provided to describe how MDL can give useful computational simplifications.
文摘Multiple response surface methodology (MRSM) most often involves the analysis of small sample size datasets which have associated inherent statistical modeling problems. Firstly, classical model selection criteria in use are very inefficient with small sample size datasets. Secondly, classical model selection criteria have an acknowledged selection uncertainty problem. Finally, there is a credibility problem associated with modeling small sample sizes of the order of most MRSM datasets. This work focuses on determination of a solution to these identified problems. The small sample model selection uncertainty problem is analysed using sixteen model selection criteria and a typical two-input MRSM dataset. Selection of candidate models, for the responses in consideration, is done based on response surface conformity to expectation to deliberately avoid selection of models using the problematic classical model selection criteria. A set of permutations of combinations of response models with conforming response surfaces is determined. Each combination is optimised and results are obtained using overlaying of data matrices. The permutation of results is then averaged to obtain credible results. Thus, a transparent multiple model approach is used to obtain the solution which gives some credibility to the small sample size results of the typical MRSM dataset. The conclusion is that, for a two-input process MRSM problem, conformity of response surfaces can be effectively used to select candidate models and thus the use of the problematic model selection criteria is avoidable.
文摘In the perspective of new institutional economics,we regard farmers' cooperatives as a "contractual set" integrating a series of long-term contractual relations,and transform the selection problem of organization forms into selection problem of contractual model within organization.By the theoretical framework of Transaction Cost Economics,we analyze the formation mechanism and determinant factors of contractual model of different farmers' cooperatives and conduct case study on Production and Marketing Cooperative of Sweet Pomegranate in Mengzi,Yunnan.The results show that selecting contractual forms of cooperatives is the result of weighing many factors;new organization model or contractual arrangement is complementary to the former system arrangement;the selection of cooperatives model is an important factor impacting cooperation efficiency and stability of organization.One organization model with efficiency not only hinges on the transaction characteristic of organization,but also considers the compatibility with exterior transaction environment.In the process of selecting contractual model,we should conform to objective evolving law,but not be in thrall to a certain given form.
基金supported by the National Social Science Found of China[Grant number 23BTJ064].
文摘Kriging models are widely employed due to their simplicity and flexibility in a variety of fields.To gain more distributional information about the unknown parameters,we focus on constructing the fiducial distribution of Kriging model parameters.To solve the challenge of constructing the fiducial marginal distribution for the spatially related parameter,we substitute the Bayesian posterior distribution for the fiducial distribution of this spatially related parameter and present a quasi-fiducial distribution for Kriging model parameters.A Gibbs sampling algorithm is given to get the samples of the quasi-fiducial distribution.Then a model selection criterion based on the quasi-fiducial distribution is proposed.Numerical studies demonstrate that the proposed method is superior to the Lasso and Elastic Net.
基金supported in part by the National Natural Science Foundation of China under Grant No.62172280in part by the Key Scientific Research Projects of Colleges and Universities in Henan Province,China under Grant No.23A520006in part by Henan Provincial Science and Technology Research Project under Grant No.222102210199.
文摘Deep neural network(DNN)models have achieved remarkable performance across diverse tasks,leading to widespread commercial adoption.However,training high-accuracy models demands extensive data,substantial computational resources,and significant time investment,making them valuable assets vulnerable to unauthorized exploitation.To address this issue,this paper proposes an intellectual property(IP)protection framework for DNN models based on feature layer selection and hyper-chaotic mapping.Firstly,a sensitivity-based importance evaluation algorithm is used to identify the key feature layers for encryption,effectively protecting the core components of the model.Next,the L1 regularization criterion is applied to further select high-weight features that significantly impact the model’s performance,ensuring that the encryption process minimizes performance loss.Finally,a dual-layer encryption mechanism is designed,introducing perturbations into the weight values and utilizing hyperchaotic mapping to disrupt channel information,further enhancing the model’s security.Experimental results demonstrate that encrypting only a small subset of parameters effectively reduces model accuracy to random-guessing levels while ensuring full recoverability.The scheme exhibits strong robustness against model pruning and fine-tuning attacks and maintains consistent performance across multiple datasets,providing an efficient and practical solution for authorization-based DNN IP protection.
基金supported by the Ministry of Trade,Industry and Energy(MOTIE)under Training Industrial Security Specialist for High-Tech Industry(RS-2024-00415520)supervised by the Korea Institute for Advancement of Technology(KIAT)the Ministry of Science and ICT(MSIT)under the ICT Challenge and Advanced Network of HRD(ICAN)Program(No.IITP-2022-RS-2022-00156310)supervised by the Institute of Information&Communication Technology Planning&Evaluation(IITP).
文摘As the density of wireless networks increases globally, the vulnerability of overlapped dense wireless communications to interference by hidden nodes and denial-of-service (DoS) attacks is becoming more apparent. There exists a gap in research on the detection and response to attacks on Medium Access Control (MAC) mechanisms themselves, which would lead to service outages between nodes. Classifying exploitation and deceptive jamming attacks on control mechanisms is particularly challengingdue to their resemblance to normal heavy communication patterns. Accordingly, this paper proposes a machine learning-based selective attack mitigation model that detects DoS attacks on wireless networks by monitoring packet log data. Based on the type of detected attack, it implements effective corresponding mitigation techniques to restore performance to nodes whose availability has been compromised. Experimental results reveal that the accuracy of the proposed model is 14% higher than that of a baseline anomaly detection model. Further, the appropriate mitigation techniques selected by the proposed system based on the attack type improve the average throughput by more than 440% compared to the case without a response.
基金grants from the Ministry of Science and Technology of China (No. 2001AA231061) the National Natural Science Foundation of China (No. 30270748)
文摘KaKs_Calculator is a software package that calculates nonsynonymous (Ka) and synonymous (Ks) substitution rates through model selection and model averaging. Since existing methods for this estimation adopt their specific mutation (substitution) models that consider different evolutionary features, leading to diverse estimates, KaKs_Calculator implements a set of candidate models in a maximum likelihood framework and adopts the Akaike information criterion to measure fitness between models and data, aiming to include as many features as needed for accurately capturing evolutionary information in protein-coding sequences. In addition, several existing methods for calculating Ka and Ks are also incorporated into this software. KaKs_Calculator, including source codes, compiled executables, and documentation, is freely available for academic use at http://evolution.genomics.org.cn/software.htm.
基金Supported by the High Technology Research and Development Program of China (863 Program,No2006AA100301)
文摘The performance of six statistical approaches,which can be used for selection of the best model to describe the growth of individual fish,was analyzed using simulated and real length-at-age data.The six approaches include coefficient of determination(R2),adjusted coefficient of determination(adj.-R2),root mean squared error(RMSE),Akaike's information criterion(AIC),bias correction of AIC(AICc) and Bayesian information criterion(BIC).The simulation data were generated by five growth models with different numbers of parameters.Four sets of real data were taken from the literature.The parameters in each of the five growth models were estimated using the maximum likelihood method under the assumption of the additive error structure for the data.The best supported model by the data was identified using each of the six approaches.The results show that R2 and RMSE have the same properties and perform worst.The sample size has an effect on the performance of adj.-R2,AIC,AICc and BIC.Adj.-R2 does better in small samples than in large samples.AIC is not suitable to use in small samples and tends to select more complex model when the sample size becomes large.AICc and BIC have best performance in small and large sample cases,respectively.Use of AICc or BIC is recommended for selection of fish growth model according to the size of the length-at-age data.
基金SP is supported by a Discovery Grant of the Natural Sciences and Engineering Research Council of Canada(RGOIN-2018-04967).
文摘A powerful investigative tool in biology is to consider not a single mathematical model but a collection of models designed to explore different working hypotheses and select the best model in that collection.In these lecture notes,the usual workflow of the use of mathematical models to investigate a biological problem is described and the use of a collection of model is motivated.Models depend on parameters that must be estimated using observations;and when a collection of models is considered,the best model has then to be identified based on available observations.Hence,model calibration and selection,which are intrinsically linked,are essential steps of the workflow.Here,some procedures for model calibration and a criterion,the Akaike Information Criterion,of model selection based on experimental data are described.Rough derivation,practical technique of computation and use of this criterion are detailed.