Forecasting travel demand requires a grasp of individual decision-making behavior.However,transport mode choice(TMC)is determined by personal and contextual factors that vary from person to person.Numerous characteris...Forecasting travel demand requires a grasp of individual decision-making behavior.However,transport mode choice(TMC)is determined by personal and contextual factors that vary from person to person.Numerous characteristics have a substantial impact on travel behavior(TB),which makes it important to take into account while studying transport options.Traditional statistical techniques frequently presume linear correlations,but real-world data rarely follows these presumptions,which may make it harder to grasp the complex interactions.Thorough systematic review was conducted to examine how machine learning(ML)approaches might successfully capture nonlinear correlations that conventional methods may ignore to overcome such challenges.An in-depth analysis of discrete choice models(DCM)and several ML algorithms,datasets,model validation strategies,and tuning techniques employed in previous research is carried out in the present study.Besides,the current review also summarizes DCM and ML models to predict TMC and recognize the determinants of TB in an urban area for different transport modes.The two primary goals of our study are to establish the present conceptual frameworks for the factors influencing the TMC for daily activities and to pinpoint methodological issues and limitations in previous research.With a total of 39 studies,our findings shed important light on the significance of considering factors that influence the TMC.The adjusted kernel algorithms and hyperparameter-optimized ML algorithms outperform the typical ML algorithms.RF(random forest),SVM(support vector machine),ANN(artificial neural network),and interpretable ML algorithms are the most widely used ML algorithms for the prediction of TMC where RF achieved an R2 of 0.95 and SVM achieved an accuracy of 93.18%;however,the adjusted kernel enhanced the accuracy of SVM 99.81%which shows that the interpretable algorithms outperformed the typical algorithms.The sensitivity analysis indicates that the most significant parameters influencing TMC are the age,total trip time,and the number of drivers.展开更多
In order to find the main factors that influence the urban traffic structure,a relational model between the travelers' characteristics and the trip mode choice is built.The data of urban residents' characteristics a...In order to find the main factors that influence the urban traffic structure,a relational model between the travelers' characteristics and the trip mode choice is built.The data of urban residents' characteristics are obtained from statistical data,while the trip mode split data is collected through a trip survey in Bengbu.In addition,the discrete choice model is adopted to build the functional relationship between the mode choice and the travelers' personal characteristics,as well as family characteristics and trip characteristics.The model shows that the relationship between the mode split and the personal,as well as family and trip characteristics is stable and changes little as the time changes.Deduced by the discrete model,the mode split result is relatively accurate and can be feasibly used for trip mode structure forecasts.Furthermore,the proposed model can also contribute to find the key influencing factors on trip mode choice,and restructure or optimize the urban trip mode structure.展开更多
Discrete choice model acts as one of the most important tools for studies involving mode split in the context of transport demand forecast. As different types of discrete choice models display their merits and restric...Discrete choice model acts as one of the most important tools for studies involving mode split in the context of transport demand forecast. As different types of discrete choice models display their merits and restrictions diversely, how to properly select the specific type among discrete choice models for realistic application still remains to be a tough problem. In this article, five typical discrete choice models for transport mode split are, respectively, discussed, which includes multinomial logit model, nested logit model (NL), heteroscedastic extreme value model, multinominal probit model and mixed multinomial logit model (MMNL). The theoretical basis and application attributes of these five models are especially analysed with great attention, and they are also applied to a realistic intercity case of mode split forecast, which results indi- cating that NL model does well in accommodating similarity and heterogeneity across alternatives, while MMNL model serves as the most effective method for mode choice prediction since it shows the highest reliability with the least significant prediction errors and even outperforms the other four models in solving the heterogeneity and similarity problems. This study indicates that conclusions derived from a single discrete choice model are not reliable, and it is better to choose the proper model based on its characteristics.展开更多
The vibrational performance of wood materials critical affects the acoustic quality of a lute. The purpose of this research was to apply a multiple choice model to predict the quality of musical instruments based on d...The vibrational performance of wood materials critical affects the acoustic quality of a lute. The purpose of this research was to apply a multiple choice model to predict the quality of musical instruments based on data on lute soundboard vibrational properties of Paulownia wood. In the lute production, lute material selection mainly depends on the subjective evaluation of technicians, which is not only inefficient, but inaccurate. In this study, nine lutes were fabricated. Using the multiple selection model, the lute tone quality was predicted by the soundboard wood vibration data. Compared with the actual value, the dependent value predicted by the count of observations with the maximum probability had 22 erroneous judgments. The model precision is 87.78%. The results confirmed that the prediction model can be used as a guideline for the selection of the soundboard wood in musical instrument plants.展开更多
This paper investigates the effectiveness of online reviews on addressing price endogeneity issue in an application to consumer demand for smartphone.We consider review variables as the substitutes of unobserved produ...This paper investigates the effectiveness of online reviews on addressing price endogeneity issue in an application to consumer demand for smartphone.We consider review variables as the substitutes of unobserved product quality in terms of a scalar variable as seen in previous methods.An aspect-based sentiment classification technique is designed to construct feature-related review variables from millions of review contents.We discuss the performance of review variables both in a hedonic pricing model and a conditional logit discrete choice model.Our results demonstrate that review variables show a good performance either as instruments for price or as explicit control variables in demand models.In detail,the pricing prediction accuracy increases 3.4%,which is considered as a significant improvement in the practice of forecasting.In the discrete choice model,the estimated price coefficient is biased in the positive direction without endogeneity correction.It is adjusted in the expected way after including review variables.The findings indicate that online reviews provide alternative sources of information in dealing with endogeneity in discrete choice models.We also analyze the differences in the preferences and needs of individual consumers to provide some practical implications of marketing.展开更多
Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/appr...Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.展开更多
The electrification of vehicles is considered one of the most important strategies for addressing the issues related to energy dependence and climate change.To meet user needs,electric vehicle(EV)management for chargi...The electrification of vehicles is considered one of the most important strategies for addressing the issues related to energy dependence and climate change.To meet user needs,electric vehicle(EV)management for charging operations is essential.This study uses modelling and simulation of EV user behaviour to forecast possible scenarios for electric charging in cities and to identify potential management problems and opportunities for improvement of EVs and EV charging infrastructures.The conurbation of Turin was selected as a case study to reproduce realistic scenarios by applying discrete choice modelling based on socio-economic and transport system data.One of objectives of the study was to describe user charging behaviour from a geographic perspective to model where users prefer to charge in the area studied according to the variables that may affect decisions.Another objective was to estimate the number of electric vehicles in Turin and the characteristics of their users,both of which are helpful in understanding electric mobility within a city.Analysing these behavioural issues in a modelling framework can provide a set of tools to compare and evaluate a variety of possible modifications,indicating an adequate network of charging infrastructure to facilitate the diffusion of electric vehicles.展开更多
We study the pricing game between competing retailers under various random coefficient attraction choice models.We characterize existence conditions and structure properties of the equilibrium.Moreover,we explore how ...We study the pricing game between competing retailers under various random coefficient attraction choice models.We characterize existence conditions and structure properties of the equilibrium.Moreover,we explore how the randomness and cost parameters affect the equilibrium prices and profits under multinomial logit(MNL),multiplicative competitive interaction(MCI)and linear attraction choice models.Specifically,with bounded randomness,for the MCI and linear attraction models,the randomness always reduces the retailer’s profit.However,for the MNL model,the effect of randomness depends on the product’s value gap.For high-end products(i.e.,whose value gap is higher than a threshold),the randomness reduces the equilibrium profit,and vice versa.The results suggest high-end retailers in MNL markets exert more effort in disclosing their exact product performance to consumers.We also reveal the effects of randomness on retailers’pricing decisions.These results help retailers in making product performance disclosure and pricing decisions.展开更多
In this paper,we propose a gradient descent method to estimate the parameters in a Markov chain choice model.Particularly,we derive closed-form formula for the gradient of the log-likelihood function and show the conv...In this paper,we propose a gradient descent method to estimate the parameters in a Markov chain choice model.Particularly,we derive closed-form formula for the gradient of the log-likelihood function and show the convergence of the algorithm.Numerical experiments verify the efficiency of our approach by comparing with the expectation-maximization algorithm.We show that the similar result can be extended to a more general case that one does not have observation of the no-purchase data.展开更多
This paper proposes a framework to analyse the impact of online travel agency(OTA)when it steps into an original market of a traditional travel agency(TTA).Based on the multinomial logit choice model,the demand model ...This paper proposes a framework to analyse the impact of online travel agency(OTA)when it steps into an original market of a traditional travel agency(TTA).Based on the multinomial logit choice model,the demand model and the profit model are presented.Then,the demand squeeze,the total demand increase and the cooperation range of wholesale price are analysed.From the analysis,the results indicate that:(1)OTA can increase the demand of the whole market while it squeezes the demand of TTA;(2)The demand squeeze,total demand increase and the range of cooperation wholesale price are all positive with the perceived value from OTA and negative with the perceived value from TTA.(3)The more immature the market is the more necessary for TTA to cooperate with OTA.In addition,numerical example and sensitivity analysis of perceived value and price are presented to illustrate the demand squeeze,demand increase and cooperation range of wholesale price.展开更多
Social networks like Facebook, X (Twitter), and LinkedIn provide an interaction and communication environment for users to generate and share content, allowing for the observation of social behaviours in the digital w...Social networks like Facebook, X (Twitter), and LinkedIn provide an interaction and communication environment for users to generate and share content, allowing for the observation of social behaviours in the digital world. These networks can be viewed as a collection of nodes and edges, where users and their interactions are represented as nodes and the connections between them as edges. Understanding the factors that contribute to the formation of these edges is important for studying network structure and processes. This knowledge can be applied to various areas such as identifying communities, recommending friends, and targeting online advertisements. Several factors, including node popularity and friends-of-friends relationships, influence edge formation and network growth. This research focuses on the temporal activity of nodes and its impact on edge formation. Specifically, the study examines how the minimum age of friends-of-friends edges and the average age of all edges connected to potential target nodes influence the formation of network edges. Discrete choice analysis is used to analyse the combined effect of these temporal factors and other well-known attributes like node degree (i.e., the number of connections a node has) and network distance between nodes. The findings reveal that temporal properties have a similar impact as network proximity in predicting the creation of links. By incorporating temporal features into the models, the accuracy of link prediction can be further improved.展开更多
This paper analyzes the characteristics of the destination distribution of trips and proposes a stratified sampling strategy for travel mode choice.The stratified sampling strategy can reduce the size of the alternati...This paper analyzes the characteristics of the destination distribution of trips and proposes a stratified sampling strategy for travel mode choice.The stratified sampling strategy can reduce the size of the alternative set;thus,the computation burden of simulation is decreased.Using the stratified sampling strategy,a combined choice model of the trip mode and destination is developed based on the Bayesian theory.Simulations are carried out to verify the proposed model.The results show that the combined choice model of the trip mode and destination can efficiently simulate travelers' choice behaviors.Furthermore,the forecasting accuracy of the combined choice model is higher than the one of the gravity model.Therefore,the proposed model is a powerful tool with which to analyze travelers' behaviors in selecting the trip mode.展开更多
Payment for Ecosystem Services(PES)has been widely acknowledged as an effective tool for mitigating grassland degradation and enhancing ecosystem services provision.However,critical factors,such as herders'willing...Payment for Ecosystem Services(PES)has been widely acknowledged as an effective tool for mitigating grassland degradation and enhancing ecosystem services provision.However,critical factors,such as herders'willingness to accept(WTA)preferences and compensation expectations,are often overlooked,leading to insufficient effectiveness of PES initiatives.This study focused on grassland ecological compensation policy(GECP),quantifying herders'WTA compensation for grassland grazing bans.Through face-to-face surveys and employing the contingent valuation method,we estimated households'WTA for participating in a grassland conservation program to bolster ecosystem service provision.Our findings indicated that herders required an average compensation of 237 CNY mu^(-1)yr^(-1)to engage in the grazing ban program.Notably,herders'environmental awareness positively influenced their willingness to participate,whereas larger family sizes were negatively correlated with WTA.Additionally,herders in better health,with higher livestock incomes or categorized as semi-herders,tended to accept lower compensation levels.These insights are crucial for improving the effectiveness of GECP and provide valuable reference points for similar analyses in economically disadvantaged and ecologically fragile regions.展开更多
Success or failure of an E-commerce platform is often reduced to its ability to maximize the conversion rate of its visitors. This is commonly regarded as the capacity to induce a purchase from a visitor. Visitors pos...Success or failure of an E-commerce platform is often reduced to its ability to maximize the conversion rate of its visitors. This is commonly regarded as the capacity to induce a purchase from a visitor. Visitors possess individual characteristics, histories, and objectives which complicate the choice of what platform features that maximize the conversion rate. Modern web technology has made clickstream data accessible allowing a complete record of a visitor’s actions on a website to be analyzed. What remains poorly constrained is what parts of the clickstream data are meaningful information and what parts are accidental for the problem of platform design. In this research, clickstream data from an online retailer was examined to demonstrate how statistical modeling can improve clickstream information usage. A conceptual model was developed that conjectured relationships between visitor and platform variables, visitors’ platform exit rate, boune rate, and decision to purchase. Several hypotheses on the nature of the clickstream relationships were posited and tested with the models. A discrete choice logit model showed that the content of a website, the history of website use, and the exit rate of pages visited had marginal effects on derived utility for the visitor. Exit rate and bounce rate were modeled as beta distributed random variables. It was found that exit rate and its variability for pages visited were associated with site content, site quality, prior visitor history on the site, and technological preferences of the visitor. Bounce rate was also found to be influenced by the same factors but was in a direction opposite to the registered hypotheses. Most findings supported that clickstream data is amenable to statistical modeling with interpretable and comprehensible models.展开更多
The aim of this work is to explore the impact of regional transit service on tour-based commuter travel behavior by using the Bayesian hierarchical multinomial logit model, accounting for the spatial heterogeneity of ...The aim of this work is to explore the impact of regional transit service on tour-based commuter travel behavior by using the Bayesian hierarchical multinomial logit model, accounting for the spatial heterogeneity of the people living in the same area.With two indicators, accessibility and connectivity measured at the zone level, the regional transit service is captured and then related to the travel mode choice behavior. The sample data are selected from Washington-Baltimore Household Travel Survey in 2007,including all the trips from home to workplace in morning hours in Baltimore city. Traditional multinomial logit model using Bayesian approach is also estimated. A comparison of the two different models shows that ignoring the spatial context can lead to a misspecification of the effects of the regional transit service on travel behavior. The results reveal that improving transit service at regional level can be effective in reducing auto use for commuters after controlling for socio-demographics and travel-related factors.This work provides insights for interpreting tour-based commuter travel behavior by using recently developed methodological approaches. The results of this work will be helpful for engineers, urban planners, and transit operators to decide the needs to improve regional transit service and spatial location efficiently.展开更多
The aim of this paper is the analysis of methodology for selecting the best model for forecasting of fuelwood demand in Greece for the years 2020, 2025 and 2030 with a final objective the decision-making in the sector...The aim of this paper is the analysis of methodology for selecting the best model for forecasting of fuelwood demand in Greece for the years 2020, 2025 and 2030 with a final objective the decision-making in the sector of forest bioenergy. A complete time series of historical data exists that concerns: (a) the consumption of fuelwood and (b) the six most important from the independent variables that could influence the consumption of fuelwood, whose data cover the time period 1989-2010. The evaluation and choice of the best model was realized with the help of the following six statistical criteria: (a) the size of standard error of theoretical values of dependant variable, S. E.; (b) the value of adjusted R square (R2); (c) the non-existence of autocorrelation among the residuals (ei) through the criterion Durbin-Watson; (d) the statistical significance of models coefficients through t criterion; (e) the statistical significance of models through F criterion and (f) the non-existence of multicolinearity through the values of Variance Inflation Factor.展开更多
Purpose:This paper aims to analyze the factors that influence information inequality in the suburban areas of Shanghai in an effort to better understand information inequality and find ways to reduce the inequality.De...Purpose:This paper aims to analyze the factors that influence information inequality in the suburban areas of Shanghai in an effort to better understand information inequality and find ways to reduce the inequality.Design/methodology/approach:A survey was conducted to gather data from the rural people who received the Shanghai information and communication technology(ICT)training courses and data analysis was based on the 1,200 valid questionnaires retrieved.By using the discrete choice model,we studied the impacts of individual attributes such as gender,age,educational level and occupation and the factors of information inequality such as information skill and the purpose of using information technology(IT)on information inequality in suburban Shanghai.Findings:The most critical factors affecting information inequality of Shanghai suburban residents are educational level and information skill,followed by age and the purpose of using IT.The results show that the purpose of using IT and information skill are the two main aspects of information inequality of Shanghai suburban residents.Differences between individuals,especially in educational level and age,are identified as the underlying causes of the information inequality.Research limitations:Subjects in the sample were limited to those who received training in the Shanghai rural ICT training project.Such a sample limits the generality of the study findings.Practical implications:The study will help enhance our understanding of information inequality and find ways to reduce the inequality.Originality/value:Most previous studies on information inequality were focused on theoretical discussions.This study adds to the limited empirical research done on information inequality and also provides some insights into the ways to reduce the inequality.展开更多
With the advancement in autonomous driving techniques,autonomous demand-responsive transit(ADRT)is a newly emerging sustainable transport mode for the future,which will provide more flexible services to public users.A...With the advancement in autonomous driving techniques,autonomous demand-responsive transit(ADRT)is a newly emerging sustainable transport mode for the future,which will provide more flexible services to public users.ADRT offers benefits such as flex-ible stops and routes and comfortable seats,but it also involves risks due to the vehicles being driverless.This paper particularly investigates users’preferences and attitudes towards ADRT,and mode choice behavior between ADRT buses and traditional buses.A survey with Likert scale statements and stated preference(SP)choice scenarios is designed and conducted to explore users’attitudes towards the safety risks of autonomous vehicles(AVs),social concerns,service flexibility concerns when using AVs,interest in new things,and shuttle mode choices.An integrated choice and latent variable(ICLV)model is adopted to explore users’psychological factors through latent variables and to integrate them into mode choice behavioral modeling.Estimated results indicate that users’attitudes towards AV safety risks,their social concerns,and their flexibility concerns with ADRT strongly influence their mode choices and are strongly related to sociodemographic and travel-related factors such as age,gender,income,education,number of family members.In gen-eral,a young age,a high education level,a higher income,private car ownership,and better knowledge of AVs are positively related to attitudes towards ADRT.Females,users from large families,and users with driving licenses or long commuting times are less willing to adopt ADRT.The study’s outcomes highlight significant heterogeneities among users and can be highly valuable for policymakers,such as government authorities,in providing social support and designing policies targeting specific population groups.This will be ben-eficial in attracting more users to this emerging mobility service and contributing to sus-tainable urban development.展开更多
文摘Forecasting travel demand requires a grasp of individual decision-making behavior.However,transport mode choice(TMC)is determined by personal and contextual factors that vary from person to person.Numerous characteristics have a substantial impact on travel behavior(TB),which makes it important to take into account while studying transport options.Traditional statistical techniques frequently presume linear correlations,but real-world data rarely follows these presumptions,which may make it harder to grasp the complex interactions.Thorough systematic review was conducted to examine how machine learning(ML)approaches might successfully capture nonlinear correlations that conventional methods may ignore to overcome such challenges.An in-depth analysis of discrete choice models(DCM)and several ML algorithms,datasets,model validation strategies,and tuning techniques employed in previous research is carried out in the present study.Besides,the current review also summarizes DCM and ML models to predict TMC and recognize the determinants of TB in an urban area for different transport modes.The two primary goals of our study are to establish the present conceptual frameworks for the factors influencing the TMC for daily activities and to pinpoint methodological issues and limitations in previous research.With a total of 39 studies,our findings shed important light on the significance of considering factors that influence the TMC.The adjusted kernel algorithms and hyperparameter-optimized ML algorithms outperform the typical ML algorithms.RF(random forest),SVM(support vector machine),ANN(artificial neural network),and interpretable ML algorithms are the most widely used ML algorithms for the prediction of TMC where RF achieved an R2 of 0.95 and SVM achieved an accuracy of 93.18%;however,the adjusted kernel enhanced the accuracy of SVM 99.81%which shows that the interpretable algorithms outperformed the typical algorithms.The sensitivity analysis indicates that the most significant parameters influencing TMC are the age,total trip time,and the number of drivers.
基金The National Natural Science Foundation of China (No.50738001,51078086)
文摘In order to find the main factors that influence the urban traffic structure,a relational model between the travelers' characteristics and the trip mode choice is built.The data of urban residents' characteristics are obtained from statistical data,while the trip mode split data is collected through a trip survey in Bengbu.In addition,the discrete choice model is adopted to build the functional relationship between the mode choice and the travelers' personal characteristics,as well as family characteristics and trip characteristics.The model shows that the relationship between the mode split and the personal,as well as family and trip characteristics is stable and changes little as the time changes.Deduced by the discrete model,the mode split result is relatively accurate and can be feasibly used for trip mode structure forecasts.Furthermore,the proposed model can also contribute to find the key influencing factors on trip mode choice,and restructure or optimize the urban trip mode structure.
基金supported by the Science&Technology pillar project(No.0556)of Guangzhou
文摘Discrete choice model acts as one of the most important tools for studies involving mode split in the context of transport demand forecast. As different types of discrete choice models display their merits and restrictions diversely, how to properly select the specific type among discrete choice models for realistic application still remains to be a tough problem. In this article, five typical discrete choice models for transport mode split are, respectively, discussed, which includes multinomial logit model, nested logit model (NL), heteroscedastic extreme value model, multinominal probit model and mixed multinomial logit model (MMNL). The theoretical basis and application attributes of these five models are especially analysed with great attention, and they are also applied to a realistic intercity case of mode split forecast, which results indi- cating that NL model does well in accommodating similarity and heterogeneity across alternatives, while MMNL model serves as the most effective method for mode choice prediction since it shows the highest reliability with the least significant prediction errors and even outperforms the other four models in solving the heterogeneity and similarity problems. This study indicates that conclusions derived from a single discrete choice model are not reliable, and it is better to choose the proper model based on its characteristics.
基金financially supported by the Natural Science Foundation of China(NSFC)through Grant Number30972300
文摘The vibrational performance of wood materials critical affects the acoustic quality of a lute. The purpose of this research was to apply a multiple choice model to predict the quality of musical instruments based on data on lute soundboard vibrational properties of Paulownia wood. In the lute production, lute material selection mainly depends on the subjective evaluation of technicians, which is not only inefficient, but inaccurate. In this study, nine lutes were fabricated. Using the multiple selection model, the lute tone quality was predicted by the soundboard wood vibration data. Compared with the actual value, the dependent value predicted by the count of observations with the maximum probability had 22 erroneous judgments. The model precision is 87.78%. The results confirmed that the prediction model can be used as a guideline for the selection of the soundboard wood in musical instrument plants.
文摘This paper investigates the effectiveness of online reviews on addressing price endogeneity issue in an application to consumer demand for smartphone.We consider review variables as the substitutes of unobserved product quality in terms of a scalar variable as seen in previous methods.An aspect-based sentiment classification technique is designed to construct feature-related review variables from millions of review contents.We discuss the performance of review variables both in a hedonic pricing model and a conditional logit discrete choice model.Our results demonstrate that review variables show a good performance either as instruments for price or as explicit control variables in demand models.In detail,the pricing prediction accuracy increases 3.4%,which is considered as a significant improvement in the practice of forecasting.In the discrete choice model,the estimated price coefficient is biased in the positive direction without endogeneity correction.It is adjusted in the expected way after including review variables.The findings indicate that online reviews provide alternative sources of information in dealing with endogeneity in discrete choice models.We also analyze the differences in the preferences and needs of individual consumers to provide some practical implications of marketing.
文摘Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.
基金This work was partially supported by the EU Horizon 2020 project“INCIT-EV”,with Grant agreement ID:875683.
文摘The electrification of vehicles is considered one of the most important strategies for addressing the issues related to energy dependence and climate change.To meet user needs,electric vehicle(EV)management for charging operations is essential.This study uses modelling and simulation of EV user behaviour to forecast possible scenarios for electric charging in cities and to identify potential management problems and opportunities for improvement of EVs and EV charging infrastructures.The conurbation of Turin was selected as a case study to reproduce realistic scenarios by applying discrete choice modelling based on socio-economic and transport system data.One of objectives of the study was to describe user charging behaviour from a geographic perspective to model where users prefer to charge in the area studied according to the variables that may affect decisions.Another objective was to estimate the number of electric vehicles in Turin and the characteristics of their users,both of which are helpful in understanding electric mobility within a city.Analysing these behavioural issues in a modelling framework can provide a set of tools to compare and evaluate a variety of possible modifications,indicating an adequate network of charging infrastructure to facilitate the diffusion of electric vehicles.
基金partially supported by the National Natural Science Foundation of China(No.72001198 and Nos.71991464/71991460)the Fundamental Research Funds for the Central Universities(No.WK2040000027)+3 种基金the National Key R&D Program of China(Nos.2020AAA0103804/2020AAA0103800)USTC(University of Science and Technology of China)Research Funds of the Double First-Class Initiative(No.YD2040002004)Collaborative Research Fund(No.C1143-20G)General Research Fund(No.115080/17).
文摘We study the pricing game between competing retailers under various random coefficient attraction choice models.We characterize existence conditions and structure properties of the equilibrium.Moreover,we explore how the randomness and cost parameters affect the equilibrium prices and profits under multinomial logit(MNL),multiplicative competitive interaction(MCI)and linear attraction choice models.Specifically,with bounded randomness,for the MCI and linear attraction models,the randomness always reduces the retailer’s profit.However,for the MNL model,the effect of randomness depends on the product’s value gap.For high-end products(i.e.,whose value gap is higher than a threshold),the randomness reduces the equilibrium profit,and vice versa.The results suggest high-end retailers in MNL markets exert more effort in disclosing their exact product performance to consumers.We also reveal the effects of randomness on retailers’pricing decisions.These results help retailers in making product performance disclosure and pricing decisions.
文摘In this paper,we propose a gradient descent method to estimate the parameters in a Markov chain choice model.Particularly,we derive closed-form formula for the gradient of the log-likelihood function and show the convergence of the algorithm.Numerical experiments verify the efficiency of our approach by comparing with the expectation-maximization algorithm.We show that the similar result can be extended to a more general case that one does not have observation of the no-purchase data.
基金High-Level Talents Project of Hainan Natural Science Foundation(No.2019RC037)the National Natural Science Foundation of China(Nos.71761009,71461006,and 71461007)+1 种基金the Hainan Province Planning Program of Philosophy and Social Science(Nos.HNSK(YB)19-06 and HNSK(YB)19-11)Scientific Research Project Hainan Department of Education(Nos.HNKY2020ZD-6 and HNKY2019ZD-10).
文摘This paper proposes a framework to analyse the impact of online travel agency(OTA)when it steps into an original market of a traditional travel agency(TTA).Based on the multinomial logit choice model,the demand model and the profit model are presented.Then,the demand squeeze,the total demand increase and the cooperation range of wholesale price are analysed.From the analysis,the results indicate that:(1)OTA can increase the demand of the whole market while it squeezes the demand of TTA;(2)The demand squeeze,total demand increase and the range of cooperation wholesale price are all positive with the perceived value from OTA and negative with the perceived value from TTA.(3)The more immature the market is the more necessary for TTA to cooperate with OTA.In addition,numerical example and sensitivity analysis of perceived value and price are presented to illustrate the demand squeeze,demand increase and cooperation range of wholesale price.
文摘Social networks like Facebook, X (Twitter), and LinkedIn provide an interaction and communication environment for users to generate and share content, allowing for the observation of social behaviours in the digital world. These networks can be viewed as a collection of nodes and edges, where users and their interactions are represented as nodes and the connections between them as edges. Understanding the factors that contribute to the formation of these edges is important for studying network structure and processes. This knowledge can be applied to various areas such as identifying communities, recommending friends, and targeting online advertisements. Several factors, including node popularity and friends-of-friends relationships, influence edge formation and network growth. This research focuses on the temporal activity of nodes and its impact on edge formation. Specifically, the study examines how the minimum age of friends-of-friends edges and the average age of all edges connected to potential target nodes influence the formation of network edges. Discrete choice analysis is used to analyse the combined effect of these temporal factors and other well-known attributes like node degree (i.e., the number of connections a node has) and network distance between nodes. The findings reveal that temporal properties have a similar impact as network proximity in predicting the creation of links. By incorporating temporal features into the models, the accuracy of link prediction can be further improved.
文摘This paper analyzes the characteristics of the destination distribution of trips and proposes a stratified sampling strategy for travel mode choice.The stratified sampling strategy can reduce the size of the alternative set;thus,the computation burden of simulation is decreased.Using the stratified sampling strategy,a combined choice model of the trip mode and destination is developed based on the Bayesian theory.Simulations are carried out to verify the proposed model.The results show that the combined choice model of the trip mode and destination can efficiently simulate travelers' choice behaviors.Furthermore,the forecasting accuracy of the combined choice model is higher than the one of the gravity model.Therefore,the proposed model is a powerful tool with which to analyze travelers' behaviors in selecting the trip mode.
基金supported by the National Natural Science Foundation of China(71934003,72322008,and72348003).
文摘Payment for Ecosystem Services(PES)has been widely acknowledged as an effective tool for mitigating grassland degradation and enhancing ecosystem services provision.However,critical factors,such as herders'willingness to accept(WTA)preferences and compensation expectations,are often overlooked,leading to insufficient effectiveness of PES initiatives.This study focused on grassland ecological compensation policy(GECP),quantifying herders'WTA compensation for grassland grazing bans.Through face-to-face surveys and employing the contingent valuation method,we estimated households'WTA for participating in a grassland conservation program to bolster ecosystem service provision.Our findings indicated that herders required an average compensation of 237 CNY mu^(-1)yr^(-1)to engage in the grazing ban program.Notably,herders'environmental awareness positively influenced their willingness to participate,whereas larger family sizes were negatively correlated with WTA.Additionally,herders in better health,with higher livestock incomes or categorized as semi-herders,tended to accept lower compensation levels.These insights are crucial for improving the effectiveness of GECP and provide valuable reference points for similar analyses in economically disadvantaged and ecologically fragile regions.
文摘Success or failure of an E-commerce platform is often reduced to its ability to maximize the conversion rate of its visitors. This is commonly regarded as the capacity to induce a purchase from a visitor. Visitors possess individual characteristics, histories, and objectives which complicate the choice of what platform features that maximize the conversion rate. Modern web technology has made clickstream data accessible allowing a complete record of a visitor’s actions on a website to be analyzed. What remains poorly constrained is what parts of the clickstream data are meaningful information and what parts are accidental for the problem of platform design. In this research, clickstream data from an online retailer was examined to demonstrate how statistical modeling can improve clickstream information usage. A conceptual model was developed that conjectured relationships between visitor and platform variables, visitors’ platform exit rate, boune rate, and decision to purchase. Several hypotheses on the nature of the clickstream relationships were posited and tested with the models. A discrete choice logit model showed that the content of a website, the history of website use, and the exit rate of pages visited had marginal effects on derived utility for the visitor. Exit rate and bounce rate were modeled as beta distributed random variables. It was found that exit rate and its variability for pages visited were associated with site content, site quality, prior visitor history on the site, and technological preferences of the visitor. Bounce rate was also found to be influenced by the same factors but was in a direction opposite to the registered hypotheses. Most findings supported that clickstream data is amenable to statistical modeling with interpretable and comprehensible models.
基金Project(71173061)supported by the National Natural Science Foundation of ChinaProject(2013U-6)supported by Key Laboratory of Eco Planning & Green Building,Ministry of Education(Tsinghua University),China
文摘The aim of this work is to explore the impact of regional transit service on tour-based commuter travel behavior by using the Bayesian hierarchical multinomial logit model, accounting for the spatial heterogeneity of the people living in the same area.With two indicators, accessibility and connectivity measured at the zone level, the regional transit service is captured and then related to the travel mode choice behavior. The sample data are selected from Washington-Baltimore Household Travel Survey in 2007,including all the trips from home to workplace in morning hours in Baltimore city. Traditional multinomial logit model using Bayesian approach is also estimated. A comparison of the two different models shows that ignoring the spatial context can lead to a misspecification of the effects of the regional transit service on travel behavior. The results reveal that improving transit service at regional level can be effective in reducing auto use for commuters after controlling for socio-demographics and travel-related factors.This work provides insights for interpreting tour-based commuter travel behavior by using recently developed methodological approaches. The results of this work will be helpful for engineers, urban planners, and transit operators to decide the needs to improve regional transit service and spatial location efficiently.
文摘The aim of this paper is the analysis of methodology for selecting the best model for forecasting of fuelwood demand in Greece for the years 2020, 2025 and 2030 with a final objective the decision-making in the sector of forest bioenergy. A complete time series of historical data exists that concerns: (a) the consumption of fuelwood and (b) the six most important from the independent variables that could influence the consumption of fuelwood, whose data cover the time period 1989-2010. The evaluation and choice of the best model was realized with the help of the following six statistical criteria: (a) the size of standard error of theoretical values of dependant variable, S. E.; (b) the value of adjusted R square (R2); (c) the non-existence of autocorrelation among the residuals (ei) through the criterion Durbin-Watson; (d) the statistical significance of models coefficients through t criterion; (e) the statistical significance of models through F criterion and (f) the non-existence of multicolinearity through the values of Variance Inflation Factor.
文摘Purpose:This paper aims to analyze the factors that influence information inequality in the suburban areas of Shanghai in an effort to better understand information inequality and find ways to reduce the inequality.Design/methodology/approach:A survey was conducted to gather data from the rural people who received the Shanghai information and communication technology(ICT)training courses and data analysis was based on the 1,200 valid questionnaires retrieved.By using the discrete choice model,we studied the impacts of individual attributes such as gender,age,educational level and occupation and the factors of information inequality such as information skill and the purpose of using information technology(IT)on information inequality in suburban Shanghai.Findings:The most critical factors affecting information inequality of Shanghai suburban residents are educational level and information skill,followed by age and the purpose of using IT.The results show that the purpose of using IT and information skill are the two main aspects of information inequality of Shanghai suburban residents.Differences between individuals,especially in educational level and age,are identified as the underlying causes of the information inequality.Research limitations:Subjects in the sample were limited to those who received training in the Shanghai rural ICT training project.Such a sample limits the generality of the study findings.Practical implications:The study will help enhance our understanding of information inequality and find ways to reduce the inequality.Originality/value:Most previous studies on information inequality were focused on theoretical discussions.This study adds to the limited empirical research done on information inequality and also provides some insights into the ways to reduce the inequality.
基金supported by the the National Key R&D Program of China(No.2019YFE0108300)the National Natural Science Foundation of China(No.71971162).
文摘With the advancement in autonomous driving techniques,autonomous demand-responsive transit(ADRT)is a newly emerging sustainable transport mode for the future,which will provide more flexible services to public users.ADRT offers benefits such as flex-ible stops and routes and comfortable seats,but it also involves risks due to the vehicles being driverless.This paper particularly investigates users’preferences and attitudes towards ADRT,and mode choice behavior between ADRT buses and traditional buses.A survey with Likert scale statements and stated preference(SP)choice scenarios is designed and conducted to explore users’attitudes towards the safety risks of autonomous vehicles(AVs),social concerns,service flexibility concerns when using AVs,interest in new things,and shuttle mode choices.An integrated choice and latent variable(ICLV)model is adopted to explore users’psychological factors through latent variables and to integrate them into mode choice behavioral modeling.Estimated results indicate that users’attitudes towards AV safety risks,their social concerns,and their flexibility concerns with ADRT strongly influence their mode choices and are strongly related to sociodemographic and travel-related factors such as age,gender,income,education,number of family members.In gen-eral,a young age,a high education level,a higher income,private car ownership,and better knowledge of AVs are positively related to attitudes towards ADRT.Females,users from large families,and users with driving licenses or long commuting times are less willing to adopt ADRT.The study’s outcomes highlight significant heterogeneities among users and can be highly valuable for policymakers,such as government authorities,in providing social support and designing policies targeting specific population groups.This will be ben-eficial in attracting more users to this emerging mobility service and contributing to sus-tainable urban development.