This paper presents a novel variable selection method in additive nonparametric regression model. This work is motivated by the need to select the number of nonparametric components and number of variables within each...This paper presents a novel variable selection method in additive nonparametric regression model. This work is motivated by the need to select the number of nonparametric components and number of variables within each nonparametric component. The proposed method uses a combination of hard and soft shrinkages to separately control the number of additive components and the variables within each component. An efficient algorithm is developed to select the importance of variables and estimate the interaction network. Excellent performance is obtained in simulated and real data examples.展开更多
Bayesian optimization(BO)is an indispensable tool to optimize objective functions that either do not have known functional forms or are expensive to evaluate.Currently,optimal experimental design is always conducted w...Bayesian optimization(BO)is an indispensable tool to optimize objective functions that either do not have known functional forms or are expensive to evaluate.Currently,optimal experimental design is always conducted within the workflow of BO leading to more efficient exploration of the design space compared to traditional strategies.This can have a significant impact on modern scientific discovery,in particular autonomous materials discovery,which can be viewed as an optimization problem aimed at looking for the maximum(or minimum)point for the desired materials properties.The performance of BO-based experimental design depends not only on the adopted acquisition function but also on the surrogate models that help to approximate underlying objective functions.In this paper,we propose a fully autonomous experimental design framework that uses more adaptive and flexible Bayesian surrogate models in a BO procedure,namely Bayesian multivariate adaptive regression splines and Bayesian additive regression trees.They can overcome the weaknesses of widely used Gaussian process-based methods when faced with relatively high-dimensional design space or non-smooth patterns of objective functions.Both simulation studies and real-world materials science case studies demonstrate their enhanced search efficiency and robustness.展开更多
In this paper we present a large scale,passive positioning system that can be used for approximate localization in Global Positioning System(GPS)denied/spoofed environments.This system can be used for detecting GPS sp...In this paper we present a large scale,passive positioning system that can be used for approximate localization in Global Positioning System(GPS)denied/spoofed environments.This system can be used for detecting GPS spoofing as well as for initial position estimation for input to other GPS free positioning and navigation systems like Terrain Contour Matching(TERCOM).Our Location inference through Frequency Modulation(FM)Signal Integration and estimation(LoSI)system is based on broadcast FM radio signals and uses Received Signal Strength Indicator(RSSI)obtained using a Software Defined Radio(SDR).The RSSI thus obtained is used for indexing into an estimated model of expected FM spectrum for the entire United States.We show that with the hardware for data acquisition,a single point resolution of around 3 miles and associated algorithms,we are capable of positioning with errors as low as a single pixel(more precisely around 0.12 mile).The algorithm uses a largescale model estimation phase that computes the expected FM spectrum in small rectangular cells(realized using geohashes)across the Contiguous United States(CONUS).We define and use Dominant Channel Descriptor(DCD)features,which can be used for positioning using time varying models.Finally we use an algorithm based on Euclidean nearest neighbors in the DCD feature space for position estimation.The system first runs a DCD feature detector on the observed spectrum and then solves a subset query formulation to find Inference Candidates(IC).Finally,it uses a simple Euclidean nearest neighbor search on the ICs to localize the observation.We report results on 1500 points across Florida using data and model estimates from 2015 and 2017.We also provide a Bayesian decision theoretic justification for the nearest neighbor search.展开更多
文摘This paper presents a novel variable selection method in additive nonparametric regression model. This work is motivated by the need to select the number of nonparametric components and number of variables within each nonparametric component. The proposed method uses a combination of hard and soft shrinkages to separately control the number of additive components and the variables within each component. An efficient algorithm is developed to select the importance of variables and estimate the interaction network. Excellent performance is obtained in simulated and real data examples.
基金B.K.M.,A.B.,and D.P.acknowledge support by NSF through Grant No.NSF CCF-1934904(TRIPODS)T.Q.K.acknowledges the NSF through Grant No.NSF-DGE-1545403+1 种基金X.Q.and R.A.acknowledge NSF through Grants Nos.1835690 and 2119103(DMREF)The authors also acknowledge Texas A&M’s Vice President for Research for partial support through the X-Grants program.Dr.Prashant Singh(Ames Laboratory)is acknowledged for his DFT calculations of SFE in FCC HEAs.Dr.Anjana Talapatra and Dr.Shahin Boluki are acknowledged for facilitating the BMA Code.DFT calculations of the SFEs were conducted with the computing resources provided by Texas A&M High Performance Research Computing.
文摘Bayesian optimization(BO)is an indispensable tool to optimize objective functions that either do not have known functional forms or are expensive to evaluate.Currently,optimal experimental design is always conducted within the workflow of BO leading to more efficient exploration of the design space compared to traditional strategies.This can have a significant impact on modern scientific discovery,in particular autonomous materials discovery,which can be viewed as an optimization problem aimed at looking for the maximum(or minimum)point for the desired materials properties.The performance of BO-based experimental design depends not only on the adopted acquisition function but also on the surrogate models that help to approximate underlying objective functions.In this paper,we propose a fully autonomous experimental design framework that uses more adaptive and flexible Bayesian surrogate models in a BO procedure,namely Bayesian multivariate adaptive regression splines and Bayesian additive regression trees.They can overcome the weaknesses of widely used Gaussian process-based methods when faced with relatively high-dimensional design space or non-smooth patterns of objective functions.Both simulation studies and real-world materials science case studies demonstrate their enhanced search efficiency and robustness.
文摘In this paper we present a large scale,passive positioning system that can be used for approximate localization in Global Positioning System(GPS)denied/spoofed environments.This system can be used for detecting GPS spoofing as well as for initial position estimation for input to other GPS free positioning and navigation systems like Terrain Contour Matching(TERCOM).Our Location inference through Frequency Modulation(FM)Signal Integration and estimation(LoSI)system is based on broadcast FM radio signals and uses Received Signal Strength Indicator(RSSI)obtained using a Software Defined Radio(SDR).The RSSI thus obtained is used for indexing into an estimated model of expected FM spectrum for the entire United States.We show that with the hardware for data acquisition,a single point resolution of around 3 miles and associated algorithms,we are capable of positioning with errors as low as a single pixel(more precisely around 0.12 mile).The algorithm uses a largescale model estimation phase that computes the expected FM spectrum in small rectangular cells(realized using geohashes)across the Contiguous United States(CONUS).We define and use Dominant Channel Descriptor(DCD)features,which can be used for positioning using time varying models.Finally we use an algorithm based on Euclidean nearest neighbors in the DCD feature space for position estimation.The system first runs a DCD feature detector on the observed spectrum and then solves a subset query formulation to find Inference Candidates(IC).Finally,it uses a simple Euclidean nearest neighbor search on the ICs to localize the observation.We report results on 1500 points across Florida using data and model estimates from 2015 and 2017.We also provide a Bayesian decision theoretic justification for the nearest neighbor search.