Robust Bayesian nonparametric variable selection for linear regression

Associated organisational units

Text available via DOI:

https://doi.org/10.1002/sta4.696
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Keywords

Dirichlet process, heteroskedasticity, horseshoe, spike-and-slab, variable selection

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Robust Bayesian nonparametric variable selection for linear regression. / Cabezas, Alberto ; Battiston, Marco ; Nemeth, Christopher.
In: Stat, Vol. 13, No. 2, e696, 27.05.2024.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Bibtex

@article{714adde00d974aa895099b636b1d402e,

title = "Robust Bayesian nonparametric variable selection for linear regression",

abstract = "Spike-and-slab and horseshoe regressions are arguably the most popular Bayesian variable selection approaches for linear regression models. However, their performance can deteriorate if outliers and heteroskedasticity are present in the data, which are common features in many real-world statistics and machine learning applications. This work proposes a Bayesian nonparametric approach to linear regression that performs variable selection while accounting for outliers and heteroskedasticity. Our proposed model is an instance of a Dirichlet process scale mixture model with the advantage that we can derive the full conditional distributions of all parameters in closed-form, hence producing an efficient Gibbs sampler for posterior inference. Moreover, we present how to extend the model to account for heavy-tailed response variables. The model's performance is tested against competing algorithms on synthetic and real-world datasets.",

keywords = "Dirichlet process, heteroskedasticity, horseshoe, spike-and-slab, variable selection",

author = "Alberto Cabezas and Marco Battiston and Christopher Nemeth",

note = "Publisher Copyright: {\textcopyright} 2024 The Author(s). Stat published by John Wiley & Sons Ltd.",

year = "2024",

month = may,

day = "27",

doi = "10.1002/sta4.696",

language = "English",

volume = "13",

journal = "Stat",

issn = "2049-1573",

publisher = "Wiley-Blackwell Publishing Ltd",

number = "2",

}

RIS

TY - JOUR

T1 - Robust Bayesian nonparametric variable selection for linear regression

AU - Cabezas, Alberto

AU - Battiston, Marco

AU - Nemeth, Christopher

PY - 2024/5/27

Y1 - 2024/5/27

N2 - Spike-and-slab and horseshoe regressions are arguably the most popular Bayesian variable selection approaches for linear regression models. However, their performance can deteriorate if outliers and heteroskedasticity are present in the data, which are common features in many real-world statistics and machine learning applications. This work proposes a Bayesian nonparametric approach to linear regression that performs variable selection while accounting for outliers and heteroskedasticity. Our proposed model is an instance of a Dirichlet process scale mixture model with the advantage that we can derive the full conditional distributions of all parameters in closed-form, hence producing an efficient Gibbs sampler for posterior inference. Moreover, we present how to extend the model to account for heavy-tailed response variables. The model's performance is tested against competing algorithms on synthetic and real-world datasets.

AB - Spike-and-slab and horseshoe regressions are arguably the most popular Bayesian variable selection approaches for linear regression models. However, their performance can deteriorate if outliers and heteroskedasticity are present in the data, which are common features in many real-world statistics and machine learning applications. This work proposes a Bayesian nonparametric approach to linear regression that performs variable selection while accounting for outliers and heteroskedasticity. Our proposed model is an instance of a Dirichlet process scale mixture model with the advantage that we can derive the full conditional distributions of all parameters in closed-form, hence producing an efficient Gibbs sampler for posterior inference. Moreover, we present how to extend the model to account for heavy-tailed response variables. The model's performance is tested against competing algorithms on synthetic and real-world datasets.

KW - Dirichlet process

KW - heteroskedasticity

KW - horseshoe

KW - spike-and-slab

KW - variable selection

U2 - 10.1002/sta4.696

DO - 10.1002/sta4.696

M3 - Journal article

AN - SCOPUS:85194542688

VL - 13

JO - Stat

JF - Stat

SN - 2049-1573

IS - 2

M1 - e696

ER -

Research

Associated organisational units

Links

Text available via DOI:

Keywords