Home > Research > Publications & Outputs > The joint lasso

Associated organisational unit

Links

Text available via DOI:

View graph of relations

The joint lasso: high-dimensional regression for group structured data

Research output: Contribution to journalJournal article

Published

Standard

The joint lasso : high-dimensional regression for group structured data. / The Alzheimer's Disease Neuroimaging Initiative.

In: Biostatistics, Vol. 21, No. 2, 01.04.2020, p. 219–235.

Research output: Contribution to journalJournal article

Harvard

The Alzheimer's Disease Neuroimaging Initiative 2020, 'The joint lasso: high-dimensional regression for group structured data', Biostatistics, vol. 21, no. 2, pp. 219–235. https://doi.org/10.1093/biostatistics/kxy035

APA

The Alzheimer's Disease Neuroimaging Initiative (2020). The joint lasso: high-dimensional regression for group structured data. Biostatistics, 21(2), 219–235. https://doi.org/10.1093/biostatistics/kxy035

Vancouver

The Alzheimer's Disease Neuroimaging Initiative. The joint lasso: high-dimensional regression for group structured data. Biostatistics. 2020 Apr 1;21(2):219–235. https://doi.org/10.1093/biostatistics/kxy035

Author

The Alzheimer's Disease Neuroimaging Initiative. / The joint lasso : high-dimensional regression for group structured data. In: Biostatistics. 2020 ; Vol. 21, No. 2. pp. 219–235.

Bibtex

@article{1e8714441b534ee9bfae315325175820,
title = "The joint lasso: high-dimensional regression for group structured data",
abstract = "We consider high-dimensional regression over subgroups of observations. Our work is motivated by biomedical problems, where subsets of samples, representing for example disease subtypes, may differ with respect to underlying regression models. In the high-dimensional setting, estimating a different model for each subgroup is challenging due to limited sample sizes. Focusing on the case in which subgroup-specific models may be expected to be similar but not necessarily identical, we treat subgroups as related problem instances and jointly estimate subgroup-specific regression coefficients. This is done in a penalized framework, combining an l1 term with an additional term that penalizes differences between subgroup-specific coefficients. This gives solutions that are globally sparse but that allow information-sharing between the subgroups. We present algorithms for estimation and empirical results on simulated data and using Alzheimer{\textquoteright}s disease, amyotrophic lateral sclerosis, and cancer datasets. These examples demonstrate the gains joint estimation can offer in prediction as well as in providing subgroup-specific sparsity patterns.",
keywords = "Group-structured data, Heterogeneous data, High-dimensional regression, Penalized regression, Information sharing",
author = "Frank Dondelinger and Sach Mukherjee and {The Alzheimer's Disease Neuroimaging Initiative}",
year = "2020",
month = apr
day = "1",
doi = "10.1093/biostatistics/kxy035",
language = "English",
volume = "21",
pages = "219–235",
journal = "Biostatistics",
issn = "1465-4644",
publisher = "Oxford University Press",
number = "2",

}

RIS

TY - JOUR

T1 - The joint lasso

T2 - high-dimensional regression for group structured data

AU - Dondelinger, Frank

AU - Mukherjee, Sach

AU - The Alzheimer's Disease Neuroimaging Initiative

PY - 2020/4/1

Y1 - 2020/4/1

N2 - We consider high-dimensional regression over subgroups of observations. Our work is motivated by biomedical problems, where subsets of samples, representing for example disease subtypes, may differ with respect to underlying regression models. In the high-dimensional setting, estimating a different model for each subgroup is challenging due to limited sample sizes. Focusing on the case in which subgroup-specific models may be expected to be similar but not necessarily identical, we treat subgroups as related problem instances and jointly estimate subgroup-specific regression coefficients. This is done in a penalized framework, combining an l1 term with an additional term that penalizes differences between subgroup-specific coefficients. This gives solutions that are globally sparse but that allow information-sharing between the subgroups. We present algorithms for estimation and empirical results on simulated data and using Alzheimer’s disease, amyotrophic lateral sclerosis, and cancer datasets. These examples demonstrate the gains joint estimation can offer in prediction as well as in providing subgroup-specific sparsity patterns.

AB - We consider high-dimensional regression over subgroups of observations. Our work is motivated by biomedical problems, where subsets of samples, representing for example disease subtypes, may differ with respect to underlying regression models. In the high-dimensional setting, estimating a different model for each subgroup is challenging due to limited sample sizes. Focusing on the case in which subgroup-specific models may be expected to be similar but not necessarily identical, we treat subgroups as related problem instances and jointly estimate subgroup-specific regression coefficients. This is done in a penalized framework, combining an l1 term with an additional term that penalizes differences between subgroup-specific coefficients. This gives solutions that are globally sparse but that allow information-sharing between the subgroups. We present algorithms for estimation and empirical results on simulated data and using Alzheimer’s disease, amyotrophic lateral sclerosis, and cancer datasets. These examples demonstrate the gains joint estimation can offer in prediction as well as in providing subgroup-specific sparsity patterns.

KW - Group-structured data

KW - Heterogeneous data

KW - High-dimensional regression

KW - Penalized regression

KW - Information sharing

U2 - 10.1093/biostatistics/kxy035

DO - 10.1093/biostatistics/kxy035

M3 - Journal article

VL - 21

SP - 219

EP - 235

JO - Biostatistics

JF - Biostatistics

SN - 1465-4644

IS - 2

ER -