- PhD Dissertation
Final published version, 859 KB, PDF document

Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Research output: Thesis › Doctoral Thesis

Published

**Essays on nonparametric inference and instrument selection.** / Kang, Byunghoon.

Research output: Thesis › Doctoral Thesis

Kang, B 2016, 'Essays on nonparametric inference and instrument selection', PhD, University of Wisconsin-Madison, Ann Arbor.

Kang, B. (2016). *Essays on nonparametric inference and instrument selection*. University of Wisconsin-Madison.

Kang B. Essays on nonparametric inference and instrument selection. Ann Arbor: University of Wisconsin-Madison, 2016. 149 p.

@phdthesis{1438931d80314d5e8873a0e7795a8e6a,

title = "Essays on nonparametric inference and instrument selection",

abstract = "My dissertation consists of two chapters on nonparametric inference and model selection in econometric models.Researchers in economics and social science need reliable models and statistical tools to quantify economic relationships and uncertainty associated with data. In practice, researchers often evaluate their object of interests with various specifications in the first stage of analysis or select model by some criteria. Unfortunately, commonly used statistical methods may fail to assess uncertainty inherent in the first step specification search. Moreover, some existing model selection criteria may be fragile due to model misspecification errors. All these methods can lead to misleading conclusions without valid, robust corrections. To quantify and test economic theories more accurately in such cases, researchers and policy makers need more reliable and robust methods. My research investigates these issues and provides practical methods in empirical research with rigorous theoretical justifications.First chapter provides new inference methods in nonparametric series regression with data dependent number of series terms. Nonparametric series estimation have increased their popularity as it gives flexible method addressing potential misspecification of the parametric model. However, implementation in practice requires a choice of the number of series terms and the estimation and inference may largely depend on its choice. Existing asymptotic theory for inference in nonparametric series estimation typically imposes an undersmoothing condition that the number of series terms is sufficiently large to make bias asymptotically negligible. However, there is no formally justified data-dependent method for this in practice. This chapter constructs inference methods for nonparametric series regression models and introduces tests based on the infimum of t-statistics over different series terms. First, I provide a uniform asymptotic theory for the t-statistic process indexed by the number of series terms. Using this result, I show that the test based on the infimum of the t-statistics and its asymptotic critical value controls the asymptotic size with the undersmoothing condition. We can construct a valid confidence interval (CI) by test statistic inversion that has correct asymptotic coverage probability. Even when asymptotic bias terms are present without the undersmoothing condition, I show that the CI based on the infimum of the t-statistics bounds the coverage distortions. In an illustrative example, nonparametric estimation of wage elasticity of the expected labor supply from Blomquist and Newey (2002), proposed CI is close to or tighter than those based on existing methods with possibly ad hoc choice of series terms.Second chapter provides instrument selection criteria in instrumental variable (IV) regression model when there is a large set of instruments with potential invalidity. Economic data identified by IV model sometimes involve large sets of potential instruments and debates about their validity. Existing methods for instrument selection are largely based on a priori assumption of an instrument{\textquoteright}s validity and/or based on the first-order asymptotics, which may lead to a large finite sample bias with many and invalid instruments. First, I derive higher-order mean square error (MSE) approximation for two-stage least squares (2SLS), limited information maximum likelihood (LIML), modified Fuller (FULL) and bias-adjusted 2SLS (B2SLS) estimator allowing locally invalid instruments. Based on the approximation to the higher-order MSE, I propose an invalidity-robust instrument selection criteria (IRC) that capture two sources of finite sample bias at the same time: bias from using many instruments and bias from invalid instruments. I also show optimality result of choice of instruments based on the criteria of Donald and Newey (2001) under certain locally invalid instruments specification.",

keywords = "IV estimator, Instrument selection, Invalid instruments, Nonparametric series regression, Pointwise inference, Specification search",

author = "Byunghoon Kang",

year = "2016",

language = "English",

publisher = "University of Wisconsin-Madison",

school = "University of Wisconsin-Madison",

}

TY - THES

T1 - Essays on nonparametric inference and instrument selection

AU - Kang, Byunghoon

PY - 2016

Y1 - 2016

N2 - My dissertation consists of two chapters on nonparametric inference and model selection in econometric models.Researchers in economics and social science need reliable models and statistical tools to quantify economic relationships and uncertainty associated with data. In practice, researchers often evaluate their object of interests with various specifications in the first stage of analysis or select model by some criteria. Unfortunately, commonly used statistical methods may fail to assess uncertainty inherent in the first step specification search. Moreover, some existing model selection criteria may be fragile due to model misspecification errors. All these methods can lead to misleading conclusions without valid, robust corrections. To quantify and test economic theories more accurately in such cases, researchers and policy makers need more reliable and robust methods. My research investigates these issues and provides practical methods in empirical research with rigorous theoretical justifications.First chapter provides new inference methods in nonparametric series regression with data dependent number of series terms. Nonparametric series estimation have increased their popularity as it gives flexible method addressing potential misspecification of the parametric model. However, implementation in practice requires a choice of the number of series terms and the estimation and inference may largely depend on its choice. Existing asymptotic theory for inference in nonparametric series estimation typically imposes an undersmoothing condition that the number of series terms is sufficiently large to make bias asymptotically negligible. However, there is no formally justified data-dependent method for this in practice. This chapter constructs inference methods for nonparametric series regression models and introduces tests based on the infimum of t-statistics over different series terms. First, I provide a uniform asymptotic theory for the t-statistic process indexed by the number of series terms. Using this result, I show that the test based on the infimum of the t-statistics and its asymptotic critical value controls the asymptotic size with the undersmoothing condition. We can construct a valid confidence interval (CI) by test statistic inversion that has correct asymptotic coverage probability. Even when asymptotic bias terms are present without the undersmoothing condition, I show that the CI based on the infimum of the t-statistics bounds the coverage distortions. In an illustrative example, nonparametric estimation of wage elasticity of the expected labor supply from Blomquist and Newey (2002), proposed CI is close to or tighter than those based on existing methods with possibly ad hoc choice of series terms.Second chapter provides instrument selection criteria in instrumental variable (IV) regression model when there is a large set of instruments with potential invalidity. Economic data identified by IV model sometimes involve large sets of potential instruments and debates about their validity. Existing methods for instrument selection are largely based on a priori assumption of an instrument’s validity and/or based on the first-order asymptotics, which may lead to a large finite sample bias with many and invalid instruments. First, I derive higher-order mean square error (MSE) approximation for two-stage least squares (2SLS), limited information maximum likelihood (LIML), modified Fuller (FULL) and bias-adjusted 2SLS (B2SLS) estimator allowing locally invalid instruments. Based on the approximation to the higher-order MSE, I propose an invalidity-robust instrument selection criteria (IRC) that capture two sources of finite sample bias at the same time: bias from using many instruments and bias from invalid instruments. I also show optimality result of choice of instruments based on the criteria of Donald and Newey (2001) under certain locally invalid instruments specification.

AB - My dissertation consists of two chapters on nonparametric inference and model selection in econometric models.Researchers in economics and social science need reliable models and statistical tools to quantify economic relationships and uncertainty associated with data. In practice, researchers often evaluate their object of interests with various specifications in the first stage of analysis or select model by some criteria. Unfortunately, commonly used statistical methods may fail to assess uncertainty inherent in the first step specification search. Moreover, some existing model selection criteria may be fragile due to model misspecification errors. All these methods can lead to misleading conclusions without valid, robust corrections. To quantify and test economic theories more accurately in such cases, researchers and policy makers need more reliable and robust methods. My research investigates these issues and provides practical methods in empirical research with rigorous theoretical justifications.First chapter provides new inference methods in nonparametric series regression with data dependent number of series terms. Nonparametric series estimation have increased their popularity as it gives flexible method addressing potential misspecification of the parametric model. However, implementation in practice requires a choice of the number of series terms and the estimation and inference may largely depend on its choice. Existing asymptotic theory for inference in nonparametric series estimation typically imposes an undersmoothing condition that the number of series terms is sufficiently large to make bias asymptotically negligible. However, there is no formally justified data-dependent method for this in practice. This chapter constructs inference methods for nonparametric series regression models and introduces tests based on the infimum of t-statistics over different series terms. First, I provide a uniform asymptotic theory for the t-statistic process indexed by the number of series terms. Using this result, I show that the test based on the infimum of the t-statistics and its asymptotic critical value controls the asymptotic size with the undersmoothing condition. We can construct a valid confidence interval (CI) by test statistic inversion that has correct asymptotic coverage probability. Even when asymptotic bias terms are present without the undersmoothing condition, I show that the CI based on the infimum of the t-statistics bounds the coverage distortions. In an illustrative example, nonparametric estimation of wage elasticity of the expected labor supply from Blomquist and Newey (2002), proposed CI is close to or tighter than those based on existing methods with possibly ad hoc choice of series terms.Second chapter provides instrument selection criteria in instrumental variable (IV) regression model when there is a large set of instruments with potential invalidity. Economic data identified by IV model sometimes involve large sets of potential instruments and debates about their validity. Existing methods for instrument selection are largely based on a priori assumption of an instrument’s validity and/or based on the first-order asymptotics, which may lead to a large finite sample bias with many and invalid instruments. First, I derive higher-order mean square error (MSE) approximation for two-stage least squares (2SLS), limited information maximum likelihood (LIML), modified Fuller (FULL) and bias-adjusted 2SLS (B2SLS) estimator allowing locally invalid instruments. Based on the approximation to the higher-order MSE, I propose an invalidity-robust instrument selection criteria (IRC) that capture two sources of finite sample bias at the same time: bias from using many instruments and bias from invalid instruments. I also show optimality result of choice of instruments based on the criteria of Donald and Newey (2001) under certain locally invalid instruments specification.

KW - IV estimator

KW - Instrument selection

KW - Invalid instruments

KW - Nonparametric series regression

KW - Pointwise inference

KW - Specification search

M3 - Doctoral Thesis

PB - University of Wisconsin-Madison

CY - Ann Arbor

ER -