**Statistics Technical Reports:**Search | Browse by year

**Term(s):**2015**Results:**2**Sorted by:**

**Title:**On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression
estimators**Author(s):**El Karoui, Noureddine; **Date issued:**July 2015

http://nma.berkeley.edu/ark:/28722/bk0016x898z (PDF) **Abstract:**We study ridge-regularized generalized robust regression estimators, i.e $$ \betaHat=\argmin_{\beta \in \mathbb{R}^p} \frac{1}{n}\sum_{i=1}^n
\rho_i(Y_i-X_i\trsp \beta)+\frac{\tau}{2}\norm{\beta}^2\;, \text{ where } Y_i=\eps_i+X_i\trsp \beta_0\;. $$ in the situation
where $p/n$ tends to a finite non-zero limit. Our study here focuses on the situation where the errors $\eps_i$'s are heavy-tailed
and $X_i$'s have an "elliptical-like" distribution. Our assumptions are quite general and we do not require homoskedasticity
of $\eps_i$'s for instance. We obtain a characterization of the limit of $\norm{\betaHat-\beta_0}$, as well as several other
results, including central limit theorems for the entries of $\betaHat$.**Keyword note:**El__Karoui__Noureddine**Report ID:**826**Relevance:**100

**Title:**Can we trust the bootstrap in high-dimension?**Author(s):**El Karoui, Noureddine; Purdom, Elizabeth; **Date issued:**February 2015

http://nma.berkeley.edu/ark:/28722/bk0016t858j (PDF) **Abstract:**We consider the performance of the bootstrap in high-dimensions for the setting of linear regression, where p < n but p/n
is not close to zero. We consider ordinary least-squares as well as robust regression methods and adopt a minimalist performance
requirement: can the bootstrap give us good confidence intervals for a single coordinate of $\beta$? (where $\beta$ is the
true regression vector). We show through a mix of numerical and theoretical work that the bootstrap is fraught with problems.
Both of the most commonly used methods of bootstrapping for regression – residual bootstrap and pairs bootstrap – give very
poor inference on $\beta$ as the ratio p/n grows. We find that the residuals bootstrap tend to give anti-conservative estimates
(inflated Type I error), while the pairs bootstrap gives very conservative estimates (severe loss of power) as the ratio p/n
grows. We also show that the jackknife resampling technique for estimating the variance of $\hat{beta}$ severely overestimates
the variance in high dimensions. We contribute alternative bootstrap procedures based on our theoretical results that mitigate
these problems. However, the corrections depend on assumptions regarding the under- lying data-generation model, suggesting
that in high-dimensions it may be difficult to have universal, robust bootstrapping techniques.**Keyword note:**Purdom__Elizabeth El__Karoui__Noureddine**Report ID:**824**Relevance:**100