Statistics Technical Reports:Search | Browse by year

Term(s):2001
Results:25
Sorted by:
Page: 1 2  Next

Title:Elementary divisors and determinants of random matrices over a local field
Author(s):Evans, Steven N.;
Date issued:Dec 2001
http://nma.berkeley.edu/ark:/28722/bk0000n208z (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n209h (PostScript)
Abstract:We consider the elementary divisors and determinant of a uniformly distributed $n \times n$ random matrix $M_n$ with entries in the ring of integers of an arbitrary local field. We show that the sequence of elementary divisors is in a simple bijective correspondence with a Markov chain on the nonnegative integers. The transition dynamics of this chain do not depend on the size of the matrix. As $n \rightarrow \infty$, all but finitely many of the elementary divisors are $1$, and the remainder arise from a Markov chain with these same transition dynamics. We also obtain the distribution of the determinant of $M_n$ and find the limit of this distribution as $n \rightarrow \infty$. Our formulae have connections with classical identities for $q$-series, and the $q$-binomial theorem in particular.
Keyword note:Evans__Steven_N
Report ID:614
Relevance:100

Title:Markov processes on vermiculated spaces
Author(s):Barlow, Martin T.; Evans, Steven N.;
Date issued:Nov 2001
http://nma.berkeley.edu/ark:/28722/bk0000n2f3x (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n2f4g (PostScript)
Abstract:A general technique is given for constructing new Markov processes from existing ones. The new process and its state space are both projective limits of sequences built by an iterative scheme. The space at each stage in the scheme is obtained by taking disjoint copies of the space at the previous stage and quotienting to identify certain distinguished points. Away from the distinguished points, the process at each stage evolves like the one constructed at the previous stage on some copy of the previous state space, but when the process hits a distinguished point it enters at random another of the copies pinned'' at that point. Special cases of this construction produce diffusions on fractal-like objects that have been studied recently.
Keyword note:Barlow__Martin_T Evans__Steven_N
Report ID:613
Relevance:100

Title:Improving the Accuracy of the Census Through Adjustment
Author(s):Freedman, David A.; Wachter, Kenneth W.;
Date issued:Nov 2001
http://nma.berkeley.edu/ark:/28722/bk0000n2d85 (PDF)
Keyword note:Freedman__David Wachter__Kenneth
Report ID:612
Relevance:100

Title:What is the Chance of an Earthquake
Author(s):Freedman, D. A.; Stark, P. B.;
Date issued:Sep 2001
http://nma.berkeley.edu/ark:/28722/bk0000n2d5h (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n2d62 (PostScript)
Abstract:What is the chance that at least one earthquake of magnitude 6.7 or greater will occur before the year 2030 in the San Francisco Bay Area? The U.S. Geological Survey estimated the chance to be 0.7 +/- 0.1. In this paper, we try to interpret such probabilities. Making sense of earthquake forecasts is surprisingly difficult. In part, this is because the forecasts are based on a complicated mixture of geological maps, empirical rules of thumb, expert opinion, physical models, stochastic models, numerical simulations, as well as geodetic, seismic, and paleoseismic data. Even the concept of probability is hard to define in this context. We examine the problems in applying standard definitions of probability to earthquakes, taking the USGS forecast---the product of a particularly careful and ambitious study---as our lead example. The issues are general, and concern the interpretation more than the numerical values. Despite the work involved in the USGS forecast, the probability estimate is shaky, as is the estimate of its uncertainty.
Keyword note:Freedman__David Stark__Philip_B
Report ID:611
Relevance:100

Title:Modelling Movements of Free-Ranging Animals
Author(s):Brillinger, David R.; Preisler, Haiganoush K.; Ager, Alan A.; Kie, John G.; Stewart, Brent S.;
Date issued:Sep 2001
http://nma.berkeley.edu/ark:/28722/bk0000n3z35 (PDF)
Abstract:This work derives and fits stochastic models to the trajectories of mammals moving about in a heterogeneous landscape. The basic data are locations of 53 Rocky Mountain elk ((\it Cervus elaphus)) estimated approximately every two-hours for nine months. The elk roam about the Starkey Experimental Forest and Range in eastern Oregon. Elk movements may be affected by explanatory variables such as the locations of fences, of roads, of cover, of water, of forage and other habitat characteristics. Wildlife biologists are interested in questions like how an elk's movement relates to such explanatories. In the work a model was developed in successive stages. First equations of motion were set down motivated by the idea of a potential function. Then the functional parameters appearing in the equations were estimated nonparametrically. Statistical questions arising involved how to include explanatory variables in the equations and how to decide which variables are significant? Residual plots proved useful. Time of day was found to play a fundamental role and distance to nearest road enters as well. Future work will include other explanatories.
Keyword note:Brillinger__David_R Preisler__Haiganoush_Krikorian Ager__Alan_A Kie__John_G Stewart__Brent_S
Report ID:610
Relevance:100

Title:Inverse Problems as Statistics
Author(s):Evans, S. N.; Stark, P. B.;
Date issued:Aug 2001
http://nma.berkeley.edu/ark:/28722/bk0000n3n42 (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n3n5m (PostScript)
Abstract:What mathematicians, scientists, engineers, and statisticians mean by inverse problem'' differs. For a statistician, an inverse problem is an inference or estimation problem. The data are finite in number and contain errors, as they do in classical estimation or inference problems, and the unknown typically is infinite-dimensional, as it is in nonparametric regression. The additional complication in an inverse problem is that the data are only indirectly related to the unknown. Canonical abstract formulations of statistical estimation problems subsume this complication by allowing probability distributions to be indexed in more-or-less arbitrary ways by parameters, which can be infinite-dimensional. Standard statistical concepts, questions, and considerations such as bias, variance, mean-squared error, identifiability, consistency, efficiency, and various forms of optimality, apply to inverse problems. This article discusses inverse problems as statistical estimation and inference problems, and points to the literature for a variety of techniques and results.
Keyword note:Evans__Steven_N Stark__Philip_B
Report ID:609
Relevance:100

Title:Hitting, occupation, and inverse local times of one-dimensional diffusions: martingale and excursion approaches
Author(s):Pitman, Jim; Yor, Marc;
Date issued:Nov 2001
http://nma.berkeley.edu/ark:/28722/bk0000n2f08 (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n2f1t (PostScript)
Abstract:Basic relations between the distributions of hitting, occupation, and inverse local times of a one-dimensional diffusion process $X$, first discussed by \Ito-McKean, are reviewed from the perspectives of martingale calculus and excursion theory. These relations, and the technique of conditioning on $L_T^y$, the local time of $X$ at level $y$ before a suitable random time $T$, yield formulae for the joint Laplace transform of $L_T^y$ and the times spent by $X$ above and below level $y$ up to time $T$.
Keyword note:Pitman__Jim Yor__Marc
Report ID:607
Relevance:100

Title:Boosting with the $L_2$-Loss: Regression and Classification
Author(s):Bühlmann, Peter; Yu, Bin;
Date issued:Aug 2001
http://nma.berkeley.edu/ark:/28722/bk0000n2c96 (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n2d0r (PostScript)
Abstract:This paper investigates a variant of boosting, $L_2$Boost, which is constructed from a functional gradient descent algorithm with the $L_2$-loss function. Based on an explicit stagewise refitting expression of $L_2$Boost, the case of (symmetric) linear weak learners is studied in detail in both regression and two-class classification. In particular, with the boosting iteration $m$ working as the smoothing or regularization parameter, a new exponential bias-variance trade off is found with the variance (complexity) term bounded as $m$ tends to infinity. When the weak learner is a smoothing spline, an optimal rate of convergence result holds for both regression and two-class classification. And this boosted smoothing spline adapts to higher order, unknown smoothness. Moreover, a simple expansion of the 0-1 loss function is derived to reveal the importance of the decision boundary, bias reduction, and impossibility of an additive bias-variance decomposition in classification. Finally, simulation and real data set results are obtained to demonstrate the attractiveness of $L_2$Boost, particularly with a novel component-wise cubic smoothing spline as an effective and practical weak learner.
Keyword note:Buhlmann__Peter Yu__Bin
Report ID:605
Relevance:100

Title:Large scale inference and tomography for network monitoring and diagnosis
Author(s):Coates, Mark; Hero, Alfred; Nowak, Robert; Yu, Bin;
Date issued:Aug 2001
http://nma.berkeley.edu/ark:/28722/bk0000n2c6j (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n2c73 (PostScript)
Abstract:Today's Internet is a massive, distributed network which continues to explode in size as e-commerce and related activities grow. The heterogeneous and largely unregulated structure of the Internet renders tasks such as dynamic routing, optimized service provision, service level verification, and detection of anamolous/malicious behavior increasingly challenging tasks. The problem is compounded by the fact that one cannot rely on the cooperation of individual servers and routers to aid in the collection of network traffic measurements vital for these tasks. In many ways, network monitoring and inference problems bear a strong resemblance to other inverse problems'' in which key aspects of a system are not directly observable. Familiar signal processing problems such as tomographic image reconstruction, pattern recognition, system identification, and array processing all have interesting interpretations in the networking context. This article introduces the new field of large-scale network inference, a field which we believe will benefit greatly from the wealth of signal processing research and algorithms.
Keyword note:Coates__Mark Hero__Alfred Nowak__Robert Yu__Bin
Report ID:604
Relevance:100

Title:Eigenvalues of random wreath products
Author(s):Evans, Steven N.;
Date issued:Jul 2001
http://nma.berkeley.edu/ark:/28722/bk0000n2c3w (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n2c4f (PostScript)
Abstract:Consider a uniformly chosen element $X_n$ of the $n$-fold wreath product $\Gamma_n = G \wr G \wr \dots \wr G$, where $G$ is a finite permutation group acting transitively on some set of size $s$. The eigenvalues of $X_n$ in the natural $s^n$-dimensional permutation representation are investigated by considering the random measure $\Xi_n$ on the unit circle that assigns mass $1$ to each eigenvalue. It is shown that if $f$ is a trigonometric polynomial, then $\lim_(n \rightarrow \infty) P$$\int f d\Xi_n \ne s^n \int f d\lambda$$=0$, where $\lambda$ is normalised Lebesgue measure on the unit circle. In particular, $s^(-n) \Xi_n$ converges weakly in probability to $\lambda$ as $n \rightarrow \infty$. For a large class of test functions $f$ with non-terminating Fourier expansions, it is shown that there exists a constant $c$ and a non-zero random variable $W$ (both depending on $f$) such that $c^(-n) \int f d\Xi_n$ converges in distribution as $n \rightarrow \infty$ to $W$. These results have applications to Sylow $p$-groups of symmetric groups and autmorphism groups of regular rooted trees.
Keyword note:Evans__Steven_N
Report ID:603
Relevance:100

Title:Non-Parametric Estimators Which Can Be Plugged-In''
Author(s):Bickel, Peter J.; Ritov, Ya'acov;
Date issued:Jul 2001
http://nma.berkeley.edu/ark:/28722/bk0000n295z (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n296h (PostScript)
Abstract:We consider nonparametric estimation of an object such as a probability density or a regression function. Can such an estimator achieve the minimax rate of convergence on suitable function spaces, while, at the same time, when plugged-in'', estimate efficiently (at a rate of $n^(-1/2)$ with the best constant) many functionals of the object? For example, can we have a density estimator whose definite integrals are efficient estimators of the cumulative distribution function? We show that this is impossible for very large sets, e.g., expectations of all functions bounded by $M<\en$. However we also show that it is possible for sets as large as indicators of all quadrants, i.e., distribution functions. We give appropriate constructions of such estimates.
Keyword note:Bickel__Peter_John Ritov__Yaacov
Report ID:602
Relevance:100

Title:On Specifying Graphical Models for Causation
Author(s):Freedman, David A.;
Date issued:Jun 2001
http://nma.berkeley.edu/ark:/28722/bk0000n293v (PDF)
Abstract:This paper (which is mainly expository) sets up graphical models for causation, having a bit less than the usual complement of hypothetical counterfactuals. Assuming the invariance of error distributions may be essential for causal inference, but the errors themselves need not be invariant. Graphs can be interpreted using conditional distributions, so that we can better address connections between the mathematical framework and causality in the world. The identification problem is posed in terms of conditionals. As will be seen, causal relationships cannot be inferred from a data set by running regressions unless there is substantial prior knowledge about the mechanisms that generated the data. The idea can be made more precise in several ways. There are few successful applications of graphical models, mainly because few causal pathways can be excluded on a priori grounds. The invariance conditions themselves remain to be assessed.
Keyword note:Freedman__David
Report ID:601
Relevance:100

Title:Applications of resampling methods to estimate the number of clusters and to improve the accuracy of a clustering method
Author(s):Fridlyand, Jane; Dudoit, Sandrine;
Date issued:Sep 2001
http://nma.berkeley.edu/ark:/28722/bk0000n3z0h (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n3z12 (PostScript)
Abstract:The burgeoning field of genomics, and in particular microarray experiments, have revived interest in both discriminant and cluster analysis, by raising new methodological and computational challenges. The present paper discusses applications of resampling methods to problems in cluster analysis. A resampling method, known as bagging in discriminant analysis, is applied to increase clustering accuracy and to assess the confidence of cluster assignments for individual observations. A novel prediction-based resampling method is also proposed to estimate the number of clusters, if any, in a dataset. The performance of the proposed and existing methods are compared using simulated data and gene expression data from four recently published cancer microarray studies.
Keyword note:Fridlyand__Jane Dudoit__Sandrine
Report ID:600
Relevance:100

Title:Random logistic maps and Lyapunov exponents
Author(s):Steinsaltz, D.;
Date issued:Jun 2001
http://nma.berkeley.edu/ark:/28722/bk0000n2c07 (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n2c1s (PostScript)
Abstract:We prove that under certain basic regularity conditions, a random iteration of logistic maps converges to a random point attractor when the Lyapunov exponent is negative, and does not converge to a point when the Lyapunov exponent is positive.
Keyword note:Steinsaltz__David
Report ID:599
Relevance:100

Title:Convergence of Moments in a Markov-chain central limit theorem
Author(s):Steinsaltz, D.;
Date issued:Jul 2001
http://nma.berkeley.edu/ark:/28722/bk0000n2b7k (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n2b84 (PostScript)
Abstract:We show that all moments of the partial sum process of a test function g along the paths of a V-uniformly ergodic Markov chain converge to the corresponding moments of a normal variable. For the n-th moment to converge, g^n must be bounded by a constant times V. We also derive starting-point dependent bounds on the rate of convergence.
Keyword note:Steinsaltz__David
Report ID:598
Relevance:100

Title:Poisson-Dirichlet and GEM invariant distributions for split-and-merge transformations of an interval partition
Author(s):Pitman, Jim;
Date issued:Jun 2001
http://nma.berkeley.edu/ark:/28722/bk0000n1q44 (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n1q5p (PostScript)
Abstract:This paper introduces a split-and-merge transformation of interval partitions which combines some features of one model studied by Gnedin and Kerov and another studied by Tsilevich and by Mayer-Wolf, Zeitouni and Zerner. The invariance under this split-and-merge transformation of the interval partition generated by a suitable Poisson process yields a simple proof of the recent result of Mayer-Wolf, Zeitouni and Zerner that a Poisson-Dirichlet distribution is invariant for a closely related fragmentation-coagulation process. Uniqueness and convergence to the invariant measure are established for the split-and-merge transformation of interval partitions, but the corresponding problems for the fragmentation-coagulation process remain open.
Keyword note:Pitman__Jim
Report ID:597
Relevance:100

Title:Census 2000
Author(s):Freedman, D. A.; Wachter, K. W.;
Date issued:Apr 2001
http://nma.berkeley.edu/ark:/28722/bk0000n3m4j (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n3m53 (PostScript)
Abstract:The census has been taken every ten years since 1790. Counts are used to apportion Congress and to redistrict states. Furthermore, census data are the basis for allocations of federal tax money to cities and other local governments. For such purposes, the geographical distribution of the population matters, rather than counts for the nation as a whole. The census turns out to be remarkably good, despite the generally bad press reviews. Statistical adjustment is unlikely to improve the accuracy: adjustment may well put in more error than it takes out. In this article, we sketch procedures for taking the census, evaluating it, and making adjustments. Pointers to the literature will be found at the end of the article, including citations to the main arguments for and against adjustment.
Keyword note:Freedman__David Wachter__Kenneth
Report ID:596
Relevance:100

Title:Invariance Principles for Non-uniform Random Mappings and Trees
Author(s):Aldous, David; Pitman, Jim;
Date issued:Dec 2001
http://nma.berkeley.edu/ark:/28722/bk0000n287j (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n2883 (PostScript)
Abstract:In the context of uniform random mappings of an n-element set to itself, Aldous and Pitman (1994) established a functional invariance principle, showing that many limit distributions as n tends to infinity can be described as distributions of suitable functions of reflecting Brownian bridge. To study non-uniform cases, in this paper we formulate a "sampling invariance principle" in terms of iterates of a fixed number of random elements. We show that the sampling invariance principle implies many, but not all, of the distributional limits implied by the functional invariance principle. We give direct verifications of the sampling invariance principle in two successive generalizations of the uniform case, to p-mappings (where elements are mapped to i.i.d. non-uniform elements) and P-mappings (where elements are mapped according to a Markov matrix). We compare with parallel results in the simpler setting of random trees.
Keyword note:Aldous__David_J Pitman__Jim
Report ID:594
Relevance:100

Title:Random mappings, forests, and subsets associated with Abel-Cayley-Hurwitz multinomial expansions
Author(s):Pitman, Jim;
Date issued:Jun 2001
http://nma.berkeley.edu/ark:/28722/bk0000n4213 (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n422n (PostScript)
Abstract:Various random combinatorial objects, such as mappings, trees, forests, and subsets of a finite set, are constructed with probability distributions related to the binomial and multinomial expansions due to Abel, Cayley and Hurwitz. Relations between these combinatorial objects, such as Joyal's bijection between mappings and marked rooted trees, have interesting probabilistic interpretations, and applications to the asymptotic structure of large random trees and mappings. An extension of Hurwitz's binomial formula is associated with the probability distribution of the random set of vertices of a fringe subtree in a random forest whose distribution is defined by terms of a multinomial expansion over rooted labeled forests.
Pub info:S{\'e}minaire Lotharingien de Combinatoire, Issue 46, (45 pp.) 2001
Keyword note:Pitman__Jim
Report ID:593
Relevance:100

Title:A different construction of Gaussian fields from Markov chains: Dirichlet covariances
Author(s):Diaconis, Persi; Evans, Steven N.;
Date issued:Apr 2001
http://nma.berkeley.edu/ark:/28722/bk0000n1s0z (PDF)
http://nma.berkeley.edu/ark:/28722/bk0000n1s1h (PostScript)
Abstract:We study a class of Gaussian random fields with negative correlations. These fields are easy to simulate. They are defined in a natural way from a Markov chain that has the index space of the Gaussian field as its state space. In parallel with Dynkin's investigation of Gaussian fields having covariance given by the Green's function of a Markov process, we develop connections between the occupation times of the Markov chain and the prediction properties of the Gaussian field. Our interest in such fields was initiated by their appearance in random matrix theory.
Keyword note:Diaconis__Persi Evans__Steven_N
Report ID:592
Relevance:100

Page: 1 2  Next