William S. Cleveland

Statistics Research
Bell Labs, Murray Hill, NJ

Lucent Technologies (50,000 people) is the company that "makes the things that make communications work". Bell Labs (15,000 people) is the R&D organization of Lucent. The Research Area of Bell Labs (550 people) does basic and applied research. Statistics Research (11 people), is part of the Research Area of Bell Labs. We got started by Walter Shewhart and John Tukey (2 people).

* Online Papers, organized by research area.
Books and Papers , pdf bibliography organized from latest to earliest.
Curriculum Vitae , pdf.

* Internet Traffic Research: Monitoring, Analysis, Modeling, and Generation

Goals: We study and model the characteristics of Internet traffic to build tools for network performance monitoring, characterizing quality of service, network provisioning and design, and the development of new protocols and algorithms.

Three Fundamental Areas of Work: Our tool development emanates from three fundamental areas of work: (1) packet header data collection projects (2) hardware and software environment for analyzing very large packet header databases (3) fundamental research in the characterization and modeling of traffic.

* Two Books on Visualization and The Analysis of Data

Coplot of Ethanol Data The Elements of Graphing Data: Principles of graph construction and the visual communication of data. Visualization methods. The principles and methods are supported by a rigorous, scientific discussion of graphical perception, the visual decoding of information from data displays. Prerequisites: None.

Visualizing Data: Visualization methods. A strategy for data analysis that stresses the use of visualization to thoroughly study the structure of data and to check the validity of mathematical and statistical models fitted to data. Prerequisites: Basic statistics and least-squares fitting.

For both books: Quotes from Reviewers     Prefaces     How to Order
For Visualizing Data: Data Sets as Tables, Data Sets as S/SPLUS Objects, and Scripts to Produce the Book Figures in S/SPLUS

* Trellis Display: Visualizing Multivariable Databases

Trellis Display is a framework for the visualization of multivariable databases. Its most prominent aspect is an overall visual design, reminiscent of a garden trelliswork, in which panels are laid out into columns, rows, and pages. The panels contain graphs of panel variables conditional on values of conditioning variables. Any graphical method can be used to display the panel variables. Software for Trellis is available in S-Plus, Windows 3.3 and UNIX 3.4, and in Axum Graphics 5.0. An overview, papers, and S-PLUS documentation are available on the Trellis home page.

* Smoothing: lowess, loess, locfit, and stl

Local regression is an approach to fitting curves and surfaces to noisy data by a multivariate smoothing procedure: fitting a linear or quadratic function of the predictor variables in a moving fashion that is analogous to how a moving average is computed for a time series. stl is is a procedure, based on loess, for decomposing a seasonal time series into seasonal and other components.

Software packages for three vintages of local regression --- lowess, loess, and locfit --- together with a software package for stl are available on the Web, and they all have implementations in S/S-PLUS.