Rate this book

All of Statistics: A Concise Course in Statistical Inference

Larry Wasserman

Rate this book

Taken literally, the title "All of Statistics" is an exaggeration. But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines.
The book includes modern topics like non-parametric curve estimation, bootstrapping, and classification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analysing data.

GenresMathematicsTextbooksTechnicalReferenceNonfictionScienceComputer Science

462 pages, Hardcover

First published December 4, 2003

186 people are currently reading

2003 people want to read

About the author

Larry Wasserman

11 books14 followers

Larry A. Wasserman is a Canadian statistician and a professor in the Department of Statistics and the Machine Learning Department at Carnegie Mellon University.

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

176 (45%)

4 stars

151 (39%)

3 stars

46 (11%)

2 stars

11 (2%)

1 star

3 (<1%)

Displaying 1 - 30 of 31 reviews

Hamish

440 reviews36 followers

August 22, 2021

I DID IT! I FINISHED IT! I'M FREE!

Notes:
- The core problem in probability is "given a generating process, what does the output look like?"
- The core problem in statistics is "given some output, what does the generating process look like?"
- If A and B are disjoint events with non-zero probability, then they cannot be independent (because P(AB)=0, but P(A),P(B)>0). "Except in this special case, there is no way to judge independence by looking at the sets in a Venn diagram."
- Mistaking P(A|B) for P(B|A) is called the prosecutor's fallacy.
- The rule that "P(B) = sum_i P(B|A_i) if {A_i} is a partition" is called the Law of Total Probability.
- I don't understand this: you can't generally assign probabilities to all subsets in a sample space, so attention is restricted to sigma-fields.
- A geometric distribution is of the form P(X=k)=p(1-p)^{k-1}. Why is this called geometric? Because it's a geometric sequence (see here for more details).
- The rate of the sum of two Poisson distributions is the sum of the rates. That is, if X1 ~ Poisson(n,lambda1), and X2 ~ Poisson(n,lambda2), then X1+x2 ~ Poisson(n,lambda1+lambda2).
- The mathematical construction of a random variable is a mapping from sample space Omega to R. Just like in computers.
- Standard normal distribution is denoted Z, with pdf and cdf denoted phi(z) and Phi(z).
- There is no closed form of Phi(z).
- If Xi ~ N(mi, si^2), then sum Xi ~ N(sum mi, sum si^2)
- The logic of the gamma distribution, cauchy, and X^2 distributions continue to elude me.
- Cauchy distribution is like Gaussian, but with thicker tails. It's a special case of t-distribution.
- The multinomial distributions has binomial distributions as marginal distributions.
- How to find the pdf of a transformation x -> y of a random variable: 1) find the pre-image for each y, 2) evaluate the CDF of the pre-image, 3) differentiate the CDF to get the PDF.

maths non-fiction read-manually

Michiel

377 reviews89 followers

April 23, 2014

Very good reference on notions on probability, statistics and machine learning. Not ideal to learn the matter from scratch, but ideal to refresh and supplement your knowledge when you do a PhD.

computer-science math phd-starter-kit

Amin Rashidbeigi

23 reviews13 followers

January 30, 2021

Really good reference. Its probability and inference parts are awesome.

read-engineering

Terran M

78 reviews103 followers

May 8, 2020

From the title, one expects this book to be comprehensive and encyclopedic, but I found the opposite to be the case. This is a very mathematical rapid-survey of statistics which does not explain how to actually do any of the things that a working engineer or scientist would need to do.

I think the audience of this book is "mathematicians who find books with more equations than text to be comfortable and easy to learn from, who also know nothing about statistics and want a quick survey of the field, and who will use statistics to prove theorems and write papers instead of actually calculating anything." This book is completely unsuitable for engineers; for those I would recommend Baclawski and then Diez. Even Casella&Berger is much more accessible than this book.

Zach

205 reviews

May 30, 2022

This could possibly be useful as a reference book. Otherwise, it's math without any explanations, unless you find symbol manipulation explanatory. I'm not afraid of math (I minored in it and have a degree in CS), but I don't understand a formula without first understanding the concepts behind it. I expect this is true of most people. It is pretty funny to me that this book is billed as 'for people who want to learn probability and statistics quickly... No previous knowledge of probability and statistics is required.'

Chris

142 reviews39 followers

December 31, 2018

Doesn't actually do what it says, but makes headway toward that goal.

If you want to learn about the chi-square, don't read Wikipedia. Read Wasserman.

mathematics

Xinyu

186 reviews31 followers

May 14, 2019

The perfect statistics book for me... ❤️
I now feel better equipped to read more stats books...

math statistics-machine-learning

Xingda Wang

22 reviews

August 1, 2018

I learnt Statistics for 2 - 3 times in campus, but I still find this book is too hard, not suitable for beginner, some of the symbols in the theorem come from nowhere, and some of the definition needs further explanation. I can understand until chapter 7, but the symbols already beyond I can remember or understand.

Juan Manuel

50 reviews3 followers

December 16, 2022

Nice reference book. Just as reference. Don't try to self-study from it.

Matias

132 reviews

October 27, 2024

Dark words Dark times. Too math heavy with too little real life examples. Also often too little steps given to the reader

Daeus

386 reviews3 followers

June 12, 2025

Great for a high level overview of a very broad space (i.e. all of modern statistics). I definitely skimmed most of the equations, and some chapters, but still got a lot out of it as a refresher with good intuition, connecting statistics and computer science well.

Quotes
- "The basic problem that we study in probability is: Given a data generating process, what are the properties of the outcomes?... the basic problem of statistical inference is the inverse of probability: given the outcomes, what can we say about the process that generated the data?"
- "many inferential problems can be identified as being one of three types: estimation, confidence sets, or hypothesis testing."
- "Confidence intervals are often more informative than tests." Because they also give magnitude and area easier to interpret.
- "The p-value is not the probability that the null hypothesis is true."
- "To combine prior beliefs with data in a principled way, use Bayesian inference. To construct procedures with guaranteed long run performance, such as confident intervals, use frequentist methods."
- Choosing among ways to generate estimators is decision theory. "Decision theory which is the formal theory for comparing statistical procedures.... and estimator is sometimes called a decision rule." Meausured using a loss function such as squared error loss, absolute loss, etc. Usually we use squared error lose (e.g. mean squared error).
- "[AIC: Akaike Information Criterion] can be thought of 'goodness of fit' minus 'complexity.'"
- The BIC (Bayesian Information Criterion) is similar but puts a more severe penalty for complexity.
- "In forward stepwise regression, we start with no covariates in the model. We then add the one variable that leads to the best score we continue adding variables one at a time until the score does not improve. Backwards stepwise regression is the same except we start with the biggest model and drop one variable at a time. Both are greedy searches: neither is guaranteed to find the model with the best score." [Note: page 221 typo: nether vs neither]. "Another popular method is to do random searching through the set of all models. However, there is no reason to expect this to be superior to a deterministic search." For example we might use stepwise regression using AIC as our score.
- "roughly speaking, the statement 'X causes Y' means that changing the value of X will change the distribution of Y. When X causes Y, X and Y will be associated but the reverse is not, in general, true. Association does not necessarily imply causation."
- "Even after adjusting for confounders, we cannot be sure that there are not other confounding variables that we missed. This is why observational studies must be treated with healthy skepticism. Results from observational studies start to become believable when: (i) the results are replicated in many studies, (ii) each of these studies controlled for plausible confounding variables, (iii) there is a plausible scientific explanation for the existence of a causal relationship... a good example is smoking and cancer."
- X (parent) -> Y(child). Where X causes Y X-> Y <- Z, is called a collider at Y.
- The curse of dimensionality. "To get a sense of how serious this problem is, consider the following table from Silverman (1986) which shows the sample size required to ensure a relative mean squared error less than 0.1 at 0 when the density is multicariate normal and the optimal bandwidth is selected.... that is bad news indeed. It says that having 824,000 observations in a ten-dimensional problem is really like having 4 observations in a one-dimensional problem."
- Statistics -> computater science terms:
Classification -> supervised learning. Predicting a discrete Y from X.
Data -> training sample
Covariates -> features
Classifier -> hypothesis. Map X -> Y
Estimation -> learning. Finding a good classifier

Quazirfan

9 reviews8 followers

September 30, 2020

It was my first statistics book and I disliked the book since the author does a poor job explaining the details. If you are new to statistics without a lot of training in mathematics ANY other book would be better than this book.

C. Hinsley

68 reviews3 followers

March 4, 2021

Great exercises, pretty comprehensive treatment and very readable without being slow-paced.

59 reviews1 follower

April 25, 2021

A good refresh on the probability/statistics topics.

Leo Ferres

50 reviews3 followers

February 17, 2022

Excellent book... short and concise.

原輔曹

2 reviews

June 16, 2022

It is my textbook for my statistics course. I think it is a good choice for somebody who has learned probability.

Rick Sam

428 reviews146 followers

July 4, 2022

Wanting to re-read again >
Notes in progress
Draft

artificial-intelligence computer-science math

WannaHaveShrex

54 reviews

April 28, 2024

So far the best statistics book i have ever read. Broad range of subjects, enough examples to give some idea about the concepts. It requires a bit of calculus knowledge though.

Huy Truong

18 reviews

December 1, 2024

one of the best stats book out there guys

math

Navid Ghorbanali

10 reviews

April 30, 2025

A great book, nice to keep as a reference and occasional refresher. Covers an impressive range of topics. Well written and holds up remarkably well despite its age.

Henry

159 reviews74 followers

March 2, 2021

A concise reference book for frequentist statistical practice up until the late 1990s or so.

60-probability-and-stochastic 62-statistics-and-statistical mcsl-popular

afloatingpoint

191 reviews31 followers

April 19, 2016

10/15/2015: So far, this is a really good book with comprehensive material, simple examples, rich problems, and most importantly easy to understand.

12/8/2015: I like everything about this book, except the title. It may receive some complaints about not discussing in depth some topics, but one can always go look up and read more on their topics of interest. Nonetheless, this is a very well written book!

mathematic

Darin

120 reviews18 followers

March 20, 2021

The author states that he wrote the book to help get engineering students up to speed. The topics and depth are in line with what one would expect from a mathematical statistics book. It's a good book for finding out what is out there, but most discussions are too brief for most people to learn the material from this book.

mathematics own statistics

David

10 reviews4 followers

August 14, 2007

The material covered in this book is not covered in sufficient depth to understand it unless you have covered once already. That said this book is a great reference: collections of useful theorems and properties.

Sanjeev

66 reviews2 followers

January 20, 2021

A VERY difficult read and took some time for me to refresh my rusted knowledge of statistics.

My first pass was a browse through this. I look forward to many months of banging my head against this book.

I wish it came with an answer key but apparently, that costs extra.

Joseph Bronski

Author 1 book64 followers

January 17, 2024

Great exposition of probability and statistics from an abstract, theorem oriented point of view, similar to Linear Algebra Done Right. Most haters are probably not comfortable with proofs style math.