Jump to ratings and reviews
Rate this book

Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference

Rate this book
Master Bayesian Inference through Practical Examples and Computation–Without Advanced Mathematical Analysis Bayesian methods of inference are deeply natural and extremely powerful. However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice–freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You’ll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you’ve mastered these techniques, you’ll constantly turn to this guide for the working PyMC code you need to jumpstart future projects. Coverage includes • Learning the Bayesian “state of mind” and its practical implications • Understanding how computers perform Bayesian inference • Using the PyMC Python library to program Bayesian analyses • Building and debugging models with PyMC • Testing your model’s “goodness of fit” • Opening the “black box” of the Markov Chain Monte Carlo algorithm to see how and why it works • Leveraging the power of the “Law of Large Numbers” • Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning • Using loss functions to measure an estimate’s weaknesses based on your goals and desired outcomes • Selecting appropriate priors and understanding how their influence changes with dataset size • Overcoming the “exploration versus exploitation” deciding when “pretty good” is good enough • Using Bayesian inference to improve A/B testing • Solving data science problems when only small amounts of data are available Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.

250 pages, Paperback

First published December 19, 2014

123 people are currently reading
602 people want to read

About the author

Cameron Davidson-Pilon

2 books3 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
59 (28%)
4 stars
97 (46%)
3 stars
42 (20%)
2 stars
10 (4%)
1 star
0 (0%)
Displaying 1 - 21 of 21 reviews
Profile Image for briz.
Author 6 books75 followers
August 19, 2017
A fun and informative book on applied Bayesian modeling in Python.

Assumes knowledge of Python and, honestly, I wouldn't recommend this - alone - as an intro to Bayesian stuff. But if you combine this with Allen Downey's Think Bayes or Khan Academy's Bayes Theorem video or a course (!), you would probably be able to get off the ground with a couple initial models quite quickly.

This book is pitched towards people a bit beyond noob but not total hackers yet on both the PyMC spectrum (Python's most popular (?) Bayesian modeling library... after PySTAN?) and on the Bayesian stats spectrum. For example, the author intentionally hand-waves away Monte Carlo Markov Chain (MCMC) sampling as being much too mathy for this book's purposes, which is... meh, fine. There's a bunch of intuition on what the MCMC sampling is doing and why we use it. (Though I would have introduced conjugate priors earlier, juxtaposing "easy" Bayesian models that can be solved analytically versus hard/annoying models that need sampling.) If you want a bit *more* mathiness, I thought mathematicalmonk's video was helpful.

There's a good section on multi-armed bandit models (i.e. A/B testing when your sample sizes are super small), which was informative and kinda mind-blowing.

The book is open source and you can read it for free (and make contributions) on their GitHub repo. This is very nice and forward-thinking/Star Trek TNG of them, thank God, but it also means that the later chapters are a bit sloppy. I cloned the repo a couple weeks ago and was gifted with a previously-unseen Chapter 7, which turned out to be a bunch of TODOs. Oh well. There's also some typos, maybe a bit too much Personality (heh), and the nagging feeling that, gosh, I guess I *could* edit their book for some open source software street cred - but then I thought, WHAT ME BE A GLORIFIED SECRETARY AGAIN NO THANK YOU, and lo, thus dideth the number of women contributors to open source code remaineth at the 5%, the Devil's percent.
Profile Image for Arthur Kipel.
74 reviews3 followers
June 7, 2021
Book provides a lot of examples with Bayesian inference using Markov Chain Monte Carlo. Some examples are too complicated, but most of them are clear and easy to understand.
I felt a lack of math description of the MCMC method during reading. So it is a rather sketchy book with examples for developers, how to use this method for statistical analysis from the PyMC library, without mathematical details.
Profile Image for Jake.
211 reviews43 followers
June 3, 2016
"Sometimes the questions are complicated and the answers are simple." ~ Dr. Seuss

I picked this book up after @DataSkeptic talked with Pilon about Bayesian A/B Testing. Put off reading it and doing exercises from it until I started applying for internships that wanted me to have more Python experience for Data Science. As an undergrad, from what I've seen of R I haven't liked. The terse nature of that language makes for a steep learning curve, but I'm sure I'll get to learning it too eventually. We're also here to talk about this book, so let's go:

I'm under the impression, much like Jake VanderPlas, that if you can write a for-loop you can do statistics. Statistics as a discipline is so mired in its own bullshit formalism that it gets in the way of doing what is important, which is asking difficult questions and attempting to gain insight from them with math. This book attempts are unraveling a lot of that bullshit.

From a hacking perspective, it's a pragmatic one. You solve one problem to the best of your faculties, then you solve the next one, and then the next. Eventually you solve enough problems that you have a functioning algorithm that is correct. I particularly liked the latter half of chapter 2, the entirety of chapter 3 and chapter 5. I actually used the methods in chapter 3 for my Physics post lab which was very cool.

The book gives you merely an outline of each topic discussed, a road-map to instruct the reader where they might look next for tools to add to their arsenal as a programmer. It's a very good overview in my experience for this reason.

"A complex system that works is invariably found to have evolved from a simple system that worked." ~ John Gall
Profile Image for rohola zandie.
24 reviews12 followers
December 18, 2017
This book is just beyond probabilistic programming using pymc. It gives you some deep insight on what is Bayesian analysis and how you can see different problems in a Bayesian framework. Try to read the book in ipynb format which is interactive and easy to understand.
2 reviews1 follower
January 27, 2024
The text is great, however, the printed version is super outdated, to the point that reading it is like doing archeology. My advice: go to the github repository of this book (open source) and look into the updated versions for Pymc 4, do not buy the book.

Pymc 4 suffered a strong refactoring from 2 to 3 and then to 4. The API, backend, and even the philosophy behind the package changed completely, it's almost impossible to follow the book given that it's very entangled with pymc2. Sadly the practical part didn't age well, however, the theoretical part and how the book was designed still live up to this day via the pull request on the repo, read those instead.




Profile Image for Xinyu.
187 reviews31 followers
July 23, 2017
Without intimidating math, this is a really nice introduction to Bayesian analysis and pymc3. I learned a lot from this book. I am ready to delve a little bit deeper into Bayesian methods, but I will probably come back to better understand some examples.
Profile Image for Yasser Mohammad.
93 reviews23 followers
June 19, 2015
very accessible and pragmatic. nicely compliment PGM books like Koller's and BRML
Profile Image for William Schram.
2,310 reviews95 followers
December 8, 2023
Bayesian Methods For Hackers is a book by Cameron Davidson-Pilon. It covers PyMC, which uses a Markov Chain Monte Carlo (MCMC) method to arrive at solutions. Davidson-Pilon intends the text to act as a bridge between theory and practice, and he does a good job.

The book has seven chapters, and all of them have a project. For example, all statistics books have a coin-flipping simulation, and Chapter 1 has ours. Chapter 2 contains a simulation of the Challenger Space Shuttle disaster.

When Davidson-Pilon mentions hackers, he means a skilled programmer or computer user. Davidson-Pilon includes some example code and links to web pages. Davidson-Pilon wrote Bayesian Methods For Hackers in 2016, so some web pages may have changed.

I enjoyed the book. Thanks for reading my review, and see you next time.
Profile Image for Pritesh Shrivastava.
80 reviews6 followers
August 8, 2020
Before starting the book, I thought it was trying to make statistics, and especially Bayesian statistics, easier to pick up for developers with little stats background. On the other hand, I found the discussion on Bayesian methods fairly difficult to follow, especially in the later chapters.

Also, the library PyMC3 has dependency on Theano which is now deprecated. A Tensorflow for Probability version of these chapters is available on Github and learning about that was interesting.
I did learn about some powerful new ideas and it would be interesting to see more research in this space, but for now, I don't see much use of these techniques in my day-to-day ML projects.
Profile Image for José.
233 reviews
September 11, 2018
Very cool (and surprisingly fun) book on Bayesian inference using MCMC, probably more suited for Python programmers (some knowledge on Bayesian statistics is convenient). It has the ideal amount of mathematical details for someone with little experience on the field - enough to make most deductions easy to understand and not enough to make it look intimidating.

If you are looking for something on the theory of Bayesian inference I would probably skip this, but if you are planning on getting your hands dirty with MCMC as soon as possible this can be a good option.
Profile Image for Danny D. Leybzon.
156 reviews2 followers
May 10, 2020
A really cool project which leverages Jupyter notebooks to create a fully interactive and dynamic textbook to teach the basics of Bayesian thinking and methodologies. It falls short in its mathematical rigor (hence the proud identification of being "for Hackers"), but should still be adequate for people looking to get some practical exposure to using Bayesian methods to solve inferencing questions and the like. One point that stood out to me was that Bayesian methods excel in low-data scenarios, which is an interesting problem space to tackle.
Profile Image for Oleg Dats.
39 reviews17 followers
February 15, 2021
I have really understood the Bayes' theorem only after completing this book. I have implemented most exercises in Pyro. Some day I will post my code )

Probabilistic programming is the practical use of one very very deep idea. After completing you will be able to model different life phenomena in a Probabilistic way. You will understand what is information gain and how to encode available knowledge about the problem.

I would highly recommend reading Pyro documentation after this book.
Profile Image for Rogério Chaves.
2 reviews1 follower
December 21, 2019
The book promises to focus on the hacker side and leave math on the side, but for me it was still too advanced, maybe I’m just too noob for it and need to learn more about Bayesian Methods before going back to this

I was hoping this book would allow me to jump right into code and then backtrack the Bayesian theories from there, but that doesn’t happen, prior theoretical knowledge is required
Profile Image for Mykhas Kobernyk.
12 reviews4 followers
March 11, 2020
Amazing book. More than half of the content consists of the code and execution results; nevertheless, ideas of distributions advantages over the scalar predictions or customized loss functions are described very nice. Still, I wouldn't recommend it if you're brand new to data science.


Profile Image for Walter Tay.
22 reviews
August 13, 2018
Not an easy read. Will come back again after I brush up on basics.
Profile Image for Baran Toppare.
25 reviews2 followers
February 13, 2022
Some of the code is a bit outdated but still, this is an amazing book and must-read for every data analyst/scientist.
Profile Image for Ben Kester.
71 reviews5 followers
December 1, 2016
This textbook is accessible for beginners. It takes you through several applications of Bayesian stats, codes up in PyMC. I love how he gives you the code (in the book) and the data (online). These libraries can be difficult to figure out but slick once you have an example.

I only wish that there was more. It's at a basic level, without much consideration for multiple features. MCMC is difficult to understand without finding a video with simulation.
4 reviews1 follower
July 14, 2016
The book is a very good hands-on attempt at bayesian inference through markov chain monte carlo. It brings value by including a good number of real life or software industry examples.
Displaying 1 - 21 of 21 reviews

Can't find what you're looking for?

Get help and learn more about the design.