It's hard to be a developer today without using a database… but they're often surrounded by an air of reverent mystery.
One of those mysteries is why it's so slow to count all the rows in a table using COUNT(*)
. After all, it's just a matter of walking a b-tree and counting leaves… and that should be trivial to optimize!
In this talk I'll answer the question of "why COUNT(*) is slow" by taking a deep dive into the the internals of Postgres' MVCC implementation, looking at:
This talk is accessible to anyone who's used an SQL database, with enough depth that experienced developers will find some interesting tidbits.
David is a Pythonista from Toronto, Canada. He's co-founder of both PyCon Canada – Canada's regional Python conference – and Akindi.com – a small company that's making multiple-choice bubble sheet tests a little bit less terrible. He's also the author of nose-parameterized, a parameterized testing for every Python testrunner, and pprint++, a Python pretty-printer that's actually pretty. Say hi on Twitter: @wolever!