From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Craig James <cjames(at)emolecules(dot)com> |
Cc: | Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Mike Sofen <msofen(at)runbox(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Slow query with big tables |
Date: | 2016-08-27 16:36:56 |
Message-ID: | 6955.1472315816@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Craig James <cjames(at)emolecules(dot)com> writes:
> Straight hash-table indexes (which Postgres doesn't use) have O(1) access
> time. The amount of data has no effect on the access time.
This is wishful thinking --- once you have enough data, O(1) goes out the
window. For example, a hash index is certainly not going to continue to
scale linearly once you reach its maximum possible number of buckets
(2^N for N-bit hashes, and remember you can't get very many useful hash
bits out of small objects like integers). But even before that, large
numbers of buckets put enough stress on your storage system that you will
see some not very O(1)-ish behavior, just because too little of the index
fits in whatever cache and RAM you have. Any storage hierarchy is
ultimately going to impose O(log N) access costs, that's the way they're
built.
I think it's fairly pointless to discuss such matters in the abstract.
If you want to make useful engineering tradeoffs you have to talk about
specific data sets and available hardware.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2016-08-27 18:12:17 | Re: Slow query with big tables |
Previous Message | Craig James | 2016-08-27 14:13:00 | Re: Slow query with big tables |