Re: Slow query with big tables

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Craig James <cjames(at)emolecules(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Mike Sofen <msofen(at)runbox(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Slow query with big tables
Date: 2016-08-27 18:12:17
Message-ID: CAMkU=1ya7NWz0T+i=nWPi5+5JY0DLQAns0CFEUh2fk1EkPY9Nw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Aug 27, 2016 at 7:13 AM, Craig James <cjames(at)emolecules(dot)com> wrote:

> On Fri, Aug 26, 2016 at 9:11 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
> wrote:
>
>> On 8/26/16 3:26 PM, Mike Sofen wrote:
>>
>>> Is there way to keep query time constant as the database size grows.
>>>
>>
>> No. More data == more time. Unless you find a way to break the laws of
>> physics.
>>
>
> Straight hash-table indexes (which Postgres doesn't use) have O(1) access
> time.
>

But he isn't doing single-row lookups, he is doing large aggregations. If
you have to aggregate N rows, doing a O(1) operation on different N
occasions is still O(N).

Not that big-O is useful here anyway. It assumes that either everything
fits in RAM (and is already there), or that nothing fits in RAM and it all
has to be fetched from disk, even the index root pages, every time it is
needed. Tommi is not operating under an environment where the first
assumption holds, and no one operates in an environment where the second
assumption holds.

As N increases beyond available RAM, your actual time for a single look-up
is going to be a weighted average of two different constant-time
operations, one with a small constant and one with a large constant. Big-O
notation ignores this nicety and assumes all operations are at the slower
speed, because that is what the limit of the weighted average will be as N
gets very large. But real world systems do not operate at the infinite
limit.

So his run time could easily be proportional to N^2, if he aggregates more
rows and each one of them is less likely to be a cache hit.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2016-08-27 18:33:50 Re: Slow query with big tables
Previous Message Tom Lane 2016-08-27 16:36:56 Re: Slow query with big tables