Re: performance expectations for table(s) with 2B recs

From: "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: performance expectations for table(s) with 2B recs
Date: 2021-12-08 22:14:13
Message-ID: 20211208221413.xfijsujafpvdp5c2@hjp.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2021-12-08 14:44:47 -0500, David Gauthier wrote:
> So far, the tables I have in my DB have relatively low numbers of records (most
> are < 10K, all are < 10M).  Things have been running great in terms of
> performance.  But a project is being brainstormed which may require some tables
> to contain a couple billion records.
[...]
> What else should I be worried about ?
>
> I suspect that part of why things are running really well so far is that the
> relatively small amounts of data in these tables ends up in the DB cache and
> disk I/O is kept at a minimum.  Will that no longer be the case once queries
> start running on these big tables ?

Depends a lot on how good the locality of your queries is. If most read
only the same parts of the same indexes, those will still be in the
cache. If they are all over the place or if you have queries which need
to read large parts of your tables, cache misses will make your
performance a lot less predictable, yes. That stuff is also hard to
test, because when you are testing a query twice in a row, the second
time it will likely hit the cache and be quite fast.

But in my experience the biggest problem with large tables are unstable
execution plans - for most of the parameters the optimizer will choose
to use an index, but for some it will erroneously think that a full
table scan is faster. That can lead to a situation where a query
normally takes less than a second, but sometimes (seemingly at random)
it takes several minutes - users will understandably be upset about such
behaviour. It is in any case a good idea to monitor execution times to
find such problems (ideally before users complain), but each needs to be
treated on an individual basis, and sometimes there seems to be no good
solution.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp(at)hjp(dot)at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Дмитрий Иванов 2021-12-09 03:26:38 Re: CTE Materialization
Previous Message David Gauthier 2021-12-08 19:44:47 performance expectations for table(s) with 2B recs