From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | "Patrick O'Toole" <patrick(dot)otoole(at)sturdy(dot)ai> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Helping planner to chose sequential scan when it improves performance |
Date: | 2023-06-25 19:34:38 |
Message-ID: | CAMkU=1ySObPhq4eNjscigw5jwV2iZK+D2rr2UA5VODRqnDBWHg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-performance |
On Tue, Jun 13, 2023 at 3:28 PM Patrick O'Toole <patrick(dot)otoole(at)sturdy(dot)ai>
wrote:
> run the query twice first, then...
Is that a realistic way to run the test? Often forcing all the data needed
for the query into memory is going to make things less realistic, not more
realistic. Assuming the system has more stuff to do than just perform this
one query, it might be unusual for the query to find everything it needs in
memory. Also, if you really do want to do it this way, then you should do
this for every plan. Different plans need to access a different
collections of buffers, so prewarming just one plan will privilege that one
over the others.
>
> PLAN A (default config, effective cache size just shy of 15GB): 3.829
> seconds. A nested loop is used to probe the hash index
> `conversation_item_item_hash_index` for each row of item_text. Although the
> cost of probing once is low, a fair amount of time passes because the
> operation is repeated ~1.3 million times.
>
> PLAN B (enable_indexscan off, effective cache same as before): 3.254
> seconds (~15% speedup, sometimes 30%). Both tables are scanned sequentially
> and conversation_item is hashed before results are combined with a hash
> join.
>
> PLAN C: (random_page_cost = 8.0, instead of default 4, effective cache
> same as before): 2.959 (~23% speedup, sometimes 38%). Same overall plan as
> PLAN B, some differences in buffers and I/O. I'll note we had to get to 8.0
> before we saw a change to planner behavior; 5.0, 6.0, and 7.0 were too low
> to make a difference.
>
The difference between B and C looks like it is entirely noise, having to
do with how many buffers it found already in the cache and how many of them
needed cleaning (which causes the buffer to be dirty as the cleaned version
now needs to be written to disk) and how many dirty buffers it found that
needed to be written in order to make way to read other buffers it needs.
(This last number most generally reflects dirty buffers left around by
other things which this query encountered, not the buffers the query itself
dirtied). None of this is likely to be reproducible, and so not worth
investigating.
And the difference between A and BC is small enough that it is unlikely to
be worth pursuing, either, even if it is reproducible. If your apps runs
this one exact query often enough that a 30% difference is worth worrying
about, you would probably be better served by questioning the business
case. What are you doing with 1.4 million rows once you do fetch them,
that it needs to be repeated so often?
If you think that taking a deep dive into this one query is going to
deliver knowledge which will pay off for other (so far unexamined) queries,
I suspect you are wrong. Look for queries where the misestimation is more
stark than 30% to serve as your case studies.
>
> Environment:
>
> Postgres 15.2
> Amazon RDS — db.m6g.2xlarge
>
>
> Questions:
>
> In Plan A, what factors are causing the planner to select a substantially
> slower plan despite having recent stats about number of rows?
>
Even if it were worth trying to answer this (which I think it is not),
there isn't much we can do with dummy tables containing no data. You would
need to include a script to generate data of a size and distribution which
reproduces the given behavior.
> Is there a substantial difference between the on-the-fly hash done in
Plan B and Plan C compared to the hash-index used in Plan A? Can I assume
they are essentially the same? Perhaps there are there differences in how
they're applied?
They are pretty much entirely different. Once jumps all over the index on
disk, the other reads the table sequentially and (due to work_mem) parcels
it out into chunks where it expects each chunk can also be read back in
sequentially as well. About the only thing not different is that they both
involve computing a hash function.
> Is it common to see values for random_page_cost set as high as 8.0? We
would of course need to investigate whether we see a net positive or net
negative impact on other queries, to adopt this as a general setting, but
is it a proposal we should actually consider?
I've never needed to set it that high, but there is no a priori reason it
wouldn't make sense to do. Settings that high would probably only be
suitable for HDD (rather than SSD) storage and when caching is not very
effective, which does seem to be the opposite of your situation. So I
certainly wouldn't do it just based on the evidence at hand.
Cheers,
Jeff
>
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-06-25 19:35:14 | Re: bug or lacking doc hint |
Previous Message | Ron | 2023-06-25 16:24:49 | Re: bug or lacking doc hint |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-06-25 19:48:27 | Re: Helping planner to chose sequential scan when it improves performance |
Previous Message | Patrick O'Toole | 2023-06-25 06:29:00 | Fwd: Helping planner to chose sequential scan when it improves performance |