Helping planner to chose sequential scan when it improves performance

From: "Patrick O'Toole" <patrick(dot)otoole(at)sturdy(dot)ai>
To: pgsql-general(at)postgresql(dot)org
Subject: Helping planner to chose sequential scan when it improves performance
Date: 2023-06-13 19:24:51
Message-ID: CAP_eXGKJrr_Bo4MaHtvhxzTi9aneJAfEA8z5KW_LwdBFWk684Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Hi all,

I recently started at a new firm and have been trying to help to grok
certain planner behavior. A strip-down example of the sort of join we do in
the database looks like this, wherein we join two tables that have about 1
million rows:

-- VACUUM (FULL, VERBOSE, ANALYZE), run the query twice first, then...
EXPLAIN(ANALYZE, VERBOSE, COSTS, SETTINGS, BUFFERS, WAL, TIMING, SUMMARY)
SELECT
ci.conversation_uuid,
ci.item_uuid,
ci.tenant_id,
it.item_subject,
it.item_body_start
FROM
conversation_item AS ci
INNER JOIN item_text AS it ON it.item_uuid = ci.item_uuid;

-- The necessary DDL that creates these tables and indexes is attached.
I've commented out some extra stuff that isn't directly related to the
above query.

Depending on config, we get different results in terms of performance
(EXPLAIN output attached):

PLAN A (default config, effective cache size just shy of 15GB): 3.829
seconds. A nested loop is used to probe the hash index
`conversation_item_item_hash_index` for each row of item_text. Although the
cost of probing once is low, a fair amount of time passes because the
operation is repeated ~1.3 million times.

PLAN B (enable_indexscan off, effective cache same as before): 3.254
seconds (~15% speedup, sometimes 30%). Both tables are scanned sequentially
and conversation_item is hashed before results are combined with a hash
join.

PLAN C: (random_page_cost = 8.0, instead of default 4, effective cache same
as before): 2.959 (~23% speedup, sometimes 38%). Same overall plan as PLAN
B, some differences in buffers and I/O. I'll note we had to get to 8.0
before we saw a change to planner behavior; 5.0, 6.0, and 7.0 were too low
to make a difference.

Environment:

Postgres 15.2
Amazon RDS — db.m6g.2xlarge

Questions:

1. In Plan A, what factors are causing the planner to select a
substantially slower plan despite having recent stats about number of rows?
2. Is there a substantial difference between the on-the-fly hash done in
Plan B and Plan C compared to the hash-index used in Plan A? Can I assume
they are essentially the same? Perhaps there are there differences in how
they're applied?
3. Is it common to see values for random_page_cost set as high as 8.0?
We would of course need to investigate whether we see a net positive or net
negative impact on other queries, to adopt this as a general setting, but
is it a proposal we should actually consider?
4. Maybe we are barking up the wrong tree with the previous questions.
Are there other configuration parameters we should consider first to
improve performance in situations like the one illustrated?
5. Are there other problems with our schema, query, or plans shown here?
Other approaches (or tools/analyses) we should consider?

Attachment Content-Type Size
PLANC.txt text/plain 1.9 KB
PLANB.txt text/plain 2.0 KB
PLANA.txt text/plain 1.6 KB
create_statements.postgre.sql application/octet-stream 1.7 KB

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2023-06-13 20:15:41 Re: Exclusion constraint with negated operator?
Previous Message Torsten Förtsch 2023-06-13 18:11:01 Exclusion constraint with negated operator?

Browse pgsql-performance by date

  From Date Subject
Next Message Ruslan Zakirov 2023-06-14 23:00:12 Re: Helping planner to chose sequential scan when it improves performance
Previous Message Pavel Stehule 2023-06-13 15:00:57 Re: Postgresql equal join on function with columns not use index