From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: should we have a fast-path planning for OLTP starjoins? |
Date: | 2025-02-04 20:06:53 |
Message-ID: | 9edd7fa2-843d-4381-9836-5771c2c72b86@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2/4/25 20:43, Jeff Davis wrote:
> On Tue, 2025-02-04 at 15:00 +0100, Tomas Vondra wrote:
>> This is a surprisingly common query pattern in OLTP applications,
>> thanks
>> to normalization.
>
> +1. Creating a small lookup table should be encouraged rather than
> penalized.
>
> Your test data includes a fact table with 10k rows and no index on the
> filter condition. In OLTP applications the fact table might often fit
> in memory, but I'd still expect it to have an index on the filter
> condition. That might not change your overall point, but I'm curious
> why you constructed the test that way?
>
No particular reason. I think I intended to make it a lookup by PK
(which would match the use case examples), and I forgot about that. But
yeah, I would expect an index too.
>
>> There's a lot of stuff that could / should be improved on the current
>> patch. For (1) we might add support for more complex cases with
>> snowflake schemas [3] or with multiple fact tables. At the same time
>> (1)
>> needs to be very cheap, so that it does not regress every non-
>> starjoin
>> query.
>
> The patch only considers the largest table as the fact table, which is
> a good heuristic of course. However, I'm curious if other approaches
> might work. For instance, could we consider the table involved in the
> most join conditions to be the fact table?
>
> If you base it on the join conditions rather than the size of the
> table, then detection of the star join would be based purely on the
> query structure (not stats), which would be nice for predictability.
>
Right, there may be other (possibly better) ways to detect the star join
shape. I was thinking about also requiring for foreign keys on the join
clauses - in DWH systems FKeys are sometimes omitted, which would break
the heuristics, but in OLTP it's common to still have them.
I think the cost of the heuristic will be an important metric - I don't
know if the number of join conditions is more expensive to determine
than what the patch does now, though.
>> But the bigger question is whether it makes sense to have such fast-
>> path
>> modes for certain query shapes.
>
> We should explore what kinds of surprising cases it might create, or
> what maintenance headaches might come up with future planner changes.
> But the performance numbers you posted suggest that we should do
> something here.
>
Yes, it seems like an interesting opportunity for starjoin queries. It's
a pretty common query pattern, but it also happens to be very expensive
to plan because the dimensions can be reordered almost arbitrarily.
regards
--
Tomas Vondra
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2025-02-04 20:11:29 | Re: new commitfest transition guidance |
Previous Message | Robert Treat | 2025-02-04 19:57:40 | Re: Eagerly scan all-visible pages to amortize aggressive vacuum |