Re: should we have a fast-path planning for OLTP starjoins?

From: Jim Nasby <jnasby(at)upgrade(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: should we have a fast-path planning for OLTP starjoins?
Date: 2025-02-04 22:28:47
Message-ID: CAMFBP2pP3Lizcbisp6Q_7Y1_vo4MkywxUkZ=MXt7LcBxGn_GhA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 4, 2025 at 3:42 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:

> On 2/4/25 21:23, Tom Lane wrote:
> > Tomas Vondra <tomas(at)vondra(dot)me> writes:
> >> On 2/4/25 20:43, Jeff Davis wrote:
> >>> If you base it on the join conditions rather than the size of the
> >>> table, then detection of the star join would be based purely on the
> >>> query structure (not stats), which would be nice for predictability.
> >
> >> Right, there may be other (possibly better) ways to detect the star join
> >> shape. I was thinking about also requiring for foreign keys on the join
> >> clauses - in DWH systems FKeys are sometimes omitted, which would break
> >> the heuristics, but in OLTP it's common to still have them.
> >
> > I think you need to insist on foreign keys. Otherwise you don't know
> > whether the joins will eliminate fact-table rows. If that's a
> > possibility then it's no longer sensible to ignore different join
> > orders.
>
> Hmmm, yeah. But that's only for the INNER JOIN case. But I've seen many
> of these star join queries with LEFT JOIN too, and then the FKs are not
> needed. All you need is a PK / unique index on the other side.
>
> Perhaps requiring (INNER JOIN + FK) or (LEFT JOIN + PK) would be enough
> to make this work for most cases, and then the rest would simply use the
> regular join order algorithm.
>
> I was thinking that if we allow the dimensions to eliminate rows in the
> fact table, we'd simply join them starting from the most selective ones.
> But that doesn't work if the joins might have different per-row costs
> (e.g. because some dimensions are much larger etc). Doing something
> smarter would likely end up fairly close to the regular join order
> algorithm ...
>

As long as the join is still happening there doesn't appear to be a
correctness issue here, so I'm not sure mandating FKs makes sense.

The reason this matters is that highly concurrent FK checks can get VERY
expensive (due to the cost of creating multiXacts). While it'd be great to
fix that issue the reality today is it's not uncommon for people to remove
FKs because of the high performance penalty.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2025-02-04 22:34:46 Re: Eagerly scan all-visible pages to amortize aggressive vacuum
Previous Message Masahiko Sawada 2025-02-04 22:06:22 Re: UUID v7