Re: More efficient RI checks - take 2

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: More efficient RI checks - take 2
Date: 2020-04-23 06:36:42
Message-ID: CAFj8pRDqVT3_4YRK=DW8GzFihdvcw8hOjpkgOTy4PO_gzkeMmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

čt 23. 4. 2020 v 8:28 odesílatel Antonin Houska <ah(at)cybertec(dot)at> napsal:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
> > čt 23. 4. 2020 v 7:06 odesílatel Antonin Houska <ah(at)cybertec(dot)at> napsal:
> >
> > Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > > But it's not entirely clear to me that we know the best plan for a
> > > statement-level RI action with sufficient certainty to go that way.
> > > Is it really the case that the plan would not vary based on how
> > > many tuples there are to check, for example?
> >
> > I'm concerned about that too. With my patch the checks become a bit
> slower if
> > only a single row is processed. The problem seems to be that the
> planner is
> > not entirely convinced about that the number of input rows, so it can
> still
> > build a plan that expects many rows. For example (as I mentioned
> elsewhere in
> > the thread), a hash join where the hash table only contains one tuple.
> Or
> > similarly a sort node for a single input tuple.
> >
> > without statistics the planner expect about 2000 rows table , no?
>
> I think that at some point it estimates the number of rows from the number
> of
> table pages, but I don't remember details.
>
> I wanted to say that if we constructed the plan "manually", we'd need at
> least
> two substantially different variants: one to check many rows and the other
> to
> check a single row.
>

There can be more variants - a hash join should not be good enough for
bigger data.

The overhead of RI is too big, so I think any solution that will be faster
then current and can be inside Postgres 14 can be perfect.

But when you know so input is only one row, you can build a query without
join

> --
> Antonin Houska
> Web: https://www.cybertec-postgresql.com
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rajkumar Raghuwanshi 2020-04-23 06:43:33 Re: WIP/PoC for parallel backup
Previous Message Masahiko Sawada 2020-04-23 06:35:18 Re: Dumping/restoring fails on inherited generated column