Re: Wrong results from Parallel Hash Full Join

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Wrong results from Parallel Hash Full Join
Date: 2023-04-20 00:47:07
Message-ID: CAAKRu_YFUC+Sh9969RJ7GCSX=nPOQLVJp9UHK_iHtVT=wkBEKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 19, 2023 at 8:41 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> On Wed, Apr 19, 2023 at 12:20:51PM -0700, Andres Freund wrote:
> > On 2023-04-19 12:16:24 -0500, Justin Pryzby wrote:
> > > On Wed, Apr 19, 2023 at 11:17:04AM -0400, Melanie Plageman wrote:
> > > > Ultimately this is probably fine. If we wanted to modify one of the
> > > > existing tests to cover the multi-batch case, changing the select
> > > > count(*) to a select * would do the trick. I imagine we wouldn't want to
> > > > do this because of the excessive output this would produce. I wondered
> > > > if there was a pattern in the tests for getting around this.
> > >
> > > You could use explain (ANALYZE). But the output is machine-dependant in
> > > various ways (which is why the tests use "explain analyze so rarely).
> >
> > I think with sufficient options it's not machine specific.
>
> It *can* be machine specific depending on the node type..
>
> In particular, for parallel workers, it shows "Workers Launched: ..",
> which can vary even across executions on the same machine. And don't
> forget about "loops=".
>
> Plus:
> src/backend/commands/explain.c: "Buckets: %d Batches: %d Memory Usage: %ldkB\n",
>
> > We have a bunch of
> > EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) ..
> > in our tests.
>
> There's 81 uses of "timing off", out of a total of ~1600 explains. Most
> of them are in partition_prune.sql. explain analyze is barely used.
>
> I sent a patch to elide the machine-specific parts, which would make it
> easier to use. But there was no interest.

While I don't know about other use cases, I would have used that here.
Do you still have that patch laying around? I'd be interested to at
least review it.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-04-20 00:48:08 Re: Autogenerate some wait events code and documentation
Previous Message Melanie Plageman 2023-04-20 00:43:15 Re: Wrong results from Parallel Hash Full Join