Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Geier <geidav(dot)pg(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE
Date: 2024-07-08 23:51:23
Message-ID: CAApHDvoom6ABLgyNW3FtqtQxFB4Hz-6Y=zMy1EobPEOOMBDfUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 8 Jul 2024 at 15:43, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Sun, 18 Feb 2024 at 11:31, Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> > 2) Leader vs. worker counters
> >
> > It seems to me this does nothing to add the per-worker values from "Heap
> > Blocks" into the leader, which means we get stuff like this:
> >
> > Heap Blocks: exact=102 lossy=10995
> > Worker 0: actual time=50.559..209.773 rows=215253 loops=1
> > Heap Blocks: exact=207 lossy=19354
> > Worker 1: actual time=50.543..211.387 rows=162934 loops=1
> > Heap Blocks: exact=161 lossy=14636
> >
> > I think this is wrong / confusing, and inconsistent with what we do for
> > other nodes.
>
> Are you able to share which other nodes that you mean here?

I did the analysis on this and out of the node types that have
parallel instrumentation (per ExecParallelRetrieveInstrumentation()),
Parallel Hash is the only node that does anything different from the
others. Looking at the loop inside show_hash_info(), you can see it
takes the Max() of each property. There's some discussion in [1] about
why this came about. In particular [2].

I see no reason to copy the odd one out here, so I'm planning on going
ahead with the patch that has Bitmap Heap Scan copy what the majority
of other nodes do. I think we should consider aligning Parallel Hash
with the other Parallel node behaviour.

I've attached the (roughly done) schema and queries I used to obtain
the plans to do this analysis.

David

[1] https://www.postgresql.org/message-id/flat/20200323165059.GA24950%40alvherre.pgsql
[2] https://www.postgresql.org/message-id/31321.1586549487%40sss.pgh.pa.us

Attachment Content-Type Size
explain_analyze_parallel_stats_analysis.txt text/plain 3.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-07-08 23:52:25 Re: pgsql: Add pg_get_acl() to get the ACL for a database object
Previous Message Jeff Davis 2024-07-08 23:39:21 Re: Faster "SET search_path"