Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Geier <geidav(dot)pg(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE
Date: 2024-07-08 03:43:01
Message-ID: CAApHDvqFtd-9DYH70sbjD7iB-Eq-xSip1LPr=nayfpPd1pkZVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 18 Feb 2024 at 11:31, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> 2) Leader vs. worker counters
>
> It seems to me this does nothing to add the per-worker values from "Heap
> Blocks" into the leader, which means we get stuff like this:
>
> Heap Blocks: exact=102 lossy=10995
> Worker 0: actual time=50.559..209.773 rows=215253 loops=1
> Heap Blocks: exact=207 lossy=19354
> Worker 1: actual time=50.543..211.387 rows=162934 loops=1
> Heap Blocks: exact=161 lossy=14636
>
> I think this is wrong / confusing, and inconsistent with what we do for
> other nodes.

Are you able to share which other nodes that you mean here?

I used the following to compare to Sort and Memoize, and as far as I
see, the behaviour matches with the attached v8 patch.

Is there some inconsistency here that I'm not seeing?

create table mill (a int);
create index on mill(a);
insert into mill select x%1000 from generate_Series(1,10000000)x;
vacuum analyze mill;
create table big (a int primary key);
insert into big select x from generate_series(1,10000000)x;
create table probe (a int);
insert into probe select 1 from generate_Series(1,1000000);
analyze big
analyze probe;

set parallel_tuple_cost=0;
set parallel_setup_cost=0;
set enable_indexscan=0;

-- compare Parallel Bitmap Heap Scan with Memoize and Sort.

-- each includes "Worker N:" with stats for the operation.
explain (analyze) select * from mill where a < 100;
explain (analyze) select * from big b inner join probe p on b.a=p.a;
explain (analyze) select * from probe order by a;

-- each includes "Worker N:" with stats for the operation
-- also includes actual time and rows for each worker.
explain (analyze, verbose) select * from mill where a < 100;
explain (analyze, verbose) select * from big b inner join probe p on b.a=p.a;
explain (analyze, verbose) select * from probe order by a;

-- each includes "Worker N:" with stats for the operation
-- shows a single total buffers which includes leader and worker buffers.
explain (analyze, buffers) select * from mill where a < 100;
explain (analyze, buffers) select * from big b inner join probe p on b.a=p.a;
explain (analyze, buffers) select * from probe order by a;

-- each includes "Worker N:" with stats for the operation
-- also includes actual time and rows for each worker.
-- shows a single total buffers which includes leader and worker buffers.
-- shows buffer counts for each worker process
explain (analyze, buffers, verbose) select * from mill where a < 100;
explain (analyze, buffers, verbose) select * from big b inner join
probe p on b.a=p.a;
explain (analyze, buffers, verbose) select * from probe order by a;

If we did want to adjust things to show the totals for each worker
rather than the stats for the leader, what would Sort Method show if
one worker spilled to disk and another did not?

David

Attachment Content-Type Size
v8-0001-Show-Parallel-Bitmap-Heap-Scan-worker-stats-in-EX.patch application/octet-stream 14.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-07-08 04:02:09 Re: SupportRequestRows support function for generate_series_timestamptz
Previous Message jian he 2024-07-08 03:18:23 Re: Doc Rework: Section 9.16.13 SQL/JSON Query Functions