From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Melanie Plageman <melanieplageman(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel leader process info in EXPLAIN |
Date: | 2020-01-26 22:49:10 |
Message-ID: | 18323.1580078950@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> I think I'm going to abandon 0002 for now, because that stuff is being
> refactored independently over here, so rebasing would be futile:
> https://www.postgresql.org/message-id/flat/CAOtHd0AvAA8CLB9Xz0wnxu1U%3DzJCKrr1r4QwwXi_kcQsHDVU%3DQ%40mail.gmail.com
Yeah, your 0002 needs some rethinking. I kind of like the proposed
change in the text-format output:
Workers Launched: 4
-> Sort (actual rows=2000 loops=15)
Sort Key: tenk1.ten
- Sort Method: quicksort Memory: xxx
+ Leader: Sort Method: quicksort Memory: xxx
Worker 0: Sort Method: quicksort Memory: xxx
Worker 1: Sort Method: quicksort Memory: xxx
Worker 2: Sort Method: quicksort Memory: xxx
but it's quite unclear to me how that translates into non-text
formats, especially if we're not to break invariants about which
fields are present in a non-text output structure (cf [1]).
I've occasionally wondered whether we'd be better off presenting
this info as if the leader were "worker 0" and then the N workers
are workers 1 to N. I've not worked out the implications of that
in any detail though. It's fairly easy to see what to do for
fields that can be aggregated (the numbers printed for the node
as a whole are totals), but it doesn't help us any with something
like Sort Method.
On a narrower note, I'm not at all happy with the fact that 0001
adds yet another field to *every* PlanState. I think this is
doubling down on a fundamentally wrong decision to have
ExecParallelRetrieveInstrumentation do some aggregation immediately.
I think we should abandon that and just say that it returns the raw
leader and per-worker data, and then explain.c can aggregate as it
wishes.
regards, tom lane
[1] https://www.postgresql.org/message-id/19416.1580069629%40sss.pgh.pa.us
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-01-26 22:53:09 | Re: EXPLAIN's handling of output-a-field-or-not decisions |
Previous Message | Peter Geoghegan | 2020-01-26 22:49:06 | Delaying/avoiding BTreeTupleGetNAtts() call within _bt_compare() |