| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Jim Finnerty <jfinnert(at)amazon(dot)com> |
| Cc: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Parallel append plan instability/randomness |
| Date: | 2018-01-08 16:42:02 |
| Message-ID: | 2171.1515429722@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Jim Finnerty <jfinnert(at)amazon(dot)com> writes:
> Ordering the plan output elements by estimated cost will cause spurious plan
> changes to be reported after table cardinalities change. Can we choose an
> explain output order that is stable to changes in cardinality, please?
I found the code that's doing this, in create_append_path, and it says:
* For parallel append, non-partial paths are sorted by descending total
* costs. That way, the total time to finish all non-partial paths is
* minimized. Also, the partial paths are sorted by descending startup
* costs. There may be some paths that require to do startup work by a
* single worker. In such case, it's better for workers to choose the
* expensive ones first, whereas the leader should choose the cheapest
* startup plan.
There's some merit in that argument, although I'm not sure how much.
It's certainly pointless to sort that way if the expected number of
workers is >= the number of subpaths. More generally, I wonder if
it wouldn't be better to implement this behavior at runtime rather
than plan time. Something along the line of "workers choose the
highest-cost remaining subplan, but leader chooses the lowest-cost one".
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Daniel Gustafsson | 2018-01-08 16:44:53 | Re: [HACKERS] Refactoring identifier checks to consistently use strcmp |
| Previous Message | Robert Haas | 2018-01-08 16:36:08 | Re: Parallel append plan instability/randomness |