From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
Subject: | Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation) |
Date: | 2018-01-03 03:41:35 |
Message-ID: | CAH2-WzkW3WtMvgz=obqL_4vrYT3LE9j0eybPJiXrpvZnkkOauw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jan 2, 2018 at 1:38 AM, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com> wrote:
> Need to do after the indexRelation build. So I added after update of
> pg_index,
> as indexRelation needed for plan_create_index_worders().
>
> Attaching the separate patch the same.
This made it so that REINDEX and CREATE INDEX CONCURRENTLY no longer
used parallelism. I think we need to do this very late, just before
nbtree's ambuild() routine is called from index.c.
> So you suggesting that need to do adjustment with the output of
> compute_parallel_worker() by considering parallel_leader_participation?
We know for sure that there is no reason to not use the leader process
as a worker process in the case of parallel CREATE INDEX. So we must
not have the number of participants (i.e. worker Tuplesortstates) vary
based on the current parallel_leader_participation setting. While
parallel_leader_participation can affect the number of worker
processes requested, that's a different thing. There is no question
about parallel_leader_participation ever being relevant to performance
-- it's strictly a testing option for us.
Even after parallel_leader_participation was added,
compute_parallel_worker() still assumes that the sequential scan
leader is always too busy to help. compute_parallel_worker() seems to
think that that's something that the leader does in "rare" cases not
worth considering -- cases where it has no worker tuples to consume
(maybe I'm reading too much into it not caring about
parallel_leader_participation, but I don't think so). If
compute_parallel_worker()'s assumption was questionable before, it's
completely wrong for parallel CREATE INDEX. I think
plan_create_index_workers() needs to count the leader-as-worker as an
ordinary worker, not special in any way by deducting one worker from
what compute_parallel_worker() returns. (This only happens when it's
necessary to compensate -- when leader-as-worker participation is
going to go ahead.)
I'm working on fixing up what you posted. I'm probably not more than a
week away from posting a patch that I'm going to mark "ready for
committer". I've already made the change above, and once I spend time
on trying to break the few small changes needed within buffile.c I'll
have taken it as far as I can, most likely.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-01-03 03:44:04 | copy_file_range is now a Linux kernel call |
Previous Message | Andres Freund | 2018-01-03 03:34:25 | copy_file_range() conflict between pg_rewind and libc |