Re: Parallel bitmap index scan

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel bitmap index scan
Date: 2020-07-28 08:36:02
Message-ID: CAFiTN-uc2c8R2Sy2c-0fCpKRwQchdUXDys14_6oPdqiPHrhyiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 27, 2020 at 3:48 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Mon, Jul 27, 2020 at 1:58 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > On Sun, Jul 26, 2020 at 6:42 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > I would like to propose a patch for enabling the parallelism for the
> > > bitmap index scan path.
>
> Workers Planned: 4
> -> Parallel Bitmap Heap Scan on tenk1
> Recheck Cond: (hundred > 1)
> - -> Bitmap Index Scan on tenk1_hundred
> + -> Parallel Bitmap Index Scan on tenk1_hundred
> Index Cond: (hundred > 1)
>
> +1, this is a very good feature to have.
>
> + /* Merge bitmap to a common
> shared bitmap */
> + SpinLockAcquire(&pstate->mutex);
> + tbm_merge(tbm,
> &pstate->tbm_shared, &pstate->pt_shared);
> + SpinLockRelease(&pstate->mutex);
>
> This doesn't look like a job for a spinlock.
>
> You have a barrier so that you can wait until all workers have
> finished merging their partial bitmap into the complete bitmap, which
> makes perfect sense. You also use that spinlock (probably should be
> LWLock) to serialise the bitmap merging work... Hmm, I suppose there
> would be an alternative design which also uses the barrier to
> serialise the merge, and has the merging done entirely by one process,
> like this:
>
> bool chosen_to_merge = false;
>
> /* Attach to the barrier, and see what phase we're up to. */
> switch (BarrierAttach())
> {
> case PBH_BUILDING:
> ... build my partial bitmap in shmem ...
> chosen_to_merge = BarrierArriveAndWait();
> /* Fall through */
>
> case PBH_MERGING:
> if (chosen_to_merge)
> ... perform merge of all partial results into final shmem bitmap ...
> BarrierArriveAndWait();
> /* Fall through */
>
> case PBH_SCANNING:
> /* We attached too late to help build the bitmap. */
> BarrierDetach();
> break;
> }
>
> Just an idea, not sure if it's a good one. I find it a little easier
> to reason about the behaviour of late-attaching workers when the
> phases are explicitly named and handled with code like that (it's not
> immediately clear to me whether your coding handles late attachers
> correctly, which seems to be one of the main problems to think about
> with "dynamic party" parallelism...).

Actually, for merging, I could not use the strategy you suggested
because in this case all the worker prepare their TBM and merge to the
shared TBM. Basically, we don't need to choose a leader for that all
the workers need to merge their TBM to the shared location but one at
a time, and also we don't need to wait for all the workers to prepare
TBM before they start merging. However, once the merge is over we
need to wait for all the workers to finish the merge and after that
only one worker will be allowed to prepare the shared iterator. So
for that, I have used your idea of the barrier phase.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v3-0001-POC-Parallel-Bitmap-Index-Scan.patch application/octet-stream 39.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2020-07-28 08:55:04 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Previous Message Andrey M. Borodin 2020-07-28 08:22:47 Re: recovering from "found xmin ... from before relfrozenxid ..."