Re: Parallel bitmap heap scan

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel bitmap heap scan
Date: 2017-03-08 14:58:42
Message-ID: CA+TgmoZQTp7AwEP+sR6QyH9tagsUymgry+t5Ob7w-gBa08-M_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Reviewing 0003:

How about adding a regression test?

bitmap_subplan_mark_shared could use castNode(), which seems like it
would be better style. Maybe some other places, too.

+ <entry><literal>ParallelBitmapPopulate</></entry>
+ <entry>Waiting for the leader to populate the TidBitmap.</entry>
+ </row>
+ <row>

If you build the documentation, you'll find that this doesn't come out
right; you need to add 1 to the value of the nearest preceding
"morerows". (I fixed a similar issue with 0001 while committing.)

+ /*---------------
+ * Check the current state
+ * If state is
+ * BM_INITIAL : We become the leader and set it to BM_INPROGRESS
+ * BM_INPROGRESS : We need to wait till leader creates bitmap
+ * BM_FINISHED : bitmap is ready so no need to wait
+ *---------------

The formatting of this comment is slightly off - the comment for
BM_INITIAL isn't aligned the same as the others. But I would just
delete the whole comment, since more or less it recapitulates the
function header comment anyway.

I wonder if BitmapShouldInitializeSharedState couldn't be written a
little more compactly overall, like this:

{
SharedBitmapState state;

while (1)
{
SpinLockAcquire(&pstate->mutex);
state = pstate->state;
if (pstate->state == BM_INITIAL)
pstate->state = BM_INPROGRESS;
SpinLockRelease(&pstate->mutex);

/* If we are leader or leader has already created a TIDBITMAP */
if (state != BM_INPROGRESS)
break;

/* Sleep until leader finishes creating the bitmap */
ConditionVariableSleep(&pstate->cv, WAIT_EVENT_PARALLEL_BITMAP_SCAN);
}

ConditionVariableCancelSleep();

return (state == BM_INITIAL);
}

+ /*
+ * By this time we have already populated the TBM and
+ * initialized the shared iterators so set the state to
+ * BM_FINISHED and wake up others.
+ */
+ SpinLockAcquire(&pstate->mutex);
+ pstate->state = BM_FINISHED;
+ SpinLockRelease(&pstate->mutex);
+ ConditionVariableBroadcast(&pstate->cv);

I think it would be good to have a function for this, like
BitmapDoneInitializingSharedState(), and just call that function here.

+ SpinLockAcquire(&pstate->mutex);
+
+ /*
+ * Recheck under the mutex, If some other process has already done
+ * the enough prefetch then we need not to do anything.
+ */
+ if (pstate->prefetch_pages >= pstate->prefetch_target)
+ SpinLockRelease(&pstate->mutex);
+ return;
+ SpinLockRelease(&pstate->mutex);

I think it would be clearer to write this as:

SpinLockAcquire(&pstate->mutex);
do_prefetch = (pstate->prefetch_pages >= pstate->prefetch_target);
SpinLockRelease(&pstate->mutex);
if (!do_prefetch)
return;

Then it's more obvious what's going on with the spinlock. But
actually what I would do is also roll in the increment to prefetch
pages in there, so that you don't have to reacquire the spinlock after
calling PrefetchBuffer:

bool do_prefetch = false;
SpinLockAcquire(&pstate->mutex);
if (pstate->prefetch_pages < pstate->prefetch_target)
{
pstate->prefetch_pages++;
do_prefetch = true;
}
SpinLockRelease(&pstate->mutex);

That seems like it will reduce the amount of excess prefetching
considerably, and also simplify the code and cut the spinlock
acquisitions by 50%.

Overall I think this is in pretty good shape.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-08 15:04:26 Re: foreign partition DDL regression tests
Previous Message Tom Lane 2017-03-08 14:53:50 Re: Removing #include "postgres.h" from a couple of headers