Re: Support tid range scan in parallel?

From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: Cary Huang <cary(dot)huang(at)highgo(dot)ca>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support tid range scan in parallel?
Date: 2024-08-11 07:03:28
Message-ID: CAEG8a3JYRPWuZTSY8YHUi9TfnSSoXV8Fwvi65cN7HL04w=HnTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Cary,

On Wed, May 15, 2024 at 5:33 AM Cary Huang <cary(dot)huang(at)highgo(dot)ca> wrote:
>
> > You should add a test that creates a table with a very low fillfactor,
> > low enough so only 1 tuple can fit on each page and insert a few dozen
> > tuples. The test would do SELECT COUNT(*) to ensure you find the
> > correct subset of tuples. You'd maybe want MIN(ctid) and MAX(ctid) in
> > there too for extra coverage to ensure that the correct tuples are
> > being counted. Just make sure and EXPLAIN (COSTS OFF) the query first
> > in the test to ensure that it's always testing the plan you're
> > expecting to test.
>
> thank you for the test suggestion. I moved the regress tests from select_parallel
> to tidrangescan instead. I follow the existing test table creation in tidrangescan
> with the lowest fillfactor of 10, I am able to get consistent 5 tuples per page
> instead of 1. It should be okay as long as it is consistently 5 tuples per page so
> the tuple count results from parallel tests would be in multiples of 5.
>
> The attached v4 patch includes the improved regression tests.
>
> Thank you very much!
>
> Cary Huang
> -------------
> HighGo Software Inc. (Canada)
> cary(dot)huang(at)highgo(dot)ca
> www.highgo.ca
>

+++ b/src/backend/access/heap/heapam.c
@@ -307,6 +307,7 @@ initscan(HeapScanDesc scan, ScanKey key, bool
keep_startblock)
* results for a non-MVCC snapshot, the caller must hold some higher-level
* lock that ensures the interesting tuple(s) won't change.)
*/
+

I see no reason why you add a blank line here, is it a typo?

+/* ----------------------------------------------------------------
+ * ExecSeqScanInitializeWorker
+ *
+ * Copy relevant information from TOC into planstate.
+ * ----------------------------------------------------------------
+ */
+void
+ExecTidRangeScanInitializeWorker(TidRangeScanState *node,
+ ParallelWorkerContext *pwcxt)
+{
+ ParallelTableScanDesc pscan;

Function name in the comment is not consistent.

@@ -81,6 +81,7 @@ typedef struct ParallelBlockTableScanDescData
BlockNumber phs_startblock; /* starting block number */
pg_atomic_uint64 phs_nallocated; /* number of blocks allocated to
* workers so far. */
+ BlockNumber phs_numblock; /* max number of blocks to scan */
} ParallelBlockTableScanDescData;

Can this be reorganized by putting phs_numblock after phs_startblock?

>
>
>
>
>

--
Regards
Junwang Zhao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ayush Vatsa 2024-08-11 07:23:23 Restricting Direct Access to a C Function in PostgreSQL
Previous Message John Naylor 2024-08-11 02:05:41 Re: 回复: An implementation of multi-key sort