Quick Links

Re: Parallel copy

From:	Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To:	vignesh C <vignesh21(at)gmail(dot)com>
Cc:	Greg Nancarrow <gregn4422(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Subject:	Re: Parallel copy
Date:	2020-09-16 13:05:56
Message-ID:	CAE9k0Pnw0ZvzBUXMp+8hhAa4j+=p2F=Z3Kymq6_B-tGnYiZsgQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Vignesh,

I've spent some time today looking at your new set of patches and I've
some thoughts and queries which I would like to put here:

Why are these not part of the shared cstate structure?

SerializeString(pcxt, PARALLEL_COPY_KEY_NULL_PRINT, cstate->null_print);
SerializeString(pcxt, PARALLEL_COPY_KEY_DELIM, cstate->delim);
SerializeString(pcxt, PARALLEL_COPY_KEY_QUOTE, cstate->quote);
SerializeString(pcxt, PARALLEL_COPY_KEY_ESCAPE, cstate->escape);

I think in the refactoring patch we could replace all the cstate
variables that would be shared between the leader and workers with a
common structure which would be used even for a serial copy. Thoughts?

Have you tested your patch when encoding conversion is needed? If so,
could you please point out the email that has the test results.

Apart from above, I've noticed some cosmetic errors which I am sharing here:

+#define IsParallelCopy() (cstate->is_parallel)
+#define IsLeader() (cstate->pcdata->is_leader)

This doesn't look to be properly aligned.

+ shared_info_ptr = (ParallelCopyShmInfo *)
shm_toc_allocate(pcxt->toc, sizeof(ParallelCopyShmInfo));
+ PopulateParallelCopyShmInfo(shared_info_ptr, full_transaction_id);

+ /* Store shared build state, for which we reserved space. */
+ shared_cstate = (SerializedParallelCopyState
*)shm_toc_allocate(pcxt->toc, est_cstateshared);

In the first case, while typecasting you've added a space between the
typename and the function but that is missing in the second case. I
think it would be good if you could make it consistent.

Same comment applies here as well:

+ pg_atomic_uint32 line_state; /* line state */
+ uint64 cur_lineno; /* line number for error messages */
+}ParallelCopyLineBoundary;

...

+ CommandId mycid; /* command id */
+ ParallelCopyLineBoundaries line_boundaries; /* line array */
+} ParallelCopyShmInfo;

There is no space between the closing brace and the structure name in
the first case but it is in the second one. So, again this doesn't
look consistent.

I could also find this type of inconsistency in comments. See below:

+/* It can hold upto 10000 record information for worker to process. RINGSIZE
+ * should be a multiple of WORKER_CHUNK_COUNT, as wrap around cases
is currently
+ * not handled while selecting the WORKER_CHUNK_COUNT by the worker. */
+#define RINGSIZE (10 * 1000)

...

+/*
+ * Each worker will be allocated WORKER_CHUNK_COUNT of records from DSM data
+ * block to process to avoid lock contention. Read RINGSIZE comments before
+ * changing this value.
+ */
+#define WORKER_CHUNK_COUNT 50

You may see these kinds of errors at other places as well if you scan
through your patch.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

On Wed, Aug 19, 2020 at 11:51 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> Thanks Greg for reviewing the patch. Please find my thoughts for your comments.
>
> On Mon, Aug 17, 2020 at 9:44 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > Some further comments:
> >
> > (1) v3-0002-Framework-for-leader-worker-in-parallel-copy.patch
> >
> > +/*
> > + * Each worker will be allocated WORKER_CHUNK_COUNT of records from DSM data
> > + * block to process to avoid lock contention. This value should be divisible by
> > + * RINGSIZE, as wrap around cases is currently not handled while selecting the
> > + * WORKER_CHUNK_COUNT by the worker.
> > + */
> > +#define WORKER_CHUNK_COUNT 50
> >
> >
> > "This value should be divisible by RINGSIZE" is not a correct
> > statement (since obviously 50 is not divisible by 10000).
> > It should say something like "This value should evenly divide into
> > RINGSIZE", or "RINGSIZE should be a multiple of WORKER_CHUNK_COUNT".
> >
>
> Fixed. Changed it to RINGSIZE should be a multiple of WORKER_CHUNK_COUNT.
>
> > (2) v3-0003-Allow-copy-from-command-to-process-data-from-file.patch
> >
> > (i)
> >
> > + /*
> > + * If the data is present in current block
> > lineInfo. line_size
> > + * will be updated. If the data is spread
> > across the blocks either
> >
> > Somehow a space has been put between "lineinfo." and "line_size".
> > It should be: "If the data is present in current block
> > lineInfo.line_size will be updated"
>
> Fixed, changed it to lineinfo->line_size.
>
> >
> > (ii)
> >
> > >This is not possible because of pg_atomic_compare_exchange_u32, this
> > >will succeed only for one of the workers whose line_state is
> > >LINE_LEADER_POPULATED, for other workers it will fail. This is
> > >explained in detail above ParallelCopyLineBoundary.
> >
> > Yes, but prior to that call to pg_atomic_compare_exchange_u32(),
> > aren't you separately reading line_state and line_state, so that
> > between those reads, it may have transitioned from leader to another
> > worker, such that the read line state ("cur_line_state", being checked
> > in the if block) may not actually match what is now in the line_state
> > and/or the read line_size ("dataSize") doesn't actually correspond to
> > the read line state?
> >
> > (sorry, still not 100% convinced that the synchronization and checks
> > are safe in all cases)
> >
>
> I think that you are describing about the problem could happen in the
> following case:
> when we read curr_line_state, the value was LINE_WORKER_PROCESSED or
> LINE_WORKER_PROCESSING. Then in some cases if the leader is very fast
> compared to the workers then the leader quickly populates one line and
> sets the state to LINE_LEADER_POPULATED. State is changed to
> LINE_LEADER_POPULATED when we are checking the currr_line_state.
> I feel this will not be a problem because, Leader will populate & wait
> till some RING element is available to populate. In the meantime
> worker has seen that state is LINE_WORKER_PROCESSED or
> LINE_WORKER_PROCESSING(previous state that it read), worker has
> identified that this chunk was processed by some other worker, worker
> will move and try to get the next available chunk & insert those
> records. It will keep continuing till it gets the next chunk to
> process. Eventually one of the workers will get this chunk and process
> it.
>
> > (3) v3-0006-Parallel-Copy-For-Binary-Format-Files.patch
> >
> > >raw_buf is not used in parallel copy, instead raw_buf will be pointing
> > >to shared memory data blocks. This memory was allocated as part of
> > >BeginCopyFrom, uptil this point we cannot be 100% sure as copy can be
> > >performed sequentially like in case max_worker_processes is not
> > >available, if it switches to sequential mode raw_buf will be used
> > >while performing copy operation. At this place we can safely free this
> > >memory that was allocated
> >
> > So the following code (which checks raw_buf, which still points to
> > memory that has been pfreed) is still valid?
> >
> > In the SetRawBufForLoad() function, which is called by CopyReadLineText():
> >
> > cur_data_blk_ptr = (cstate->raw_buf) ?
> > &pcshared_info->data_blocks[cur_block_pos] : NULL;
> >
> > The above code looks a bit dicey to me. I stepped over that line in
> > the debugger when I debugged an instance of Parallel Copy, so it
> > definitely gets executed.
> > It makes me wonder what other code could possibly be checking raw_buf
> > and using it in some way, when in fact what it points to has been
> > pfreed.
> >
> > Are you able to add the following line of code, or will it (somehow)
> > break logic that you are relying on?
> >
> > pfree(cstate->raw_buf);
> > cstate->raw_buf = NULL; <=== I suggest that this line is added
> >
>
> You are right, I have debugged & verified it sets it to an invalid
> block which is not expected. There are chances this would have caused
> some corruption in some machines. The suggested fix is required, I
> have fixed it. I have moved this change to
> 0003-Allow-copy-from-command-to-process-data-from-file.patch as
> 0006-Parallel-Copy-For-Binary-Format-Files is only for Binary format
> parallel copy & that change is common change for parallel copy.
>
> I have attached new set of patches with the fixes.
> Thoughts?
>
> Regards,
> Vignesh
> EnterpriseDB: http://www.enterprisedb.com

In response to

Re: Parallel copy at 2020-08-19 06:21:29 from vignesh C

Responses

Re: Parallel copy at 2020-09-22 09:14:21 from vignesh C

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2020-09-16 13:06:01	Re: Avoid useless retrieval of defaults and check constraints in pg_dump -a
Previous Message	Michael Paquier	2020-09-16 13:01:29	Re: Proposal of new PostgreSQL Extension - PGSpiderExt