Re: Parallel copy

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alastair Turner <minion(at)decodable(dot)me>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2020-10-09 05:31:23
Message-ID: CAA4eK1+ZRyGEgwRzvq+CVyva8rWFVCEC2Q+8J-+Be-rSmPjpuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 8, 2020 at 12:14 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Mon, Sep 28, 2020 at 12:19 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > + */
> > > > +typedef struct ParallelCopyLineBoundary
> > > >
> > > > Are we doing all this state management to avoid using locks while
> > > > processing lines? If so, I think we can use either spinlock or LWLock
> > > > to keep the main patch simple and then provide a later patch to make
> > > > it lock-less. This will allow us to first focus on the main design of
> > > > the patch rather than trying to make this datastructure processing
> > > > lock-less in the best possible way.
> > > >
> > >
> > > The steps will be more or less same if we use spinlock too. step 1, step 3 & step 4 will be common we have to use lock & unlock instead of step 2 & step 5. I feel we can retain the current implementation.
> > >
> >
> > I'll study this in detail and let you know my opinion on the same but
> > in the meantime, I don't follow one part of this comment: "If they
> > don't follow this order the worker might process wrong line_size and
> > leader might populate the information which worker has not yet
> > processed or in the process of processing."
> >
> > Do you want to say that leader might overwrite some information which
> > worker hasn't read yet? If so, it is not clear from the comment.
> > Another minor point about this comment:
> >
>
> Here leader and worker must follow these steps to avoid any corruption
> or hang issue. Changed it to:
> * The leader & worker process access the shared line information by following
> * the below steps to avoid any data corruption or hang:
>

Actually, I wanted more on the lines why such corruption or hang can
happen? It might help reviewers to understand why you have followed
such a sequence.

> >
> > How did you ensure that this is fixed? Have you tested it, if so
> > please share the test? I see a basic problem with your fix.
> >
> > + /* Report WAL/buffer usage during parallel execution */
> > + bufferusage = shm_toc_lookup(toc, PARALLEL_COPY_BUFFER_USAGE, false);
> > + walusage = shm_toc_lookup(toc, PARALLEL_COPY_WAL_USAGE, false);
> > + InstrEndParallelQuery(&bufferusage[ParallelWorkerNumber],
> > + &walusage[ParallelWorkerNumber]);
> >
> > You need to call InstrStartParallelQuery() before the actual operation
> > starts, without that stats won't be accurate? Also, after calling
> > WaitForParallelWorkersToFinish(), you need to accumulate the stats
> > collected from workers which neither you have done nor is possible
> > with the current code in your patch because you haven't made any
> > provision to capture them in BeginParallelCopy.
> >
> > I suggest you look into lazy_parallel_vacuum_indexes() and
> > begin_parallel_vacuum() to understand how the buffer/wal usage stats
> > are accumulated. Also, please test this functionality using
> > pg_stat_statements.
> >
>
> Made changes accordingly.
> I have verified it using:
> postgres=# select * from pg_stat_statements where query like '%copy%';
> userid | dbid | queryid |
> query
> | plans | total_plan_time |
> min_plan_time | max_plan_time | mean_plan_time | stddev_plan_time |
> calls | total_exec_time | min_exec_time | max_exec_time |
> mean_exec_time | stddev_exec_time | rows | shared_blks_hi
> t | shared_blks_read | shared_blks_dirtied | shared_blks_written |
> local_blks_hit | local_blks_read | local_blks_dirtied |
> local_blks_written | temp_blks_read | temp_blks_written | blk_
> read_time | blk_write_time | wal_records | wal_fpi | wal_bytes
> --------+-------+----------------------+---------------------------------------------------------------------------------------------------------------------+-------+-----------------+-
> --------------+---------------+----------------+------------------+-------+-----------------+---------------+---------------+----------------+------------------+--------+---------------
> --+------------------+---------------------+---------------------+----------------+-----------------+--------------------+--------------------+----------------+-------------------+-----
> ----------+----------------+-------------+---------+-----------
> 10 | 13743 | -6947756673093447609 | copy hw from
> '/home/vignesh/postgres/postgres/inst/bin/hw_175000.csv' with(format
> csv, delimiter ',') | 0 | 0 |
> 0 | 0 | 0 | 0 |
> 1 | 265.195105 | 265.195105 | 265.195105 | 265.195105
> | 0 | 175000 | 191
> 6 | 0 | 946 | 946 |
> 0 | 0 | 0 | 0
> | 0 | 0 |
> 0 | 0 | 1116 | 0 | 3587203
> 10 | 13743 | 8570215596364326047 | copy hw from
> '/home/vignesh/postgres/postgres/inst/bin/hw_175000.csv' with(format
> csv, delimiter ',', parallel '2') | 0 | 0 |
> 0 | 0 | 0 | 0 |
> 1 | 35668.402482 | 35668.402482 | 35668.402482 | 35668.402482
> | 0 | 175000 | 310
> 1 | 36 | 952 | 919 |
> 0 | 0 | 0 | 0
> | 0 | 0 |
> 0 | 0 | 1119 | 6 | 3624405
> (2 rows)
>

I am not able to properly parse the data but If understand the wal
data for non-parallel (1116 | 0 | 3587203) and parallel (1119
| 6 | 3624405) case doesn't seem to be the same. Is that
right? If so, why? Please ensure that no checkpoint happens for both
cases.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message movead.li@highgo.ca 2020-10-09 05:41:25 Wrong statistics for size of XLOG_SWITCH during pg_waldump.
Previous Message Dilip Kumar 2020-10-09 05:25:14 Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables