From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | vignesh C <vignesh21(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alastair Turner <minion(at)decodable(dot)me>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Parallel copy |
Date: | 2020-06-24 08:10:29 |
Message-ID: | CALj2ACW94icER3WrWapon7JkcX8j0TGRue5ycWMTEvgA3X7fOg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Thanks Vignesh for reviewing parallel copy for binary format files
patch. I tried to address the comments in the attached patch
(0006-Parallel-Copy-For-Binary-Format-Files.patch).
On Thu, Jun 18, 2020 at 6:42 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Mon, Jun 15, 2020 at 4:39 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > The above tests were run with the configuration attached config.txt, which is the same used for performance tests of csv/text files posted earlier in this mail chain.
> >
> > Request the community to take this patch up for review along with the parallel copy for csv/text file patches and provide feedback.
> >
>
> I had reviewed the patch, few comments:
>
> The new members added should be present in ParallelCopyData
Added to ParallelCopyData.
>
> line_size can be set as and when we process the tuple from
> CopyReadBinaryTupleLeader and this can be set at the end. That way the
> above code can be removed.
>
curr_tuple_start_info and curr_tuple_end_info variables are now local
variables to CopyReadBinaryTupleLeader and the line size calculation
code is moved to CopyReadBinaryAttributeLeader.
>
> curr_block_pos variable is present in ParallelCopyShmInfo, we could
> use it and remove from here.
> curr_data_offset, similar variable raw_buf_index is present in
> CopyStateData, we could use it and remove from here.
>
Yes, making use of them now.
>
> This code is duplicate in CopyReadBinaryTupleLeader &
> CopyReadBinaryAttributeLeader. We could make a function and re-use.
>
Added a new function AdjustFieldInfo.
>
> column_no is not used, it can be removed
>
Removed.
>
> The above code is present in NextCopyFrom & CopyReadBinaryTupleLeader,
> check if we can make a common function or we could use NextCopyFrom as
> it is.
>
Added a macro CHECK_FIELD_COUNT.
> + if (fld_count == -1)
> + {
> + return true;
> + }
>
> Should this be an assert in CopyReadBinaryTupleWorker function as this
> check is already done in the leader.
>
This check in leader signifies the end of the file. For the workers,
the eof is when GetLinePosition() returns -1.
line_pos = GetLinePosition(cstate);
if (line_pos == -1)
return true;
In case the if (fld_count == -1) is encountered in the worker, workers
should just return true from CopyReadBinaryTupleWorker marking eof.
Having this as an assert doesn't serve the purpose I feel.
Along with the review comments addressed
patch(0006-Parallel-Copy-For-Binary-Format-Files.patch) also attaching
all other latest series of patches(0001 to 0005) from [1], the order
of applying patches is from 0001 to 0006.
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
0006-Parallel-Copy-For-Binary-Format-Files.patch | application/octet-stream | 24.7 KB |
0001-Copy-code-readjustment-to-support-parallel-copy.patch | application/octet-stream | 16.8 KB |
0002-Framework-for-leader-worker-in-parallel-copy.patch | application/octet-stream | 33.1 KB |
0003-Allow-copy-from-command-to-process-data-from-file-ST.patch | application/octet-stream | 40.4 KB |
0004-Documentation-for-parallel-copy.patch | application/octet-stream | 2.0 KB |
0005-Tests-for-parallel-copy.patch | application/octet-stream | 20.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Vik Fearing | 2020-06-24 08:12:38 | Re: Allow CURRENT_ROLE in GRANTED BY |
Previous Message | Peter Eisentraut | 2020-06-24 08:00:00 | Re: Improve handling of parameter differences in physical replication |