From: | Alexey Kondratov <kondratov(dot)aleksey(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Anastasia Lubennikova <lubennikovaAV(at)gmail(dot)com> |
Subject: | Re: GSOC'17 project introduction: Parallel COPY execution with errors handling |
Date: | 2017-06-16 17:53:52 |
Message-ID: | 2F15DA8D-4FFF-4C2E-8110-F6FDB7DB9C09@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On 13 Jun 2017, at 01:44, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>> I am not going to start with "speculative insertion" right now, but it would
>> be very
>> useful, if you give me a point, where to start. Maybe I will at least try to
>> evaluate
>> the complexity of the problem.
>
> Speculative insertion has the following special entry points to
> heapam.c and execIndexing.c, currently only called within
> nodeModifyTable.c
>
> Offhand, it doesn't seem like it would be that hard to teach another
> heap_insert() caller the same tricks.
I went through the nodeModifyTable.c code and it seems not to be so
difficult to do the same inside COPY.
> My sense is that it's going to be hard to sell a committer on any
> design that consumes subtransactions in a way that's not fairly
> obvious to the user, and doesn't have a pretty easily understood worse
> case.
Yes, and worse case probably will be a quite frequent case, since it is not possible to do heap_multy_insert, if BEFORE/INSTEAD triggers or partitioning exist (according to the current copy.c code). Thus, it will frequently fall back into a single heap_insert, each being wrapped with subtransaction will consume XIDs too greedy and seriously affect performance. I like my previous idea less and less.
> I haven't thought about this very carefully, but I guess you could do
> something like passing a flag to ExecConstraints() that indicates
> "don't throw an error; instead, just return false so I know not to
> proceed"
Currently ExecConstraints always throws an error and I do not think, that it would be wise from my side to modify its behaviour.
I have updated my patch (rebased over the topmost master commit 94da2a6a9a05776953524424a3d8079e54bc5d94). Please, find patch file attached or always up to date version on GitHub https://github.com/ololobus/postgres/pull/1/files <https://github.com/ololobus/postgres/pull/1/files>
Currently, It caches all major errors in the input data:
1) Rows with less/extra columns cause WARNINGs and are skipped
2) I found that input type format errors are thrown from the InputFunctionCall; and wrapped it up with PG_TRY/CATCH. I am not 100%
Alexey
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2017-06-16 17:58:28 | Re: pg_waldump command line arguments |
Previous Message | Andres Freund | 2017-06-16 17:52:30 | Re: Why forcing Hot_standby_feedback to be enabled when creating a logical decoding slot on standby |