From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | pgsql-committers(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [COMMITTERS] pgsql: Fix decoding of MULTI_INSERTs when rows other than the last are |
Date: | 2014-07-06 16:50:52 |
Message-ID: | 20140706165052.GA28604@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Hi,
On 2014-07-06 14:01:11 +0000, Andres Freund wrote:
> Fix decoding of MULTI_INSERTs when rows other than the last are toasted.
>
> When decoding the results of a HEAP2_MULTI_INSERT (currently only
> generated by COPY FROM) toast columns for all but the last tuple
> weren't replaced by their actual contents before being handed to the
> output plugin. The reassembled toast datums where disregarded after
> every REORDER_BUFFER_CHANGE_(INSERT|UPDATE|DELETE) which is correct
> for plain inserts, updates, deletes, but not multi inserts - there we
> generate several REORDER_BUFFER_CHANGE_INSERTs for a single
> xl_heap_multi_insert record.
>
> To solve the problem add a clear_toast_afterwards boolean to
> ReorderBufferChange's union member that's used by modifications. All
> row changes but multi_inserts always set that to true, but
> multi_insert sets it only for the last change generated.
>
> Add a regression test covering decoding of multi_inserts - there was
> none at all before.
>
> Backpatch to 9.4 where logical decoding was introduced.
>
> Bug found by Petr Jelinek.
Further testing unfortuantely shows that this isn't sufficient:
Commit 1b86c81d2d fixed the decoding of toasted columns for the rows
contained in one xl_heap_multi_insert record. But that's not actually
enough because heap_multi_insert() will actually first toast all
passed in rows and then emit several *_multi_insert records; one for
each page it fills with tuples.
I've attached a preliminary patch that:
Add a XLOG_HEAP_LAST_MULTI_INSERT flag which is set in
xl_heap_multi_insert->flag denoting that this multi_insert record is
the last emitted by one heap_multi_insert() call. Then use that flag
in decode.c to only set clear_toast_afterwards in the right situation.
But since I am not exactly happy about that solution I'm wondering if
somebody can think of a better approach. Alternatives I though of
included:
* Add a boolean to xl_heap_multi_insert instead of adding the
XLOG_HEAP_LAST_MULTI_INSERT flag. I don't have an opinion either way
about that one.
* Only toast rows in heap_multi_insert for every page. That doesn't
really work without major reworks though as the loop over all the
tuples is done in a critical section.
* Don't clean up toast chunks for multi insert at all. Instead do so after
the next !multi insert record. I think that's out because of the
amount of multi_insert calls COPY can produce.
The patch also adds 200 lines of COPY FROM STDIN in the regression tests
to create a long COPY that does a multi insert with toast covering two
pages. Does anybody object to that?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-decoding-of-consecutive-MULTI_INSERTs-emitted-by.patch | text/x-patch | 29.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2014-07-06 18:57:09 | pgsql: Remove swpb-based spinlock implementation for ARMv5 and earlier. |
Previous Message | Andres Freund | 2014-07-06 14:01:13 | pgsql: Fix decoding of MULTI_INSERTs when rows other than the last are |
From | Date | Subject | |
---|---|---|---|
Next Message | Sawada Masahiko | 2014-07-06 17:18:38 | Re: add line number as prompt option to psql |
Previous Message | Tomas Vondra | 2014-07-06 16:44:35 | Re: tweaking NTUP_PER_BUCKET |