Re: Performance degradation on concurrent COPY into a single relation in PG16.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Performance degradation on concurrent COPY into a single relation in PG16.
Date: 2023-10-13 18:30:35
Message-ID: 20231013183035.wukjzu5bpnmshynm@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-10-13 10:39:10 -0700, Andres Freund wrote:
> On 2023-10-12 09:24:19 -0700, Andres Freund wrote:
> > I kind of had hoped somebody would comment on the approach. Given that nobody
> > has, I'll push the minimal fix of resetting the state in
> > ReleaseBulkInsertStatePin(), even though I think architecturally that's not
> > great.
>
> I spent some time working on a test that shows the problem more cheaply than
> the case upthread. I think it'd be desirable to have a test that's likely to
> catch an issue like this fairly quickly. We've had other problems in this
> realm before - there's only a single test that fails if I remove the
> ReleaseBulkInsertStatePin() call, and I don't think that's guaranteed at all.
>
> I'm a bit on the fence on how large to make the relation. For me the bug
> triggers when filling both relations up to the 3rd block, but how many rows
> that takes is somewhat dependent on space utilization on the page and stuff.
>
> Right now the test uses data/desc.data and ends up with 328kB and 312kB in two
> partitions. Alternatively I could make the test create a new file to load with
> copy that has fewer rows than data/desc.data - I didn't see another data file
> that works conveniently and has fewer rows. The copy is reasonably fast, even
> under something as expensive as rr (~60ms). So I'm inclined to just go with
> that?

Patch with fix and test attached (0001).

Given how easy a missing ReleaseBulkInsertStatePin() can cause corruption (not
due to this bug, but the issue fixed in b1ecb9b3fcf), I think we should
consider adding an assertion along the lines of 0002 to HEAD. Perhaps adding a
new bufmgr.c function to avoid having to get the fields in the buffer tag we
don't care about. Or perhaps we should just promote the check to an elog, we
already call BufferGetBlockNumber(), using BufferGetTag() instead doesn't cost
much more.

Greetings,

Andres Freund

Attachment Content-Type Size
v1-0001-Fix-bulk-table-extension-when-copying-into-multip.patch text/x-diff 7.0 KB
v1-0002-Assert-buffer-in-ReadBufferBI-is-for-correct-rela.patch text/x-diff 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2023-10-13 18:54:07 Re: Questions about the new subscription parameter: password_required
Previous Message Andres Freund 2023-10-13 17:39:10 Re: Performance degradation on concurrent COPY into a single relation in PG16.