Re: Performance degradation on concurrent COPY into a single relation in PG16.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Performance degradation on concurrent COPY into a single relation in PG16.
Date: 2023-09-25 21:37:46
Message-ID: 20230925213746.fwqauhhifjgefyzk@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-09-25 12:48:30 -0700, Andres Freund wrote:
> On 2023-09-25 15:42:26 -0400, Tom Lane wrote:
> > I just did a git bisect run to discover when the failure documented
> > in bug #18130 [1] started. And the answer is commit 82a4edabd.
> > Now, it's pretty obvious that that commit didn't in itself cause
> > problems like this:
> >
> > postgres=# \copy test from 'bug18130.csv' csv
> > ERROR: could not read block 5 in file "base/5/17005": read only 0 of 8192 bytes
> > CONTEXT: COPY test, line 472: "0,185647715,222655,489637,2,2023-07-31,9100.0000000,302110385,2023-07-30 14:16:36.750981+00,14026347..."
>
> Ugh.
>
>
> > IMO there must be some very nasty bug lurking in the new
> > multiple-block extension logic, that happens to be exposed by this
> > test case with 82a4edabd's adjustments to the when-to-extend choices
> > but wasn't before that.
>
> > To save other people the trouble of extracting the in-line data
> > in the bug submission, I've attached the test files I was using.
>
> Thanks, looking at this now.

(had to switch locations in between)

Uh, huh. The problem is that COPY uses a single BulkInsertState for multiple
partitions. Which to me seems to run counter to the following comment:
* The caller can also provide a BulkInsertState object to optimize many
* insertions into the same relation. This keeps a pin on the current
* insertion target page (to save pin/unpin cycles) and also passes a
* BULKWRITE buffer selection strategy object to the buffer manager.
* Passing NULL for bistate selects the default behavior.

The reason this doesn't cause straight up corruption due to reusing a pin from
another relation is that b1ecb9b3fcfb added ReleaseBulkInsertStatePin() and a
call to it. But I didn't make ReleaseBulkInsertStatePin() reset the bulk
insertion state, which is what leads to the errors from the bug report.

Resetting the relevant BulkInsertState fields fixes the problem. But I'm not
sure that's the right fix. ISTM that independent of whether we fix this via
ReleaseBulkInsertStatePin() resetting the fields or via not reusing
BulkInsertState, we should add assertions defending against future issues like
this (e.g. by adding a relation field to BulkInsertState in cassert builds,
and asserting that the relation is the same as in prior calls unless
ReleaseBulkInsertStatePin() has been called).

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-09-25 21:45:33 Re: Fix a wrong comment in setrefs.c
Previous Message Peter Geoghegan 2023-09-25 21:16:46 Re: Eager page freeze criteria clarification