From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: logical decoding / rewrite map vs. maxAllocatedDescs |
Date: | 2018-08-10 21:59:44 |
Message-ID: | 470adb65-5101-4659-d213-41bde1eef8f2@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 08/10/2018 11:13 PM, Andres Freund wrote:
> On 2018-08-10 22:57:57 +0200, Tomas Vondra wrote:
>>
>>
>> On 08/09/2018 07:47 PM, Alvaro Herrera wrote:
>>> On 2018-Aug-09, Tomas Vondra wrote:
>>>
>>>> I suppose there are reasons why it's done this way, and admittedly the test
>>>> that happens to trigger this is a bit extreme (essentially running pgbench
>>>> concurrently with 'vacuum full pg_class' in a loop). I'm not sure it's
>>>> extreme enough to deem it not an issue, because people using many temporary
>>>> tables often deal with bloat by doing frequent vacuum full on catalogs.
>>>
>>> Actually, it seems to me that ApplyLogicalMappingFile is just leaking
>>> the file descriptor for no good reason. There's a different
>>> OpenTransientFile call in ReorderBufferRestoreChanges that is not
>>> intended to be closed immediately, but the other one seems a plain bug,
>>> easy enough to fix.
>>>
>>
>> Indeed. Adding a CloseTransientFile to ApplyLogicalMappingFile solves
>> the issue with hitting maxAllocatedDecs. Barring objections I'll commit
>> this shortly.
>
> Yea, that's clearly a bug. I've not seen a patch, so I can't quite
> formally sign off, but it seems fairly obvious.
>
>
>> But while running the tests on this machine, I repeatedly got pgbench
>> failures like this:
>>
>> client 2 aborted in command 0 of script 0; ERROR: could not read block
>> 3 in file "base/16384/24573": read only 0 of 8192 bytes
>>
>> That kinda reminds me the issues we're observing on some buildfarm
>> machines, I wonder if it's the same thing.
>
> Oooh, that's interesting! What's the precise recipe that gets you there?
>
I don't have an exact reproducer - it's kinda rare and unpredictable,
and I'm not sure how much it depends on the environment etc. But I'm
doing this:
1) one cluster with publication (wal_level=logical)
2) one cluster with subscription to (1)
3) simple table, replicated from (1) to (2)
-- publisher
create table t (a serial primary key, b int, c int);
create publication p for table t;
-- subscriber
create table t (a serial primary key, b int, c int);
create subscription s CONNECTION '...' publication p;
4) pgbench inserting rows into the replicated table
pgbench -n -c 4 -T 300 -p 5433 -f insert.sql test
5) pgbench doing vacuum full on pg_class
pgbench -n -f vacuum.sql -T 300 -p 5433 test
And once in a while I see failures like this:
client 0 aborted in command 0 of script 0; ERROR: could not read
block 3 in file "base/16384/86242": read only 0 of 8192 bytes
client 3 aborted in command 0 of script 0; ERROR: could not read
block 3 in file "base/16384/86242": read only 0 of 8192 bytes
client 2 aborted in command 0 of script 0; ERROR: could not read
block 3 in file "base/16384/86242": read only 0 of 8192 bytes
or this:
client 2 aborted in command 0 of script 0; ERROR: could not read
block 3 in file "base/16384/89369": read only 0 of 8192 bytes
client 1 aborted in command 0 of script 0; ERROR: could not read
block 3 in file "base/16384/89369": read only 0 of 8192 bytes
I suspect there's some other ingredient, e.g. some manipulation with the
subscription. Or maybe it's not needed at all and I'm just imagining things.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
vacuum.sql | application/sql | 21 bytes |
insert.sql | application/sql | 34 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Andreas Seltenreich | 2018-08-10 22:12:07 | [sqlsmith] ERROR: partition missing from subplans |
Previous Message | Andrew Dunstan | 2018-08-10 21:55:06 | Re: libpq compression |