From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Peter Geoghegan <pg(at)bowt(dot)ie>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum |
Date: | 2021-11-07 18:00:00 |
Message-ID: | 08c2445c-c3c9-ba45-18d3-6399707d8306@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
31.10.2021 22:20, Dmitry Dolgov wrote:
>>
>> I suspect this is the same bug as #17245. Could you check if it's fixed by
>> https://www.postgresql.org/message-id/CAH2-WzkN5aESSLfK7-yrYgsXxYUi__VzG4XpZFwXm98LUtoWuQ%40mail.gmail.com
>>
>> The crash is somewhere in pg_class, which is also manually VACUUMed by the
>> test, which could trigger the issue we found in the other thread. The likely
>> reason the loop in the repro is needed is that that'll push one of the indexes
>> on pg_class over the 512kb/min_parallel_index_scan_size boundary to start
>> using paralell vacuum.
> I've applied both patches from Peter, the fix itself and
> index-points-to-LP_UNUSED-item assertions. Now it doesn't crash on
> pg_unreachable, but hits those extra assertions in the second patch:
Yes, the committed fix for the bug #17245 doesn't help here.
I've also noticed that the server crash is not the only possible
outcome. You can also get unexpected errors like:
ERROR: relation "errtst_parent" already exists
ERROR: relation "tmp_idx1" already exists
ERROR: relation "errtst_child_plaindef" already exists
or
ERROR: could not open relation with OID 1033921
STATEMENT: DROP TABLE errtst_parent;
in the server.log (and no crash).
These strange errors and the crash inside index_delete_sort_cmp() can be
seen starting from the commit dc7420c2.
On the previous commit (b8443eae) the reproducing script completes
without a crash or errors (triple-checked).
Probably, the bug #17257 has the same root cause, but the patch [1]
applied to REL_14_STABLE (b0f6bd48) doesn't prevent the crash.
Initially I've thought that the infinite loop in vacuum is a problem
itself, so I decided to separate that one, but maybe both bugs are too
related to be discussed apart.
Best regards,
Alexander
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2021-11-07 19:22:00 | Re: pg_upgrade test for binary compatibility of core data types |
Previous Message | Semab Tariq | 2021-11-07 15:25:09 | Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data |