From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Костя Кузнецов <chapaev28(at)ya(dot)ru> |
Subject: | Re: GiST VACUUM |
Date: | 2018-07-13 14:10:40 |
Message-ID: | 50d0105a-bd99-788e-37ee-51f59d81fbec@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 13/07/18 16:41, Andrey Borodin wrote:
>> 12 июля 2018 г., в 21:07, Andrey Borodin <x4mmm(at)yandex-team(dot)ru
>> <mailto:x4mmm(at)yandex-team(dot)ru>> написал(а):
>> 12 июля 2018 г., в 20:40, Heikki Linnakangas <hlinnaka(at)iki(dot)fi
>> <mailto:hlinnaka(at)iki(dot)fi>> написал(а):
>>> Actually, now that I think about it more, I'm not happy with leaving orphaned
>>> pages like that behind. Let's WAL-log the removal of the downlink, and
>>> marking the leaf pages as deleted, in one WAL record, to avoid that.
>>
>> OK, will do this. But this will complicate WAL replay seriously, and I do not
>> know a proper way to test that (BTW there is GiST amcheck in progress, but I
>> decided to leave it for a while).
> Done. Now WAL record for deleted page also removes downlink from internal page.
> I had to use IndexPageTupleDelete() instead of IndexPageTupleMultiDelete(), but
> I do not think it will have any impact on performance.
Yeah, I think that's fine, this isn't that performance critical
>>> But the situation in gistdoinsert(), where you encounter a deleted leaf page,
>>> could happen during normal operation, if vacuum runs concurrently with an
>>> insert. Insertion locks only one page at a time, as it descends the tree, so
>>> after it has released the lock on the parent, but before it has locked the
>>> child, vacuum might have deleted the page. In the latest patch, you're
>>> checking for that just before swapping the shared lock for an exclusive one,
>>> but I think that's wrong; you need to check for that after swapping the lock,
>>> because otherwise vacuum might delete the page while you're not holding the lock.
>> Looks like a valid concern, I'll move that code again.
> Done.
Ok, the comment now says:
> + /*
> + * Leaf pages can be left deleted but still referenced in case of
> + * crash during VACUUM's gistbulkdelete()
> + */
But that's not accurate, right? You should never see deleted pages after
a crash, because the parent is updated in the same WAL record as the
child page, right?
I'm still a bit scared about using pd_prune_xid to store the XID that
prevents recycling the page too early. Can we use some field in
GISTPageOpaqueData for that, similar to how the B-tree stores it in
BTPageOpaqueData?
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2018-07-13 14:25:47 | Re: GiST VACUUM |
Previous Message | Ashutosh Bapat | 2018-07-13 14:05:05 | Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled. |