Quick Links

Re: POC: Cleaning up orphaned files using undo logs

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: POC: Cleaning up orphaned files using undo logs
Date:	2019-07-12 13:37:46
Message-ID:	CA+TgmoZfPzo7-Wf0Gi+ZxjTan2Mvy5ip4sPJK6rueH4axW5o0g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Jul 12, 2019 at 5:40 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> I am not sure but don't we need to retain the color of z as well?

I believe that would be very wrong. If you recolor an internal node,
you'll break the constant-black-height invariant.

> Apart from this, the duplicate key (ex. for size queues the size of
> two requests can be same) handling might need some work. Basically,
> either special combine function needs to be written (not sure yet what
> we should do there) or we always need to ensure that the key is unique
> like (size + start_urec_ptr). If the size is the same, then we can
> decide based on start_urec_ptr.

I think that this problem is somewhat independent of whether we use an
rbtree or a binaryheap or some other data structure. I would be
inclined to use XID as a tiebreak for the size queue, so that it's
more likely to match the ordering of the XID queue, but if that's
inconvenient, then some other arbitrary value like start_urec_ptr
should be fine.

> I think we can go by changing the implementation to rbtree by doing
> some enhancements instead of the binary heap or alternatively, we can
> use one of the two ideas suggested by you in the email above [1] to
> simplify the code and keep using the binary heap for now. Especially,
> I like the below one.
> "2. However, I don't think we should have a separate request object
> for each queue anyway. We should insert pointers to the same objects
> in all the relevant queue (either size + XID, or else error). So
> instead of having three sets of objects, one for each queue, we'd just
> have one set of objects and point to them with as many as two
> pointers.
> We'd therefore need LESS memory than we're using today, because we
> wouldn't have separate arrays for XID, size, and error queue
> elements."
>
> I think even if we currently go with a binary heap, it will be
> possible to change it to rbtree later, but I am fine either way.

Well, I don't see much point in revising all of this logic twice. We
should pick the way we want it to work and make it work that way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: POC: Cleaning up orphaned files using undo logs at 2019-07-12 09:39:51 from Amit Kapila

Responses

Re: POC: Cleaning up orphaned files using undo logs at 2019-07-13 10:25:51 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tomas Vondra	2019-07-12 13:47:02	Re: rebased background worker reimplementation prototype
Previous Message	Fabien COELHO	2019-07-12 13:10:26	Re: proposal - patch: psql - sort_by_size