From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POC: Cleaning up orphaned files using undo logs |
Date: | 2019-07-12 13:37:46 |
Message-ID: | CA+TgmoZfPzo7-Wf0Gi+ZxjTan2Mvy5ip4sPJK6rueH4axW5o0g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jul 12, 2019 at 5:40 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> I am not sure but don't we need to retain the color of z as well?
I believe that would be very wrong. If you recolor an internal node,
you'll break the constant-black-height invariant.
> Apart from this, the duplicate key (ex. for size queues the size of
> two requests can be same) handling might need some work. Basically,
> either special combine function needs to be written (not sure yet what
> we should do there) or we always need to ensure that the key is unique
> like (size + start_urec_ptr). If the size is the same, then we can
> decide based on start_urec_ptr.
I think that this problem is somewhat independent of whether we use an
rbtree or a binaryheap or some other data structure. I would be
inclined to use XID as a tiebreak for the size queue, so that it's
more likely to match the ordering of the XID queue, but if that's
inconvenient, then some other arbitrary value like start_urec_ptr
should be fine.
> I think we can go by changing the implementation to rbtree by doing
> some enhancements instead of the binary heap or alternatively, we can
> use one of the two ideas suggested by you in the email above [1] to
> simplify the code and keep using the binary heap for now. Especially,
> I like the below one.
> "2. However, I don't think we should have a separate request object
> for each queue anyway. We should insert pointers to the same objects
> in all the relevant queue (either size + XID, or else error). So
> instead of having three sets of objects, one for each queue, we'd just
> have one set of objects and point to them with as many as two
> pointers.
> We'd therefore need LESS memory than we're using today, because we
> wouldn't have separate arrays for XID, size, and error queue
> elements."
>
> I think even if we currently go with a binary heap, it will be
> possible to change it to rbtree later, but I am fine either way.
Well, I don't see much point in revising all of this logic twice. We
should pick the way we want it to work and make it work that way.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2019-07-12 13:47:02 | Re: rebased background worker reimplementation prototype |
Previous Message | Fabien COELHO | 2019-07-12 13:10:26 | Re: proposal - patch: psql - sort_by_size |