Re: POC: Cleaning up orphaned files using undo logs

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-07-30 08:24:42
Message-ID: CAFiTN-sCsaWtiAHzXAbQS0x6YkVLTEGj15jHEJSc6qu4S7jEyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 30, 2019 at 12:21 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Tue, Jul 30, 2019 at 5:03 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > I think this idea is good for the DO time but during REDO time it will
> > not work as we will not have the transaction state. Having said that
> > the current idea of keeping in the global variable will also not work
> > during REDO time because the WAL from the different transaction can be
> > interleaved. There are few ideas to handle this issue
> >
> > 1. At DO time keep in TopTransactionState as you suggested and during
> > recovery time read from the first complete record on the page.
> > 2. Just to keep the code uniform always read from the first complete
> > record of the page.
> >
It contains another
> suggestion for that problem you just mentioned (and also me pointing
> out what you just pointed out, since I wrote it earlier) though I'm
> not sure if it's better than your options above.

Thanks, Thomas for your review, Currently, I am replying to the
problem which both of us has identified and found a different set of
solutions. I will go through other comments soon and work on those.

> +UndoRecPtr
> +PrepareUndoInsert(UndoRecordInsertContext *context,
> + UnpackedUndoRecord *urec,
> + Oid dbid)
> +{
> ...
> + /* Fetch compression info for the transaction. */
> + compression_info = GetTopTransactionUndoCompressionInfo(category);
>
> How can this work correctly in recovery? [Edit: it doesn't, as you
> just pointed out]
>
>
> One data structure that could perhaps hold this would be
> UndoLogTableEntry (the per-backend cache, indexed by undo log number,
> with pretty fast lookups; used for things like
> UndoLogNumberGetCategory()). As long as you never want to have
> inter-transaction compression, that should have the right scope to
> give recovery per-undo log tracking. If you ever wanted to do
> compression between transactions too, maybe UndoLogSlot could work,
> but that'd have more complications.

I think this could be a good idea. I had thought of keeping in the
slot as my 3rd option but later I removed it thinking that we need to
expose the compression field to the undo log layer. I think keeping
in the UndoLogTableEntry is a better idea than keeping in the slot.
But, I still have the same problem that we need to expose undo
record-level fields to undo log layer to compute the cache entry size.
OTOH, If we decide to get from the first record of the page (as I
mentioned up thread) then I don't think there is any performance issue
because we are inserting on the same page. But, for doing that we
need to unpack the complete undo record (guaranteed to be on one
page). And, UnpackUndoData will internally unpack the payload data
as well which is not required in our case unless we change
UnpackUndoData such that it unpacks only what the caller wants (one
input parameter will do).

I am not sure out of these two which idea is better?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Ladhe 2019-07-30 08:52:30 Re: concerns around pg_lsn
Previous Message Michael Paquier 2019-07-30 08:04:34 Re: Possible race condition in pg_mkdir_p()?