Re: generalized conveyor belt storage

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: generalized conveyor belt storage
Date: 2021-12-15 15:03:05
Message-ID: CAEze2WjLqm=J1qmumefrbGqAL_nPg2XQTUa_Q8R1e4PM8XPRWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 15 Dec 2021 at 00:01, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Hi!
[...]
> So here's a patch. Basically, it lets you initialize a relation fork
> as a "conveyor belt," and then you can add pages of basically
> arbitrary data to the conveyor belt and then throw away old ones and,
> modulo bugs, it will take care of recycling space for you. There's a
> fairly detailed README in the patch if you want a more detailed
> description of how the whole thing works.

I was reading through this README when I hit the following:

> +Conceptually, a relation fork organized as a conveyor belt has three parts:
> +
> +- Payload. The payload is whatever data the user of this module wishes
> + to store. The conveyor belt doesn't care what you store in a payload page,
> + but it does require that you store something: each time a payload page is
> + initialized, it must end up with either pd_lower > SizeOfPageHeaderData,
> + or pd_lower < BLCKSZ.

As SizeOfPageHeaderData < BLCKSZ, isn't this condition always true? Or
at least, this currently allows for either any value of pd_lower, or
the (much clearer) 'pd_lower <= SizeOfPageHeaderData or pd_lower >=
BLCKSZ', depending on exclusiveness of the either_or clause.

> It's missing some features
> that I want it to have: for example, I'd like to have on-line
> compaction, where whatever logical page numbers of data currently
> exist can be relocated to lower physical page numbers thus allowing
> you to return space to the operating system, hopefully without
> requiring a strong heavyweight lock. But that's not implemented yet,
> and it's also missing a few other things, like test cases, performance
> results, more thorough debugging, better write-ahead logging
> integration, and some code to use it to do something useful. But
> there's enough here, I think, for you to form an opinion about whether
> you think this is a reasonable direction, and give any design-level
> feedback that you'd like to give. My colleagues Dilip Kumar and Mark
> Dilger have contributed to this effort with some testing help, but all
> the code in this patch is mine.

You mentioned that this is meant to be used as a "relation fork", but
I couldn't find new code in relpath.h (where ForkNumber etc. are
defined) that allows one more fork per relation. Is that too on the
missing features list, or did I misunderstand what you meant with
"relation fork"?

Kind regards,

Matthias van de Meent

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Benoit Lobréau 2021-12-15 15:20:42 Re: Probable memory leak with ECPG and AIX
Previous Message Alvaro Herrera 2021-12-15 14:49:21 Re: Add id's to various elements in protocol.sgml