Re: generalized conveyor belt storage

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: generalized conveyor belt storage
Date: 2021-12-15 05:47:51
Message-ID: CAFiTN-sM-psjA5imbczqkVP2+uhmqwi1_POPFCbockqg1R62SA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 15, 2021 at 6:33 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>

> How did you test this? I ask because it would be nice if there was a
> convenient way to try this out, as somebody with a general interest.
> Even just a minimal test module, that you used for development work.
>

I have tested this using a simple extension over the conveyor belt
APIs, this extension provides wrapper apis over the conveyor belt
APIs. These are basic APIs which can be extended even further for
more detailed testing, e.g. as of now I have provided an api to read
complete page from the conveyor belt but that can be easily extended
to read from a particular offset and also the amount of data to read.

Parallelly I am also testing this by integrating it with the vacuum,
which is still a completely WIP patch and needs a lot of design level
improvement so not sharing it, so once it is in better shape I will
post that in the separate thread.

Basically, we have decoupled different vacuum phases (only for manual
vacuum ) something like below,
VACUUM (first_pass) t;
VACUUM idx;
VACUUM (second_pass) t;

So in the first pass we are just doing the first pass of vacuum and
wherever we are calling lazy_vacuum() we are storing those dead tids
in the conveyor belt. In the index pass, user can vacuum independent
index and therein it will just fetch the last conveyor belt point upto
which it has already vacuum, then from there load dead tids which can
fit in maintenance_work_mem and then call the index bulk delete (this
will be done in loop until we complete the index vacuum). In the
second pass, we check all the indexes and find the minimum conveyor
belt point upto which all indexes have vacuumed. We also fetch the
last point where we left the second pass of the heap. Now we fetch
the dead tids from the conveyor belt (which fits in
maintenance_work_mem) from the last vacuum point of heap upto the min
index vacuum point. And perform the second heap pass.

I have given the highlights of the decoupling work just to show what
sort of testing we are doing for the conveyor belt. But we can
discuss this on a separate thread when I am ready to post that patch.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v1-0001-Conveyor-belt-testing-extention.patch text/x-patch 17.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-12-15 06:24:57 Re: row filtering for logical replication
Previous Message Masahiko Sawada 2021-12-15 05:09:16 Re: Failed transaction statistics to measure the logical replication progress