Using read_stream in index vacuum

From: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Using read_stream in index vacuum
Date: 2024-10-19 08:41:50
Message-ID: DBD427E0-7E57-41D3-AEE1-7DFFA3CAB4EE@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers!

On a recent hacking workshop [0] Thomas mentioned that patches using new API would be welcomed.
So I prototyped streamlining of B-tree vacuum for a discussion.
When cleaning an index we must visit every index tuple, thus we uphold a special invariant:
After checking a trailing block, it must be last according to subsequent RelationGetNumberOfBlocks(rel) call.

This invariant does not allow us to completely replace block loop with streamlining. That's why streamlining is done only for number of blocks returned by first RelationGetNumberOfBlocks(rel) call. A tail is processed with regular ReadBufferExtended().

Also, it's worth mentioning that we have to jump to the left blocks from a recently split pages. We also do it with regular ReadBufferExtended(). That's why signature btvacuumpage() now accepts a buffer, not a block number.

I've benchmarked the patch on my laptop (MacBook Air M3) with following workload:
1. Initialization
create unlogged table x as select random() r from generate_series(1,1e7);
create index on x(r);
create index on x(r);
create index on x(r);
create index on x(r);
create index on x(r);
create index on x(r);
create index on x(r);
vacuum;
2. pgbench with 1 client
insert into x select random() from generate_series(0,10) x;
vacuum x;

On my laptop I see ~3% increase in TPS of the the pgbench (~ from 101 to 104), but statistical noise is very significant, bigger than performance change. Perhaps, a less noisy benchmark can be devised.

What do you think? If this approach seems worthwhile, I can adapt same technology to other AMs.

Best regards, Andrey Borodin.

[0] https://rhaas.blogspot.com/2024/08/postgresql-hacking-workshop-september.html

Attachment Content-Type Size
0001-Prototype-B-tree-vacuum-streamlineing.patch application/octet-stream 4.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-10-19 10:13:50 Re: New "raw" COPY format
Previous Message Joel Jacobson 2024-10-19 07:55:15 Re: New "raw" COPY format