From: | Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
---|---|
To: | Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Fix parallel vacuum buffer usage reporting |
Date: | 2024-05-10 15:55:27 |
Message-ID: | CAN55FZ24cArjPhm-53T-SAF=wiBM82oxp=fCjRJUynFKd74sGg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Fri, 10 May 2024 at 16:55, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Fri, 10 May 2024 at 16:21, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > On Fri, 10 May 2024 at 14:49, Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru> wrote:
> > >
> > > Hi! I could try to check it with the test, but I want to ask you about
> > > details, because I'm not sure that I completely understand the test case.
> > >
> > > You mean that we need to have two backends and on one of them we deleted
> > > the tuples before vacuum called the other, do you?
> > >
There should be some other backend(s) which will try to read the same
buffer with the ongoing VACUUM operation. I think it works now but the
reproduction steps are a bit racy. See:
1- Build Postgres with attached diff, it is the same
see_previous_output.diff that I shared two mails ago.
2- Run Postgres, all settings are default.
3- Use two client backends, let's name them as A and B client backends.
4- On A client backend, run:
CREATE TABLE vacuum_fix (aid int, bid int) with (autovacuum_enabled=false);
INSERT INTO vacuum_fix SELECT *, * FROM generate_series(1, 20000000);
VACUUM vacuum_fix;
UPDATE vacuum_fix SET aid = aid + 1, bid = bid + 1;
5- Now it will be a bit racy, SQL commands below need to be run at the
same time. The aim is for VACUUM on A client backend and SELECT on B
client backend to read the same buffers at the same time. So, some of
the buffers will be double counted.
Firstly, run VACUUM on A client backend; immediately after running
VACUUM, run SELECT on B backend.
A client backend:
VACUUM VERBOSE vacuum_fix;
B client backend:
SELECT * from vacuum_fix WHERE aid = -1;
This is the output of the VACUUM VERBOSE on my end:
INFO: vacuuming "test.public.vacuum_fix"
INFO: finished vacuuming "test.public.vacuum_fix": index scans: 0
pages: 0 removed, 176992 remain, 176992 scanned (100.00% of total)
...
...
buffer usage: 254181 hits, 99030 misses in the previous version, 99865
misses in the patched version, 106830 dirtied
...
VACUUM
Time: 2578.217 ms (00:02.578)
VACUUM does not run parallel, so this test case does not trigger what
is fixed in this thread. As it can be seen, there is ~1000 buffers
difference.
I am not sure if there is an easier way to reproduce this but I hope this helps.
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachment | Content-Type | Size |
---|---|---|
see_previous_output.diff | text/x-patch | 1.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2024-05-10 15:57:42 | Re: WAL record CRC calculated incorrectly because of underlying buffer modification |
Previous Message | Tom Lane | 2024-05-10 15:43:07 | End-of-cycle code beautification tasks |