Re: Vacuum statistics

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Andrei Zubkov <a(dot)zubkov(at)postgrespro(dot)ru>
Subject: Re: Vacuum statistics
Date: 2024-06-08 06:30:47
Message-ID: 9706bb4d-8966-49d8-a836-b11107160327@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi! Thank you for your interest in this topic!

On 07.06.2024 09:46, Dilip Kumar wrote:
> On Thu, May 30, 2024 at 11:57 PM Alena Rybakina
> <lena(dot)ribackina(at)yandex(dot)ru> wrote:
>> On 30.05.2024 10:33, Alena Rybakina wrote:
>>> I suggest gathering information about vacuum resource consumption for
>>> processing indexes and tables and storing it in the table and index
>>> relationships (for example, PgStat_StatTabEntry structure like it has
>>> realized for usual statistics). It will allow us to determine how well
>>> the vacuum is configured and evaluate the effect of overhead on the
>>> system at the strategic level, the vacuum has gathered this
>>> information already, but this valuable information doesn't store it.
>>>
>> My colleagues and I have prepared a patch that can help to solve this
>> problem.
>>
>> We are open to feedback.
> I was reading through the patch here are some initial comments.
>
> --
> +typedef struct LVExtStatCounters
> +{
> + TimestampTz time;
> + PGRUsage ru;
> + WalUsage walusage;
> + BufferUsage bufusage;
> + int64 VacuumPageMiss;
> + int64 VacuumPageHit;
> + int64 VacuumPageDirty;
> + double VacuumDelayTime;
> + PgStat_Counter blocks_fetched;
> + PgStat_Counter blocks_hit;
> +} LVExtStatCounters;
>
>
> I noticed that you are storing both pgBufferUsage and
> VacuumPage(Hit/Miss/Dirty) stats. Aren't these essentially the same?
> It seems they both exist in the system because some code, like
> heap_vacuum_rel(), uses pgBufferUsage, while do_analyze_rel() still
> relies on the old counters. And there is already a patch to remove
> those old counters.
I agree with you and I have fixed it.
>
> --
> +static Datum
> +pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int ncolumns)
> +{
>
> I don't think you need this last parameter (ncolumns) we can anyway
> fetch that from tupledesc, so adding an additional parameter
> just for checking doesn't look good to me.
To be honest,I'm notsureifncolumns shouldbe deletedat
all,becausethepg_stats_vacuum
functionisusedtodisplaythreedifferenttypesof
statistics:fortables,indexes,
anddatabases.Weusethisparametertopassinformationaboutthe numberof
parameters(orhowmany statisticsweexpect)dependingonthe typeof
statistics.For example,table
vacuumstatisticscontain27parameters,whileindexesanddatabasescontain19and15parameters,
respectively.Youcanseethatthe pg_stats_vacuum functioncontainsan
Assertthatchecksthatthe expectednumberof tupledesc
parametersmatchestheactualnumber.

Assert(tupdesc->natts == ncolumns);

PerhapsIcanconvertitto alocalparameteranddetermineitsvaluealreadyinthe
function,for example:

pg_stats_vacuum(FunctionCallInfo fcinfo, ExtVacReportType type, int
ncolumns)
{

int columns = 0;

switch (type)

{

case PGSTAT_EXTVAC_HEAP:

ncolumns = EXTVACHEAPSTAT_COLUMNS;

break;

case PGSTAT_EXTVAC_INDEX:

ncolumns = EXTVACINDEXSTAT_COLUMNS;

break;

case PGSTAT_EXTVAC_DB:

ncolumns = EXTVACDBSTAT_COLUMNS;

break;

}

...

}

What do you think?

> --
> + /* Tricky turn here: enforce pgstat to think that our database us dbid */
> +
> + MyDatabaseId = dbid;
>
> typo
> /think that our database us dbid/think that our database has dbid
>
> Also, remove the blank line between the comment and the next code
> block that is related to that comment.
>
>
> --
> VacuumPageDirty = 0;
> + VacuumDelayTime = 0.;
>
> There is an extra "." after 0
>
>
Thank you, I fixed it.

In additionto thesechanges,Ifixedthe
problemwithdisplayingvacuumstatisticsfordatabases:Ifoundan
errorindefiningthe pg_stats_vacuum_database systemview.In
addition,Irewrotethe testsandconvertedthemintoa regressiontest.In
addition,Ihave dividedthe testtotestthe functionalityof the outputof
vacuumstatisticsintotwotests:oneofthemchecksthe functionalityof
tablesanddatabases,andthe other-indexes.Thisis causedby aproblemwiththe
vacuumfunctionalitywhenthe tablecontainsan
index.Youcanfindmoreinformationaboutthishere:[0]and[1].

I attached the diff to this letter.

[0]
https://www.postgresql.org/message-id/d1ca3a1d-7ead-41a7-bfd0-5b66ad97b1cd%40yandex.ru

[1]
https://www.postgresql.org/message-id/CAH2-Wznv94Q_Td8OS8bAN7fYLpfU6CGgjn6Xau5eJ_sDxEGeBA%40mail.gmail.com

Iam currentlyworkingondividingthispatchintothreepartstosimplifythe
reviewprocess:oneofthemwillcontaincodeforcollectingvacuumstatisticsontables,the
secondonindexesandthe lastondatabases.I alsowritethe documentation.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v2-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 103.3 KB
vacuum_file.diff.no-cfbot text/plain 65.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-06-08 10:22:32 Re: Conflict Detection and Resolution
Previous Message Andy Fan 2024-06-08 06:05:51 New function normal_rand_array function to contrib/tablefunc.