Re: New GUC autovacuum_max_threshold ?

From: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New GUC autovacuum_max_threshold ?
Date: 2024-11-09 14:08:51
Message-ID: CAGjGUALGwH9LJUxe+ZbjnW_Cg4B4qyFrXFv+oWYCYnLcwggBJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Nathan Bossart
> I apologize for the curt response, but I don't understand how we could
> decide which of these three complicated formulas to use, let alone how
we
> could expect users to reason about the behavior.
Sorry ,I forgot to explain the reason in my last email,In fact, I
submitted the patch to the community,(frederic(dot)yhuel(at)dalibo(dot)com) told me
there has a same idea ,so ,
Let me explain those two formulas here,about ( vacthresh = (float4)
fmin(vac_base_thresh + (vac_scale_factor * reltuples), sqrt(1000.0 *
reltuples)); A few days ago, I was looking at the sql server
documentation and found that sql server has optimized the algorithm related
to updating statistics in the 2016 ,version,I think we can also learn from
the implementation method of sql server to optimize the problem of
automatic vacuum triggered by large tables,The Document link(
https://learn.microsoft.com/en-us/sql/relational-databases/statistics/statistics?view=sql-server-ver16
),about ( vacthresh = (float4) fmin(vac_base_thresh + vac_scale_factor *
reltuples,vac_base_thresh+ vac_scale_factor * log2(reltuples) * 10000);)I
came to the conclusion by trying to draw a function graph,I personally
think it is a smooth formula

####
SQL SERVER
[image: image.png]
log2
[image: image.png]
log

[image: image.png]

original

[image: image.png]

Thanks

Nathan Bossart <nathandbossart(at)gmail(dot)com> 于2024年11月9日周六 01:44写道:

> On Wed, Nov 06, 2024 at 08:51:07PM +0800, wenhui qiu wrote:
> >> Thank you. FWIW, I would prefer a sub-linear growth, so maybe something
> >> like this
> >
> >> vacthresh = Min(vac_base_thresh + vac_scale_factor * reltuples,
> >> vac_base_thresh + vac_scale_factor * pow(reltuples, 0.7) * 100);
> >
> >> This would give :
> >
> >> * 386M (instead of 5.1 billion currently) for a 25.6 billion tuples
> > table ;
> >> * 77M for a 2.56 billion tuples table (Robert's example) ;
> >> * 15M (instead of 51M currently) for a 256M tuples table ;
> >> * 3M (instead of 5M currently) for a 25.6M tuples table.
> >> The other advantage is that you don't need another GUC.
> > Argee ,We just need to change the calculation formula,But I prefer this
> > formula because it calculates a smoother value.
> >
> > vacthresh = (float4) fmin(vac_base_thresh + vac_scale_factor *
> > reltuples,vac_base_thresh
> > + vac_scale_factor * log2(reltuples) * 10000);
> > or
> > vacthresh = (float4) fmin(vac_base_thresh + (vac_scale_factor *
> reltuples)
> > , sqrt(1000.0 * reltuples));
>
> I apologize for the curt response, but I don't understand how we could
> decide which of these three complicated formulas to use, let alone how we
> could expect users to reason about the behavior.
>
> --
> nathan
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2024-11-09 14:13:47 Re: New "single" COPY format
Previous Message Kirill Reshke 2024-11-09 12:55:04 Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row