From: | wenhui qiu <qiuwenhuifx(at)gmail(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: New GUC autovacuum_max_threshold ? |
Date: | 2024-11-09 14:08:51 |
Message-ID: | CAGjGUALGwH9LJUxe+ZbjnW_Cg4B4qyFrXFv+oWYCYnLcwggBJQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Nathan Bossart
> I apologize for the curt response, but I don't understand how we could
> decide which of these three complicated formulas to use, let alone how
we
> could expect users to reason about the behavior.
Sorry ,I forgot to explain the reason in my last email,In fact, I
submitted the patch to the community,(frederic(dot)yhuel(at)dalibo(dot)com) told me
there has a same idea ,so ,
Let me explain those two formulas here,about ( vacthresh = (float4)
fmin(vac_base_thresh + (vac_scale_factor * reltuples), sqrt(1000.0 *
reltuples)); A few days ago, I was looking at the sql server
documentation and found that sql server has optimized the algorithm related
to updating statistics in the 2016 ,version,I think we can also learn from
the implementation method of sql server to optimize the problem of
automatic vacuum triggered by large tables,The Document link(
https://learn.microsoft.com/en-us/sql/relational-databases/statistics/statistics?view=sql-server-ver16
),about ( vacthresh = (float4) fmin(vac_base_thresh + vac_scale_factor *
reltuples,vac_base_thresh+ vac_scale_factor * log2(reltuples) * 10000);)I
came to the conclusion by trying to draw a function graph,I personally
think it is a smooth formula
####
SQL SERVER
[image: image.png]
log2
[image: image.png]
log
[image: image.png]
original
[image: image.png]
Thanks
Nathan Bossart <nathandbossart(at)gmail(dot)com> 于2024年11月9日周六 01:44写道:
> On Wed, Nov 06, 2024 at 08:51:07PM +0800, wenhui qiu wrote:
> >> Thank you. FWIW, I would prefer a sub-linear growth, so maybe something
> >> like this
> >
> >> vacthresh = Min(vac_base_thresh + vac_scale_factor * reltuples,
> >> vac_base_thresh + vac_scale_factor * pow(reltuples, 0.7) * 100);
> >
> >> This would give :
> >
> >> * 386M (instead of 5.1 billion currently) for a 25.6 billion tuples
> > table ;
> >> * 77M for a 2.56 billion tuples table (Robert's example) ;
> >> * 15M (instead of 51M currently) for a 256M tuples table ;
> >> * 3M (instead of 5M currently) for a 25.6M tuples table.
> >> The other advantage is that you don't need another GUC.
> > Argee ,We just need to change the calculation formula,But I prefer this
> > formula because it calculates a smoother value.
> >
> > vacthresh = (float4) fmin(vac_base_thresh + vac_scale_factor *
> > reltuples,vac_base_thresh
> > + vac_scale_factor * log2(reltuples) * 10000);
> > or
> > vacthresh = (float4) fmin(vac_base_thresh + (vac_scale_factor *
> reltuples)
> > , sqrt(1000.0 * reltuples));
>
> I apologize for the curt response, but I don't understand how we could
> decide which of these three complicated formulas to use, let alone how we
> could expect users to reason about the behavior.
>
> --
> nathan
>
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2024-11-09 14:13:47 | Re: New "single" COPY format |
Previous Message | Kirill Reshke | 2024-11-09 12:55:04 | Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row |