From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: VACUUM memory management |
Date: | 2020-04-05 20:26:25 |
Message-ID: | 20200405202625.GC2228@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Dec 11, 2019 at 09:29:17PM +0500, Ibrar Ahmed wrote:
> > Did you modify Claudio's patch or write a totally new one?
>
> I wrote completely new patch. I tried multiple techniques like using a list
> instead of fixed size array which I thought was most suitable here, but
> leave that because of conflict with Parallel Vacuum.
Using a list will hardly work, or certainly not well, since it needs to be
searched by the ambulkdelete callback.
> >> If you wrote a totally new one, have you compared your work with
> >> Claudio's, to see if he covered
> >> anything you might need to cover?
> >
> > I checked the patch, and it does not do anything special which my patch is
> not doing except one thing. The patch is claiming to increase the limit of
> 1GB along with that, but I have not touched that. In my case, we are still
> under the limit of maintaines_work_mem but allocate memory in chunks. In
> that case, you have the leverage to set a big value of maintaness_work_mem
> (even if you don't need that) because it will not allocate all the memory
> at the start.
After spending a bunch of time comparing them, I disagree. Claudio's patch
does these:
- avoid using multiple chunks if there's no indexes, therefore no need to
avoid the high cost of index scans to avoid;
- rather than doing an index scan for each chunk (bad), the callback function
lazy_tid_reaped() does a custom binary search *over* chunks of different
sizes and then *within* each chunk. That's maybe slighly over-engineered,
I'm not convinced that's needed (but I thought it was pretty clever), but
someone thought that was important.
- properly keep track of *total* number of dead tuples, eg for progress
reporting, and for prev_dead_count for pages with no dead tuples;
- lazy_record_dead_tuple() doubles allocation when running out of space for
dead tuples; some people disagree with that (myself included) but I'm
including it here since that's what it does. This still seems nontrivial
(to me) to adapt to work with parallel query.
--
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2020-04-05 20:44:04 | Re: CLOBBER_CACHE_ALWAYS regression instability |
Previous Message | Tom Lane | 2020-04-05 20:12:26 | Re: backup manifests and contemporaneous buildfarm failures |