From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com> |
Subject: | Re: New IndexAM API controlling index vacuum strategies |
Date: | 2021-04-01 00:58:19 |
Message-ID: | CAH2-WzmMk6-iDc1-OJQ-sj6X-Zw=L2yM2wpqDtUBcUFiyYpNxQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 31, 2021 at 4:45 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> Both 0001 and 0002 patch refactors the whole lazy vacuum code. Can we
> merge them? I basically agree with the refactoring made by 0001 patch
> but I'm concerned a bit that having such a large refactoring at very
> close to feature freeze could be risky. We would need more eyes to
> review during stabilization.
I think that Robert makes some related points about how we might cut
scope here. So I'll definitely do some of that, maybe all of it.
> I think it's more clear to use this macro. The macro can be like this:
>
> ParallelVacuumIsActive(vacrel) (((LVRelState) vacrel)->lps != NULL)
Yes, that might be better. I'll consider it when I get back to the
patch tomorrow.
> + * LVDeadTuples stores TIDs that are gathered during pruning/the initial heap
> + * scan. These get deleted from indexes during index vacuuming. They're then
> + * removed from the heap during a second heap pass that performs heap
> + * vacuuming.
> */
>
> The second sentence of the removed lines still seems to be useful
> information for readers?
I don't think that the stuff about shared memory was useful, really.
If we say something like this then it should be about the LVRelState
pointer, not the struct.
> - * We do not process them because it's
> a very rare condition,
> - * and the next vacuum will process them anyway.
>
> Maybe the above comments should not be removed by 0001 patch.
Right.
> Looking at the comments, I thought that this function also frees
> palloc'd dead tuple space but it doesn't. It seems to more clear that
> doing pfree(vacrel->dead_tuples) here or not creating
> lazy_space_free().
I'll need to think about this some more.
> ---
> + if (shared_istat)
> + {
> + /* Get the space for IndexBulkDeleteResult */
> + bulkdelete_res = &(shared_istat->istat);
> +
> + /*
> + * Update the pointer to the corresponding
> bulk-deletion result if
> + * someone has already updated it.
> + */
> + if (shared_istat->updated && istat == NULL)
> + istat = bulkdelete_res;
> + }
>
> (snip)
>
> + if (shared_istat && !shared_istat->updated && istat != NULL)
> + {
> + memcpy(bulkdelete_res, istat, sizeof(IndexBulkDeleteResult));
> + shared_istat->updated = true;
> +
> + /*
> + * Now that top-level indstats[idx] points to the DSM
> segment, we
> + * don't need the locally allocated results.
> + */
> + pfree(istat);
> + istat = bulkdelete_res;
> + }
> +
> + return istat;
>
> If we have parallel_process_one_index() return the address of
> IndexBulkDeleteResult, we can simplify the first part above. Also, it
> seems better to use a separate variable from istat to store the
> result. How about the following structure?
I'll try it that way and see how it goes.
> + /* This won't have changed: */
> + Assert(savefreespace && freespace == PageGetHeapFreeSpace(page));
>
> This assertion can be false because freespace can be 0 if the page's
> PD_HAS_FREE_LINES hint can wrong. Since lazy_vacuum_heap_page() fixes
> it, PageGetHeapFreeSpace(page) in the assertion returns non-zero
> value.
Good catch, I'll fix it.
> The first vacrel->relname should be vacrel->relnamespace.
Will fix.
> I think we can use errmsg_plural() for "X index scans" part.
Yeah, I think that that would be more consistent.
> We should use vacrel->lpdead_item_pages instead of vacrel->rel_pages
Will fix. I was mostly focussed on the log_autovacuum version, which
is why it looks nice already.
> ---
> + /* Stop applying cost limits from this point on */
> + VacuumCostActive = false;
> + VacuumCostBalance = 0;
> + }
>
> I agree with the idea of disabling vacuum delay in emergency cases.
> But why do we do that only in the case of the table with indexes? I
> think this optimization is helpful even in the table with no indexes.
> We can check the XID wraparound emergency by calling
> vacuum_xid_limit_emergency() at some point to disable vacuum delay?
Hmm. I see your point, but at the same time I think that the risk is
lower on a table that has no indexes. It may be true that index
vacuuming doesn't necessarily take the majority of all of the work in
lots of cases. But I think that it is true that it does when things
get very bad -- one-pass/no indexes VACUUM does not care about
maintenance_work_mem, etc.
But let me think about it...I suppose we could do it when one-pass
VACUUM considers vacuuming a range of FSM pages every
VACUUM_FSM_EVERY_PAGES. That's kind of similar to index vacuuming, in
a way -- it wouldn't be too bad to check for emergencies in the same
way there.
> Both vacrel->do_index_vacuuming and vacrel->do_index_cleanup can be
> false also when INDEX_CLEANUP is off. So autovacuum could wrongly
> report emergency if the table's vacuum_index_vacuum reloption is
> false.
Good point. I will need to account for that so that log_autovacuum's
LOG message does the right thing. Perhaps for other reasons, too.
Thanks for the review!
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2021-04-01 01:02:52 | Re: Refactor SSL test framework to support multiple TLS libraries |
Previous Message | Michael Paquier | 2021-04-01 00:56:02 | Re: invalid data in file backup_label problem on windows |