Quick Links

Re: autovac issue with large number of tables

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc:	Kasahara Tatsuhito <kasahara(dot)tatsuhito(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <nasbyj(at)amazon(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: autovac issue with large number of tables
Date:	2020-12-01 08:31:00
Message-ID:	CAD21AoAZ8mjS+-E2uDTBpERd2VQg6SD+RgQ_CMjS+m_8QHFFVQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Dec 1, 2020 at 4:32 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
>
>
> On 2020/12/01 16:23, Masahiko Sawada wrote:
> > On Tue, Dec 1, 2020 at 1:48 PM Kasahara Tatsuhito
> > <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
> >>
> >> Hi,
> >>
> >> On Mon, Nov 30, 2020 at 8:59 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> >>>
> >>>
> >>>
> >>> On 2020/11/30 10:43, Masahiko Sawada wrote:
> >>>> On Sun, Nov 29, 2020 at 10:34 PM Kasahara Tatsuhito
> >>>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
> >>>>>
> >>>>> Hi, Thanks for you comments.
> >>>>>
> >>>>> On Fri, Nov 27, 2020 at 9:51 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 2020/11/27 18:38, Kasahara Tatsuhito wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On Fri, Nov 27, 2020 at 1:43 AM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2020/11/26 10:41, Kasahara Tatsuhito wrote:
> >>>>>>>>> On Wed, Nov 25, 2020 at 8:46 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Nov 25, 2020 at 4:18 PM Kasahara Tatsuhito
> >>>>>>>>>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Hi,
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Nov 25, 2020 at 2:17 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Sep 4, 2020 at 7:50 PM Kasahara Tatsuhito
> >>>>>>>>>>>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Sep 2, 2020 at 2:10 AM Kasahara Tatsuhito
> >>>>>>>>>>>>> <kasahara(dot)tatsuhito(at)gmail(dot)com> wrote:
> >>>>>>>>>>>>>>> I wonder if we could have table_recheck_autovac do two probes of the stats
> >>>>>>>>>>>>>>> data. First probe the existing stats data, and if it shows the table to
> >>>>>>>>>>>>>>> be already vacuumed, return immediately. If not, *then* force a stats
> >>>>>>>>>>>>>>> re-read, and check a second time.
> >>>>>>>>>>>>>> Does the above mean that the second and subsequent table_recheck_autovac()
> >>>>>>>>>>>>>> will be improved to first check using the previous refreshed statistics?
> >>>>>>>>>>>>>> I think that certainly works.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> If that's correct, I'll try to create a patch for the PoC
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I still don't know how to reproduce Jim's troubles, but I was able to reproduce
> >>>>>>>>>>>>> what was probably a very similar problem.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This problem seems to be more likely to occur in cases where you have
> >>>>>>>>>>>>> a large number of tables,
> >>>>>>>>>>>>> i.e., a large amount of stats, and many small tables need VACUUM at
> >>>>>>>>>>>>> the same time.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So I followed Tom's advice and created a patch for the PoC.
> >>>>>>>>>>>>> This patch will enable a flag in the table_recheck_autovac function to use
> >>>>>>>>>>>>> the existing stats next time if VACUUM (or ANALYZE) has already been done
> >>>>>>>>>>>>> by another worker on the check after the stats have been updated.
> >>>>>>>>>>>>> If the tables continue to require VACUUM after the refresh, then a refresh
> >>>>>>>>>>>>> will be required instead of using the existing statistics.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I did simple test with HEAD and HEAD + this PoC patch.
> >>>>>>>>>>>>> The tests were conducted in two cases.
> >>>>>>>>>>>>> (I changed few configurations. see attached scripts)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 1. Normal VACUUM case
> >>>>>>>>>>>>> - SET autovacuum = off
> >>>>>>>>>>>>> - CREATE tables with 100 rows
> >>>>>>>>>>>>> - DELETE 90 rows for each tables
> >>>>>>>>>>>>> - SET autovacuum = on and restart PostgreSQL
> >>>>>>>>>>>>> - Measure the time it takes for all tables to be VACUUMed
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 2. Anti wrap round VACUUM case
> >>>>>>>>>>>>> - CREATE brank tables
> >>>>>>>>>>>>> - SELECT all of these tables (for generate stats)
> >>>>>>>>>>>>> - SET autovacuum_freeze_max_age to low values and restart PostgreSQL
> >>>>>>>>>>>>> - Consumes a lot of XIDs by using txid_curent()
> >>>>>>>>>>>>> - Measure the time it takes for all tables to be VACUUMed
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> For each test case, the following results were obtained by changing
> >>>>>>>>>>>>> autovacuum_max_workers parameters to 1, 2, 3(def) 5 and 10.
> >>>>>>>>>>>>> Also changing num of tables to 1000, 5000, 10000 and 20000.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Due to the poor VM environment (2 VCPU/4 GB), the results are a little unstable,
> >>>>>>>>>>>>> but I think it's enough to ask for a trend.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ===========================================================================
> >>>>>>>>>>>>> [1.Normal VACUUM case]
> >>>>>>>>>>>>> tables:1000
> >>>>>>>>>>>>> autovacuum_max_workers 1: (HEAD) 20 sec VS (with patch) 20 sec
> >>>>>>>>>>>>> autovacuum_max_workers 2: (HEAD) 18 sec VS (with patch) 16 sec
> >>>>>>>>>>>>> autovacuum_max_workers 3: (HEAD) 18 sec VS (with patch) 16 sec
> >>>>>>>>>>>>> autovacuum_max_workers 5: (HEAD) 19 sec VS (with patch) 17 sec
> >>>>>>>>>>>>> autovacuum_max_workers 10: (HEAD) 19 sec VS (with patch) 17 sec
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> tables:5000
> >>>>>>>>>>>>> autovacuum_max_workers 1: (HEAD) 77 sec VS (with patch) 78 sec
> >>>>>>>>>>>>> autovacuum_max_workers 2: (HEAD) 61 sec VS (with patch) 43 sec
> >>>>>>>>>>>>> autovacuum_max_workers 3: (HEAD) 38 sec VS (with patch) 38 sec
> >>>>>>>>>>>>> autovacuum_max_workers 5: (HEAD) 45 sec VS (with patch) 37 sec
> >>>>>>>>>>>>> autovacuum_max_workers 10: (HEAD) 43 sec VS (with patch) 35 sec
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> tables:10000
> >>>>>>>>>>>>> autovacuum_max_workers 1: (HEAD) 152 sec VS (with patch) 153 sec
> >>>>>>>>>>>>> autovacuum_max_workers 2: (HEAD) 119 sec VS (with patch) 98 sec
> >>>>>>>>>>>>> autovacuum_max_workers 3: (HEAD) 87 sec VS (with patch) 78 sec
> >>>>>>>>>>>>> autovacuum_max_workers 5: (HEAD) 100 sec VS (with patch) 66 sec
> >>>>>>>>>>>>> autovacuum_max_workers 10: (HEAD) 97 sec VS (with patch) 56 sec
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> tables:20000
> >>>>>>>>>>>>> autovacuum_max_workers 1: (HEAD) 338 sec VS (with patch) 339 sec
> >>>>>>>>>>>>> autovacuum_max_workers 2: (HEAD) 231 sec VS (with patch) 229 sec
> >>>>>>>>>>>>> autovacuum_max_workers 3: (HEAD) 220 sec VS (with patch) 191 sec
> >>>>>>>>>>>>> autovacuum_max_workers 5: (HEAD) 234 sec VS (with patch) 147 sec
> >>>>>>>>>>>>> autovacuum_max_workers 10: (HEAD) 320 sec VS (with patch) 113 sec
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [2.Anti wrap round VACUUM case]
> >>>>>>>>>>>>> tables:1000
> >>>>>>>>>>>>> autovacuum_max_workers 1: (HEAD) 19 sec VS (with patch) 18 sec
> >>>>>>>>>>>>> autovacuum_max_workers 2: (HEAD) 14 sec VS (with patch) 15 sec
> >>>>>>>>>>>>> autovacuum_max_workers 3: (HEAD) 14 sec VS (with patch) 14 sec
> >>>>>>>>>>>>> autovacuum_max_workers 5: (HEAD) 14 sec VS (with patch) 16 sec
> >>>>>>>>>>>>> autovacuum_max_workers 10: (HEAD) 16 sec VS (with patch) 14 sec
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> tables:5000
> >>>>>>>>>>>>> autovacuum_max_workers 1: (HEAD) 69 sec VS (with patch) 69 sec
> >>>>>>>>>>>>> autovacuum_max_workers 2: (HEAD) 66 sec VS (with patch) 47 sec
> >>>>>>>>>>>>> autovacuum_max_workers 3: (HEAD) 59 sec VS (with patch) 37 sec
> >>>>>>>>>>>>> autovacuum_max_workers 5: (HEAD) 39 sec VS (with patch) 28 sec
> >>>>>>>>>>>>> autovacuum_max_workers 10: (HEAD) 39 sec VS (with patch) 29 sec
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> tables:10000
> >>>>>>>>>>>>> autovacuum_max_workers 1: (HEAD) 139 sec VS (with patch) 138 sec
> >>>>>>>>>>>>> autovacuum_max_workers 2: (HEAD) 130 sec VS (with patch) 86 sec
> >>>>>>>>>>>>> autovacuum_max_workers 3: (HEAD) 120 sec VS (with patch) 68 sec
> >>>>>>>>>>>>> autovacuum_max_workers 5: (HEAD) 96 sec VS (with patch) 41 sec
> >>>>>>>>>>>>> autovacuum_max_workers 10: (HEAD) 90 sec VS (with patch) 39 sec
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> tables:20000
> >>>>>>>>>>>>> autovacuum_max_workers 1: (HEAD) 313 sec VS (with patch) 331 sec
> >>>>>>>>>>>>> autovacuum_max_workers 2: (HEAD) 209 sec VS (with patch) 201 sec
> >>>>>>>>>>>>> autovacuum_max_workers 3: (HEAD) 227 sec VS (with patch) 141 sec
> >>>>>>>>>>>>> autovacuum_max_workers 5: (HEAD) 236 sec VS (with patch) 88 sec
> >>>>>>>>>>>>> autovacuum_max_workers 10: (HEAD) 309 sec VS (with patch) 74 sec
> >>>>>>>>>>>>> ===========================================================================
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The cases without patch, the scalability of the worker has decreased
> >>>>>>>>>>>>> as the number of tables has increased.
> >>>>>>>>>>>>> In fact, the more workers there are, the longer it takes to complete
> >>>>>>>>>>>>> VACUUM to all tables.
> >>>>>>>>>>>>> The cases with patch, it shows good scalability with respect to the
> >>>>>>>>>>>>> number of workers.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It seems a good performance improvement even without the patch of
> >>>>>>>>>>>> shared memory based stats collector.
> >>>>>>>>
> >>>>>>>> Sounds great!
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Note that perf top results showed that hash_search_with_hash_value,
> >>>>>>>>>>>>> hash_seq_search and
> >>>>>>>>>>>>> pgstat_read_statsfiles are dominant during VACUUM in all patterns,
> >>>>>>>>>>>>> with or without the patch.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Therefore, there is still a need to find ways to optimize the reading
> >>>>>>>>>>>>> of large amounts of stats.
> >>>>>>>>>>>>> However, this patch is effective in its own right, and since there are
> >>>>>>>>>>>>> only a few parts to modify,
> >>>>>>>>>>>>> I think it should be able to be applied to current (preferably
> >>>>>>>>>>>>> pre-v13) PostgreSQL.
> >>>>>>>>>>>>
> >>>>>>>>>>>> +1
> >>>>>>>>>>>>
> >>>>>>>>>>>> +
> >>>>>>>>>>>> + /* We might be better to refresh stats */
> >>>>>>>>>>>> + use_existing_stats = false;
> >>>>>>>>>>>> }
> >>>>>>>>>>>> + else
> >>>>>>>>>>>> + {
> >>>>>>>>>>>>
> >>>>>>>>>>>> - heap_freetuple(classTup);
> >>>>>>>>>>>> + heap_freetuple(classTup);
> >>>>>>>>>>>> + /* The relid has already vacuumed, so we might be better to
> >>>>>>>>>>>> use exiting stats */
> >>>>>>>>>>>> + use_existing_stats = true;
> >>>>>>>>>>>> + }
> >>>>>>>>>>>>
> >>>>>>>>>>>> With that patch, the autovacuum process refreshes the stats in the
> >>>>>>>>>>>> next check if it finds out that the table still needs to be vacuumed.
> >>>>>>>>>>>> But I guess it's not necessarily true because the next table might be
> >>>>>>>>>>>> vacuumed already. So I think we might want to always use the existing
> >>>>>>>>>>>> for the first check. What do you think?
> >>>>>>>>>>> Thanks for your comment.
> >>>>>>>>>>>
> >>>>>>>>>>> If we assume the case where some workers vacuum on large tables
> >>>>>>>>>>> and a single worker vacuum on small tables, the processing
> >>>>>>>>>>> performance of the single worker will be slightly lower if the
> >>>>>>>>>>> existing statistics are checked every time.
> >>>>>>>>>>>
> >>>>>>>>>>> In fact, at first I tried to check the existing stats every time,
> >>>>>>>>>>> but the performance was slightly worse in cases with a small number of workers.
> >>>>>>>>
> >>>>>>>> Do you have this benchmark result?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>>> (Checking the existing stats is lightweight , but at high frequency,
> >>>>>>>>>>> it affects processing performance.)
> >>>>>>>>>>> Therefore, at after refresh statistics, determine whether autovac
> >>>>>>>>>>> should use the existing statistics.
> >>>>>>>>>>
> >>>>>>>>>> Yeah, since the test you used uses a lot of small tables, if there are
> >>>>>>>>>> a few workers, checking the existing stats is unlikely to return true
> >>>>>>>>>> (no need to vacuum). So the cost of existing stats check ends up being
> >>>>>>>>>> overhead. Not sure how slow always checking the existing stats was,
> >>>>>>>>>> but given that the shared memory based stats collector patch could
> >>>>>>>>>> improve the performance of refreshing stats, it might be better not to
> >>>>>>>>>> check the existing stats frequently like the patch does. Anyway, I
> >>>>>>>>>> think it’s better to evaluate the performance improvement with other
> >>>>>>>>>> cases too.
> >>>>>>>>> Yeah, I would like to see how much the performance changes in other cases.
> >>>>>>>>> In addition, if the shared-based-stats patch is applied, we won't need to reload
> >>>>>>>>> a huge stats file, so we will just have to check the stats on
> >>>>>>>>> shared-mem every time.
> >>>>>>>>> Perhaps the logic of table_recheck_autovac could be simpler.
> >>>>>>>>>
> >>>>>>>>>>> BTW, I found some typos in comments, so attache a fixed version.
> >>>>>>>>
> >>>>>>>> The patch adds some duplicated codes into table_recheck_autovac().
> >>>>>>>> It's better to make the common function performing them and make
> >>>>>>>> table_recheck_autovac() call that common function, to simplify the code.
> >>>>>>> Thanks for your comment.
> >>>>>>> Hmm.. I've cut out the duplicate part.
> >>>>>>> Attach the patch.
> >>>>>>> Could you confirm that it fits your expecting?
> >>>>>>
> >>>>>> Yes, thanks for updataing the patch! Here are another review comments.
> >>>>>>
> >>>>>> + shared = pgstat_fetch_stat_dbentry(InvalidOid);
> >>>>>> + dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
> >>>>>>
> >>>>>> When using the existing stats, ISTM that these are not necessary and
> >>>>>> we can reuse "shared" and "dbentry" obtained before. Right?
> >>>>> Yeah, but unless autovac_refresh_stats() is called, these functions
> >>>>> read the information from the
> >>>>> local hash table without re-read the stats file, so the process is very light.
> >>>>> Therefore, I think, it is better to keep the current logic to keep the
> >>>>> code simple.
> >>>>>
> >>>>>>
> >>>>>> + /* We might be better to refresh stats */
> >>>>>> + use_existing_stats = false;
> >>>>>>
> >>>>>> I think that we should add more comments about why it's better to
> >>>>>> refresh the stats in this case.
> >>>>>>
> >>>>>> + /* The relid has already vacuumed, so we might be better to use existing stats */
> >>>>>> + use_existing_stats = true;
> >>>>>>
> >>>>>> I think that we should add more comments about why it's better to
> >>>>>> reuse the stats in this case.
> >>>>> I added comments.
> >>>>>
> >>>>> Attache the patch.
> >>>>>
> >>>>
> >>>> Thank you for updating the patch. Here are some small comments on the
> >>>> latest (v4) patch.
> >>>>
> >>>> + * So if the last time we checked a table that was already vacuumed after
> >>>> + * refres stats, check the current statistics before refreshing it.
> >>>> + */
> >>>>
> >>>> s/refres/refresh/
> >> Thanks! fixed.
> >> Attached the patch.
> >>
> >>>>
> >>>> -----
> >>>> +/* Counter to determine if statistics should be refreshed */
> >>>> +static bool use_existing_stats = false;
> >>>> +
> >>>>
> >>>> I think 'use_existing_stats' can be declared within table_recheck_autovac().
> >>>>
> >>>> -----
> >>>> While testing the performance, I realized that the statistics are
> >>>> reset every time vacuumed one table, leading to re-reading the stats
> >>>> file even if 'use_existing_stats' is true. Please refer that vacuum()
> >>>> eventually calls AtEOXact_PgStat() which calls to
> >>>> pgstat_clear_snapshot().
> >>>
> >>> Good catch!
> >>>
> >>>
> >>>> I believe that's why the performance of the
> >>>> method of always checking the existing stats wasn’t good as expected.
> >>>> So if we save the statistics somewhere and use it for rechecking, the
> >>>> results of the performance benchmark will differ between these two
> >>>> methods.
> >> Thanks for you checks.
> >> But, if a worker did vacuum(), that means this worker had determined
> >> need vacuum in the
> >> table_recheck_autovac(). So, use_existing_stats set to false, and next
> >> time, refresh stats.
> >> Therefore I think the current patch is fine, as we want to avoid
> >> unnecessary refreshing of
> >> statistics before the actual vacuum(), right?
> >
> > Yes, you're right.
> >
> > When I benchmarked the performance of the method of always checking
> > existing stats I edited your patch so that it sets use_existing_stats
> > = true even if the second check is false (i.g., vacuum is needed).
> > And the result I got was worse than expected especially in the case of
> > a few autovacuum workers. But it doesn't evaluate the performance of
> > that method rightly as the stats snapshot is cleared every time
> > vacuum. Given you had similar results, I guess you used a similar way
> > when evaluating it, is it right? If so, it’s better to fix this issue
> > and see how the performance benchmark results will differ.
> >
> > For example, the results of the test case with 10000 tables and 1
> > autovacuum worker I reported before was:
> >
> > 10000 tables:
> > autovac_workers 1 : 158s,157s, 290s
> >
> > But after fixing that issue in the third method (always checking the
> > existing stats), the results are:
>
> Could you tell me how you fixed that issue? You copied the stats to
> somewhere as you suggested or skipped pgstat_clear_snapshot() as
> I suggested?

I used the way you suggested in this quick test; skipped
pgstat_clear_snapshot().

>
> Kasahara-san seems not to like the latter idea because it might
> cause bad side effect. So we should use the former idea?

Not sure. I'm also concerned about the side effect but I've not checked yet.

Since probably there is no big difference between the two ways in
terms of performance I'm going to see how the performance benchmark
result will change first. Maybe meanwhile we can discuss on these two
choices.

Regards,

--
Masahiko Sawada
EnterpriseDB: https://www.enterprisedb.com/

In response to

Re: autovac issue with large number of tables at 2020-12-01 07:32:20 from Fujii Masao

Responses

Re: autovac issue with large number of tables at 2020-12-02 03:53:51 from Masahiko Sawada

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jürgen Purtz	2020-12-01 08:38:20	Re: Change JOIN tutorial to focus more on explicit joins
Previous Message	Drouvot, Bertrand	2020-12-01 08:29:31	Re: Add Information during standby recovery conflicts