Re: Millions of tables

From: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: "Alex Ignatov (postgrespro)" <a(dot)ignatov(at)postgrespro(dot)ru>, Terry Schmitt <tschmitt(at)schmittworks(dot)com>, "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Millions of tables
Date: 2016-10-05 12:34:02
Message-ID: CAEtnbpXbKBR1Hp_prQWNMbCDLVWn0Upo6V3_c83f+XFJ-aKNQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Sep 30, 2016 at 4:49 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:

> On 9/29/16 6:11 AM, Alex Ignatov (postgrespro) wrote:
>
>> With millions of tables you have to set autovacuum_max_workers
>> sky-high =). We have some situation when at thousands of tables
>> autovacuum can’t vacuum all tables that need it. Simply it vacuums some
>> of most modified table and never reach others. Only manual vacuum can
>> help with this situation. With wraparound issue it can be a nightmare
>>
>
> Specifically, autovac isn't going to start worrying about anti-wrap
> vacuums until tables start hitting autovacuum_freeze_max_age (or
> autovacuum_multixact_freeze_max_age). Any tables that hit that threshold
> go to the front of the line for being vacuumed. (But keep in mind that
> there is no universal line, just what each worker computes on it's own when
> it's started).
>
> Where things will completely fall apart for you is if a lot of tables all
> have roughly the same relfrozenxid (or relminmxid), like they would
> immediately after a large load. In that scenario you'll suddenly have loads
> of work for autovac to do, all at the same time. That will make the
> database, DBAs and you Very Unhappy (tm).
>
> Somehow, some way, you *must* do a vacuum of the entire database. Luckily
> the freeze map in 9.6 means you'd only have to do that one time (assuming
> the data really is static). In any older version, (auto)vacuum will need to
> eventually *read everything in every table* at least once every ~2B
> transactions.
>

Data is not static. The 4M tables fall into one of two groups.

Group A contains 2M tables. INSERT will occur ~100 times/day and maximum
number of records anticipated will be 200k. Periodic DELETE's will occur
removing "old" records. Age is something the client sets and I have no way
of saying 1 or 10k records will be removed.

Group B contains the other 2M tables. Maximum records ~140k and UPSERT
will be the only mechanism used to populate and maintain. Periodic
DELETE's may run on these tables as well removing "old" records.

Will a set of tables require vacuum'ing at the same time? Quite possibly
but I have no way to say 2 or 200k tables will need it.

When you say "must do a vacuum of the entire database", are you saying the
entire database must be vacuum'd as a whole per 2B transactions or all
tables must be vacuum'd eventually at least once? I want to be absolutely
clear on what you're saying.

> There is one potentially significant difference between autovac and manual
> vacuums here; autovac treats toast tables as just another table, with their
> own stats and their own freeze needs. If you're generating a lot of toast
> records that might make a difference.
>

I do not anticipate TOAST entering the picture. No single column or record
> 8KB or even approaching it. We have a few databases that (ab)use pg_toast
and I want to avoid those complications.

-Greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim Nasby 2016-10-08 21:42:01 Re: Millions of tables
Previous Message Gavin Flower 2016-10-04 04:20:27 Re: MYSQL Stats