Re: When do vacuumed pages/tuples become available for reuse?

From: rihad <rihad(at)mail(dot)ru>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: When do vacuumed pages/tuples become available for reuse?
Date: 2019-04-11 19:04:40
Message-ID: 8aa21502-aed5-2114-dbb0-e2e03b7557eb@mail.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 04/11/2019 10:13 PM, Jeff Janes wrote:
> On Thu, Apr 11, 2019 at 12:18 PM rihad <rihad(at)mail(dot)ru
> <mailto:rihad(at)mail(dot)ru>> wrote:
>
> On 04/11/2019 08:09 PM, Jeff Janes wrote:
>> On Thu, Apr 11, 2019 at 11:44 AM rihad <rihad(at)mail(dot)ru
>> <mailto:rihad(at)mail(dot)ru>> wrote:
>>
>>
>> Since we dump&restore production DB daily into staging
>> environment, the difference in size (as reported by psql's
>> \l+) is 11GB in a freshly restored DB as opposed to 70GB in
>> production.
>>
>>
>> Yeah, that seems like a problem.  Do you have long lived
>> transactions/snapshots that are preventing vacuuming from
>> removing dead tuples? You can run a manual "vacuum verbose" and
>> see how many dead but nonremovable tuples there were, or
>> set log_autovacuum_min_duration to some non-negative value less
>> than the autovac takes, and do the same.
>>
> vacuum frees tuples just fine. It's just that by the time each run
> finishes many more accumulate due to table update activity, ad
> nauseum. So this unused space constantly grows. Here's a sample
> autovacuum run:
>
> 2019-04-11 19:39:44.450841500 [] LOG:  automatic vacuum of table
> "foo.public.bar": index scans: 1
> 2019-04-11 19:39:44.450843500   pages: 0 removed, 472095 remain, 4
> skipped due to pins, 39075 skipped frozen
> 2019-04-11 19:39:44.450844500   tuples: 19150 removed, 2725811
> remain, 465 are dead but not yet removable
> 2019-04-11 19:39:44.450845500   buffer usage: 62407557 hits,
> 6984769 misses, 116409 dirtied
> 2019-04-11 19:39:44.450846500   avg read rate: 16.263 MB/s, avg
> write rate: 0.271 MB/s
> 2019-04-11 19:39:44.450847500   system usage: CPU 59.05s/115.26u
> sec elapsed 3355.28 sec
>
>
> This data doesn't seem to support either one of our theories.  "Dead
> but not yet removable" is low.  But "removed" also seems pretty low. 
> Is 19,150 really the number of updates you think occur over the course
> of an hour which causes the problem you are seeing?  Updates that
> happened during one vacuum should be cleanly caught by the next one,
> so you should only see a steady state of bloat, not unbounded increase.
>
> But your buffer usage being 132 time the number of pages in the table
> suggests it is your indexes, not your table, which are bloated.
> How many indexes do you have, and of what type?  Index pages can only
> get reused when they become completely empty, or when a new indexed
> value fits into (or near) the key-space that that page already
> covers.  So if the key space for new tuples is constantly migrating
> around and your pages never become absolutely empty, you can get
> unbounded bloat in the indexes.

Way to many indexes. I'm going to have a hard time convincing our
programmers to get rid of any of them )

>
> Can you compare the sizes object by object between the live and the
> stage, taking care not to include index (or toast) size into the size
> of their parent table?

You're right, it's mostly indexes that are bloated.

Staging:
# select pg_size_pretty(pg_relation_size('foo'));
 pg_size_pretty
----------------
 2924 MB
(1 row)

# select pg_size_pretty(pg_indexes_size('foo'));
 pg_size_pretty
----------------
 1958 MB
(1 row)

Prod:

# select pg_size_pretty(pg_relation_size('foo'));
 pg_size_pretty
----------------
 3688 MB
(1 row)

# select pg_size_pretty(pg_indexes_size('foo'));
 pg_size_pretty
----------------
 60 GB
(1 row)

>
>>
>> Also, what does pg_freespace
>> (https://www.postgresql.org/docs/current/pgfreespacemap.html)
>> show about the available of space in the table? How
>> about pgstattuple
>> (https://www.postgresql.org/docs/current/pgstattuple.html)
> Thanks, I'll try those. But as I said freshly restored DB is only
> 11GB in size, not 70 (only public schema is used).
>
>
> Yeah, but we need to know **why** that extra 59GB  is not being
> reused, not simply the fact that it isn't being reused.  If it isn't
> listed as free in the freespace map, then PostgreSQL might not know
> how to find it in order to reuse it, for example. But now that I think
> it is the indexes, not the table, that is bloated I would chase that
> part down first.  No point checking the freespace of the table proper
> if the problem is with the indexes.
> Cheers,
>
> Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Janes 2019-04-11 19:39:15 Re: shared_buffers on Big RAM systems
Previous Message Jeff Janes 2019-04-11 19:02:16 Re: When do vacuumed pages/tuples become available for reuse?