Re: GIN Trigram Index Size

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Christian Ramseyer <rc(at)networkz(dot)ch>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: GIN Trigram Index Size
Date: 2015-09-10 04:40:43
Message-ID: CAMkU=1xn9LBp4ec88db6Xyy=NKyn+SHAEizzqgopeLWR=W5aNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Sep 9, 2015 at 2:54 PM, Christian Ramseyer <rc(at)networkz(dot)ch> wrote:

> I have read some discussions about pending list bloat issues, but there
> it was suggested that vacuuming the table should reclaim the space, and
> this does not seem to the case. The only way I found to reduce the size
> is by doing a REINDEX.
>

Vacuuming will allow the space to be reused internally. It will not
visibly shrink the index, but will mark that space as eligible for reuse.

If you have a 36GB index and a reindex would have reduced it to 15GB, then
a vacuum will leave it at 36GB but with 21GB of that as free space. The
index should then stop growing and remain at the same size for 4 days while
it fills up the internally freed space, at which point it would start
growing again at its usual rate (until you did another vacuum).

Your best bet for now might be to turn off fastupdate on that index. It
will eliminate the re-occurrence of the bloat, but might cause your
insertions to become too slow (on the other hand, it might make them faster
on average, it is hard to know without trying it). If you can't turn it
off, then you can set the table-specific autovacuum_analyze_scale_factor to
a very small value (even zero) to get autoanalyze to process the table more
often.

> The string in msg can be quite large, but average to around 200 characters:
>
> select avg(length(msg)) from (select msg from logs_09 where log_date
> between '2015-09-01' and '2015-09-08') x;
>
> avg
> ----------------------
> 199.3491688585874446
>
> Any insights on what might be happening here? Can I somehow check if the
> space is used up in sparsely filled pages or this list structures?

You can use pg_freespacemap once the vacuum (or autoanalyze) completes to
see how many pages have 0 space available, and how many of 8160 space
available:

select avail, count(*) from pg_freespace('pgbench_accounts_gin_idx') group
by avail;

This won't work while the bloat is actively happening, though. During
those times, freespacemap is not aware that the space is free, which is the
root of the problem. The newest version of pageinspect has
gin_page_opaque_info which will show those pages as being deleted, but that
version is not yet released.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message hubert depesz lubaczewski 2015-09-10 09:56:55 Re: Bugs with like_option in CREATE TABLE
Previous Message Melvin Davidson 2015-09-10 01:21:00 Re: Bugs with like_option in CREATE TABLE