GiST index for pgtrgm bloats a lot

From: hubert depesz lubaczewski <depesz(at)depesz(dot)com>
To: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: GiST index for pgtrgm bloats a lot
Date: 2015-05-18 12:12:50
Message-ID: 20150518121250.GA19976@depesz.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,
We have this situation on many servers - used versions are 9.1.14 and/or
9.3.6, on Linux servers.

There is a table, with ~ 1 million rows.
There are no deletes there, inserts happen at the rate of ~ 100-200
daily (but not spread, it's usually a single moment in day where there
happen the inserts).
There is also HUGE number of updates - up to 200,000 rows updated per
hour.

Schema of the table:
Table "schemaXX.tableYY"
Column | Type | Modifiers
--------------------------------------+-----------------------------+---------------------------------------------------------------
id | bigint | not null default nextval('schemaXX.tableYY_id_seq'::regclass)
col_a | character varying(255) |
............................... | bigint | not null
............................... | character varying(255) |
............................... | bigint |
............................... | character varying(255) | not null
............................... | character varying(255) |
............................... | timestamp without time zone |
............................... | timestamp without time zone |
............................... | bigint |
............................... | boolean |
............................... | boolean |
............................... | timestamp without time zone |
............................... | timestamp without time zone |
............................... | boolean |
............................... | text |
............................... | boolean | default false
............................... | character varying(255) |
............................... | bigint |
............................... | boolean | default true
col_c | character varying(255) |
............................... | character varying(255) | default '----'::character varying
............................... | bigint |
............................... | bigint | not null
............................... | bigint | not null
col_b | character varying(255) |
............................... | bigint |
............................... | boolean |
............................... | boolean |
............................... | bigint |
............................... | text |
............................... | boolean |
............................... | text |
............................... | boolean |
............................... | character varying(255) |
............................... | boolean |
............................... | boolean |
............................... | bigint |
............................... | character varying(255) |
............................... | text |
............................... | bigint |
............................... | text |
............................... | text |
............................... | character varying(255) |
............................... | integer |
............................... | character varying(255) |
............................... | character varying(255) |
............................... | character varying(255) |
............................... | bigint |

indexes:
There are 17 indexes in total, and the problematic is:

"index_trgm_courses_composite_search" gist (((((COALESCE(lower(col_a::text), ''::text) || ' '::text) || COALESCE(lower(col_b::text), ''::text)) || ' '::text) || COALESCE(lower(col_c::text), ''::text)) gist_trgm_ops)

On 2015-04-27 this index was reindexed. It's size (relpages as shown in
pg_class) dropped from 152698 to 45865. Since then the index is consistently
increasing in size. Today it was 343370 pages, and after reindex it dropped to
29091.

On these databases we have some (not many) 1-2 minute transactions, but usually
every transaction within 30 seconds

This table is vacuumed (autovacuum) quite a lot, usually at least 10 times a
day.

Is there anything we could do to help diagnose the problem, and fix it?
I don't think I can install custom pg version, and downtime would be
complicated to get approval, but anything about the data/database I can
check that would allow diagnosing the bug, I'd be happy to do.

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
http://depesz.com/

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2015-05-18 13:19:41 Re: GiST index for pgtrgm bloats a lot
Previous Message guettli.postgres 2015-05-18 09:11:36 BUG #13310: Please update docs of regexp_replace()