Re: GiST index for pgtrgm bloats a lot

From: hubert depesz lubaczewski <depesz(at)depesz(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: GiST index for pgtrgm bloats a lot
Date: 2015-05-18 13:36:48
Message-ID: 20150518133648.GA22898@depesz.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, May 18, 2015 at 04:19:41PM +0300, Heikki Linnakangas wrote:
> Are the columns that are included in the bloated index also updated that
> often? If not, I'd suggest moving those columns to a separate table with a
> one-to-one relationship to the main table. Or perhaps just create a helper
> table that contains copies of those columns, and keep it up-to-date with
> triggers. That would reduce the churn in the index.

I'm not sure if this is possible, but will look into it.

> 1. When GiST has multiple equally good choices where it could insert a
> tuple, it favours branches that are "earlier" in the index. Always
> descending down the same branch is good for cache efficiency when you insert
> multiple items with similar keys, but the downside is that the other
> branches can easily have a lot of free space that goes unused, while the
> "hot" branch just gets split repeatedly. This is explained in the comments
> in the gistchoose() function. That leads to bloat, because the free space
> isn't being used.

This looks related. Too bad that we don't have any other way to handle
it, but at the very least I know it's not some "bug", it's just
a missing feature.

Thanks a lot,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
http://depesz.com/

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David G. Johnston 2015-05-18 16:19:46 Re: BUG #13310: Please update docs of regexp_replace()
Previous Message Heikki Linnakangas 2015-05-18 13:19:41 Re: GiST index for pgtrgm bloats a lot