Re: gin performance issue.

From: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "'pgsql-performance(at)postgresql(dot)org'" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: gin performance issue.
Date: 2016-02-08 10:21:53
Message-ID: B6F6FD62F2624C4C9916AC0175D56D88420E12DB@jenmbs01.ad.intershop.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Freitag, 5. Februar 2016 16:07

> > http://explain.depesz.com/s/wKv7
> > Postgres Version 9.3.10 (Linux)
> >
> > Hello,
> > this is a large daily table that only get bulk inserts (200-400 /days) with no update.
> > After rebuilding the whole table, the Bitmap Index Scan on
> > r_20160204_ix_toprid falls under 1 second (from 800)
> >
> > Fastupdate is using the default, but autovacuum is disabled on that
> > table which contains 30 Mio rows.

> Pre-9.5, it's a pretty bad idea to disable autovacuum on a GIN index,
> because then the "pending list" only gets flushed when it exceeds
> work_mem. (Obviously, using a large work_mem setting makes this
> worse.)
>
> regards, tom lane

Hello,
knowing what the problem is don't really help here:

- auto vacuum will not run as these are insert only tables
- according to this post, auto analyze would also do the job:
http://postgresql.nabble.com/Performance-problem-with-gin-index-td5867870.html
It seems that this information is missing in the doc

but it sadly neither triggers in our case as we have manual analyzes called during the dataprocesssing just following the imports.
Manual vacuum is just too expensive here.

Hence disabling fast update seems to be our only option.

I hope this problem will help push up the 9.5 upgrade on our todo list :)

Ideally, we would then like to flush the pending list inconditionally after the imports.
I guess we could achieve something approaching while modifying the analyze scale factor and gin_pending_list_limit
before/after the (bulk) imports, but having the possibility to flush it per SQL would be better.
Is this a reasonable feature wish?

And a last question: how does the index update work with bulk (COPY) inserts:
without pending list: is it like a per row trigger or will the index be cared of afterwards ?
with small pending lists : is there a concurrency problem, or can both tasks cleanly work in parallel ?

best regards,

Marc mamin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2016-02-08 14:35:58 Re: bad COPY performance with NOTIFY in a trigger
Previous Message Gustav Karlsson 2016-02-08 10:04:58 Re: Primary key index suddenly became very slow