Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

From: Richard Neill <rjn(at)richardneill(dot)org>
To:
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Why does the query planner use two full indexes, when a dedicated partial index exists?
Date: 2012-12-27 10:49:58
Message-ID: 50DC27D6.10608@richardneill.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


> The partial index is highly leveraged. If every tuple in the
> table is updated once, that amounts to every tuple in the index
> being updated 25,000 times.
>
> How so? That sounds like O(n_2) behaviour.
>
> If the table has 5 million rows while the index has 200 (active) rows at
> any given time, then to update every row in the table to null and back
> again would be 100% turn over of the table. But each such change would
> lead to an addition and then a deletion from the index. So 100%
> turnover of the table would be a 5 million / 200 = 25,000 fold turn of
> the index.

Sorry, I was being dense. I misread that as:
"every time a single tuple in the table is updated, the entire index
(every row) is updated".
Yes, of course your explanation makes sense.

>
> There is some code that allows a btree index entry to get killed (and so
> the slot to be reused) without any vacuum, if a scan follows that entry
> and finds the corresponding tuple in the table no longer visible to
> anyone. I have not examined this code, and don't know whether it is
> doing its job but just isn't enough to prevent the bloat, or if for some
> reason it is not applicable to your situation.
>

It looks like my solution is going to be a REINDEX invoked from cron, or
maybe just every 100k inserts.

In terms of trying to improve this behaviour for other PG users in the
future, are there any more diagnostics I can do for you? Having found a
special case, I'd like to help permanently resolve it if I can.

Thanks very much again.

Best wishes,

Richard

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Neill 2012-12-27 10:52:01 Re: Why does the query planner use two full indexes, when a dedicated partial index exists?
Previous Message Pavel Stehule 2012-12-27 05:40:23 Re: Performance on Bulk Insert to Partitioned Table