Re: Setting Statistics on Functional Indexes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: sthomas(at)optionshouse(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Setting Statistics on Functional Indexes
Date: 2012-10-26 22:01:38
Message-ID: 3989.1351288898@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Claudio Freire <klaussfreire(at)gmail(dot)com> writes:
> Because once you've accessed that last index page, it would be rather
> trivial finding out how many duplicate tids are in that page and, with
> a small CPU cost (no disk access if you don't query other index pages)
> you could verify the assumption of near-uniqueness.

I thought about that too, but I'm not sure how promising the idea is.
In the first place, it's not clear when to stop counting duplicates, and
in the second, I'm not sure we could get away with not visiting the heap
to check for tuple liveness. There might be a lot of apparent
duplicates in the index that just represent unreaped old versions of a
frequently-updated endpoint tuple. (The existing code is capable of
returning a "wrong" answer if the endpoint tuple is dead, but I don't
think it matters much in most cases. I'm less sure such an argument
could be made for dup-counting.)

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Claudio Freire 2012-10-26 22:04:56 Re: Setting Statistics on Functional Indexes
Previous Message Claudio Freire 2012-10-26 21:19:05 Re: Setting Statistics on Functional Indexes