Re: Why does the query planner use two full indexes, when a dedicated partial index exists?

From: Richard Neill <rn214(at)richardneill(dot)org>
To: Richard Neill <rn214(at)richardneill(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why does the query planner use two full indexes, when a dedicated partial index exists?
Date: 2012-12-21 03:16:20
Message-ID: 50D3D484.6080100@richardneill.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 21/12/12 02:34, Richard Neill wrote:
>
> Reindexing only takes a couple of seconds, and restores correctness.
>

Interestingly, the partial index (after reindexing) is only 16kB in
size; whereas the table is 1.1 GB, and the normal single-column indexes
are about 250MB in size.

In terms of what's physically happening in reality,

- tbl_tracker keeps a record of all books that move through the system
over a period of one month (at a rate of about 20/second, or 1
million/day), after which they are deleted.

- the partial index, tbl_tracker_performance_1_idx tracks only those
books which are currently "in flight" - books remain in flight for
about 200 seconds as they go round the machine.
(While in flight, these have exit_state = NULL)

- the partial index is used to overcome a design defect(*) in the
sorter machine, namely that it doesn't number each book uniquely,
but wraps the parcel_id_code every few hours. Worse, some books can
remain on the sorter for several days (if they jam), so the numbering
isn't a clean "wraparound", but more like a fragmented (and
occasionally lossy) filesystem.

- What I'm trying to do is trace the history of the books
through the system and assign each one a proper unique id.
So, if I see a book with "parcel_id_code = 37",
is it a new book (after pid wrap), or is it the same book I saw 1
minute ago, that hasn't exited the sorter?

So... is there some way to, for example, set a trigger that will reindex
every time the index exceeds 1000 rows?

Richard

(*)Readers of The Daily WTF might appreciate another curious anomaly:
this machine originally had an RS-232 port; it now uses ethernet, but
TxD and RxD use different TCP sockets on different network ports!

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2012-12-21 03:24:09 Re: Performance on Bulk Insert to Partitioned Table
Previous Message Richard Neill 2012-12-21 02:34:44 Re: Why does the query planner use two full indexes, when a dedicated partial index exists?