Re: optimizing a query

From: Jonathan Vanasco <postgres(at)2xlp(dot)com>
To: PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: optimizing a query
Date: 2016-06-22 18:23:55
Message-ID: D6AF3833-B022-4DE6-B1C1-E9290E639A9F@2xlp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Jun 22, 2016, at 4:25 AM, Erik Gustafson wrote:

> don't you want an index on t_a2b.col_a, maybe partial where col_a=1 ?

that table has indexes on all columns. they're never referenced because the rows are so short. this was just an example query too, col_a has 200k variations

After a lot of testing, I think I found a not-bug but possible area-for-improvement in the planner when joining against a table for filtering (using my production 9.5.2 box)

I checked a query against multiple possible indexes using the related columns. only one of indexes was on the table for each series of tests, and I analyzed the table after the drop/create of indexes.

Note 1: The only time an index-only scan is used, is on this form:

CREATE INDEX idx_partial_fkey_id_partial ON table_a(fkey_1, id, col_partial) WHERE col_partial IS NOT FALSE;

Omitting the col_partial from being indexed will trigger a Bitmap Heap Scan on the full table with a recheck condition:

CREATE INDEX idx_partial_fkey_id ON table_a(fkey_1, id) WHERE col_partial IS NOT FALSE;

This shouldn't be necessary. the planner knew that `col_partial` fulfilled the WHERE clause when it used the index, but scanned the table to check it anyways.

On most tables the heap scan was negligible, but on a few larger tables it accounted a 20% increase in execution.

Note 2:

This is odd, but this index is used by the planner:
CREATE INDEX idx_partial_fkey_id ON table_a(fkey_1, id) WHERE col_partial IS NOT FALSE;

but this index is never used:
CREATE INDEX idx_partial_id_fkey ON table_a(id, fkey_1) WHERE col_partial IS NOT FALSE;

I honestly don't know why the second index would not be used. The query time doubled without it when run on a table with 6million rows and about 20 columns.

-------

The indexes I tested on:

CREATE INDEX idx_fkey_1 ON table_a(fkey_1);
CREATE INDEX idx_partial_fkey ON table_a(fkey_1) WHERE col_partial IS NOT FALSE;
CREATE INDEX idx_partial_fkey_id ON table_a(fkey_1, id) WHERE col_partial IS NOT FALSE;
CREATE INDEX idx_partial_id_fkey ON table_a(id, fkey_1) WHERE col_partial IS NOT FALSE;
CREATE INDEX idx_partial_fkey_partial ON table_a(fkey_1, col_partial) WHERE col_partial IS NOT FALSE;
CREATE INDEX idx_partial_fkey_id_partial ON table_a(fkey_1, id, col_partial) WHERE col_partial IS NOT FALSE;

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michelle Schwan 2016-06-22 18:36:15 Postgres 9.5.2 upgrade to 9.6
Previous Message John R Pierce 2016-06-22 17:51:50 Re: Protect a table against concurrent data changes while allowing to vacuum it