Re: Index usage with slow query

From: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>
To: David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Index usage with slow query
Date: 2014-07-25 19:14:06
Message-ID: B6F6FD62F2624C4C9916AC0175D56D8828AC2D79@jenmbs01.ad.intershop.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>The presence of absence of the length limiter on a varchar will not impact
>the query plan. And I'm pretty sure you cannot even store a too long
>varchar in an index. It will error on the attempt (as opposed to
>truncating).

The max size is almost one block. After that you get an error:
FEHLER: Indexzeile benötigt 9184 Bytes, Maximalgröße ist 8191

>Looking for some advice regarding a slow query I have and indexing.
>
>I'm using postgresql 9.1 and this is my table that has around 6800000 rows:

It's worth a try to compare the planner choice in Postgres 9.3 ...

>
>CREATE TABLE mytable
>(
> class character varying,
> floor character varying,
> source_id integer,
> the_geom geometry

You could slightly improve the table definition while placing the integer column in front of the varchar.

More interesting would be to move the varchar in separate reference column and only have integers except for the geometry type.
This will make the new index smaller and faster. I bet that the planner would take that change in account.

>
>INDEX idx_source_id
> USING btree
> (source_id);
>
>INDEX idx_the_geom_gist
> USING gist
> (the_geom);
>
>
>This table is constantly hit with the below query (not always the same values in the where). The only difference between queries are the values in the where clause:
>
>SELECT the_geom,oid from mytable
>WHERE
>the_geom && ST_GeomFromText('POLYGON((529342.334095833 180696.221733333,529342.334095833 181533.44595,530964.336820833 181533.44595,530964.336820833 180696.221733333,529342.334095833 180696.221733333))',find_srid('','mytable','the_geom'))
>AND
>(floor = 'gf' AND source_id = '689' AND class = 'General')
>
>
>As the table has increased in size, this query has become slower, so I made this index:
>
>
>INDEX idx_floor_sourceid_class
> USING btree
> (floor, source_id, class);

if your query always uses these 3 columns, you should put the one with the highest cardinality first.

>
>When I run an EXPLAIN and EXPLAIN ANALYZE the query isn't using this new index.
>
>Sometimes it uses just idx_the_geom_gist
>
>other times it uses idx_the_geom_gist and idx_source_id
>
>I don't understand why it's inconsistent in its use of indexes when the query is always the same structure, just different where clause values, and I don't understand why it's not using the new index either.

The planner uses statistics on the different columns content to weight the possible query plans.
And it is good at that :)
You can help him while raising the target statistics on these 3 columns.

More difficult for the planner is to compare the advantage of the GIN index to the other one.
If some columns get toasted then the cost of detoasting seems to often be underestimated.

For the case when one of your 3 first columns has a very low cardinality, you may consider adding some partial indexes.
e.g.:
create INDEX idx_the_geom_gist_general USING gist (the_geom) where class ='general';
create INDEX idx_the_geom_gist_special USING gist (the_geom) where class ='special';

They can of course only get used when your query contains exactly the same clause.

regards,

Marc Mamin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Marc Mamin 2014-07-25 19:53:13 Re: copy/dump database to text/csv files
Previous Message Alvaro Herrera 2014-07-25 18:46:34 Re: event triggers in 9.3.4