Re: Problem with slow query with WHERE conditions with OR clause on primary keys

From: David Johnston <polobo(at)yahoo(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Problem with slow query with WHERE conditions with OR clause on primary keys
Date: 2013-12-11 07:18:38
Message-ID: 1386746318622-5782822.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Krzysztof Olszewski wrote
> Hi,
>
> my sql is very simple,
> returns one row,
> where conditions are assigned to primary keys
>
>
> */select g.gd_index, gd.full_name/**/
> /**/from gd g join gd_data gd on (g.id_gd = gd.id_gd)/**/
> /**/where gd.id_gd_data = 1111 OR g.id_gd = 1111;/*
>
>
> but generates "crazy" plan with Merge Join on big amount of rows (both
> tables contains 500000 rows)
> because Index scans ignore conditions, conditions are processed after
> index sacans on Merge Join
>
> */Merge Join (cost=0.00..46399.80 rows=2 width=115) (actual
> time=3.881..644.409 rows=1 loops=1)/**/
> /**/ Merge Cond: (g.id_gd = gd.id_gd)/**/
> /**/ Join Filter: ((gd.id_gd_data = 1111) OR (g.id_gd = 1111))/**/
> /**/ -> Index Scan using pk_gd on gd g (cost=0.00..14117.79
> rows=500001 width=40) (actual time=0.019..146.521 rows=500001 loops=1)/**/
> /**/ -> Index Scan using fki_gd on gd_data gd (cost=0.00..22282.04
> rows=500001 width=99) (actual time=0.016..157.384 rows=500001 loops=1)/**/
> /**/Total runtime: 644.460 ms/*
>
>
> model is very simple
>
>
> /CREATE TABLE gd (//
> // id_gd bigint NOT NULL,//
> // gd_index character varying(60) NOT NULL,//
> // notes text,//
> // notes_exists integer NOT NULL DEFAULT 0,//
> // CONSTRAINT pk_gd PRIMARY KEY (id_gd )//
> //)//
> //
> //
> //CREATE TABLE gd_data (//
> // id_gd_data bigint NOT NULL,//
> // id_gd bigint NOT NULL,//
> // short_name character varying(120) NOT NULL,//
> // full_name character varying(512) NOT NULL,//
> // notes text,//
> // notes_exists integer NOT NULL DEFAULT 0,//
> // CONSTRAINT pk_gd_data PRIMARY KEY (id_gd_data ),//
> // CONSTRAINT fk_gd FOREIGN KEY (id_gd)//
> // REFERENCES gd (id_gd) MATCH SIMPLE//
> // ON UPDATE NO ACTION ON DELETE NO ACTION//
> //)//
> //
> //CREATE INDEX fki_gd//
> // ON gd_data//
> // USING btree//
> // (id_gd );//
> /
>
>
> my configuration from (select * from pg_settings):
>
> "server_version";"9.1.10"
>
> Thank you for your help.
>
>
> Kris Olszewski

It cannot do any better since it cannot pre-filter either table using the
where condition without risking removing rows that would meet the other
table's condition post-join.

The query you are executing makes no sense to me: I don't understand why you
would ever filter on gd.id_gd_data given the model you are showing.

I believe your understanding of your model - or the model itself - is flawed
but as you have only provided code it is impossible to pinpoint where
exactly the disconnect resides. You can either fix the model or the query -
the later by implementing sub-selects with where clauses manually - which
then encodes an assumption about your data that the current query cannot
make.

Your model implies that a single gd record can have multiple gd_data records
associated with it.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-slow-query-with-WHERE-conditions-with-OR-clause-on-primary-keys-tp5782803p5782822.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mack Talcott 2013-12-11 08:23:29 Re: Debugging shared memory issues on CentOS
Previous Message Tom Lane 2013-12-11 04:54:36 Re: Debugging shared memory issues on CentOS