Quick Links

Re: PostgreSQL strange query plan for my query

From:	Julien Cigar <jcigar(at)ulb(dot)ac(dot)be>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: PostgreSQL strange query plan for my query
Date:	2012-11-16 13:18:19
Message-ID:	50A63D1B.6040308@ulb.ac.be
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 11/16/2012 14:04, David Popiashvili wrote:
> All right, after some discussion on StackOverflow
> <http://stackoverflow.com/questions/13407555/postgresql-query-taking-too-long/13415984#13415984>,
> we found out that incorrect query plan is generated due to the fact
> that there is a LIMIT keyword in the query. I guess Postgresql expects
> to find appropriate rows faster and that's why it generates a seq scan
> on the table. If I remove LIMIT 1000 everything is executed in several
> milliseconds and query plan looks like this:
>
> Hash Join (cost=2662004.85..14948213.44 rows=22661658 width=138)
> (actual time=0.105..0.105 rows=0 loops=1)
> Hash Cond: (p."PaymentOrderId" = po."Id")
> -> Seq Scan on "Payments" p (cost=0.00..5724570.00 rows=350000000
> width=18) (actual time=0.018..0.018 rows=1 loops=1)
> -> Hash (cost=2583365.85..2583365.85 rows=2614480 width=120)
> (actual time=0.046..0.046 rows=0 loops=1)
> Buckets: 8192 Batches: 64 Memory Usage: 0kB
> -> Hash Join (cost=904687.05..2583365.85 rows=2614480
> width=120) (actual time=0.046..0.046 rows=0 loops=1)
> Hash Cond: (po."UserId" = u."Id")
> -> Seq Scan on "PaymentOrders" po
> (cost=0.00..654767.00 rows=40000000 width=24) (actual
> time=0.003..0.003 rows=1 loops=1)
> -> Hash (cost=850909.04..850909.04 rows=1980881
> width=96) (actual time=0.016..0.016 rows=0 loops=1)
> Buckets: 8192 Batches: 32 Memory Usage: 0kB
> -> Hash Join (cost=1.20..850909.04 rows=1980881
> width=96) (actual time=0.016..0.016 rows=0 loops=1)
> Hash Cond: (u."RoleId" = r."Id")
> -> Seq Scan on "Users" u
> (cost=0.00..718598.20 rows=30000220 width=80) (actual
> time=0.002..0.002 rows=1 loops=1)
> -> Hash (cost=1.19..1.19 rows=1 width=16)
> (actual time=0.009..0.009 rows=0 loops=1)
> Buckets: 1024 Batches: 1 Memory
> Usage: 0kB
> -> Seq Scan on "Roles" r
> (cost=0.00..1.19 rows=1 width=16) (actual time=0.009..0.009 rows=0
> loops=1)
> Filter: (("Name")::text =
> 'Moses2333'::text)
> Rows Removed by Filter: 15
> Total runtime: 0.209 ms
>
> According to Erwin Brandstetter I also tried pushing the query in a
> subquery and applying LIMIT there:
>
> SELECT *
> FROM (
> SELECT *
> FROM "Roles" AS r
> JOIN "Users" AS u ON u."RoleId" = r."Id"
> JOIN "PaymentOrders" AS po ON po."UserId" = u."Id"
> JOIN "Payments" AS p ON p."PaymentOrderId" = po."Id"
> WHERE r."Name" = 'Moses'
> ) x
> LIMIT 1000;
>
> but this solution also generates incorrect query plan. Any idea how to
> solve this query without omitting LIMIT keyword?
> Thanks
>

maybe with a CTE ?

> > Subject: RE: [PERFORM] PostgreSQL strange query plan for my query
> > Date: Fri, 16 Nov 2012 13:55:41 +0100
> > From: laurenz(dot)albe(at)wien(dot)gv(dot)at
> > To: dato0011(at)hotmail(dot)com; pgsql-performance(at)postgresql(dot)org
> >
> > David Popiashvili wrote:
> > > I have database with few hundred millions of rows. I'm running the
> > following query:
> > >
> > > select * from "Payments" as p
> > > inner join "PaymentOrders" as po
> > > on po."Id" = p."PaymentOrderId"
> > > inner join "Users" as u
> > > On u."Id" = po."UserId"
> > > INNER JOIN "Roles" as r
> > > on u."RoleId" = r."Id"
> > > Where r."Name" = 'Moses'
> > > LIMIT 1000
> > > When the where clause finds a match in database, I get the result in
> > several milliseconds, but if I
> > > modify the query and specify a non-existent r."Name" in where clause,
> > it takes too much time to
> > > complete. I guess that PostgreSQL is doing a sequential scan on the
> > Payments table (which contains the
> > > most rows), comparing each row one by one.
> > > Isn't postgresql smart enough to check first if Roles table contains
> > any row with Name 'Moses'?
> > >
> > > Roles table contains only 15 row, while Payments contains ~350
> > million.
> > >
> > > I'm running PostgreSQL 9.2.1.
> >
> > > Here'e explain analyse results: http://explain.depesz.com/s/7e7
> >
> > Can you also show the plan for the good case?
> >
> > Yours,
> > Laurenz Albe

--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

Attachment	Content-Type	Size
jcigar.vcf	text/x-vcard	292 bytes

In response to

Re: PostgreSQL strange query plan for my query at 2012-11-16 13:04:05 from David Popiashvili

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Russell Keane	2012-11-16 14:28:25	intercepting where clause on a view or other performance tweak
Previous Message	David Popiashvili	2012-11-16 13:04:05	Re: PostgreSQL strange query plan for my query