Re: incorrect row estimates for primary key join

From: Julien Cigar <jcigar(at)ulb(dot)ac(dot)be>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: incorrect row estimates for primary key join
Date: 2013-06-25 14:43:56
Message-ID: 51C9ACAC.5080805@ulb.ac.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 06/25/2013 00:18, Ben wrote:
> hello postgresql experts --
>
> i have a strange row estimate problem, which unfortunately i have trouble reproducing for anything but very large tables which makes boiling down to a simple example hard. i'm using version 9.1.1, all tables have had analyze run on them.
>
> here's the example : i have a large table (billions of rows) with a five column primary key and a sixth value column. for confidentiality purposes i can't reproduce the exact schema here, but it is essentially
>
> create table bigtable (
> id1 integer not null,
> id2 date not null,
> id3 integer not null,
> id4 time not null,
> id5 integer not null,
> value real not null,
> primary key (id1, id2, id3, id4, id5)
> );
>
> for various reasons there is only one id1 in the table right now, though in the future there will be more; also the primary key was added after table creation with alter table, though that shouldn't matter.
>
> i need to select out a somewhat arbitrary collection of rows out of bigtable. to do so i generate a temporary table
>
> create table jointable (
> id1 integer not null,
> id2 date not null,
> id3 integer not null,
> id4 time not null,
> id5 integer not null
> );
>
> and then perform a join against this table.
>
> if jointable doesn't have many rows, the planner picks a nested loop over jointable and a primary key lookup on bigtable. in the following, for expository purposes, jointable has 10 rows. we can see the planner knows this.
>
> explain select * from bigtable join jointable using (id1, id2, id3, id4, id5);
>
> Nested Loop (cost=0.00..6321.03 rows=145 width=28)
> -> Seq Scan on jointable (cost=0.00..1.10 rows=10 width=24)
> -> Index Scan using bigtable_pkey on bigtable (cost=0.00..631.97 rows=1 width=28)
> Index Cond: ((id1 = jointable.id1) AND (id2 = jointable.id2) AND (id3 = jointable.id3) AND (id4 = jointable.id4) AND (vid = foo.vid))
> (4 rows)
>
> as you increase the number of rows in jointable, the planner switches to a sort + merge. in this case jointable has roughly 2 million rows.
>
> Merge Join (cost=727807979.29..765482193.16 rows=18212633 width=28)
> Merge Cond: ((bigtable.id1 = jointabe.id1) AND (bigtable.id2 = jointable.id2) AND (bigtable.id3 = jointable.id3) AND (bigtable.id4 = bigtable.id4) AND (bigtable.id5 = bigtable.id5))
> -> Sort (cost=727511795.16..735430711.00 rows=3167566336 width=28)
> Sort Key: bigtable.id3, bigtable.id1, bigtable.id2, bigtable.id4, bigtable.id5
> -> Seq Scan on bigtable (cost=0.00..76085300.36 rows=3167566336 width=28)
> -> Materialize (cost=295064.70..305399.26 rows=2066911 width=24)
> -> Sort (cost=295064.70..300231.98 rows=2066911 width=24)
> Sort Key: jointable.id3, jointable.id1, jointable.id2, jointable.id4, jointable.id5
> -> Seq Scan on jointable (cost=0.00..35867.11 rows=2066911 width=24)
> (9 rows)

can you show us the explain analyze version ?

> the choice of sort + merge is really bad here, given the size of bigtable (3 billion rows and counting.)
>
> some questions :
>
> 1 - first off, why isn't the sort happening on the primary key, so that bigtable does not have to be sorted?
>
> 2 - more importantly, since we are joining on the primary key, shouldn't the row estimate coming out of the join be limited by the number of rows in jointable?
>
> for example, it is strange to see that if i put in a non-limiting limit statement (something bigger than the number of rows in jointable) it switches back to a nested loop + index scan :
>
> explain select * from bigtable join jointable using (id1, id2, id3, id4, id5) limit 2500000;
>
> Limit (cost=0.00..178452647.11 rows=2500000 width=28)
> -> Nested Loop (cost=0.00..1306127545.35 rows=18297957 width=28)
> -> Seq Scan on jointable (cost=0.00..35867.11 rows=2066911 width=24)
> -> Index Scan using bigtable_pkey on bigtable (cost=0.00..631.88 rows=1 width=28)
> Index Cond: ((id1 = jointable.id1) AND (id2 = jointable.id2) AND (id3 = jointable.id3) AND (id4 = jointable.id4) AND (id5 = jointable.id5))
> (5 rows)
>
> am i not understanding the query planner, or is this a known issue in the query planner, or have i stumbled onto something amiss? unfortunately any synthetic examples i was able to make (up to 10 million rows) did not exhibit this behavior, which makes it hard to test.
>
> best regards, ben
>
>
>

--
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ben 2013-06-25 18:27:38 Re: incorrect row estimates for primary key join
Previous Message Kevin Grittner 2013-06-25 13:20:29 Re: incorrect row estimates for primary key join