Quick Links

Re: Re: Join between 2 tables always executes a sequential scan on the larger table

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Igor Neyman <ineyman(at)perceptron(dot)com>
Cc:	Dieter Rehbein <dieter(dot)rehbein(at)skiline(dot)cc>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Re: Join between 2 tables always executes a sequential scan on the larger table
Date:	2013-04-02 15:45:46
Message-ID:	2515.1364917546@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Igor Neyman <ineyman(at)perceptron(dot)com> writes:
> The output of explain analyze (Postgres 9.2.3):

> Hash Left Join (cost=111357.64..126222.29 rows=41396 width=42) (actual time=1982.543..2737.331 rows=41333 loops=1)
> Hash Cond: ((uc.user_id)::text = (u.id)::text)
> -> Seq Scan on user_2_competition uc (cost=0.00..4705.21 rows=41396 width=33) (actual time=0.019..89.691 rows=41333 loops=1)
> Filter: ((competition_id)::text = '3cc1cb9b3ac132ad013ad01316040001'::text)
> Rows Removed by Filter: 80684
> -> Hash (cost=90074.73..90074.73 rows=999673 width=42) (actual time=1977.604..1977.604 rows=999673 loops=1)
> Buckets: 2048 Batches: 128 Memory Usage: 589kB
> -> Seq Scan on "user" u (cost=0.00..90074.73 rows=999673 width=42) (actual time=0.004..1178.827 rows=999673 loops=1)
> Total runtime: 2740.723 ms

> I expected to see an index-scan on user_2_competition with a hash join to user, not a sequential scan on user. I've tried this with Postgres 9.1 and 9.2.3).

According to the numbers, you're fetching about a third of the
user_2_competition table, which is well past the point where an
indexscan is of any use. It's picking the seqscan because it thinks
that's faster, and I'm sure it's right.

The aspect of this plan that actually seems a bit dubious is that it's
hashing the larger input table rather than the smaller one. There's
a thread going on about that in -hackers right now; we think it's
probably putting too much emphasis on the distribution of the join key
as opposed to the size of the table.

One thing that would help is increasing work_mem --- it looks like you
are using the default 1MB. Cranking that up to a few MB would reduce
the number of hash batches needed.

regards, tom lane

In response to

Re: Join between 2 tables always executes a sequential scan on the larger table at 2013-04-02 14:55:55 from Igor Neyman

Responses

Re: Join between 2 tables always executes a sequential scan on the larger table at 2013-04-03 08:24:54 from Dieter Rehbein

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Armand du Plessis	2013-04-02 18:16:03	Re: Problems with pg_locks explosion
Previous Message	Igor Neyman	2013-04-02 14:55:55	Re: Join between 2 tables always executes a sequential scan on the larger table