Re: slow joins?

From: Julien Cigar <jcigar(at)ulb(dot)ac(dot)be>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: slow joins?
Date: 2013-04-06 10:53:37
Message-ID: 515FFEB1.4040606@ulb.ac.be
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

try to increase cpu_tuple_cost to 0.1

On 04/06/2013 03:50, Joe Van Dyk wrote:
> If I disable sequential scans, hash joins, and merge joins, the query
> plans become the same and performance on the first slow one is much
> improved.
>
> Is there something else I can do to avoid this problem?
>
> below also at
> https://gist.github.com/joevandyk/34e31b3ad5cccb730a50/raw/8081a4298ba50ac93a86df97c1d0aae482ee7d2d/gistfile1.txt
>
> Aggregate (cost=869360.53..869360.54 rows=1 width=0) (actual
> time=103.102..103.102 rows=1 loops=1)
> -> Nested Loop (cost=0.00..869164.63 rows=78360 width=0) (actual
> time=0.253..101.708 rows=8413 loops=1)
> -> Nested Loop (cost=0.00..438422.95 rows=56499 width=4)
> (actual time=0.157..51.766 rows=8178 loops=1)
> -> Index Scan using index_products_on_drop_shipper_id
> on products (cost=0.00..2312.56 rows=618 width=4) (actual
> time=0.087..6.318 rows=618 loops=1)
> Index Cond: (drop_shipper_id = 221)
> -> Index Scan using index_line_items_on_product_id on
> line_items li (cost=0.00..702.89 rows=279 width=8) (actual
> time=0.010..0.069 rows=13 loops=618)
> Index Cond: (product_id = products.id
> <http://products.id>)
> -> Index Only Scan using purchased_items_line_item_id_idx on
> purchased_items pi (cost=0.00..7.60 rows=2 width=4) (actual
> time=0.005..0.005 rows=1 loops=8178)
> Index Cond: (line_item_id = li.id <http://li.id>)
> Heap Fetches: 144
> Total runtime: 103.442 ms
> (11 rows)
>
>
>
> On Fri, Apr 5, 2013 at 6:38 PM, Joe Van Dyk <joe(at)tanga(dot)com
> <mailto:joe(at)tanga(dot)com>> wrote:
>
> On 9.2.4, running two identical queries except for the value of a
> column in the WHERE clause. Postgres is picking very different
> query plans, the first is much slower than the second.
>
> Any ideas on how I can speed this up? I have btree indexes for
> all the columns used in the query.
>
> explain analyze
> SELECT COUNT(*)
> FROM purchased_items pi
> inner join line_items li on li.id <http://li.id> = pi.line_item_id
> inner join products on products.id <http://products.id> =
> li.product_id
> WHERE products.drop_shipper_id = 221;
>
> Aggregate (cost=193356.31..193356.32 rows=1 width=0) (actual
> time=2425.225..2425.225 rows=1 loops=1)
> -> Hash Join (cost=78864.43..193160.41 rows=78360 width=0)
> (actual time=726.612..2424.206 rows=8413 loops=1)
> Hash Cond: (pi.line_item_id = li.id <http://li.id>)
> -> Seq Scan on purchased_items pi (cost=0.00..60912.39
> rows=3724639 width=4) (actual time=0.008..616.812 rows=3724639
> loops=1)
> -> Hash (cost=77937.19..77937.19 rows=56499 width=4)
> (actual time=726.231..726.231 rows=8178 loops=1)
> Buckets: 4096 Batches: 4 Memory Usage: 73kB
> -> Hash Join (cost=1684.33..77937.19 rows=56499
> width=4) (actual time=1.270..723.222 rows=8178 loops=1)
> Hash Cond: (li.product_id = products.id
> <http://products.id>)
> -> Seq Scan on line_items li
> (cost=0.00..65617.18 rows=2685518 width=8) (actual
> time=0.081..392.926 rows=2685499 loops=1)
> -> Hash (cost=1676.60..1676.60 rows=618
> width=4) (actual time=0.835..0.835 rows=618 loops=1)
> Buckets: 1024 Batches: 1 Memory
> Usage: 22kB
> -> Bitmap Heap Scan on products
> (cost=13.07..1676.60 rows=618 width=4) (actual time=0.185..0.752
> rows=618 loops=1)
> Recheck Cond: (drop_shipper_id = 221)
> -> Bitmap Index Scan on
> index_products_on_drop_shipper_id (cost=0.00..12.92 rows=618
> width=0) (actual time=0.125..0.125 rows=618 loops=1)
> Index Cond:
> (drop_shipper_id = 221)
> Total runtime: 2425.302 ms
>
>
> explain analyze
> SELECT COUNT(*)
> FROM purchased_items pi
> inner join line_items li on li.id <http://li.id> = pi.line_item_id
> inner join products on products.id <http://products.id> =
> li.product_id
> WHERE products.drop_shipper_id = 2;
>
>
> Aggregate (cost=29260.40..29260.41 rows=1 width=0) (actual
> time=0.906..0.906 rows=1 loops=1)
> -> Nested Loop (cost=0.00..29254.38 rows=2409 width=0)
> (actual time=0.029..0.877 rows=172 loops=1)
> -> Nested Loop (cost=0.00..16011.70 rows=1737 width=4)
> (actual time=0.021..0.383 rows=167 loops=1)
> -> Index Scan using
> index_products_on_drop_shipper_id on products (cost=0.00..80.41
> rows=19 width=4) (actual time=0.010..0.074 rows=70 loops=1)
> Index Cond: (drop_shipper_id = 2)
> -> Index Scan using index_line_items_on_product_id
> on line_items li (cost=0.00..835.70 rows=279 width=8) (actual
> time=0.002..0.004 rows=2 loops=70)
> Index Cond: (product_id = products.id
> <http://products.id>)
> -> Index Only Scan using
> purchased_items_line_item_id_idx on purchased_items pi
> (cost=0.00..7.60 rows=2 width=4) (actual time=0.002..0.003 rows=1
> loops=167)
> Index Cond: (line_item_id = li.id <http://li.id>)
> Heap Fetches: 5
> Total runtime: 0.955 ms
> (11 rows)
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2013-04-06 14:22:29 Re: slow joins?
Previous Message Franck Routier 2013-04-06 09:05:11 Re: What happens between end of explain analyze and end of query execution ?