Re: Merge Join chooses very slow index scan

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jake Magner <jakemagner90(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Merge Join chooses very slow index scan
Date: 2015-03-19 06:42:11
Message-ID: CAFj8pRBqBTfP+bhatA103WzYv5Mcik7CLQhCtT4HGCCrkr_WTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi

what is your random_page_cost and seq_page_cost?

Regards

Pavel Stehule

2015-03-19 7:23 GMT+01:00 Jake Magner <jakemagner90(at)gmail(dot)com>:

> I am having problems with a join where the planner picks a merge join and
> an
> index scan on one of the tables. Manually disabling merge joins and running
> the query both ways shows the merge join takes over 10 seconds while a hash
> join takes less than 100ms. The planner total cost estimate favors the
> merge
> join, but the cost estimate for the index scan part is greater than the
> total cost estimate by a factor of 300x. My understanding of how this can
> occur is that it expects it won't actually have to scan all the rows,
> because using the histogram distribution stats it can know that all the
> relevant rows of the join column will be at the beginning of the scan. But
> in practice it appears to actually be index scanning all the rows, showing
> massive amounts of page hits. What is also confusing is that the planner
> estimate of the number of rows that match the second join condition is
> accurate and very low, so I would expect it to index scan on that column's
> index instead. Pasted at the bottom is the explain plan for the query and
> some other variations I think might be relevant. The table/index names are
> obfuscated. I ran ANALYZE on all the tables in the query first. All the
> pages are cached in the explain plans but we wouldn't expect that to be
> true
> in the production system. There are btree indexes on all the columns in
> both
> the join conditions and the filters.
>
> Searching, I found this thread
> http://postgresql.nabble.com/merge-join-killing-performance-td2076433.html
> which sounds kind of similar, but there are no Nulls in this table.
>
> Thanks for your help.
>
>
>
> Postgres version info: PostgreSQL 9.1.13 on x86_64-unknown-linux-gnu,
> compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
>
>
> -----------------------
> Original Query
>
> The estimated cost for Index Scan is 898k but the total cost estimate is
> 2.6k. The planner has a good estimate of the number of rows, 1335, for the
> index scan, but by the number of page hits (8M) it appears it actually
> scanned the entire table which has about 8M rows.
> -----------------------
> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM vehicles v LEFT JOIN usagestats ON
> v.id = tid AND type = 'vehicle';
>
> QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Merge Right Join (cost=593.28..2634.10 rows=4155 width=619) (actual
> time=9.150..11464.949 rows=4155 loops=1)
> Merge Cond: (usagestats.tid = s.id)
> Buffers: shared hit=8063988
> -> Index Scan using usagestats_tid_idx on usagestats
> (cost=0.00..898911.91 rows=1335 width=37) (actual time=0.027..11448.789
> rows=2979 loops=1)
> Filter: ((type)::text = 'vehicle'::text)
> Buffers: shared hit=8063686
> -> Sort (cost=593.28..603.67 rows=4155 width=582) (actual
> time=9.108..10.429 rows=4155 loops=1)
> Sort Key: s.id
> Sort Method: quicksort Memory: 1657kB
> Buffers: shared hit=302
> -> Seq Scan on vehicles v (cost=0.00..343.55 rows=4155
> width=582)
> (actual time=0.014..2.917 rows=4155 loops=1)
> Buffers: shared hit=302
> Total runtime: 11466.122 ms
> (13 rows)
>
> ------------------------
> Change the type='vehicle' condition to an always true condition
>
> If we change the filter from "type = 'vehicle'" (True for a small fraction
> of the rows) to "freq > -1" (True for all rows) then the plan is the same,
> but the actual time and page hits are much less and the query returns is
> fast.
> ------------------------
> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM vehicle v LEFT JOIN usagestats ON
> (v.id = tid AND freq > -1);
>
>
> QUERY PLAN
>
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Merge Right Join (cost=593.28..2434.79 rows=7733 width=619) (actual
> time=5.635..59.852 rows=17096 loops=1)
>
> Merge Cond: (usagestats.tid = v.id)
>
> Buffers: shared hit=17653
>
> -> Index Scan using usagestats_tid_idx on usagestats
> (cost=0.00..898914.00 rows=8006976 width=37) (actual time=0.010..34.075
> rows=17225 loops=1)
>
> Filter: (freq > (-1))
>
> Buffers: shared hit=17351
>
> -> Sort (cost=593.28..603.67 rows=4155 width=582) (actual
> time=5.617..9.351 rows=17094 loops=1)
>
> Sort Key: v.id
>
> Sort Method: quicksort Memory: 1657kB
>
> Buffers: shared hit=302
>
> -> Seq Scan on vehicle v (cost=0.00..343.55 rows=4155 width=582)
> (actual time=0.009..1.803 rows=4157 loops=1)
>
> Buffers: shared hit=302
>
> Total runtime: 62.868 ms
>
> (13 rows)
>
>
> ----------------------
> Original Query with merge joins disabled
>
> If we manually disable merge joins and run the original query we get a hash
> join with what seems like
> a more reasonable index scan on the more selective type column. The total
> cost estimate is higher than the merge join plan, but lower than the cost
> estimate for the index scan in the merge join query.
> ---------------------
> BEGIN;
> SET LOCAL enable_mergejoin = off;
>
> EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM vehicle v LEFT JOIN usagestats ON
> v.id = tid AND type = 'vehicle';
>
> QUERY PLAN
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Hash Right Join (cost=395.49..5158.10 rows=4155 width=619) (actual
> time=8.038..20.886 rows=4155 loops=1)
> Hash Cond: (usagestats.tid = v.id)
> Buffers: shared hit=3250
> -> Index Scan using usagestats_type_idx on usagestats
> (cost=0.00..4752.59 rows=1335 width=37) (actual time=0.100..6.770 rows=2979
> loops=1)
> Index Cond: ((type)::text = 'vehicle'::text)
> Buffers: shared hit=2948
> -> Hash (cost=343.55..343.55 rows=4155 width=582) (actual
> time=7.908..7.908 rows=4155 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 1088kB
> Buffers: shared hit=302
> -> Seq Scan on vehicle v (cost=0.00..343.55 rows=4155 width=582)
> (actual time=0.021..3.068 rows=4155 loops=1)
> Buffers: shared hit=302
> Total runtime: 21.936 ms
> (12 rows)
>
> -----------------------
> Miscellaneous stats
> -----------------------
>
> SELECT COUNT(*) FROM vehicle;
> count
> -------
> 4155
> (1 row)
>
> SELECT COUNT(*) FROM usagestats;
> count
> ---------
> 8007015
> (1 row)
>
> The usagestats table has 501 histogram buckets for the tid column. The max
> id in the vehicle table is 4155 and the first two buckets of the histogram
> cover all the values between 1 and 4500.
>
>
>
>
>
> --
> View this message in context:
> http://postgresql.nabble.com/Merge-Join-chooses-very-slow-index-scan-tp5842523.html
> Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jake Magner 2015-03-19 07:04:10 Re: Merge Join chooses very slow index scan
Previous Message Jake Magner 2015-03-19 06:23:34 Merge Join chooses very slow index scan