From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Yet another abort-early plan disaster on 9.3 |
Date: | 2014-09-19 17:15:34 |
Message-ID: | CAHyXU0zSCWLYZ34fsgAGn9x+nFEwOih4ve3kyeYonTY6xG1Xbg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On Wed, Sep 17, 2014 at 7:11 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Folks,
>
> Just encountered another case of critical fail for abort-early query
> plans. In this case, it will completely prevent a user from upgrading
> to 9.3; this is their most common query, and on 9.3 it takes 1000X longer.
>
> Maybe we should think about removing abort-early plans from 9.5?
> Clearly we don't understand them well enough for them to work for users.
>
> Query:
>
> SELECT "categories".* FROM "categories" WHERE "categories"."user_id" IN
> ( SELECT to_user_id FROM "tags" WHERE "tags"."from_user_id" = 53529975 )
> ORDER BY recorded_on DESC LIMIT 20;
>
> Here's the plan from 9.1:
>
> Limit (cost=1613.10..1613.15 rows=20 width=194) (actual
> time=0.503..0.509 rows=20 loops=1)
> -> Sort (cost=1613.10..1736.14 rows=49215 width=194) (actual
> time=0.502..0.505 rows=20 loops=1)
> Sort Key: categories.recorded_on
> Sort Method: top-N heapsort Memory: 30kB
> -> Nested Loop (cost=248.80..303.51 rows=49215 width=194)
> (actual time=0.069..0.347 rows=81 loops=1)
> -> HashAggregate (cost=248.80..248.81 rows=1 width=4)
> (actual time=0.050..0.054 rows=8 loops=1)
> -> Index Scan using unique_index_tags on tags
> (cost=0.00..248.54 rows=103 width=4) (actual time=0.020..0.033 rows=8
> loops=1)
> Index Cond: (from_user_id = 53529975)
> -> Index Scan using index_categories_on_user_id on
> categories (cost=0.00..54.34 rows=29 width=194) (actual
> time=0.010..0.028 rows=10 loops=8)
> Index Cond: (user_id = tags.to_user_id)
> Total runtime: 0.641 ms
>
> And from 9.3:
>
> Limit (cost=1.00..2641.10 rows=20 width=202) (actual
> time=9.933..711.372 rows=20 loops=1)
> -> Nested Loop Semi Join (cost=1.00..9641758.39 rows=73041
> width=202) (actual time=9.931..711.361 rows=20 loops=1)
> -> Index Scan Backward using index_categories_on_recorded_on
> on categories (cost=0.43..406943.98 rows=4199200 width=202) (actual
> time=0.018..275.020 rows=170995 loops=1)
> -> Index Scan using unique_index_tags on tags
> (cost=0.57..2.20 rows=1 width=4) (actual time=0.002..0.002 rows=0
> loops=170995)
> Index Cond: ((from_user_id = 53529975) AND (to_user_id =
> categories.user_id))
> Total runtime: 711.457 ms
>
> So, here's what's happening here:
>
> As usual, PostgreSQL is dramatically undercounting n_distinct: it shows
> chapters.user_id at 146,000 and the ratio of to_user_id:from_user_id as
> being 1:105 (as opposed to 1:6, which is about the real ratio). This
> means that PostgreSQL thinks it can find the 20 rows within the first 2%
> of the index ... whereas it actually needs to scan 50% of the index to
> find them.
>
> Removing LIMIT causes 9.3 to revert to the "good" plan, as expected.
>
> This is the core issue with abort-early plans; they depend on our
> statistics being extremely accurate, which we know they are not. And if
> they're wrong, the execution time climbs by 1000X or more. Abort-early
> plans are inherently riskier than other types of query plans.
>
> What I'm not clear on is why upgrading from 9.1 to 9.3 would bring about
> this change. The stats are no more than 10% different across the
> version change.
Amusingly on-topic rant I happened to read immediately after this by chance:
Is there a canonical case of where 'abort early' plans help? (I'm new
to that term -- is it a recent planner innovation...got any handy
links?)
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2014-09-19 17:18:22 | Re: Support for N synchronous standby servers |
Previous Message | Robert Haas | 2014-09-19 17:13:16 | Re: Minor improvement in lock.sgml |
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2014-09-19 18:40:17 | Re: Yet another abort-early plan disaster on 9.3 |
Previous Message | Pavel Stehule | 2014-09-19 13:32:00 | Re: query a table with lots of coulmns |