Re: Major performance problem after upgrade from 8.3 to 8.4

From: Gerhard Wiesinger <lists(at)wiesinger(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andreas Kretschmer <akretschmer(at)spamfence(dot)net>
Subject: Re: Major performance problem after upgrade from 8.3 to 8.4
Date: 2010-09-13 06:39:44
Message-ID: alpine.LFD.2.01.1009130838530.12288@bbs.intern
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

Any news or ideas regarding this issue?

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/

On Sat, 4 Sep 2010, Gerhard Wiesinger wrote:

> On Fri, 3 Sep 2010, Tom Lane wrote:
>
>> Gerhard Wiesinger <lists(at)wiesinger(dot)com> writes:
>>> 8.3 query plans: http://www.wiesinger.com/tmp/pg_perf_83_new.txt
>>> 8.4 quey plans: http://www.wiesinger.com/tmp/pg_perf_84.txt
>>
>> Hmm. The 8.3 plan is indeed assuming that the number of rows will stay
>> constant as we bubble up through the join levels, but AFAICS this is
>> simply wrong:
>>
>> -> Nested Loop Left Join (cost=0.00..38028.89 rows=67 width=8)
>> -> Nested Loop Left Join (cost=0.00..25399.46 rows=67 width=8)
>> -> Nested Loop Left Join (cost=0.00..12770.04 rows=67
>> width=8)
>> -> Index Scan using i_log_unique on log l
>> (cost=0.00..140.61 rows=67 width=8)
>> Index Cond: (datetime >= (now() - '00:01:00'::interval))
>> -> Index Scan using unique_key_and_id on log_details d7
>> (cost=0.00..187.39 rows=89 width=8)
>> Index Cond: ((l.id = d7.fk_id) AND (d7.fk_keyid = $6))
>> -> Index Scan using unique_key_and_id on log_details d6
>> (cost=0.00..187.39 rows=89 width=8)
>> Index Cond: ((l.id = d6.fk_id) AND (d6.fk_keyid = $5))
>> -> Index Scan using unique_key_and_id on log_details d5
>> (cost=0.00..187.39 rows=89 width=8)
>> Index Cond: ((l.id = d5.fk_id) AND (d5.fk_keyid = $4))
>>
>> If the log_details indexscans are expected to produce 89 rows per
>> execution, then surely the join size should go up 89x at each level,
>> because the join steps themselves don't eliminate anything.
>>
>> In 8.4 the arithmetic is at least self-consistent:
>>
>> -> Nested Loop Left Join (cost=0.00..505256.95 rows=57630
>> width=8)
>> -> Nested Loop Left Join (cost=0.00..294671.96 rows=6059
>> width=8)
>> -> Nested Loop Left Join (cost=0.00..272532.55 rows=637
>> width=8)
>> -> Index Scan using log_pkey on log l
>> (cost=0.00..270203.92 rows=67 width=8)
>> Filter: (datetime >= (now() - '00:01:00'::interval))
>> -> Index Scan using unique_key_and_id on log_details d7
>> (cost=0.00..34.63 rows=10 width=8)
>> Index Cond: ((l.id = d7.fk_id) AND (d7.fk_keyid = $6))
>> -> Index Scan using unique_key_and_id on log_details d6
>> (cost=0.00..34.63 rows=10 width=8)
>> Index Cond: ((l.id = d6.fk_id) AND (d6.fk_keyid = $5))
>> -> Index Scan using unique_key_and_id on log_details d5
>> (cost=0.00..34.63 rows=10 width=8)
>> Index Cond: ((l.id = d5.fk_id) AND (d5.fk_keyid = $4))
>>
>> The rowcount estimates are apparently a shade less than 10, but they get
>> rounded off in the display.
>>
>> I believe the reason for this change is that 8.4's join estimation code
>> was rewritten so that it wasn't completely bogus for outer joins. 8.3
>> might have been getting the right answer, but it was for the wrong
>> reasons.
>>
>> So the real question to be answered here is why doesn't it think that
>> each of the unique_key_and_id indexscans produce just a single row, as
>> you indicated was the case. The 8.4 estimate is already a factor of
>> almost 10 closer to reality than 8.3's, but you need another factor of
>> 10. You might find that increasing the statistics target for the
>> log_details table helps.
>
>
> Ok, Tom, tried different things (more details are below):
> 1.) Setting statistic target to 1000 and 10000 (without success), still merge
> join
> 2.) Tried to added a Index on description to help the planner for uniqueness
> (without success)
> 3.) Forced the planner to use nested loop joins (SUCCESS):
> SET enable_hashjoin=false;SET enable_mergejoin=false;
> (BTW: How do use such settings in Java and PHP and Perl, is there a command
> available?)
>
> Open questions:
> Why does the planner not choose nested loop joins, that should be the optimal
> one for that situation?
> Does the planner value: a.) UNIQUENESS b.) UNIQUENESS and NOT NULLs?
> Any ideas for improvement of the planner?
>
> Details:
> -- CREATE UNIQUE INDEX unique_key_and_id ON log_details USING btree (fk_id,
> fk_keyid)
> -- 1000 and 10000 didn't help for better query plan for Nested Loop Left
> Join, still Merge Left Join
> -- Sample with:
> -- ALTER TABLE log_details ALTER COLUMN fk_id SET STATISTICS 10000;
> -- ALTER TABLE log_details ALTER COLUMN fk_keyid SET STATISTICS 10000;
> -- ANALYZE VERBOSE log_details;
> -- Still Merge Join:
> -- -> Merge Left Join (cost=9102353.88..83786934.25 rows=2726186787
> width=16)
> -- Merge Cond: (l.id = d2000902.fk_id)
> -- -> Merge Left Join (cost=8926835.18..40288402.09 rows=972687282
> width=24)
> -- Merge Cond: (l.id = d2000904.fk_id)
> -- Default values again
> ALTER TABLE log_details ALTER COLUMN fk_id SET STATISTICS 100;
> ALTER TABLE log_details ALTER COLUMN fk_keyid SET STATISTICS 100;
> ANALYZE VERBOSE log_details;
>
> -- Tried to add WITHOUT SUCCESS (that planner could know that description is
> NOT NULL and UNIQE)
> DROP INDEX IF EXISTS i_key_description_desc;
> CREATE UNIQUE INDEX i_key_description_desc ON key_description (description);
> -- Therefore planner should know: keyid is NOT NULL and UNIQUE and only one
> result: (SELECT keyid FROM key_description WHERE description =
> 'Raumsolltemperatur')
> -- Therefore from constraint planner should know that fk_id is NOT NULL and
> UNIQUE: CONSTRAINT unique_key_and_id UNIQUE(fk_id, fk_keyid):
> -- LEFT JOIN log_details d1 ON l.id = d1.fk_id AND
> -- d1.fk_keyid = (SELECT keyid FROM key_description WHERE description =
> 'Raumsolltemperatur')
> -- Does the planner value alls those UNIQUEnesses and NOT NULLs?
>
> -- Again back to 8.3 query plan which is fast (319ms):
> SET enable_hashjoin=false;
> SET enable_mergejoin=false;
> -- -> Nested Loop Left Join (cost=0.00..22820970510.45 rows=2727492136
> width=16)
> -- -> Nested Loop Left Join (cost=0.00..12810087616.29
> rows=973121653 width=24)
> -- -> Nested Loop Left Join (cost=0.00..9238379092.22
> rows=347192844 width=24)
>
> Thnx.
>
> Ciao,
> Gerhard
>
> --
> http://www.wiesinger.com/
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gaetano Mendola 2010-09-13 09:47:24 Useless sort by
Previous Message azajac 2010-09-13 05:49:45 Problem with mergejoin performance