Re: Optimizer wrongly picks Nested Loop Left Join

From: "Edin Kadribasic" <edink(at)emini(dot)dk>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "John A Meinel" <john(at)arbash-meinel(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Optimizer wrongly picks Nested Loop Left Join
Date: 2005-05-11 23:28:25
Message-ID: 003001c55681$216ba5c0$0b00000a@blaster
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> "Edin Kadribasic" <edink(at)emini(dot)dk> writes:
> > I have a query that is giving the optimizer (and me) great headache.
>
> The main problem seems to be that the rowcount estimates for
> axp_temp_order_match and axp_dayschedule are way off:
>
> > -> Index Scan using axp_temp_order_match_idx1 on
> > axp_temp_order_match a (cost=0.00..209.65 rows=426 width=4) (actual
> > time=0.277..0.512 rows=6 loops=1)
> > Index Cond: (sid = 16072)
>
> > -> Index Scan using axp_dayschedule_day_idx
on
> > axp_dayschedule ds (cost=0.00..3.02 rows=1 width=8) (actual
> > time=0.036..3.973 rows=610 loops=1)
> > Index Cond: (("day" >=
'2005-05-12'::date)
> > AND ("day" <= '2005-05-12'::date))
> > Filter: (used = B'1'::"bit")
>
> > -> Index Scan using axp_temp_order_match_idx1 on
> > axp_temp_order_match a (cost=0.00..2.45 rows=1 width=4) (actual
> > time=0.027..2.980 rows=471 loops=1)
> > Index Cond: (sid = 16092)
>
> > -> Index Scan using axp_dayschedule_day_idx on
> > axp_dayschedule ds (cost=0.00..3.02 rows=1 width=8) (actual
> > time=0.015..3.557 rows=606 loops=471)
> > Index Cond: (("day" >= '2005-05-13'::date)
AND
> > ("day" <= '2005-05-13'::date))
> > Filter: (used = B'1'::"bit")
>
> Do you ANALYZE these tables on a regular basis? If so, it may be
> necessary to increase the statistics target to the point where you
> get better estimates.

Increasing statistics didn't seem to help, but both of you gave me an idea
of what might be wrong. axp_temp_order match contains temporary matches for
a search. Just before execution of that query the new matches are inserted
into the table under a new search id (sid column). Since the ANALYZE was
that before it it grossly underestimates the number of matches for that sid.
As this table is relatively small inserting ANALYZE axp_temp_order_match
just before running the query does not introduce a great perforance penalty
(50ms) and it reduces the query execution time from up to 50s down to ~20ms.

> > Please note that sometimes when I get "bad plan" in the logfile, I just
> > re-run the query and the optimizer chooses the more efficient one.
>
> That's fairly hard to believe, unless you've got autovacuum running
> in the background.

The application had ANALYZE axp_temp_order_match placed in the "slightly"
wrong location, before the large insert was done (1000 rows with a new
sid). So when the app run the next search, previous search got correctly
analyzed and the query execution time dropped dramatically as I was trying
to EXPLAIN ANALYZE query recorded in the log file.

Thanks for your help,

Edin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2005-05-12 02:26:28 Re: Intel SRCS16 SATA raid?
Previous Message Enrico Weigelt 2005-05-11 23:26:23 Re: BLOB's bypassing the OS Filesystem for better Image loading speed?