Re: Planner tuning

From: "Dmitry Koterov" <dmitry(at)koterov(dot)ru>
To: "Alban Hertroys" <alban(at)magproductions(dot)nl>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Planner tuning
Date: 2007-03-20 14:47:55
Message-ID: d7df81620703200747y3dd287dbh6186ca50df168f78@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I don't know exactly is it your case, but sometimes

SET enable_sort = off;

speeds up some queries by the factor of hundred. But in some cases this
command slows down operations, so I TEMPORARILY switch enable_sort on and
off for some queries. It affects the query plan greatly.

On 3/20/07, Alban Hertroys <alban(at)magproductions(dot)nl> wrote:
>
> Tom Lane wrote:
> > Alban Hertroys <alban(at)magproductions(dot)nl> writes:
> >> It seems pretty obvious that the planner underestimates the cost of
> >> nestloops here, is there some way to tweak this?
> >
> > The real problem is the factor-of-a-thousand underestimate of the size
> > of this join:
>
> Good observation, I missed that one. Thanks.
>
> >> -> Nested Loop (cost=0.00..281.74 rows=2 width=14) (actual time=
> 0.068..14.000 rows=1683 loops=1)
> >> -> Index Scan using fewo_location_ancestry_full_idx on
> fewo_location_ancestry ancestor (cost=0.00..49.34 rows=9 width=4) (actual
> time=0.024..0.172 rows=41 loops=1)
> >> Index Cond: ((ancestor_id = 309) AND (ancestor_type_id =
> 12) AND (child_type_id = 10))
> >> -> Index Scan using
> fewo_property_location_country_location_idx on fewo_property_location
> property_location (cost=0.00..25.80 rows=2 width=18) (actual time=
> 0.009..0.169 rows=41 loops=41)
> >> Index Cond: ((property_location.country_id = 300) AND
> ("outer".child_id = property_location.location_id))
> >> Filter: (property_state_id = 3)
> >
> > Have you got up-to-date ANALYZE stats for both of these tables?
> > Maybe increasing the statistics targets for them would help.
>
> Yes. This is as of this moment a mostly static development database that
> has been vacuumed and analyzed quite recently.
>
> > You may be kind of stuck because of the lack of cross-column statistics
> > --- I suppose these columns are probably rather highly correlated ---
> > but you should at least try pulling the levers you've got.
> >
> > One thought is that country_id is probably entirely determined by
> > location_id, and possibly ancestor_type_id is determined by ancestor_id.
>
> Actually property.location_id refers to cities, which is the deepest
> level in the represented data. Country_id is the top level.
>
> Ancestry id, type and child id, type are indeed closely related. I
> changed their representation based on your suggestions.
>
> > If so you should be leaving them out of the queries and indexes;
> > they're not doing anything for you except fooling the planner about the
> > net selectivity of the conditions.
>
> I tried a few things, but it seems I am quite successful at fooling the
> planner...
>
> I changed the indices on our ancestry table to not combine id and type
> on the same half of the join; which is something we're in fact never
> interested in anyway. This seems to have helped some indeed.
>
> I tried removing country_id from the equation, but I haven't had the
> patience to wait for the explain analyzes to complete that way - they
> take long.
> I implemented it this way as an optimization; I decided to join
> property_location with both period_type_property and
> property_availability_month using (country_id, property_id) as FK.
> That quickly narrows down the number of matching records in those
> tables, which an index on property_id only somehow didn't accomplish.
>
> The good news is that I get results under 1s without having to
> explicitly sort my subquery results.
> The bad news is that the estimated row counts are still quite a bit off.
> I analyzed the DB just before generating the attached result.
>
> --
> Alban Hertroys
> alban(at)magproductions(dot)nl
>
> magproductions b.v.
>
> T: ++31(0)534346874
> F: ++31(0)534346876
> M:
> I: www.magproductions.nl
> A: Postbus 416
> 7500 AK Enschede
>
> // Integrate Your World //
>
> QUERY
> PLAN
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=6780.04..6780.42 rows=1 width=182) (actual time=
> 629.652..629.653 rows=1 loops=1)
> -> Nested Loop (cost=1053.78..6779.62 rows=1 width=182) (actual time=
> 160.662..595.059 rows=969 loops=1)
> -> Hash Join (cost=1053.78..6688.46 rows=21 width=186) (actual
> time=160.493..545.222 rows=3522 loops=1)
> Hash Cond: ("outer".property_id = "inner".property_id)
> -> GroupAggregate (cost=0.00..5581.97 rows=3500 width=12)
> (actual time=0.159..363.108 rows=3522 loops=1)
> -> Index Scan using
> fewo_property_availability_month_country_property_idx on
> fewo_property_availability_month property_availability_month (cost=
> 0.00..3893.52 rows=34992 width=12) (actual time=0.023..147.269 rows=37316
> loops=1)
> Index Cond: (300 = country_id)
> -> Hash (cost=1053.16..1053.16 rows=250 width=86) (actual
> time=160.277..160.277 rows=3522 loops=1)
> -> Hash Join (cost=825.03..1053.16 rows=250
> width=86) (actual time=115.767..147.429 rows=3522 loops=1)
> Hash Cond: ("outer".property_id =
> "inner".property_id)
> -> HashAggregate (cost=338.96..488.83rows=2141 width=12) (actual time=
> 64.207..77.280 rows=3522 loops=1)
> -> Bitmap Heap Scan on
> fewo_period_type_property period_type_property (cost=23.03..253.01rows=3438 width=12) (actual time=
> 0.625..27.199 rows=3522 loops=1)
> Recheck Cond: (300 = country_id)
> -> Bitmap Index Scan on
> fewo_period_type_property_country_property_idx (cost=0.00..23.03rows=3438 width=0) (actual time=
> 0.605..0.605 rows=3522 loops=1)
> Index Cond: (300 =
> country_id)
> -> Hash (cost=473.87..473.87 rows=4881
> width=18) (actual time=51.496..51.496 rows=4873 loops=1)
> -> Bitmap Heap Scan on
> fewo_property_location property_location (cost=50.19..473.87 rows=4881
> width=18) (actual time=0.974..24.530 rows=4873 loops=1)
> Recheck Cond: (country_id = 300)
> Filter: (property_state_id = 3)
> -> Bitmap Index Scan on
> fewo_property_location_country_property_idx (cost=0.00..50.19 rows=4912
> width=0) (actual time=0.939..0.939 rows=4873 loops=1)
> Index Cond: (country_id =
> 300)
> -> Index Scan using fewo_location_ancestry_ancestor_child_idx on
> fewo_location_ancestry ancestor (cost=0.00..4.33 rows=1 width=4) (actual
> time=0.007..0.008 rows=0 loops=3522)
> Index Cond: ((ancestor.ancestor_id = 309) AND (
> ancestor.child_id = "outer".location_id))
> Total runtime: 631.858 ms
> (24 rows)
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Vivek Khera 2007-03-20 14:49:39 Re: cache - timing
Previous Message Merlin Moncure 2007-03-20 14:21:41 Re: Reference Type in PostgreSQL