Re: Inheritance planner CPU and memory usage change since 9.3.2

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inheritance planner CPU and memory usage change since 9.3.2
Date: 2015-06-20 05:04:36
Message-ID: CAEepm=1zNmSz3v9nquJmfSU5Ux6ZDTXBE=jJs0-T45HoQA2ehg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 19, 2015 at 9:20 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The extraordinarily planning time for query 4 is caused by a
> completely different problem: SearchCatCache eats up huge amounts of
> CPU; its callers are get_attavgwidth and get_typlen. It's not clear
> to me why doubling the number of relations causes such an enormous
> explosion in calls to those functions; I would expect the number of
> calls to double, but presumably the actual increase is much more.
> That's a separate problem, though, unconnected to
> c03ad5602f529787968fa3201b35c119bbc6d782 and not necessarily a
> regression.

I don't have a great high level understanding of the planner, and Q4
may be somehow asking for trouble or unrepresentative of anything
useful, but I did some profiling and instrumenting, and I noticed that
we spend tables^2 * columns time in get_attavgwidth. I wonder if
estimate_rel_size (or some other function in that stack, or some new
function wrapper) should remember the result for each relation for the
scope of this planner invocation. That should bring the calls to
get_attavgwidth down to the same order as Q3 (tables * columns).

Here is some profiler output from a 500 table, 500 column Q4 run:

160295.0ms 60.2% 95 inheritance_planner
120064.0ms 45.1% 0 grouping_planner
119826.0ms 45.0% 2 query_planner
114204.0ms 42.9% 0 add_base_rels_to_query
114204.0ms 42.9% 0 add_base_rels_to_query
114204.0ms 42.9% 151 build_simple_rel
113817.0ms 42.8% 57 build_simple_rel
113600.0ms 42.7% 19 get_relation_info
112123.0ms 42.1% 27 estimate_rel_size
111557.0ms 41.9% 14139 get_rel_data_width
80152.0ms 30.1% 362 get_attavgwidth
79788.0ms 30.0% 282 SearchSysCache
79368.0ms 29.8% 52373 SearchCatCache
13182.0ms 4.9% 2125
CatalogCacheComputeHashValue

Here are some tables showing function call counts. The columns are
ordered like this:

1: Query number
2: Number of child tables
3: Number of columns
4: Number of calls to add_base_rels_to_query
5: Number of calls to build_simple_rel
6: Number of calls to get_relation_info
7: Number of calls to estimate_rel_size
8: Number of calls to get_attavgwidth

Q3 10 10 22 11 11 11 131
Q3 10 20 22 11 11 11 241
Q3 10 30 22 11 11 11 351
Q3 20 10 42 21 21 21 251
Q3 20 20 42 21 21 21 461
Q3 20 30 42 21 21 21 671
Q3 30 10 62 31 31 31 371
Q3 30 20 62 31 31 31 681
Q3 30 30 62 31 31 31 991
Q3 500 500 1002 501 501 501 251501

Q4 10 10 33 143 143 132 1451
Q4 10 20 33 143 143 132 2661
Q4 10 30 33 143 143 132 3871
Q4 20 10 63 483 483 462 5291
Q4 20 20 63 483 483 462 9701
Q4 20 30 63 483 483 462 14111
Q4 30 10 93 1023 1023 992 11531
Q4 30 20 93 1023 1023 992 21141
Q4 30 30 93 1023 1023 992 30751
Q4 500 500 1503 252003 252003 251502 126002501

--
Thomas Munro
http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Feng Tian 2015-06-20 06:54:46 Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H
Previous Message Alvaro Herrera 2015-06-20 04:25:59 Re: anole: assorted stability problems