From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> |
Cc: | josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: A costing analysis tool |
Date: | 2005-10-14 18:37:37 |
Message-ID: | 24296.1129315057@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> I propose capturing only three values from the output of explain
> analyze, and saving it with many columns of context information.
You really have to capture the rowcounts (est and actual) too.
Otherwise you can't tell if it's a costing problem or a statistics
problem.
More generally, I think that depending entirely on EXPLAIN ANALYZE
numbers is a bad idea, because the overhead of EXPLAIN ANALYZE is both
significant and variable depending on the plan structure. The numbers
that I think we must capture are the top-level EXPLAIN cost and the
actual runtime of the query (*without* EXPLAIN). Those are the things
we would like to get to track closely. EXPLAIN ANALYZE is incredibly
valuable as context for such numbers, but it's not the thing we actually
wish to optimize.
> Besides the additional context info, I expect to be storing the log
> of the ratio, since it seems to make more sense to average and
> look for outliers based on that than the raw ratio.
Why would you store anything but raw data? Easily-derivable numbers
should be computed while querying the database, not kept in it.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | C Wegrzyn | 2005-10-14 19:15:13 | Re: BUG #1962: ECPG and VARCHAR |
Previous Message | Kevin Grittner | 2005-10-14 17:57:50 | Re: A costing analysis tool |