Re: A costing analysis tool

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <josh(at)agliodbs(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>, <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: A costing analysis tool
Date: 2005-10-19 17:08:15
Message-ID: 4356372F0200002500000125@gwmta.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

If we stored the actual queries and the EXPLAIN ANALYZE results (when
generated) in the database, what would be the purpose of the node_name,
db_object, and condition_detail columns? They don't seem like they
would be useful for statistical analysis, and it seems like the
information would be more useful in context. Are these column really
needed?

For a given node_type, are there mutiple valid condition_type values?
If so, I need to modify my python script to capture this. If not, I
don't see a need to store it.

-Kevin

>>> Josh Berkus <josh(at)agliodbs(dot)com> >>>

[snip]

so, for example, the query step:
" -> Seq Scan on detail0009 (cost=0.00..20500.11
rows=26 width=1005) (actual time=453.000..5983.000 rows=53588 loops=1)"
" Filter: ((txcontenttype ~~* '%html%'::text)
AND ((vchost)::text ~~* '%www.%'::text))"

Could be displayed as:

query_instance 12008
step_id 14701
parent_step 14698
node_name Seq Scan on detail0009
node_type Seq Scan
cost_start 0
cost_end 20500.11
est_rows 26
time_start 453.0
time_end 5983.0
actual_rows 53588
loops 1
db_object detail009
condition_type Filter
condition_detail ((txcontenttype ~~* '%html%'::text) AND
((vchost)::text ~~*
'%www.%'::text))

By collecting all of this data, you make it possible to perform other
sorts of
analysis on the cost estimates. For example, statistical analysis
might
tell us that 3-or-more-condition filters take significantly longer to
execute
than single-condition filters, which would be important to know for the
cost
model. Limiting it to collecting only 3 of the 13 bits of node data
produced
by EA would very much limit the usefulness of the tool and the
reliability of
its statistics.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2005-10-19 17:30:08 Re: A costing analysis tool
Previous Message Dann Corbit 2005-10-19 16:35:53 Re: Optimization system