Re: A costing analysis tool

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: A costing analysis tool
Date: 2005-10-19 17:30:08
Message-ID: 200510191030.09135.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kevin,

> If we stored the actual queries and the EXPLAIN ANALYZE results (when
> generated) in the database, what would be the purpose of the node_name,
> db_object, and condition_detail columns? They don't seem like they
> would be useful for statistical analysis, and it seems like the
> information would be more useful in context. Are these column really
> needed?

Yes. For example, the only way you're going analyze index costing by type
of index is if the index name is stored somewhere (db_object) so that it
can be matched to its characteristics. For condition_detail, again we
could determine that (for example) we have costing problems when filters
involve more than 2 columns or complex expressions.

Node_name is as actually duplicative of some of the other columns, so I
suppose it could be dropped.

> For a given node_type, are there mutiple valid condition_type values?
> If so, I need to modify my python script to capture this. If not, I
> don't see a need to store it.

I'm not sure. Even if there aren't now, there could be in the future. I'm
more focused on supporting cross-node-type conditions. For example,
"Filter" conditions can apply to a variety of node types (Index Scan,
Merge Join, Subquery Scan, Seq Scan, aggregates). If we were costing
Filters, we'd want to be able to aggregate their stats regardless of the
node in which they occurred.

I'm also really unclear on why you're so focused on storing less
information rather than more. In an "investigation" tool like this, it's
important to collect as much data as possible because we don't know what's
going to be valuable until we analyze it. You seem to be starting out
with the idea that you *already* know exactly where the problems are
located, in which case why develop a tool at all? Just fix the problem.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Devrim GUNDUZ 2005-10-19 17:34:44 Re: 8.04 and RedHat/CentOS init script issue
Previous Message Kevin Grittner 2005-10-19 17:08:15 Re: A costing analysis tool