RE: Help interpreting the output of EXPLAIN

From: "Mayers, Philip J" <p(dot)mayers(at)ic(dot)ac(dot)uk>
To: "'pgsql-general(at)postgresql(dot)org'" <pgsql-general(at)postgresql(dot)org>
Cc: "'Tom Lane '" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: RE: Help interpreting the output of EXPLAIN
Date: 2000-12-09 12:20:47
Message-ID: A0F836836670D41183A800508BAF190B35E234@icex1.cc.ic.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Ok, I'm slowly getting it.

The table declarations and query are shown at:

http://blacklotus.cc.ic.ac.uk/~pjm3/sql.txt
http://blacklotus.cc.ic.ac.uk/~pjm3/query

The output of the explain with just the SQL is at:

http://blacklotus.cc.ic.ac.uk/~pjm3/explain1

So, the major optimisation point seems to be the nested loop on line 5 - the
major portion of the cost there is coming from 9000 rows times 2.0 cost for
index scan on interface_pkey, which corresponds to the join on line 11 of
the query.

Hmm - interface.mac is the primary key and thus already indexed - if I
eliminate the sequential scan on host (line 6 of the explain) by indexing
the host.mac (which is macaddr, so no hash index), what happens:

http://blacklotus.cc.ic.ac.uk/~pjm3/explain2

Now the major portion of the cost is an index scan on host_mac at a cost of
2.0, as the inner scan of a nested join of 9000 rows, so a cost of
(surprise) 18000 again... Damn.

I'm stuck - restructure the query?

Cheers,
Phil

-----Original Message-----
From: Tom Lane
To: Mayers, Philip J
Cc: 'pgsql-general(at)postgresql(dot)org'
Sent: 09/12/00 00:07
Subject: Re: [GENERAL] Help interpreting the output of EXPLAIN

"Mayers, Philip J" <p(dot)mayers(at)ic(dot)ac(dot)uk> writes:
> 1) The costs seem to change radically without me taking any action -
do I
> have to run them on an unloaded machine?
> 2) VACUUM seemed to massively increase the cost
> 3) VACUUM ANALYZE decreased it - so, should I always do "VACUUM;
VACUUM
> ANALYZE;"

The cost estimates certainly don't depend on load ;-). They aren't
going to change when you "haven't taken any action", either. However,
they do depend on statistics gathered by VACUUM and VACUUM ANALYZE,
so the estimates can change after you run those.

> 4) I'm using LIMIT after the outer join - should I LIMIT the two inner
joins
> to the same amount, since or will Postgres do this itself?

Hard to say. You're not going to get much useful response about a query
as complex as this one evidently is when you haven't shown us the query
itself and the declarations of the relevant tables/indexes.

> a join of the form "host.mac = interface.mac", so I should hash index
the
> host.mac column? Unfortunately, mac is of type "macaddr" and this
happens:

> hdb=> create index host_mac on host using hash (mac);
> ERROR: fmgr_info: function 0: cache lookup failed

Hash indexes on macaddr don't work in 7.0 (the support's only partially
present). Personally I can see very little use for hash indexes anyway.
I don't know of any situation where I'd prefer hash to btree, at least
not given the implementations of the two in Postgres.

regards, tom lane

Browse pgsql-general by date

  From Date Subject
Next Message Hiroshi Inoue 2000-12-09 13:23:09 RE: [HACKERS] Japan pictures
Previous Message Jarmo Paavilainen 2000-12-09 10:28:35 SV: Bug in index scans with Locale support enabled