Re: join-performance problem

From: Wolfgang(dot)Fuertbauer(at)ebewe(dot)com
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-sql(at)postgresql(dot)org, Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Subject: Re: join-performance problem
Date: 2002-05-02 08:00:47
Message-ID: OF65A70128.C3997CD6-ONC1256BAD.002B15FA@ebewe.co.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


On 30.04.2002 18:44:19 pgsql-sql-owner wrote:
>Wolfgang(dot)Fuertbauer(at)ebewe(dot)com writes:
>>> explain SELECT b.Kdnr, date_part('year', b.Datum), d.Name, e.Name,
a.Menge,
>>> a.Rabproz, a.Netto - (a.netto * a.rabproz / 100),
>>> a.Netto * b.kurs - ((a.netto * b.kurs) * a.rabproz / 100), 'RG'
>>> FROM Faktzeilen a, Fakt b, Artikel c, Kollektion d, ArGruppen e
>>> where a.Fanr = b.nr
>>> and c.nr = a.Arnr
>>> and c.Kollektion = d.Nr
>>> and (c.Gruppe = e.nr or c.gruppe = 0)
>>> and b.kdnr = 49736;
>
>Anyplace that c.gruppe is 0, this is an unconstrained join to e --- ie,
>you'll get a row out for *every* row of e. Somehow I doubt that's the
>behavior you really want.

you're absolutly right! I fixed that (by having an record in c with nr 0 -
what
is ok for the application)

>I concur with Stephan's observation that you haven't analyzed. But
>even when you have, this query doesn't give much traction for the use
>of indexes on a --- the only constraint that might possibly be used to
>avoid a complete scan of a is the "b.kdnr = 49736", and that's not even
>on a. The only hope I can see is if you create an index on b.kdnr;
>then (if there aren't very many rows matching b.kdnr = 49736), it might
>be able to pick those up with an indexscan on b and then do an inner
>indexscan join to a using a.Fanr = b.nr.

OK;
created an index on b.kdnr (fakt_kunde_key) and here is the analyses:

Hash Join (cost=482.82..704.62 rows=80 width=93)
-> Hash Join (cost=481.78..702.17 rows=80 width=79)
-> Hash Join (cost=480.44..699.43 rows=80 width=59)
-> Seq Scan on artikel c (cost=0.00..155.77 rows=4977
width=18)
-> Hash (cost=480.24..480.24 rows=80 width=41)
-> Nested Loop (cost=0.00..480.24 rows=80 width=41)
-> Index Scan using fakt_kunde_key on fakt b
(cost=0.00..55.11 rows=13 width=16)
-> Index Scan using faktzeilen_pkey on
faktzeilen a (cost=0.00..31.82 rows=10 width=25)
-> Hash (cost=1.27..1.27 rows=27 width=20)
-> Seq Scan on argruppen e (cost=0.00..1.27 rows=27
width=20)
-> Hash (cost=1.03..1.03 rows=3 width=14)
-> Seq Scan on kollektion d (cost=0.00..1.03 rows=3 width=14)

EXPLAIN

can you pleas explain me why c is now scanned sequential?

>Your secondary indexes on a
>look like wastes of space (at least for this query).

they are relevant for other queries; this is only a part from a *large*
ordering / invoicing-system I want to port from Access.

Thanks and Best regards
Wolfgang

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Rajesh Kumar Mallah. 2002-05-02 09:16:20 Weird select output...
Previous Message Joel Burton 2002-05-02 03:17:08 Re: CREATE VIEW question...