Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: jeff(at)remapcorp(dot)com (Jeff Hoffmann)
Cc: hackers(at)postgreSQL(dot)org (PostgreSQL-development)
Subject: Re: [GENERAL] Cost: Big Tables vs. Organized Separation of Data
Date: 1999-02-03 21:27:27
Message-ID: 199902032127.QAA18226@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Bruce Momjian wrote:
> >
> > > > So, I guess my question is: how costly are joins? I've heard that
> > > > Postgres pretty much "pukes" (in terms of speed) when you're trying
> > > > to do anything more than 6 table joins in one query. This leads
> > > > me to believe that joins are fairly costly... ????
> > >
> > > i've noticed a pretty drastic slowdown going from a 4 table join
> > > (instantaneous) to a 5 table join (15-20 seconds) i don't think i've
> > > tried 6 tables yet with the same database.
>
> i'll attach the output of my query explains if you want to look at them
> in detail, but here's a quick overview: all of my tables that are being
> joined have 125000-200000 records, all have a primary key, and the join
> is done by matching the primary key in the respective tables. i'm doing
> selects on 4,5,6,7,8 tables testing both with and without GEQO. 7 and 8
> tables caused a crash (apparently out of memory), 4-6 were pretty
> reasonable. they all came up with sensible plans, namely:
> Nested Loop
> -> Nested Loop
> -> Nested Loop
> -> Index Scan
> -> Index Scan
> -> Index Scan
> -> Index Scan
> with the respective levels of index scans/nested loops for each number
> of tables in the join.
>
> although i did find some improvement by lowering the GEQO threshold to
> 6, it wasn't really the answer i was hoping for. using GEQO was the
> loser (time wise) each time on similar selects joining 4 and 5 tables.
> GEQO came up with a better plan (lower cost), but took longer to come up
> with the plan, making it the loser unless you can prepare the statement
> beforehand (which would be nice and i know has been discussed already).
> apparently 4 is the magic number for "instantaneous" joins in my case, 5
> gets you up to about 5 seconds (if you type the query right), and 6 is
> just nasty without GEQO (it's still churning whereas with GEQO it took
> ~20 seconds)

So 6 was your magic number for GEQO in 6.4, though I should comment that
in 6.5, we will use tables+indexes as the geqo start value, so with 5
tables and one index on each you had combined value of 10.

>
> so now the question is, what is the relationship between cost and how
> long a query takes? for example, one of my queries show up with
> cost=14387.49 size=165779 width=664(6 table join w/o GEQO); another
> shows up as cost=14383.39 size=165779 width=644 (5 table join w/o GEQO)
> yet the first one takes several minutes whereas the second only takes a
> few seconds. i don't really know what other questions i should be
> asking here. my guess is that there's some kind of memory limit that's
> being hit causing the jump from 4->5 to take more time than it
> apparently should causing it to take multiple (and therefore slower)
> steps to get the end result. is this a possibility? or am i way off
> base?

[CC to hackers.]

The cost really is just for comparison to other plans. Not sure you can
really make any meaning out of the number, and geqo probably uses a
different measurement for cost. The 4-5 setting is about what I
expected, and I realize GEQO is still too slow for large joins. I am
looking at what can be done with this, in trying to make non-geqo faster
for joins in the 4-10 range, so perhaps we can speed those up, and use
geqo only for really large joins. I hope to have something for 6.5, but
am still researching.

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 1999-02-04 01:01:42 Re: [HACKERS] template/alpha_cc
Previous Message Bruce Momjian 1999-02-03 20:10:21 Re: [HACKERS] DEC OSF1 Compilation problems