Re: [INTERFACES] Back end memory consumption (was Front end memory consumption in SELECT)

From: Douglas Thomson <dougt(at)mugc(dot)cc(dot)monash(dot)edu(dot)au>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane)
Cc: dougt(at)mugc(dot)cc(dot)monash(dot)edu(dot)au, pgsql-interfaces(at)postgreSQL(dot)org
Subject: Re: [INTERFACES] Back end memory consumption (was Front end memory consumption in SELECT)
Date: 1999-12-03 05:51:16
Message-ID: 199912030551.QAA09331@mugca.cc.monash.edu.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-interfaces

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Douglas Thomson <dougt(at)mugc(dot)cc(dot)monash(dot)edu(dot)au> writes:
> > If I hadn't checked that I have indexes on
> > all the attributes (and attribute combinations) that make up the keys
> > needed for the join, I would think the back end was trying to read in
> > and sort a large part of the data just for the query!
>
> Well, of course it *is* ... the trick is to make sure that anything
> really large ends up on disk (in a temp file) and not in memory.

What I don't understand is, why would it bother trying to read in
lots of data and sort it when it already has an index sorted in the
required order? I was trying to set up the individual table indexes
so that the join could be done efficiently, and not need either lots
of memory or temporary disk files. Do indexes not get used for joins?

> Offhand the plan looks pretty reasonable for a 9-way join. I see that
> you have two sorts and five hashes going on here. Each of those will
> feel entitled to use as much memory as whatever your -S setting is,
> which IIRC is 512K by default. But 3.5meg altogether isn't enough to
> explain your memory usage ... unless you are using a larger-than-normal
> -S switch?

No, I use the default.

> Also, while the sorts are pretty certain to honor your -S limit (plus or
> minus some slop), the hashes are much more probabilistic affairs. If a
> table is much larger than the planner guesses, or if the distribution of
> tuple values is very uneven, a hash join might use a lot more memory
> than -S. It's hard to tell whether this might be happening without a
> lot more info than EXPLAIN provides, however. EXPLAIN shows that your
> tables being hashed are all pretty small, the largest being
> 't24 users' at 6955 rows. If that's accurate then I doubt hash overrun
> is the explanation...

6955 rows is accurate for that table. Values could well be highly
skewed though. The joining attribute for t24 is a student ID "number"
text field, and the numbers will be grouped.

Doug.

In response to

Browse pgsql-interfaces by date

  From Date Subject
Next Message Peter Wiley 1999-12-03 06:06:33 Re: pgsql-interfaces-digest V1 #570
Previous Message Ross J. Reedstrom 1999-12-03 04:34:27 pgsql, ODBC and ColdFusion