Re: [HACKERS] Hashjoin status report

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: The Hermit Hacker <scrappy(at)hub(dot)org>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Hashjoin status report
Date: 1999-05-07 02:31:43
Message-ID: 199905070231.LAA18421@srapc451.sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> The Hermit Hacker <scrappy(at)hub(dot)org> writes:
> >> Opinions? Should I plow ahead, or leave this to fix after 6.5 release?
>
> > Estimate of time involved to fix this? vs likelihood of someone
> > triggering the bug in production?
>
> I could probably get the coding done this weekend, unless something else
> comes up to distract me. It's the question of how much testing it'd
> receive before release that worries me...
>
> As for the likelihood, that's hard to say. It's very easy to trigger
> the bug as a test case. (Arrange for a hashjoin where the inner table
> has a lot of identical rows, or at least many sets of more-than-10-
> rows-with-the-same-value-in-the-field-being-hashed-on.) In real life
> you'd like to think that that's pretty improbable.
>
> What started this go-round was Contzen's report of seeing the
> "hash table out of memory. Use -B parameter to increase buffers"
> message in what was evidently a real-life scenario. So it can happen.
> Do you recall having seen many complaints about that error before?

We already have a good example for this "hash table out of memory. Use
-B parameter to increase buffers" syndrome in our source tree. Go
src/test/bench, remove "-B 256" from the last line of runwisc.sh then
run the test. The "-B 256" used to not be in there. That was added by
me while fixing the test suit and elog() (see included posting). I
don't see the error message in 6.4.2. I guess this is due to the
change of the optimizer.

IMHO, we should fix this before 6.5 is out, or should change the
default settings of -B to 256 or so, this may cause short of shmem,
however.

P.S. At that time I misunderstood in that I didn't have enough sort
memory :-<

>Message-Id: <199904160654(dot)PAA00221(at)srapc451(dot)sra(dot)co(dot)jp>
>From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
>To: hackers(at)postgreSQL(dot)org
>Subject: [HACKERS] elog() and wisconsin bench test fix
>Date: Fri, 16 Apr 1999 15:54:16 +0900
>
>I have modified elog() so that it uses its own pid(using getpid()) as
>the first parameter for kill() in some cases. It used to get its own
>pid from MyProcPid global variable. This was fine until I ran the
>wisconsin benchmark test suit (test/bench/). In the test, postgres is
>run as a command and MyProcPid is set to 0. As a result elog() calls
>kill() with the first parameter being set to 0 and SIGQUIT was issued
>to the process group, not the postgres process itself! This was why
>/bin/sh got core dumped whenever I ran the bench test.
>
>Also, I fixed several bugs in the test quries.
>
>One thing still remains is some queries fail due to insufficient sort
>memory. I modified the test script adding -B option. But is this
>normal? I think not. I thought postgres should use disk files instead
>of memory if there's enough sort buffer.
>
>Comments?
>--
>Tatsuo Ishii

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 1999-05-07 02:36:29 Re: [HACKERS] Hashjoin status report
Previous Message The Hermit Hacker 1999-05-07 01:54:45 Re: [HACKERS] Hashjoin status report