NetBSD "Bad address" failure (was Re: Third call for platform testing)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tom Ivar Helbekkmo <tih(at)kpnQwest(dot)no>
Cc: lockhart(at)fourpalms(dot)org, matthew green <mrg(at)eterna(dot)com(dot)au>, ivan <ivan(at)420(dot)am>, Hackers List <pgsql-hackers(at)postgresql(dot)org>
Subject: NetBSD "Bad address" failure (was Re: Third call for platform testing)
Date: 2001-04-14 01:16:31
Message-ID: 9179.987210991@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Ivar Helbekkmo <tih(at)kpnQwest(dot)no> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
> + ERROR: cannot read block 3 of hash_i4_index: Bad address
>>
>> "Bad address"? That seems pretty bizarre.

> This is obviously something that shows up on _some_ NetBSD platforms.
> The above was on sparc64, but that same problem is the only one I see
> in the regression testing on NetBSD/vax that isn't just different
> floating point (the VAX doesn't have IEEE), different ordering of
> (unordered) collections or different wording of strerror() output.

> NetBSD/i386 doesn't have the "Bad address" problem.

After looking into it, I find that the problem is this: Postgres, or at
least the hash-index part of it, expects to be able to lseek() to a
position past the end of a file and then get a non-failure return from
read(). (This happens indirectly because it uses ReadBuffer for blocks
that it has never yet written.) Given the attached test program, I get
this result on my own machine:

$ touch z -- create an empty file
$ ./a.out z 0 -- read at offset 0
Read 0 bytes
$ ./a.out z 1 -- read at offset 8K
Read 0 bytes

Presumably, the same result appears everywhere else that the regress
tests pass. But NetBSD 1.5T gives

$ touch z
$ ./a.out z 0
Read 0 bytes
$ ./a.out z 1
read: Bad address
$ uname -a
NetBSD varg.i.eunet.no 1.5T NetBSD 1.5T (VARG) #4: Thu Apr 5 23:38:04 CEST 2001 root(at)varg(dot)i(dot)eunet(dot)no:/usr/src/sys/arch/vax/compile/VARG vax

I think this is indisputably a bug in (some versions of) NetBSD. If I
can seek past the end of file, read() shouldn't consider it a hard error
to read there --- and in any case, EFAULT isn't a very reasonable error
code to return. Since it seems not to be a widespread problem, I'm not
eager to change the hash code to try to avoid it.

regards, tom lane

#include <stdio.h>
#include <errno.h>
#include <fcntl.h>
#include <unistd.h>

int main (int argc, char** argv)
{
char *fname = argv[1];
int fd, readres;
long seekres;
char buf[8192];

fd = open(fname, O_RDONLY, 0);
if (fd < 0)
{
perror(fname);
exit(1);
}
seekres = lseek(fd, atoi(argv[2]) * 8192, SEEK_SET);
if (seekres < 0)
{
perror("seek");
exit(1);
}
readres = read(fd, buf, sizeof(buf));
if (readres < 0)
{
perror("read");
exit(1);
}
printf("Read %d bytes\n", readres);

exit(0);
}

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Philip Warner 2001-04-14 01:43:02 Re: pg_dump ordering problem (rc4)
Previous Message Bruce Momjian 2001-04-14 01:15:21 Re: pg_dump ordering problem (rc4)