Re: [HACKERS] Problems with >2GB tables on Linux 2.0

From: Peter T Mount <peter(at)retep(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Problems with >2GB tables on Linux 2.0
Date: 1999-02-08 20:25:17
Message-ID: Pine.LNX.4.04.9902081937150.19320-100000@maidast.retep.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 8 Feb 1999, Tom Lane wrote:

> Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us> writes:
> >> However, I'm using John's suggestion of reducing the file size a lot more,
> >> to ensure we don't hit any math errors, etc. So the max file size is about
> >> 1.6Gb.
>
> > I can imagine people finding that strange. It it really needed. Is
> > there some math that could overflow with a larger value?
>
> Well, that's the question all right --- are you sure that there's not?
> I think "max - 1 blocks" is pushing it, since code that computes
> something like "the byte offset of the block after next" would fail.
> Even if there isn't any such code today, it seems possible that there
> might be someday.
>
> I'd be comfortable with 2 billion (2000000000) bytes as the filesize
> limit, or Andreas' proposal of 1Gb.

I'm starting to like Andreas' proposal as the new default.

> I also like the proposals to allow the filesize limit to be configured
> even lower to ease splitting huge tables across filesystems.
>
> To make that work easily, we really should adopt a layout where the data
> files don't all go in the same directory. Perhaps the simplest is:
>
> * First or only segment of a table goes in top-level data directory,
> same as now.
>
> * First extension segment is .../data/1/tablename.1, second is
> .../data/2/tablename.2, etc. (Using numbers for the subdirectory
> names prevents name conflict with ordinary tables.)

How about dropping the suffix, so you would have:

.../data/2/tablename

Doing that doesn't mean having to increase the filename buffer size, just
the format and arg order (from %s.%d to %d/%s).

I'd think we could add a test when the new segment is created for the
symlink/directory. If it doesn't exist, then create it. Otherwise a poor
unsuspecting user would have their database fall over, not realising where
the error is.

Peter

--
Peter T Mount peter(at)retep(dot)org(dot)uk
Main Homepage: http://www.retep.org.uk
PostgreSQL JDBC Faq: http://www.retep.org.uk/postgres
Java PDF Generator: http://www.retep.org.uk/pdf

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1999-02-08 21:28:54 samekeys
Previous Message Jan Wieck 1999-02-08 19:32:13 Re: [HACKERS] Optimizer problems