Page Files
A description of the database file default page format.
This section provides an overview of the page format used by Postgres
classes. User-defined access methods need not use this page format.
In the following explanation, a
byte
is assumed to contain 8 bits. In addition, the term
item
refers to data which is stored in Postgres classes.
Page Structure
The following table shows how pages in both normal Postgres classes
and Postgres index
classes (e.g., a B-tree index) are structured.
Sample Page Layout
Page Layout
Item
Description
itemPointerData
filler
itemData...
Unallocated Space
ItemContinuationData
Special Space
``ItemData 2''
``ItemData 1''
ItemIdData
PageHeaderData
The first 8 bytes of each page consists of a page header
(PageHeaderData).
Within the header, the first three 2-byte integer fields
(lower,
upper,
and
special)
represent byte offsets to the start of unallocated space, to the end
of unallocated space, and to the start of special space.
Special space is a region at the end of the page which is allocated at
page initialization time and which contains information specific to an
access method. The last 2 bytes of the page header,
opaque,
encode the page size and information on the internal fragmentation of
the page. Page size is stored in each page because frames in the
buffer pool may be subdivided into equal sized pages on a frame by
frame basis within a class. The internal fragmentation information is
used to aid in determining when page reorganization should occur.
Following the page header are item identifiers
(ItemIdData).
New item identifiers are allocated from the first four bytes of
unallocated space. Because an item identifier is never moved until it
is freed, its index may be used to indicate the location of an item on
a page. In fact, every pointer to an item
(ItemPointer)
created by Postgres consists of a frame number and an index of an item
identifier. An item identifier contains a byte-offset to the start of
an item, its length in bytes, and a set of attribute bits which affect
its interpretation.
The items themselves are stored in space allocated backwards from
the end of unallocated space. Usually, the items are not interpreted.
However when the item is too long to be placed on a single page or
when fragmentation of the item is desired, the item is divided and
each piece is handled as distinct items in the following manner. The
first through the next to last piece are placed in an item
continuation structure
(ItemContinuationData).
This structure contains
itemPointerData
which points to the next piece and the piece itself. The last piece
is handled normally.
Files
data/
Location of shared (global) database files.
data/base/
Location of local database files.
Bugs
The page format may change in the future to provide more efficient
access to large objects.
This section contains insufficient detail to be of any assistance in
writing a new access method.