Re: Performance of PostgreSQL B+-tree algorithm

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyle Lanclos <lanclos(at)ucolick(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Performance of PostgreSQL B+-tree algorithm
Date: 2012-05-14 18:49:16
Message-ID: 14776.1337021356@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Kyle Lanclos <lanclos(at)ucolick(dot)org> writes:
> I spent some time last week staring at the code for the PostgreSQL
> B+-tree implementation. What I hoped to find, and was not immediately
> able to determine, was the Knuth order for the PostgreSQL B+-tree
> implementation. It is entirely possible that I simply got lost in the
> wrong C file.

> My goal is to make an informed assertion about the performance of
> a PostgreSQL B+-tree index as the quantity of records in our database
> grows more or less unbounded.

> To use a common reference, wikipedia states the following:

> Bayer & McCreight (1972), Comer (1979), and
> others define the order of B-tree as the
> minimum number of keys in a non-root node.
> Folk & Zoellick (1992) points out that terminology
> is ambiguous because the maximum number of keys
> is not clear. An order 3 B-tree might hold a
> maximum of 6 keys or a maximum of 7 keys.
> (Knuth 1998, p. 483) avoids the problem by defining
> the order to be maximum number of children (which
> is one more than the maximum number of keys).

Well, that would depend on the data type being indexed, which you did
not specify; and if it's a variable-length type then it's really hard to
give a concrete answer. For integer or int8 keys the answer is
typically about 400, though, depending on whether you're talking about
a 32- or 64-bit platform. Basically it's 4 bytes for line pointer,
plus 8 bytes for index tuple header, plus maxalign'ed size of the index
key, divided into page size (less a couple dozen bytes for page header).
You could increase the result by building with a page size of more than
the default 8K, though I've seen no recent experiments suggesting that
doing so is likely to be a win.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kyle Lanclos 2012-05-14 20:08:23 Re: Performance of PostgreSQL B+-tree algorithm
Previous Message François Beausoleil 2012-05-14 18:36:13 Re: COPY from CSV, passing in default value?