Re: Does this matter?

From: "Curtis Faith" <curtis(at)galtair(dot)com>
To: "Andrew Sullivan" <andrew(at)libertyrms(dot)info>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Does this matter?
Date: 2002-11-01 22:00:19
Message-ID: DMEEJMCDOJAKPPFACMPMGEONCEAA.curtis@galtair.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance pgsql-sql

Andrew Sullivan wrote:
> Hmm. Maybe a clarification, but I don't think this is quite what the
> tip is talking about. The tip points out that part of the cost is
> "the increased storage" from the blank-padded type (char) as
> contrasted with non-padded types (like text). The tip isn't talking
> about whether a length of 20 is faster than a length of 36. Anyway,
> I can't really believe the length would be a big deal except on
> really huge tables.

It really depends on the access. I spend quite a bit of time optimizing
database internals and the size of an index matters much more than is
apparent in certain cases. This is especially true for medium sized tables.

The real issue is the number of reads required to find a particular entry in
the index.

Assume a btree that tries to be 70% full. Assume 40 bytes for a header, 8
bytes overhead per index entry and an 8K btree page.

The following represents the number of index entries that can be contained in
both a two level and a three level btree.

Type Bytes Items per page 2 3
---- ------ ----- ------ ----------
char(36) 40 129 16,641 2,146,689
char(20) 24 203 41,209 8,365,427

Depending on the size of the table, the number of pages in the btree affect
performance in two separate ways:

1) Cache hit ratio - This greatly depends on the way the tables are accessed
but more densely packed btree indices are used more often and more likely to
be present in a cache than less densely packed indices.

2) I/O time - If the number of items reaches a particular size then the btree
will add an additional level which could result in a very expensive I/O
operation per access. How this affects performance depends very specifically
on the way the index is used.

The problem is not necessarily the size of the table but the transitions in
numbers of levels in the btree. For a table size of 200 to 15,000 tuples,
there won't be a major difference.

For a table size of 25,000 to 40,000 tuples, and assuming the root page is
cached, an index lookup can be twice as fast with a char(20) as it is for a
char(36) because in the one case a two-level btree handles the table while a
three-level btree is needed for the other.

This won't typically affect multi-user throughput as much since other
backends will be working while the I/O's are waiting but it might affect the
performance as seen from a single client.

- Curtis

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2002-11-01 22:01:11 Re: Does this matter?
Previous Message Josh Berkus 2002-11-01 22:00:09 Re: Does this matter?

Browse pgsql-sql by date

  From Date Subject
Next Message Josh Berkus 2002-11-01 22:01:11 Re: Does this matter?
Previous Message Josh Berkus 2002-11-01 22:00:09 Re: Does this matter?