Re: Large (8M) cache vs. dual-core CPUs

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: mark(at)mark(dot)mielke(dot)cc
Cc: Ron Peacetree <rjpeace(at)earthlink(dot)net>, Scott Marlowe <smarlowe(at)g2switchworks(dot)com>, Bill Moran <wmoran(at)collaborativefusion(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Large (8M) cache vs. dual-core CPUs
Date: 2006-04-26 22:24:57
Message-ID: 20060426222456.GB97354@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Apr 26, 2006 at 02:48:53AM -0400, mark(at)mark(dot)mielke(dot)cc wrote:
> You said that DB accesses are random. I'm not so sure. In PostgreSQL,
> are not the individual pages often scanned sequentially, especially
> because all records are variable length? You don't think PostgreSQL
> will regularly read 32 bytes (8 bytes x 4) at a time, in sequence?
> Whether for table pages, or index pages - I'm not seeing why the
> accesses wouldn't be sequential. You believe PostgreSQL will access
> the table pages and index pages randomly on a per-byte basis? What
> is the minimum PostgreSQL record size again? Isn't it 32 bytes or
> over? :-)

Data within a page can absolutely be accessed randomly; it would be
horribly inefficient to slog through 8K of data every time you needed to
find a single row.

The header size of tuples is ~23 bytes, depending on your version of
PostgreSQL, and data fields have to start on the proper alignment
(generally 4 bytes). So essentially the smallest row you can get is 28
bytes.

I know that tuple headers are dealt with as a C structure, but I don't
know if that means accessing any of the header costs the same as
accessing the whole thing. I don't know if PostgreSQL can access fields
within tuples without having to scan through at least the first part of
preceeding fields, though I suspect that it can access fixed-width
fields that sit before any varlena fields directly (without scanning
through the other fields).

If we ever got to the point of divorcing the in-memory tuple layout from
the table layout it'd be interesting to experiment with having all
varlena length info stored immediately after all fixed-width fields;
that could potentially make accessing varlena's randomly faster. Note
that null fields are indicated as such in the null bitmap, so I'm pretty
sure that their in-tuple position doesn't matter much. Of course if you
want the definitive answer, Use The Source.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2006-04-26 22:28:56 Re: Introducing a new linux readahead framework
Previous Message Bruce Momjian 2006-04-26 22:16:46 Re: Large (8M) cache vs. dual-core CPUs