Quick Links

Re: Postgres Optimization: IO and Data Organization

From:	Doug McNaught <doug(at)mcnaught(dot)org>
To:	james(at)unifiedmind(dot)com (James Thornton)
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Postgres Optimization: IO and Data Organization
Date:	2004-05-05 22:18:25
Message-ID:	871xlyseku.fsf@asmodeus.mcnaught.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

james(at)unifiedmind(dot)com (James Thornton) writes:

> I want to understand how Postgres organizes data and handles IO
> operations so that I will better know how to optimize a Postgres
> database server. I am looking for answers to specific questions and
> pointers to where this stuff is documented.

If you haven't read the Postgres docs in detail that would be a good
place to start. :)

> How does Postgres organize its data? For example, is it grouped
> together on the disk, or is it prone to be spread out over the disk?
> Does vacuum reorganize the data? (Seeking to minimize disk head
> movement.)

Tables and indexes are stored in disk files in the filesystem, so PG
relies on the OS to lay out data on the disk.

> How does Postgres handle sequential IO? Does it treat is specially
> such as issuing large IO operations that span block boundaries?

The WAL (write-ahead log), a sort of journal, is written sequentially.
I"m not too familiar with whether WAL writes are ganged together if
possible, but I would think so.

> How does Postgres handle direct IOs (operations directly to disk,
> bypassing the buffer cache)? Will it issue multiple asynchronous IO
> operations?

No direct I/O, no async I/O. A background checkpoint process handles
a lot of the data writeback I/O.

> Is Postgres always one process per client, or can it spawn additional
> processes to parallelise some operations such as a nested loops join
> operation?

One process per client connection. Right now there is no spawning of
additional "worker" processes.

> Is there a recommended file system to use for Postgres data, such as
> ext2 or another non-journaling FS?

You definitely want a journaled FS or the equivalent, since losing
filesystem metadata on a crash can ruin your whole day, not to mention
the fsck times...

There doesn't seem to be a clear winner in the "which FS" debate. If
you use ext3, it's probably fastest to mount with 'data=writeback' for
your DB partition, since you can rely on PG to journal the data
writes. Most other FS's only journal metadata anyway.

Hope this helps!

-Doug

In response to

Postgres Optimization: IO and Data Organization at 2004-04-30 17:47:32 from James Thornton

Responses

Re: Postgres Optimization: IO and Data Organization at 2004-05-06 08:35:00 from Gaetano Mendola

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2004-05-05 22:22:47	Re: initdb failure in CVS
Previous Message	Rod Taylor	2004-05-05 22:12:04	Re: PostgreSQL pre-fork speedup