Re: Raw device I/O for large objects

From: Georgi Chulkov <godji(at)metapenguin(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Raw device I/O for large objects
Date: 2007-09-18 03:48:25
Message-ID: 200709180548.25277.godji@metapenguin.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> We've heard this idea proposed before, and it's been shot down as a poor
> use of development effort every time. Check the archives for previous
> threads, but the basic argument goes like this: when Oracle et al did
> that twenty years ago, it was a good idea because (1) operating systems
> tended to have sucky filesystems, (2) performance and reliability
> properties of same were not very consistent across platforms, and (3)
> being large commercial software vendors they could afford to throw lots
> of warm bodies at anything that seemed like a bottleneck. None of those
> arguments holds up well for us today however. If you think you want to
> reimplement a filesystem you need to have some pretty concrete reasons
> why you can outsmart all the smart folks who have worked on
> your-favorite-OS's filesystems for lo these many years. There's also
> the fact that on any reasonably modern disk hardware, "raw I/O" is
> anything but.

Thanks, I agree with all your arguments.

Here's the reason why I'm looking at raw device storage for large objects only
(as opposed to all tables): with raw device I/O I can control, to an extent,
spatial locality. So, if I have an application that wants to store N large
objects (totaling several gigabytes) and read them back in some order that is
well-known in advance, I could store my large objects in that order on the
raw device.* Sequentially reading them back would then be very efficient.
With a file system underneath, I don't have that freedom. (Such a scenario
occurs with raster databases, for example.)

* assuming I have a way to communicate these requirements; that's a whole new
problem

Please allow me to ask then:
1. In your opinion, would the above scenario indeed benefit from a raw-device
interface for large objects?
2. How feasible it is to decouple general table storage from large object
storage?

Thank you for your time,

Georgi

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Luke Lonergan 2007-09-18 03:55:11 Re: Raw device I/O for large objects
Previous Message Tom Lane 2007-09-18 03:38:41 Re: Open issues for HOT patch