Re: Raw device I/O for large objects

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Georgi Chulkov <godji(at)metapenguin(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Raw device I/O for large objects
Date: 2007-09-18 09:57:17
Message-ID: 46EFA0FD.4030104@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Georgi Chulkov wrote:
> Please allow me to ask then:
> 1. In your opinion, would the above scenario indeed benefit from a raw-device
> interface for large objects?

No, because file systems also try to do what you outline above. They
certainly don't split sequential data up into blocks and distribute them
randomly over the device, at least not without having a pretty good
reason to do so (with which you'd also have to fight).

The possible gain achievable is pretty minimal, especially in
conjunction with a (hopefully battery backed) write cache.

> 2. How feasible it is to decouple general table storage from large object
> storage?

I think that would be the easiest part. I would go for a pluggable
storage implementation, selectable per tablespace. But then again, I
wouldn't do it at all. After all, this is what MySQL is doing. And we
certainly don't want to repeat their mistakes! Or do you know anybody
who goes like: "Yepee, multiple storages engines to choose from for my
(un)valuable data, lets put some here and others there...".

Let's optimize the *one* storage engine we have and try to make that
work well together with the various filesystems it uses. Because
filesystems are already very good in what they are used for. (And we are
glad we can use a filesystem and don't need to implement one ourselves).

Regards

Markus

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-09-18 10:35:44 Re: Open issues for HOT patch
Previous Message Greg Smith 2007-09-18 04:37:47 Re: Just-in-time Background Writer Patch+Test Results