Re: Databases compared at zend.com

From: Doug McNaught <doug(at)wireboard(dot)com>
To: <pgsql-general(at)commandprompt(dot)com>
Cc: Michael <mwaples(at)waples(dot)net>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Databases compared at zend.com
Date: 2001-06-01 18:34:50
Message-ID: m31yp43v79.fsf@belphigor.mcnaught.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

<pgsql-general(at)commandprompt(dot)com> writes:

> > Good article. The only inaccuracy I saw was that he claims Postgres'
> > storage of large objects is "in the filesystem" and "inefficient".
> > >From my reading of the docs this is not true--large objects are stored
> > within the database just as with any other data. The programming
> > interface to LOs may or may not be clunkier than that of other
> > databases, but the storage is no less efficient.
>
>
> As the person who wrote it :) It was my understanding that PostgreSQL
> stores large objects on the filesystem outside of the database tables.
> They may be indexed but I thought there was only an identifier within the
> table that pointed to the large object.

> If this is not true, could someone please describe the actual process, I
> can update the article.

You're close, but not completely accurate.

To quote the 7.1 docs:

[http://postgresql.readysetnet.com/users-lounge/docs/7.1/programmer/largeobjects.html]

2.1. Historical Note

Originally, Postgres 4.2 supported three standard implementations of
large objects: as files external to Postgres, as external files
managed by Postgres, and as data stored within the Postgres
database. It causes considerable confusion among users. As a result,
we only support large objects as data stored within the Postgres
database in PostgreSQL. Even though it is slower to access, it
provides stricter data integrity. For historical reasons, this storage
scheme is referred to as Inversion large objects. (We will use
Inversion and large objects interchangeably to mean the same thing in
this section.) Since PostgreSQL 7.1 all large objects are placed in
one system table called pg_largeobject.

It's interesting that the docs claim this table-level storage is
slower to access than using external files. On one respect, though,
it's more efficient than storing LOs in files--if you have a lot of LO
files in a single directory, access to those files can get very slow
on many filesystems (Linux ext2, BSD ufs) due to linear searching of
the directory.

So, from my knowledge and reading of the docs (I've used LOs a bit
but not extensively), here's my understanding:

* LOs are stored within a single system table (as above) which is
indexed by OID.
* User applications store references to LOs in other tables as column
type "oid".
* Postgres provides file-descriptor-style access (open, read, write,
seek, tell) to LOs through their OIDs. This is something you want,
since you don't want to be forced to read an entire LO into memory
in order to work with it.

It might be nice also to have more "convenient" interfaces to
manipulate large objects in a way that approximates "normal" column
data, for LOs that are not too big. You could probably do a lot of
this by writing custom functions.

-Doug
--
The rain man gave me two cures; he said jump right in,
The first was Texas medicine--the second was just railroad gin,
And like a fool I mixed them, and it strangled up my mind,
Now people just get uglier, and I got no sense of time... --Dylan

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2001-06-01 18:48:58 Re: Databases compared at zend.com
Previous Message Tom Lane 2001-06-01 18:25:59 Re: Databases compared at zend.com