Re: Array, bytea and large objects

From: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
To: d(dot)wall(at)computer(dot)org
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Array, bytea and large objects
Date: 2009-02-05 15:18:31
Message-ID: 92869e660902050718p4e6f8c9au1664cf6f6c953435@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2009/2/4 David Wall <d(dot)wall(at)computer(dot)org>

>
> First, LOs seem to allow an OID column to be added to any number of tables,
> but is it true that the actual large object data is stored in a single table
> (pg_largeobject?).

yes.

http://www.postgresql.org/docs/8.3/static/lo-intro.html

> If so, wouldn't this become a bottleneck if LOs were used frequently?
> Even vacuuming and vacuumlo must create a lot of pressure on that one table
> if LOs are used extensively. And can you backup a table with an OID column
> and get only those LOs referenced in the dump?
>
> Does the JDBC library support LO streaming? Can I receive data, compress,
> encrypt and stream into the database as well as do the opposite when reading
> it back?

http://jdbc.postgresql.org/documentation/83/index.html
see Large Objects and "Storing Binary Data"

transparent encryption is not implemented in the driver but of course it is
possible.

> If I have an "unlimited" number of name-value pairs that I'd like to get
> easy access to for flexible reports, could I store these in two arrays (one
> for name, the other for value) in a table so that if I had 10 name-value
> pairs or 200 name-value pairs, I could store these into a single row using
> arrays so I could retrieve all name-value pairs in a single SELECT from the
> db?

yes you could, but what for?
what's wrong with many rows? create table kvstore(key text primary key,value
text);
what stops you from using single SELECT to get all these kv pairs?

> How are these arrays stored -- does it use an underlying type like LO or
> BYTEA?
>
> How big can an LO get? Is it 2GB?

no, much larger. I guess it's limited by max table size.

see http://www.postgresql.org/docs/faqs.FAQ.html#item4.4

>
> How many LO fields can I have in a database?

as many as the max number of different OIDs .. minus the number of system
objects
so I think 2^31 large objects is possible without problem.

> It seems that the LO may even be implemented as an OID with one or more
> BYTEA storage structure in the pg_largeobject table (loid,pageno,data). Is
> that true?

that's how it is actually implemented - you probably know it already :)

>
>
> How big is a "page"? Maybe an LO is more efficient than a BYTEA if it's
> bigger than one page?

default and recommended page size is 8 kB.

>
>
> How big can a BYTEA get? Is it 1GB?

Yes.

>
> At what size does it make more sense to store in LO instead of a BYTEA
> (because of all the escaping and such)?
> How many BYTEA fields can I have in a database?

no limit (other than limits mentioned in the FAQ)

> Are the BYTEA fields stored in the same table as the rest of the data?

yes - and the TOAST tables if it's larger than 1/3 of a page or so. search
for TOAST details if you're interested.

> I believe this is yes, so a backup of that table will include the binary
> data, too, correct?
>
yes

>
> How big can an ARRAY get? Is it 1GB?

yes

>
> How many ARRAY fields can I have in a table or database? Are there
> limits?

same as bytea or any other type

>
> Are the ARRAY fields stored in the same table as the rest of the data?

I guess ARRAYs are serialized and stored according to same rules as any
other data (TOAST mechanism).

>
>
> Sorry for all the questions, but I'm trying to research it but the info is
> not always clear (and perhaps some of the stuff I find is not even true).

most of your questions are answered in the documentation and FAQ.

>
>
> I am wondering if when my encrypted XML data is small, should I choose to
> store it in a table using BYTEA so that each "record" in my application
> (which uses the encrypted XML name-value storage) is not forced to be in a
> single pg_largeobject table, and use LO when my data reaches a threshold
> size? Thoughts?

I'd avoid LO unless you really need streaming (block-wise) access.

cheers,

--
Filip Rembiałkowski
JID,mailto:filip(dot)rembialkowski(at)gmail(dot)com
http://filip.rembialkowski.net/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message A.M. 2009-02-05 15:45:12 Re: Pet Peeves?
Previous Message Sam Mason 2009-02-05 12:16:33 Re: running postgres