From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: INSERT times - same storage space but more fields -> much slower inserts |
Date: | 2009-04-14 15:56:14 |
Message-ID: | 20090414155614.GL8123@tamriel.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Craig,
* Craig Ringer (craig(at)postnewspapers(dot)com(dot)au) wrote:
> I've been doing some testing for the Bacula project, which uses
> PostgreSQL as one of the databases in which it stores backup catalogs.
We also use Bacula with a PostgreSQL backend.
> I've been evaluating a schema change for Bacula that takes a field
> that's currently stored as a gruesome-to-work-with base64-encoded
> representation of a binary blob, and expands it into a set of integer
> fields that can be searched, indexed, etc.
This would be extremely nice.
> The table size of the expanded form is marginally smaller than for the
> base64-encoded string version. However, INSERT times are *CONSIDERABLY*
> greater for the version with more fields. It takes 1011 seconds to
> insert the base64 version, vs 1290 seconds for the expanded-fields
> version. That's a difference of 279 seconds, or 27%.
>
> Despite that, the final table sizes are the same.
>
> If I use tab-separated input and COPY, the original-format file is
> 1300MB and the expanded-structure format is 1618MB. The performance hit
> on COPY-based insert is not as bad, at 161s vs 182s (13%), but still
> quite significant.
>
> Any ideas about what I might be able to do to improve the efficiency of
> inserting records with many integer fields?
Bacula should be using COPY for the batch data loads, so hopefully won't
suffer too much from having the fields split out. I think it would be
interesting to try doing PQexecPrepared with binary-format data instead
of using COPY though. I'd be happy to help you implement a test setup
for doing that, if you'd like.
Thanks,
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Matthew Wakeling | 2009-04-14 16:01:08 | Re: INSERT times - same storage space but more fields -> much slower inserts |
Previous Message | Merlin Moncure | 2009-04-14 13:46:28 | Re: Nested query performance issue |