Bulk read

From: Nathaniel <naptrel(at)yahoo(dot)co(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Bulk read
Date: 2009-09-14 11:55:43
Message-ID: 872141.19868.qm@web25003.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In the context of writing to and reading from a postgres DB from a client application, I've spent quite a while looking through the postgres manual, these mailing lists and the internet, and there is quite a bit of information available about bulk-loading. The favoured strategy seems to be to:

1. write records to a temporary file (binary is preferable),
2. read that file into a temporary table (through a COPY BINARY sql statement),
3. copy temporary table's records to destination table through an "INSERT INTO" call.

This strategy provides efficiency and (through the use of a mapping from the temporary table to the destination table) flexibility.

But I can't find any info/strategies on bulk reads!

I want to perform a query on a single table/view and extract multiple results (100,000+ records) into my client application. I'm using ECPG. I've only come up with 2 strategies, and have yet to implement either:

a. Use cursors and FETCH commands.
b. Do bulk-loading in reverse.

I've talked to someone with experience of (a), and things weren't quite
a simple as you'd expect. To maximise performance it proved best to
fetch chunks of results at a time (around 10,000) and the code was ugly
to write as a generic 'bulk read' function.

(b) would go something like this:

1. query the source table and
write the results into a temporary table (something like
"INSERT INTO tmp_table SELECT * FROM source_table WHERE ....")
2. write the temporary table to a binary file (using "COPY BINARY TO")
3. read by the temporary file into the client application.

I fancy the look of (2), but I'm a novice and can't find any reference to such a strategy anywhere, suggesting wise folks know to stay well clear.

Can anyone comment on either approach, or offer an alternative strategy?

Thanks, Nathaniel

Browse pgsql-general by date

  From Date Subject
Next Message Angus Miller 2009-09-14 13:09:53 Cartesian product not correct
Previous Message Magnus Hagander 2009-09-14 11:27:54 Re: Regarding initdb & pg_ctl