What's the fastest way to do this?

From: Orion <o2(at)trustcommerce(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: What's the fastest way to do this?
Date: 2001-11-08 19:58:49
Message-ID: 9sentg$1hkt$1@news.tht.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


I have several really big tables that have columns uniquely identified by
single or multiple rows. [ I have about 25 tables, 10k to 500k rows
per table ]

Each day I get a flat file of updates. I have no way of knowing which
lines in the file are new records and which are updates for existing
records.

I need a way to insert the new ones and update the old ones. I have
a couple of ideas but none of them seem fast enough ( I will soon
be getting updates faster than I can feed them into the database ).

I am running postgres 7.1.3-1PDGD.

Here are my ideas...

---------------------------------------------------------

1)

select count(*) from table where id = X;
(if the count == 1)
update table set a = 1, b = 2, c = 3 where id = X;

2)

insert into table (id, a, b, c) values (X, 1, 2, 3);
(if there is an exception b/c of duplicate keys)
update table set a = 1, b = 2, c = 3 where id = X;

3)

delete from table where id = X;
(if the record is new nothing happens)
insert into table (id, a, b, c) values (X, 1, 2, 3);

----------------------------------------------------------

right now I am doing (3) but it is very slow - I
am getting about 8 records/second through and
yes my id column is indexed. The other problem
with (3) is it requires more vacuuming.

I cant imagine that (1) would be any faster.

It would seem that (2) would be the fastest but some
of my tables key on multiple columns
(where id1 = X and id2 = Y) and thus cant use a primary
key to raise an exception.

I cant imagine I am the first person to try this...
What's the standard/best/fastest way to do this.

Thanks in advance for any advice!

Orion

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rich Ryan 2001-11-08 20:54:19 Re: pg_dump usage - problems with restore due to the use of tab delimiters
Previous Message Evelio Martínez 2001-11-08 19:52:45 How to optimize a column type change???