Quick Links

What's the fastest way to do this?

From:	Orion <o2(at)trustcommerce(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	What's the fastest way to do this?
Date:	2001-11-08 19:58:49
Message-ID:	9sentg$1hkt$1@news.tht.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

I have several really big tables that have columns uniquely identified by
single or multiple rows. [ I have about 25 tables, 10k to 500k rows
per table ]

Each day I get a flat file of updates. I have no way of knowing which
lines in the file are new records and which are updates for existing
records.

I need a way to insert the new ones and update the old ones. I have
a couple of ideas but none of them seem fast enough ( I will soon
be getting updates faster than I can feed them into the database ).

I am running postgres 7.1.3-1PDGD.

Here are my ideas...

---------------------------------------------------------

select count(*) from table where id = X;
(if the count == 1)
update table set a = 1, b = 2, c = 3 where id = X;

insert into table (id, a, b, c) values (X, 1, 2, 3);
(if there is an exception b/c of duplicate keys)
update table set a = 1, b = 2, c = 3 where id = X;

delete from table where id = X;
(if the record is new nothing happens)
insert into table (id, a, b, c) values (X, 1, 2, 3);

----------------------------------------------------------

right now I am doing (3) but it is very slow - I
am getting about 8 records/second through and
yes my id column is indexed. The other problem
with (3) is it requires more vacuuming.

I cant imagine that (1) would be any faster.

It would seem that (2) would be the fastest but some
of my tables key on multiple columns
(where id1 = X and id2 = Y) and thus cant use a primary
key to raise an exception.

I cant imagine I am the first person to try this...
What's the standard/best/fastest way to do this.

Thanks in advance for any advice!

Orion

Responses

Re: What's the fastest way to do this? at 2001-11-08 22:13:45 from Andrew Gould
Re: What's the fastest way to do this? at 2001-11-09 06:38:59 from Einar Karttunen

Browse pgsql-general by date

	From	Date	Subject
Next Message	Rich Ryan	2001-11-08 20:54:19	Re: pg_dump usage - problems with restore due to the use of tab delimiters
Previous Message	Evelio Martínez	2001-11-08 19:52:45	How to optimize a column type change???