Re: What's the fastest way to do this?

From: Andrew Gould <andrewgould(at)yahoo(dot)com>
To: o2(at)trustcommerce(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: What's the fastest way to do this?
Date: 2001-11-08 22:13:45
Message-ID: 20011108221345.92109.qmail@web13409.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Are you updating directly from the flat file? Or are
you copying the new data into an indexed table that is
emptied and vacuumed after daily updates, and
updating/inserting from there?

Andrew Gould

--- Orion <o2(at)trustcommerce(dot)com> wrote:
>
> I have several really big tables that have columns
> uniquely identified by
> single or multiple rows. [ I have about 25 tables,
> 10k to 500k rows
> per table ]
>
> Each day I get a flat file of updates. I have no
> way of knowing which
> lines in the file are new records and which are
> updates for existing
> records.
>
> I need a way to insert the new ones and update the
> old ones. I have
> a couple of ideas but none of them seem fast enough
> ( I will soon
> be getting updates faster than I can feed them into
> the database ).
>
> I am running postgres 7.1.3-1PDGD.
>
> Here are my ideas...
>
>
---------------------------------------------------------
>
> 1)
>
> select count(*) from table where id = X;
> (if the count == 1)
> update table set a = 1, b = 2, c = 3 where id = X;
>
> 2)
>
> insert into table (id, a, b, c) values (X, 1, 2, 3);
> (if there is an exception b/c of duplicate keys)
> update table set a = 1, b = 2, c = 3 where id = X;
>
> 3)
>
> delete from table where id = X;
> (if the record is new nothing happens)
> insert into table (id, a, b, c) values (X, 1, 2, 3);
>
>
----------------------------------------------------------
>
> right now I am doing (3) but it is very slow - I
> am getting about 8 records/second through and
> yes my id column is indexed. The other problem
> with (3) is it requires more vacuuming.
>
> I cant imagine that (1) would be any faster.
>
> It would seem that (2) would be the fastest but some
> of my tables key on multiple columns
> (where id1 = X and id2 = Y) and thus cant use a
> primary
> key to raise an exception.
>
> I cant imagine I am the first person to try this...
> What's the standard/best/fastest way to do this.
>
> Thanks in advance for any advice!
>
> Orion
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephan Szabo 2001-11-08 22:32:27 Re: Performance: Sum()
Previous Message David Link 2001-11-08 21:57:38 Performance: Sum()