From: | Andrew Gould <andrewgould(at)yahoo(dot)com> |
---|---|
To: | o2(at)trustcommerce(dot)com, pgsql-general(at)postgresql(dot)org |
Subject: | Re: What's the fastest way to do this? |
Date: | 2001-11-08 22:13:45 |
Message-ID: | 20011108221345.92109.qmail@web13409.mail.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Are you updating directly from the flat file? Or are
you copying the new data into an indexed table that is
emptied and vacuumed after daily updates, and
updating/inserting from there?
Andrew Gould
--- Orion <o2(at)trustcommerce(dot)com> wrote:
>
> I have several really big tables that have columns
> uniquely identified by
> single or multiple rows. [ I have about 25 tables,
> 10k to 500k rows
> per table ]
>
> Each day I get a flat file of updates. I have no
> way of knowing which
> lines in the file are new records and which are
> updates for existing
> records.
>
> I need a way to insert the new ones and update the
> old ones. I have
> a couple of ideas but none of them seem fast enough
> ( I will soon
> be getting updates faster than I can feed them into
> the database ).
>
> I am running postgres 7.1.3-1PDGD.
>
> Here are my ideas...
>
>
---------------------------------------------------------
>
> 1)
>
> select count(*) from table where id = X;
> (if the count == 1)
> update table set a = 1, b = 2, c = 3 where id = X;
>
> 2)
>
> insert into table (id, a, b, c) values (X, 1, 2, 3);
> (if there is an exception b/c of duplicate keys)
> update table set a = 1, b = 2, c = 3 where id = X;
>
> 3)
>
> delete from table where id = X;
> (if the record is new nothing happens)
> insert into table (id, a, b, c) values (X, 1, 2, 3);
>
>
----------------------------------------------------------
>
> right now I am doing (3) but it is very slow - I
> am getting about 8 records/second through and
> yes my id column is indexed. The other problem
> with (3) is it requires more vacuuming.
>
> I cant imagine that (1) would be any faster.
>
> It would seem that (2) would be the fastest but some
> of my tables key on multiple columns
> (where id1 = X and id2 = Y) and thus cant use a
> primary
> key to raise an exception.
>
> I cant imagine I am the first person to try this...
> What's the standard/best/fastest way to do this.
>
> Thanks in advance for any advice!
>
> Orion
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com
From | Date | Subject | |
---|---|---|---|
Next Message | Stephan Szabo | 2001-11-08 22:32:27 | Re: Performance: Sum() |
Previous Message | David Link | 2001-11-08 21:57:38 | Performance: Sum() |