From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com> |
Cc: | Wei Weng <wweng(at)kencast(dot)com>, <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: performance of insert/delete/update |
Date: | 2002-11-21 23:34:53 |
Message-ID: | 200211211534.53358.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Scott,
> > This only works up to the limit of the memory you have available for
> > Postgres. If the updates in one transaction exceed your available
> > memory, you'll see a lot of swaps to disk log that will slow things
> > down by a factor of 10-50 times.
>
> Sorry, but that isn't true. MVCC means we don't have to hold all the data
> in memory, we can have multiple versions of the same tuples on disk, and
> use memory for what it's meant for, buffering.
Sorry, you're absolutely correct. I don't know what I was thinking of; 's the
problem with an off-the-cuff response.
Please disregard the previous quote. Instead:
Doing several large updates in a single transaction can lower performance if
the number of updates is sufficient to affect index usability and a VACUUM is
really needed between them. For example, a series of large data
transformation statements on a single table or set of related tables should
have VACCUUM statements between them, thus preventing you from putting them
in a single transaction.
Example, the series:
1. INSERT 10,000 ROWS INTO table_a;
2. UPDATE 100,000 ROWS IN table_a WHERE table_b;
3. UPDATE 100,000 ROWS IN table_c WHERE table_a;
WIll almost certainly need a VACUUM or even VACUUM FULL table_a after 2),
requiring you to split the update series into 2 transactions. Otherwise, the
"where table_a" condition in step 3) will be extremely slow.
> Also note that many folks have replaced foreign keys with triggers and
> gained in performance, as fks in pgsql still have some deadlock issues to
> be worked out.
Yeah. I think Neil Conway is overhauling FKs, which everyone considers a bit
of a hack in the current implementation, including Jan who wrote it.
> > It can be dangerous though ... in the event of a power outage, for
> > example, your database could be corrupted and difficult to recover. So
> > ... "at your own risk".
>
> No, the database will not be corrupted, at least not in my experience.
> however, you MAY lose data from transactions that you thought were
> committed. I think Tom posted something about this a few days back.
Hmmm ... have you done this? I'd like the performance gain, but I don't want
to risk my data integrity. I've seen some awful things in databases (such as
duplicate primary keys) from yanking a power cord repeatedly.
> update table set field1=field1+1
>
> are killers in an MVCC database as well.
Yeah -- don't I know it.
--
-Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | David Wheeler | 2002-11-21 23:38:43 | Re: DBD::PostgreSQL compile time /run time version |
Previous Message | Christopher Kings-Lynne | 2002-11-21 23:02:41 | Re: Optimizer & boolean syntax |
From | Date | Subject | |
---|---|---|---|
Next Message | Mario Weilguni | 2002-11-22 07:31:11 | Re: [ADMIN] H/W RAID 5 on slower disks versus no raid on |
Previous Message | scott.marlowe | 2002-11-21 22:54:14 | Re: performance of insert/delete/update |