Re: One tuple per transaction

From: "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: One tuple per transaction
Date: 2005-03-14 01:41:30
Message-ID: d12qcr$a1r$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

""Tambet Matiisen"" <t(dot)matiisen(at)aprote(dot)ee> writes
> Hi!
>
> In one of our applications we have a database function, which
> recalculates COGS (cost of good sold) for certain period. This involves
> deleting bunch of rows from one table, inserting them again in correct
> order and updating them one-by-one (sometimes one row twice) to reflect
> current state. The problem is, that this generates an enormous amount of
> tuples in that table.
>
> If I'm correct, the dead tuples must be scanned also during table and
> index scan, so a lot of dead tuples slows down queries considerably,
> especially when the table doesn't fit into shared buffers any more. And
> as I'm in transaction, I can't VACUUM to get rid of those tuples. In one
> occasion the page count for a table went from 400 to 22000 at the end.

Not exactly. The dead tuple in the index will be scanned the first time (and
its pointed heap tuple as well), then we will mark it dead, then next time
we came here, we will know that the index tuple actually points to a uesless
tuple, so we will not scan its pointed heap tuple.

>
> All this made me wonder, why is new tuple created after every update?
> One tuple per transaction should be enough, because you always commit or
> rollback transaction as whole. And my observations seem to indicate,
> that new index tuple is created after column update even if this column
> is not indexed.

This is one cost of MVCC. A good thing of MVCC is there is no conflict
between read and write - maybe some applications need this.

A reference could be found here:

http://www.postgresql.org/docs/8.0/static/storage-page-layout.html#HEAPTUPLEHEADERDATA-TABLE

>
> One tuple per transaction would save a loads of I/O bandwidth, so I
> believe there must be a reason why it isn't implemented as such. Or were
> my assumptions wrong, that dead tuples must be read from disk?
>
> Tambet
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Stark 2005-03-14 04:36:13 Re: Postgres on RAID5
Previous Message Tom Lane 2005-03-14 00:08:41 Re: [PERFORM] How to read query plan