Re: Speedier count(*)

From: PFC <lists(at)boutiquenumerique(dot)com>
To: "Tino Wildenhain" <tino(at)wildenhain(dot)de>, "Gavin Sherry" <swm(at)alcove(dot)com(dot)au>
Cc: "Mark Cotner" <mcotner(at)yahoo(dot)com>, "Dan Harris" <fbsd(at)drivefaster(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Speedier count(*)
Date: 2005-08-11 12:08:34
Message-ID: op.svceokgbth1vuj@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> You could lock the count table to prevent the problem
> where 2 competing transactions do an insert, read the
> start value and add 1 to it and then write the result
> - which is n+1 rather then n+2 - so you are off by one.
> Think of the same when one transaction inserts 100
> and the other 120. Then you could even be off by 100.

Niet.

If your trigger does UPDATE counts_cache SET cached_count =
cached_count+N WHERE ...
Then all locking is taken care of by Postgres.
Of course if you use 2 queries then you have locking issues.

However the UPDATE counts_cache has a problem, ie. it locks this row FOR
UPDATE for the whole transaction, and all transactions which want to
update the same row must wait to see if the update commits or rollbacks,
so if you have one count cache row for the whole table you get MySQL style
scalability...

To preserve scalability you could, instead of UPDATE, INSERT the delta of
rows inserted/deleted in a table (which has no concurrencies issues) and
compute the current count with the sum() of the deltas, then with a cron,
consolidate the deltas and update the count_cache table so that the deltas
table stays very small.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Paul Johnson 2005-08-11 12:23:21 PG8 Tuning
Previous Message Luis Cornide Arce 2005-08-11 11:33:37 Why is not using the index