Re: Speedier count(*)

From: Tino Wildenhain <tino(at)wildenhain(dot)de>
To: PFC <lists(at)boutiquenumerique(dot)com>
Cc: Gavin Sherry <swm(at)alcove(dot)com(dot)au>, Mark Cotner <mcotner(at)yahoo(dot)com>, Dan Harris <fbsd(at)drivefaster(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Speedier count(*)
Date: 2005-08-11 12:34:00
Message-ID: 1123763640.27613.18.camel@Andrea.peacock.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Am Donnerstag, den 11.08.2005, 14:08 +0200 schrieb PFC:
>
> > You could lock the count table to prevent the problem
> > where 2 competing transactions do an insert, read the
> > start value and add 1 to it and then write the result
> > - which is n+1 rather then n+2 - so you are off by one.
> > Think of the same when one transaction inserts 100
> > and the other 120. Then you could even be off by 100.
>
> Niet.
>
> If your trigger does UPDATE counts_cache SET cached_count =
> cached_count+N WHERE ...
> Then all locking is taken care of by Postgres.
> Of course if you use 2 queries then you have locking issues.

Yes, in the case you use just the UPDATE statement you are right. This
does the locking I was talking about.

In either case I'd use an after trigger and not before to minimize
the impact.

> However the UPDATE counts_cache has a problem, ie. it locks this row FOR
> UPDATE for the whole transaction, and all transactions which want to
> update the same row must wait to see if the update commits or rollbacks,
> so if you have one count cache row for the whole table you get MySQL style
> scalability...
>
> To preserve scalability you could, instead of UPDATE, INSERT the delta of
> rows inserted/deleted in a table (which has no concurrencies issues) and
> compute the current count with the sum() of the deltas, then with a cron,
> consolidate the deltas and update the count_cache table so that the deltas
> table stays very small.

Yes, this is in fact a better approach to this problem.

(All this provided you want an unqualified count() - as the
original poster)

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2005-08-11 12:49:10 Re: Why is not using the index
Previous Message Paul Johnson 2005-08-11 12:23:21 PG8 Tuning