Quick Links

Re: COUNT and Performance ...

From:	Hans-Jürgen Schönig <hs(at)cybertec(dot)at>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, neilc(at)samurai(dot)com, eg(at)cybertec(dot)at
Subject:	Re: COUNT and Performance ...
Date:	2003-02-02 18:18:51
Message-ID:	3E3D610B.20703@cybertec.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>
>
>But pgstattuple does do a sequential scan of the table. You avoid a lot
>of the executor's tuple-pushing and plan-node-traversing machinery that
>way, but the I/O requirement is going to be exactly the same.
>
>

I have tried it more often so that I can be sure that everything is in
the cache.
I thought it did some sort of "stat" on tables. Too bad :(.

>>If people want to count ALL rows of a table. The contrib stuff is pretty
>>useful. It seems to be transaction safe.
>>
>>
>
>Not entirely. pgstattuple uses HeapTupleSatisfiesNow(), which means you
>get a count of tuples that are committed good in terms of the effects of
>transactions committed up to the instant each tuple is examined. This
>is in general different from what count(*) would tell you, because it
>ignores snapshotting. It'd be quite unrepeatable too, in the face of
>active concurrent changes --- it's very possible for pgstattuple to
>count a single row twice or not at all, if it's being concurrently
>updated and the other transaction commits between the times pgstattuple
>sees the old and new versions of the row.
>
>
Interesting. I have tried it with concurrent sessions and transactions -
the results seemed to be right (I could not see the records inserted by
open transactions). Too bad :(. It would have been a nice work around.

>>The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz
>>
>
>I think your test case is small enough that the whole table is resident
>in memory, so this measurement only accounts for CPU time per tuple and
>not any I/O. Given the small size of pgstattuple's per-tuple loop, the
>speed differential is not too surprising --- but it won't scale up to
>larger tables.
>
>Sometime it would be interesting to profile count(*) on large tables
>and see exactly where the CPU time goes. It might be possible to shave
>off some of the executor overhead ...
>
> regards, tom lane
>
>

I have tried it with the largest table on my testing system.
Reducing the overhead is great :).

Thanks a lot,

Hans

--
*Cybertec Geschwinde u Schoenig*
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/1/913 68 09; +43/664/233 90 75
www.postgresql.at <http://www.postgresql.at>, cluster.postgresql.at
<http://cluster.postgresql.at>, www.cybertec.at
<http://www.cybertec.at>, kernel.cybertec.at <http://kernel.cybertec.at>

In response to

Re: COUNT and Performance ... at 2003-02-02 18:04:14 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Curt Sampson	2003-02-02 18:51:36	Re: Linux.conf.au 2003 Report
Previous Message	Tom Lane	2003-02-02 18:04:14	Re: COUNT and Performance ...