Quick Links

Select count(*) on a 2B Rows Tables Takes ~20 Hours

From:	Fd Habash <fmhabash(at)gmail(dot)com>
To:	"pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject:	Select count(*) on a 2B Rows Tables Takes ~20 Hours
Date:	2018-09-13 17:33:54
Message-ID:	5b9a9f81.1c69fb81.43388.0e8b@mx.google.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Based on my research in the forums and Google , it is described in multiple places that ‘select count(*)’ is expected to be slow in Postgres because of the MVCC controls imposed upon the query leading a table scan. Also, the elapsed time increase linearly with table size.

However, I do not know if elapsed time I’m getting is to be expected.

Table reltuples in pg_class = 2,266,649,344 (pretty close)
Query = select count(*) from jim.sttyations ;
Elapsed time (ET) = 18.5 hrs

This is an Aurora cluster running on r4.2xlarge (8 vCPU, 61g). CPU usage during count run hovers around 20% with 20g of freeable memory.

Is this ET expected? If not, what could be slowing it down? I’m currently running explain analyze and I’ll share the final output when done.

I’m familiar with the ideas listed here https://www.citusdata.com/blog/2016/10/12/count-performance/

----------------
Thank you

refpep-> select count(*) from jim.sttyations;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Aggregate (cost=73451291.77..73451291.78 rows=1 width=8)
Output: count(*)
-> Index Only Scan using stty_indx_fk03 on jim.sttyations (cost=0.58..67784668.41 rows=2266649344 width=0)
Output: vsr_number
(4 rows)

Responses

Re: Select count(*) on a 2B Rows Tables Takes ~20 Hours at 2018-09-13 18:05:32 from Justin Pryzby
Re: Select count(*) on a 2B Rows Tables Takes ~20 Hours at 2018-09-13 18:12:02 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Justin Pryzby	2018-09-13 18:05:32	Re: Select count(*) on a 2B Rows Tables Takes ~20 Hours
Previous Message	padusuma	2018-09-13 12:57:39	Re: Performance of INSERT into temporary tables using psqlODBC driver