From: | "Pierson Patricia L (Contractor)" <Patricia(dot)L(dot)Pierson(at)irs(dot)gov> |
---|---|
To: | "gogala(dot)mladen(at)gmail(dot)com" <gogala(dot)mladen(at)gmail(dot)com>, "michaeldba(at)sqlexec(dot)com" <michaeldba(at)sqlexec(dot)com> |
Cc: | "pgsql-admin(at)lists(dot)postgresql(dot)org" <pgsql-admin(at)lists(dot)postgresql(dot)org> |
Subject: | RE: [EXT] Re: Improve "select count(*)" query - takes more than 30 mins for some large tables |
Date: | 2022-07-12 18:13:08 |
Message-ID: | SJ0PR09MB6061D637A2A7988D19DC506FC7869@SJ0PR09MB6061.namprd09.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Hello,
Do a count on the primary key. Will force index access and you don’t access
the entire row which may be very long.
LIKE : select count(ID) from my_table;
From: Mladen Gogala <gogala(dot)mladen(at)gmail(dot)com>
Sent: Tuesday, July 12, 2022 11:58 AM
To: MichaelDBA Vitale <michaeldba(at)sqlexec(dot)com>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: [EXT] Re: Improve "select count(*)" query - takes more than 30 mins
for some large tables
What's wrong with parallelism? That's why it was invented. If you really need
an accurate count at moment's notice, create a trigger to maintain it.
Regards
On Tue, Jul 12, 2022, 10:31 AM MichaelDBA Vitale <michaeldba(at)sqlexec(dot)com
<mailto:michaeldba(at)sqlexec(dot)com> > wrote:
Perhaps do an analyze on the table and then select reltuples from pg_class for
that table. Might be faster than the select count(*).
Regards,
Michael Vitale
On 07/12/2022 8:51 AM Mladen Gogala <gogala(dot)mladen(at)gmail(dot)com
<mailto:gogala(dot)mladen(at)gmail(dot)com> > wrote:
On 7/11/22 03:23, Florents Tselai wrote:
psql “select id from my_table" | sort -u | wc -l
That will be a lot slower than just "select count(*) from my_table". You are
delivering data to the user program (psql) and then shipping them to pipe and
then processing the output with "wc". Depending on the version, PostgreSQL has
very reliable parallelism and can do counting rather quickly. The speed of
"select count(*) from my_table" depends on the speed of I/O. Since the table
is big, it cannot be cached in the file system cache, so all that you have at
your disposal is the raw disk speed. For the smaller machines, NVME is the
king. For larger rigs, you should consider something like Pure, XTremIO or
NetApp SolidFire. People frequently expect database to do miracles with under
par hardware.
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com
From | Date | Subject | |
---|---|---|---|
Next Message | Holger Jakobs | 2022-07-12 18:25:32 | Re: [EXT] Re: Improve "select count(*)" query - takes more than 30 mins for some large tables |
Previous Message | Mladen Gogala | 2022-07-12 17:57:42 | Re: Improve "select count(*)" query - takes more than 30 mins for some large tables |