From: | salah jubeh <s_jubeh(at)yahoo(dot)com> |
---|---|
To: | "Carlos Sotto Maior \(SIM\)" <csotto(at)sistemassim(dot)com(dot)br>, David Johnston <polobo(at)yahoo(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: RES: SELECT COUNT(*) execution time on large tables (v9.0.4-1) |
Date: | 2011-05-31 07:58:32 |
Message-ID: | 47244.39008.qm@web161504.mail.bf1.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Please Have a look on pg_stat_user_tables, there is a field called n_live_tup .
But I think in some cases this might not hold the actual row number
Regards
________________________________
From: Carlos Sotto Maior (SIM) <csotto(at)sistemassim(dot)com(dot)br>
To: David Johnston <polobo(at)yahoo(dot)com>; pgsql-general(at)postgresql(dot)org
Sent: Mon, May 30, 2011 8:35:39 PM
Subject: RES: [GENERAL] SELECT COUNT(*) execution time on large tables
(v9.0.4-1)
David, Thanks for your reply.
I will probably use the strategy of a trigger driven counter, with temporal
strategy devising current month totals and up to last month total as current
month changes rapidly.
I also apologize for not being investigative enough. I did look at wiki but
maybe I did not pursue my doubt in wiki in the proper manner. I will surely
get more acquainted to wiki to keep this channel clean from repeated
questions.
Once again, thank you very much.
Carlos Sotto Maior
+55 11 8244-7899
csotto(at)sistemassim(dot)com(dot)br
Sistemas Sim Serviços e Tecnologia Ltda.
+55 11 5041-3086
Rua Tenente Gomes Ribeiro, 78
Vila Clementino (Próximo ao Metro Santa Cruz)
São Paulo - SP
04038-040
-----Mensagem original-----
De: David Johnston [mailto:polobo(at)yahoo(dot)com]
Enviada em: sexta-feira, 27 de maio de 2011 17:49
Para: 'Carlos Sotto Maior (SIM)'; pgsql-general(at)postgresql(dot)org
Assunto: RE: [GENERAL] SELECT COUNT(*) execution time on large tables
(v9.0.4-1)
Counting live data is inherently imprecise. There are supposedly some
system tables that can give you rough numbers.
You would be better off figuring out an alternative method to get the data
you desire and stop continually recounting all 5.7M records.
A Trigger driven counter, for insert and delete, is probably the most
obvious method. Also, say for temporal data, cache the prior monthly counts
and only perform an actual count over the current (changing) month(s).
At your table size the brute-force approach is obviously not going to work
so an alternative method needs to be devised, one that eliminates
re-counting previously counted records. The specific design is going to be
highly dependent on your specific requirements - which is why no generalized
solution exists. If you provide the why behind the question, and not just
the question, people may be inclined to provide relevant suggestions.
Issuing a "count(*)" is not a need - it is an implementation. The need is
what you end up doing with that number.
Lastly, the time you spent combing the system catalogs would have been
better spent perusing the FAQ linked to off the PostgreSQL homepage. You
question, in almost the same words, is in the FAQ with a link to the wiki
which repeats all your observations and explains why the behavior is that
way; and suggests (links to) possible alternatives. You may wish to go
there now to get more background and ideas.
David J.
> Hi,
>
> My application has a frequent need to issue a select count(*) on tables.
> Some have a large row count. (The example below are from a 5.7 M row;
> Some are larger).
>
> Issuing either SELECT COUNT(*) or SELECT COUNT(<Primary_Key_Colum>)
> yelds a sequential scan on table;
>
> I have browsed catalog tables, digging for a real time Row.count but
> so
far
> did not find any.
> QUESTION: Is there a better (faster) way to obtain the row count from
> a table?
>
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2011-05-31 08:06:02 | Re: Universal certificate for verify-full ssl connection |
Previous Message | Asia | 2011-05-31 07:40:54 | Universal certificate for verify-full ssl connection |