From: | Christopher Browne <cbbrowne(at)acm(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: *sigh* |
Date: | 2003-12-03 05:55:16 |
Message-ID: | m31xrmwim3.fsf@wolfe.cbbrowne.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Martha Stewart called it a Good Thing when Randolf Richardson <rr(at)8x(dot)ca> wrote:
>>> The count(*) information can be revisioned too, am I wrong ? I'm able
>>> to create a trigger that store the count(*) information in a special
>>> table, why not implement the same in a way "builded in" ?
>>
>> Then every insert or delete would have to lock that count. Nobody else
>> would be able to insert or delete any records until you either commit or
>> roll back.
>>
>> That would lead to much lower concurrency, much more contention for
>> locks, and tons of deadlocks.
>
> What about queueing all these updates for a separate
> low-priority thread? The thread would be the only one with access
> to update this field.
If updates are "queued," then how do you get to use them if the
"update thread" isn't running because it's not high enough in
priority?
I am not being facetious.
The one way that is expected to be successful would be to have a
trigger that, upon seeing an insert of 5 rows to table "ABC", puts,
into table "count_detail", something like:
insert into count_detail (table, value) values ('ABC', 5);
You then replace
select count(*) from abc;
with
select sum(value) from count_detail where table = 'ABC';
The "low priority" thread would be a process that does something akin
to vacuuming, where it would replace the contents of the table every
so often...
for curr_table in (select table from count_detail) do
new_total = select sum(value) from count_detail
where table = curr_table;
delete from count_detail where table = curr_table;
insert into count_detail (table, value) values (curr_table,
new_total);
done
The point of this being to try to keep the number of rows to 1 per
table.
Note that this gets _real_ expensive for tables that see lots of
single row inserts and deletes. There isn't a cheaper way that will
actually account for the true numbers of records that have been
committed.
For a small table, it will be cheaper to walk through and calculate
count(*) directly from the tuples themselves.
The situation where it may be worthwhile to do this is a table which
is rather large (thus count(*) is expensive) where there is some
special reason to truly care how many rows there are in the table.
For _most_ tables, it seems unlikely that this will be true. For
_most_ tables, it is absolutely not worth the cost of tracking the
information.
--
(format nil "~S(at)~S" "cbbrowne" "acm.org")
http://cbbrowne.com/info/spreadsheets.html
Predestination was doomed from the start.
From | Date | Subject | |
---|---|---|---|
Next Message | John Sidney-Woollett | 2003-12-03 08:08:49 | Transaction Question |
Previous Message | Claudio Natoli | 2003-12-03 05:30:40 | Re: [HACKERS] fork/exec problem: DynaHashCxt |