From: | Chris Browne <cbbrowne(at)acm(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Exposing the Xact commit order to the user |
Date: | 2010-06-03 20:21:35 |
Message-ID: | 871vcnx3xs.fsf@cbbrowne-laptop.afilias-int.info |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
gsstark(at)mit(dot)edu (Greg Stark) writes:
> On Wed, Jun 2, 2010 at 6:45 PM, Chris Browne <cbbrowne(at)acm(dot)org> wrote:
>> It would make it easy to conclude:
>>
>> "This next transaction did 8328194 updates. Maybe we should do
>> some kind of checkpoint (e.g. - commit transaction or such) before
>> working on it."
>>
>> versus
>>
>> "This transaction we're thinking of working on had 7 updates. No
>> big deal..."
>
> I'm puzzled how you would define this value. How do you add 7 inserts,
> 7 deletes, and 7 updates? Is that 21 rows modified? Why are the 7
> inserts and 7 deletes worth twice as much as the 7 updates when
> they're basically the same thing? What if the inserts fired triggers
> which inserted 7 more rows, is that 14? What if the 7 updates modified
> 2 TB of TOAST data but the 8238194 updates were all to the same record
> and they were all HOT updates so all it did was change 8kB?
The presence of those questions (and their ambiguity) is the reason
why there's a little squirming as to whether this is super-useful and
super-necessary.
What this offers is *SOME* idea of how much updating work a particular
transaction did. It's a bit worse than you suggest:
- If replication triggers have captured tuples, those would get
counted.
- TOAST updates might lead to extra updates being counted.
But back to where you started, I'd anticipate 7 inserts, 7 deletes,
and 7 updates being counted as something around 21 updates.
And if that included 5 TOAST changes, it might bump up to 26.
If there were replication triggers in place, that might bump the count
up to 45 (which I chose arbitrarily).
> In any case you'll have all the actual data from your triggers or
> hooks or whatever so what value does having the system keep track of
> this add?
This means that when we'd pull the list of transactions to consider,
we'd get something like:
select * from next_transactions('4218:23', 50);
[list of 50 transactions returned, each with...
-> txid
-> START timestamp
-> COMMIT timestamp
-> Approximate # of updates
Then, for each of the 50, I'd pull replication log data for the
corresponding transaction.
If I have the approximate # of updates, that might lead me to stop
short, and say:
"That next update looks like a doozy! I'm going to stop and commit
what I've got before doing that one."
It's not strictly necessary, but would surely be useful for flow
control.
--
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://cbbrowne.com/info/internet.html
"MS apparently now has a team dedicated to tracking problems with
Linux and publicizing them. I guess eventually they'll figure out
this back fires... ;)" -- William Burrow <aa126(at)DELETE(dot)fan(dot)nb(dot)ca>
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2010-06-03 20:43:40 | Re: clarification on walsender protocol document |
Previous Message | Tom Lane | 2010-06-03 20:17:43 | clarification on walsender protocol document |