From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Jan Wieck <JanWieck(at)Yahoo(dot)com> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Coping with huge deferred-trigger lists |
Date: | 2001-05-09 15:38:39 |
Message-ID: | 20906.989422719@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I had a thought just now about how to deal with the TODO item about
coping with deferred trigger lists that are so long as to overrun
main memory. This might be a bit harebrained, but I offer it for
consideration:
What we need to do, at the end of a transaction in which deferred
triggers were fired, is to find each tuple that was inserted or
updated in the current transaction in each table that has such
triggers. Well, we know where those tuples are: to a first
approximation, they're all near the end of the table. Perhaps instead
of storing each and every trigger-related tuple in memory, we only need
to store one value per affected table: the lowest CTID of any tuple
that we need to revisit for deferred-trigger purposes. At the end of
the transaction, scan forward from that point to the end of the table,
looking for tuples that were inserted by the current xact. Process each
one using the table's list of deferred triggers.
Instead of a list of all tuples subject to deferred triggers, we now
need only a list of all tables subject to deferred triggers, which
should pose no problems for memory consumption. It might be objected
that this means more disk activity --- but in an xact that hasn't
inserted very many tuples, most likely the disk blocks containing 'em
are still in memory and won't need a physical re-read. Once we get to
inserting so many tuples that that's not true, this approach should
require less disk activity overall than the previous idea of writing
(and re-reading) a separate disk file for the tuple list.
I am not sure exactly what the "triggered data change violation" test
does or is good for, but if we want to keep it, I *think* that in these
terms we'd just need to signal error if we come across a tuple that was
both inserted and deleted by the current xact. I'm a bit fuzzy on this
though.
An interesting property of this approach is that if the set of triggers
for the table changes during the xact (which could only happen if this
same xact created or deleted triggers; no other xact can, since changing
triggers requires an exclusive lock on the table), the set of triggers
applied to a tuple is the set that exists at the end of the xact, not
the set that existed when the tuple was modified. Offhand I think this
is a good change.
Comments?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2001-05-09 15:45:08 | Re: Case sensitive order by |
Previous Message | Michaël Fiey | 2001-05-09 15:36:53 | Re: Case sensitive order by |