From: | Hannu Krosing <hannu(at)krosing(dot)net> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: BUG #4204: COPY to table with FK has memory leak |
Date: | 2008-05-29 06:44:34 |
Message-ID: | 1212043474.7129.3.camel@huvostro |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Wed, 2008-05-28 at 22:45 +0100, Simon Riggs wrote:
> On Wed, 2008-05-28 at 16:28 -0400, Tom Lane wrote:
> > Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> > > "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> > >> This is expected to take lots of memory because each row-requiring-check
> > >> generates an entry in the pending trigger event list.
> >
> > > Hm, it occurs to me that we could still do a join against the pending event
> > > trigger list... I wonder how feasible it would be to store the pending trigger
> > > event list in a temporary table instead of in ram.
> >
> > We could make that list spill to disk, but the problem remains that
> > verifying the rows one at a time will take forever.
> >
> > The idea that's been kicked around occasionally is that once you get
> > past N pending events, throw them all away and instead queue a single
> > operation to do a bulk verify (just like initial establishment of the
> > FK constraint). I'm not sure how to do the queue management for this
> > though.
>
> Neither of those approaches is really suitable. Just spilling to disk is
> O(N) of the number of rows loaded, the second one is O(N) at least on
> the number of rows (loaded + existing). The second one doesn't help
> either since if the table was empty you'd have added the FK afterwards,
> so we must assume there is already rows in there and in most cases rows
> already loaded will exceed those being added by the bulk operation.
>
> AFAICS we must aggregate the trigger checks. We would need a special
> property of triggers that allowed them to be aggregated when two similar
> checks arrived. We can then use hash aggregation to accumulate them. We
> might conceivably need to spill to disk also, since the aggregation may
> not always be effective.
Can't we just do the checks for the FKs accumulated at the point they
don't fit in memory, instead of spilling to disk ?
> But in most cases the tables against which FK
> checks are made are significantly smaller than the tables being loaded.
> Once we have hash aggregated them, that is then the first part of a hash
> join to the target table.
>
> We certainly need a TODO item for "improve RI checks during bulk
> operations".
Agreed.
----------------
Hannu
From | Date | Subject | |
---|---|---|---|
Next Message | Jeferson Kasper | 2008-05-29 07:01:46 | BUG #4207: EXTRACT function |
Previous Message | Simon Riggs | 2008-05-29 05:41:44 | Re: BUG #4204: COPY to table with FK has memory leak |
From | Date | Subject | |
---|---|---|---|
Next Message | Hannu Krosing | 2008-05-29 06:57:53 | Re: intercepting WAL writes |
Previous Message | Simon Riggs | 2008-05-29 05:45:37 | Re: Avoiding second heap scan in VACUUM |