From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Surafel Temesgen <surafel3000(at)gmail(dot)com>, alvherre(at)2ndquadrant(dot)com, Adam Berlin <berlin(dot)ab(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: COPY FROM WHEN condition |
Date: | 2019-01-22 18:35:57 |
Message-ID: | 20190122183557.6246fitlswmt725p@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2019-01-22 18:35:21 +0100, Tomas Vondra wrote:
> On 1/21/19 11:15 PM, Tomas Vondra wrote:
> > On 1/21/19 7:51 PM, Andres Freund wrote:
> >> I'm *not* convinced by this. I think it's bad enough that we do this for
> >> normal COPY, but for WHEN, we could end up *never* resetting before the
> >> end. Consider a case where a single tuple is inserted, and then *all*
> >> rows are filtered. I think this needs a separate econtext that's reset
> >> every round. Or alternatively you could fix the code not to rely on
> >> per-tuple not being reset when tuples are buffered - that actually ought
> >> to be fairly simple.
> >>
> >
> > I think separating the per-tuple and per-batch contexts is the right
> > thing to do, here. It seems the batching was added somewhat later and
> > using the per-tuple context is rather confusing.
> >
>
> OK, here is a WIP patch doing that. It creates a new "batch" context,
> and allocates tuples in it (instead of the per-tuple context). The
> per-tuple context is now reset always, irrespectedly of nBufferedTuples.
> And the batch context is reset every time the batch is emptied.
>
> It turned out to be a tad more complex due to partitioning, because when
> we find the partitions do not match, the tuple is already allocated in
> the "current" context (be it per-tuple or batch). So we can't just free
> the whole context at that point. The old code worked around this by
> alternating two contexts, but that seems a bit too cumbersome to me, so
> the patch simply copies the tuple to the new context. That allows us to
> reset the batch context always, right after emptying the buffer. I need
> to do some benchmarking to see if the extra copy causes any regression.
>
> Overall, separating the contexts makes it quite a bit clearer. I'm not
> entirely happy about the per-tuple context being "implicit" (hidden in
> executor context) while the batch context being explicitly created, but
> there's not much I can do about that.
I think the extra copy is probably OK for now - as part of the pluggable
storage series I've converted COPY to use slots, which should make that
less of an issue - I've not done that, but I actually assume we could
remove the whole batch context afterwards.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Arthur Zakirov | 2019-01-22 18:36:17 | Re: [PROPOSAL] Shared Ispell dictionaries |
Previous Message | Andres Freund | 2019-01-22 18:17:02 | Re: Refactoring the checkpointer's fsync request queue |