Re: Reducing the chunk header sizes on all memory context types

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reducing the chunk header sizes on all memory context types
Date: 2022-09-26 22:28:46
Message-ID: CAApHDvoBxz7RiJiyGzi-wN4c-vLg9TGQ8yTmFW1j6eUJPsP4DA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 20 Sept 2022 at 13:23, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > Aside from that, I don't have any ideas on how to get rid of the
> > possible additional datumCopy() from non-Var arguments to these window
> > functions. Should we just suffer it? It's quite likely that most
> > arguments to these functions are plain Vars anyway.
>
> No, we shouldn't. I'm pretty sure that we have various window
> functions that are deliberately designed to take advantage of the
> no-copy behavior, and that they have taken a significant speed
> hit from your having disabled that optimization. I don't say
> that this is enough to justify reverting the chunk header changes
> altogether ... but I'm completely not satisfied with the current
> situation in HEAD.

Looking more closely at window_gettupleslot(), it always allocates the
tuple in ecxt_per_query_memory, so any column we fetch out of that
tuple will be in that memory context. window_gettupleslot() is used
in lead(), lag(), first_value(), last_value() and nth_value() to fetch
the Nth tuple out of the partition window. The other window functions
all return BIGINT, FLOAT8 or INT which are byval on 64-bit, and on
32-bit these functions return a freshly palloc'd Datum in the
CurrentMemoryContext.

Maybe we could remove the datumCopy() from eval_windowfunction() and
also document that a window function when returning a non-byval type,
must allocate the Datum in either ps_ExprContext's
ecxt_per_tuple_memory or ecxt_per_query_memory. We could ensure any
extension which has its own window functions get the memo about the
API change by adding an Assert to ensure that the return value (for
byref types) is in the current context by calling the
loop-over-the-blocks version of MemoryContextContains().

This would mean that wfuncs like lead(column_name) would no longer do
that extra datumCopy and the likes of lead(col || 'some OpExpr') would
save a little as we'd no longer call MemoryContextContains on
non-Assert builds.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ibrar Ahmed 2022-09-26 23:26:53 [Commitfest 2022-09] Last days
Previous Message Nathan Bossart 2022-09-26 22:13:39 Re: Enables to call Unregister*XactCallback() in Call*XactCallback()