Quick Links

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	James Coleman <jtc331(at)gmail(dot)com>
Cc:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date:	2020-09-08 20:37:06
Message-ID:	557e8853-7cb8-9c01-6193-1d4109345401@iki.fi
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 08/09/2020 22:25, James Coleman wrote:
> On Wed, Aug 19, 2020 at 3:16 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>
>> You could also apply the optimization for non-Const expressions, as long
>> as they're stable (i.e. !contain_volatile_functions(expr)).
>
> Is that true? Don't we also have to worry about whether or not the
> value is stable (i.e., know when a param has changed)? There have been
> discussions about being able to cache stable subexpressions, and my
> understanding was that we'd need to have that infrastructure (along
> with the necessarily invalidation when the param changes) to be able
> to use this for non-Const expressions.

Yeah, you're right, you'd have to also check for PARAM_EXEC Params. And
Vars. I think the conditions are the same as those checked in
match_clause_to_partition_key() in partprune.c (it's a long function,
search for "if (!IsA(rightop, Const))"). Not sure it's worth the
trouble, then. But it would be nice to handle queries like "WHERE column
= ANY ($1)"

>> Deconstructing the array Datum into a simple C array on first call would
>> be a win even for very small arrays and for AND semantics, even if you
>> don't use a hash table.
>
> Because you wouldn't have to repeatedly detoast it? Or some other
> reason I'm not thinking of? My intuition would have been that (aside
> from detoasting if necessary) there wouldn't be much real overhead in,
> for example, an array storing integers.

Dealing with NULLs and different element sizes in the array is pretty
complicated. Looping through the array currently looks like this:

/* Loop over the array elements */
s = (char *) ARR_DATA_PTR(arr);
bitmap = ARR_NULLBITMAP(arr);
bitmask = 1;

for (int i = 0; i < nitems; i++)
{
Datum elt;
Datum thisresult;

/* Get array element, checking for NULL */
if (bitmap && (*bitmap & bitmask) == 0)
{
fcinfo->args[1].value = (Datum) 0;
fcinfo->args[1].isnull = true;
}
else
{
elt = fetch_att(s, typbyval, typlen);
s = att_addlength_pointer(s, typlen, s);
s = (char *) att_align_nominal(s, typalign);
fcinfo->args[1].value = elt;
fcinfo->args[1].isnull = false;
}

[do stuff with Datum/isnull]

/* advance bitmap pointer if any */
if (bitmap)
{
bitmask <<= 1;
if (bitmask == 0x100)
{
bitmap++;
bitmask = 1;
}
}
}

Compared with just:

for (int i = 0; i < nitems; i++)
{
Datum elt = datums[i];

[do stuff with the Datum]
}

I'm not sure how much difference that makes, but I presume it's not
zero, and it seems like an easy win when you have the code to deal with
the Datum array representation anyway.

- Heikki

In response to

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays at 2020-09-08 19:25:41 from James Coleman

Responses

Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays at 2020-09-11 21:11:34 from James Coleman

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tomas Vondra	2020-09-08 21:05:30	Re: Ideas about a better API for postgres_fdw remote estimates
Previous Message	Alvaro Herrera	2020-09-08 20:36:00	Re: VACUUM (INTERRUPTIBLE)?