Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date: 2020-04-25 12:59:07
Message-ID: 20200425125907.yv26rvs5exjzy463@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 24, 2020 at 09:22:34PM -0400, James Coleman wrote:
>On Fri, Apr 24, 2020 at 5:55 PM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>
>> On Fri, Apr 24, 2020 at 09:38:54AM -0400, James Coleman wrote:
>> >On Thu, Apr 23, 2020 at 10:55 AM Tomas Vondra
>> ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> >>
>> >> On Thu, Apr 23, 2020 at 09:02:26AM -0400, James Coleman wrote:
>> >> >On Thu, Apr 23, 2020 at 8:47 AM Tomas Vondra
>> >> ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> >> >>
>> >> >> On Mon, Apr 20, 2020 at 09:27:34PM -0400, James Coleman wrote:
>> >> >> >Over in "execExprInterp() questions / How to improve scalar array op
>> >> >> >expr eval?" [1] I'd mused about how we might be able to optimized
>> >> >> >scalar array ops with OR'd semantics.
>> >> >> >
>> >> >> >This patch implements a binary search for such expressions when the
>> >> >> >array argument is a constant so that we can avoid needing to teach
>> >> >> >expression execution to cache stable values or know when a param has
>> >> >> >changed.
>> >> >> >
>> >> >> >The speed-up for the target case can pretty impressive: in my
>> >> >> >admittedly contrived and relatively unscientific test with a query in
>> >> >> >the form:
>> >> >> >
>> >> >> >select count(*) from generate_series(1,100000) n(i) where i in (<1000
>> >> >> >random integers in the series>)
>> >> >> >
>> >> >> >shows ~30ms for the patch versus ~640ms on master.
>> >> >> >
>> >> >>
>> >> >> Nice improvement, although 1000 items is probably a bit unusual. The
>> >> >> threshold used in the patch (9 elements) seems a bit too low - what
>> >> >> results have you seen with smaller arrays?
>> >> >
>> >> >At least in our systems we regularly work with 1000 batches of items,
>> >> >which means you get IN clauses of identifiers of that size. Admittedly
>> >> >the most common case sees those IN clauses as simple index scans
>> >> >(e.g., WHERE <primary key> IN (...)), but it's also common to have a
>> >> >broader query that merely filters additionally on something like "...
>> >> >AND <some foreign key> IN (...)" where it makes sense for the rest of
>> >> >the quals to take precedence in generating a reasonable plan. In that
>> >> >case, the IN becomes a regular filter, hence the idea behind the
>> >> >patch.
>> >> >
>> >> >Side note: I'd love for us to be able to treat "IN (VALUES)" the same
>> >> >way...but as noted in the other thread that's an extremely large
>> >> >amount of work, I think. But similarly you could use a hash here
>> >> >instead of a binary search...but this seems quite good.
>> >> >
>> >> >As to the choice of 9 elements: I just picked that as a starting
>> >> >point; Andres had previously commented off hand that at 8 elements
>> >> >serial scanning was faster, so I figured this was a reasonable
>> >> >starting point for discussion.
>> >> >
>> >> >Perhaps it would make sense to determine that minimum not as a pure
>> >> >constant but scaled based on how many rows the planner expects us to
>> >> >see? Of course that'd be a more invasive patch...so may or may not be
>> >> >as feasible as a reasonable default.
>> >> >
>> >>
>> >> Not sure. That seems a bit overcomplicated and I don't think it depends
>> >> on the number of rows the planner expects to see very much. I think we
>> >> usually assume the linear search is cheaper for small arrays and then at
>> >> some point the binary search starts winning The question is where this
>> >> "break even" point is.
>> >
>> >Well since it has to do preprocessing work (expanding the array and
>> >then sorting it), then the number of rows processed matters, right?
>> >For example, doing a linear search on 1000 items only once is going to
>> >be cheaper than preprocessing the array and then doing a binary
>> >search, but only a very large row count the binary search +
>> >preprocessing might very well win out for only a 10 element array.
>> >
>>
>> Hmmm, good point. Essentially the initialization (sorting of the array)
>> has some cost, and the question is how much extra per-tuple cost this
>> adds. It's probably not worth it for a single lookup, but for many
>> lookups it's probably OK. Let's see if I can do the math right:
>>
>> N - number of lookups
>> K - number of array elements
>>
>> Cost to sort the array is
>>
>> O(K * log(K)) = C1 * K * log(K)
>>
>> and the cost of a lookup is C2 * log(K), so with the extra cost amortized
>> for N lookups, the total "per lookup" cost is
>>
>> C1 * K * log(K) / N + C2 * log(K) = log(K) * (C1 * K / N + C2)
>>
>> We need to compare this to the O(K) cost of simple linear search, and
>> the question is at which point the linear search gets more expensive:
>>
>> C3 * K = log(K) * (C1 * K / N + C2)
>>
>> I think we can assume that C3 is somewhere in between 0.5 and 1, i.e. if
>> there's a matching item we find it half-way through on average, and if
>> there is not we have to walk the whole array. So let's say it's 1.
>>
>> C1 and C2 are probably fairly low, I think - C1 is typically ~1.4 for
>> random pivot choice IIRC, and C2 is probably similar. With both values
>> being ~1.5 we get this:
>>
>> K = log(K) * (1.5 * K/N + 1.5)
>>
>> for a fixed K, we get this formula for N:
>>
>> N = log(K) * 1.5 * K / (K - 1.5 * log(K))
>>
>> and for a bunch of K values the results look like this:
>>
>> K | N
>> -------|-------
>> 1 | 0
>> 10 | 5.27
>> 100 | 7.42
>> 1000 | 10.47
>> 10000 | 13.83
>> 100000 | 17.27
>>
>> i.e. the binary search with 10k values starts winning over linear search
>> with just ~13 lookups.
>>
>> (Assuming I haven't made some silly mistake in the math ...)
>>
>> Obviously, this only accounts for cost of comparisons and neglects e.g.
>> the indirect costs for less predictable memory access patterns mentioned
>> by Andres in his response.
>>
>> But I think it still shows the number of lookups needed for the binary
>> search to be a win is pretty low - at least for reasonable number of
>> values in the array. Maybe it's 20 and not 10, but I don't think that
>> changes much.
>>
>> The other question is if we can get N at all and how reliable the value
>> is. We can probably get the number of rows, but that will ignore other
>> conditions that may eliminate the row before the binary search.
>>
>> >I'm not trying to argue for more work for myself here: I think the
>> >optimization is worth it on its own, and something like this could be
>> >a further improvement on its own. But it is interesting to think
>> >about.
>> >
>>
>> I don't know. Clearly, if the user sends a query with 10k values and
>> only does a single lookup, that won't win. And if we can reasonably and
>> reliably protect against that, I wouldn't mind doing that, although it
>> means a risk of not using the bin search in case of underestimates etc.
>>
>> I don't have any hard data about this, but I think we can assume the
>> number of rows processed by the clause is (much) higher than the number
>> of keys in it. If you have a clause with 10k values, then you probably
>> expect it to be applied to many rows, far more than the "beak even"
>> point of about 10-20 rows ...
>>
>> So I wouldn't worry about this too much.
>
>Yeah. I think it becomes a lot more interesting in the future if/when
>we end up with a way to use this with params and not just constant
>arrays. Then the "group" size would matter a whole lot more.
>

True. That probably changes the calculations quite a bit.

>For now, the constant amount of overhead is quite small, so even if we
>only execute it once we won't make the query that much worse (or, at
>least, the total query time will still be very small). Also, because
>it's only applied to constants, there's a natural limit to how much
>overhead we're likely to introduce into a query.
>

FWIW the results from repeated test with both int and text columns that
I shared in [1] also have data for smaller numbers of rows. I haven't
tried very much to minimize noise (the original goal was to test speedup
for large numbers of rows and large arrays, where this is not an issue).
But I think it still shows that the threshold of ~10 elements is in the
right ballpark. We might use a higher value to be a bit more defensive,
but it's never going to be perfect for types with both cheap and more
expensive comparisons.

One more note - shouldn't this also tweak cost_qual_eval_walker which
computes cost for ScalarArrayOpExpr? At the moment it does this:

/*
* Estimate that the operator will be applied to about half of the
* array elements before the answer is determined.
*/

but that's appropriate for linear search.

[1] https://www.postgresql.org/message-id/20200425124024.hsv7z6bia752uymz%40development

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-04-25 14:13:20 Re: Autovacuum on partitioned table (autoanalyze)
Previous Message Tomas Vondra 2020-04-25 12:40:24 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays