Quick Links

Re: obtaining ARRAY position for a given match

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Sam Mason <sam(at)samason(dot)me(dot)uk>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: obtaining ARRAY position for a given match
Date:	2009-11-19 18:33:07
Message-ID:	162867790911191033w4516150i130bba5bdf0fcc08@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

2009/11/19 Sam Mason <sam(at)samason(dot)me(dot)uk>:
> On Thu, Nov 19, 2009 at 05:24:33PM +0100, Pavel Stehule wrote:
>> it should be little bit more effective:
>
> I'm not sure if it will be much more; when you put a set returning
> function into a FROM clause PG will always run the function to
> completion---as far as I know, but I've only got 8.3 for testing at the
> moment.

yes, but generate_series is very cheap, this is protection before not
necessary string equation.

I'm also not sure why you want to return zero when you don't
> find the element. The code also exploits an implementation artifact of
> PG that the zero (i.e. the RHS of your UNION ALL) will be "after" the
> real index.
>

this is only convention. Somebody like 0 or -1 as result. Somebody can
live with NULL.

> This raises a small and interesting optimization for PG, when it does
> the plan it could notice that a UNION ALL followed by a LIMIT won't need
> to return all rows and hence it may be better to run the "quicker" one
> first. Or would this end up breaking more code than it helps?
>
>> CREATE OR REPLACE FUNCTION idx(anyarray, anyelement)
>> RETURNS int AS $$
>> SELECT i
>> FROM generate_series(array_lover($1,1),array_upper($1,1)) g(i)
>
> Quality typo :) ^^^
>
>> WHERE $1[i] = $2
>> UNION ALL
>> SELECT 0 -- return 0 as not found
>> LIMIT 1; -- stop after first match
>> $$ LANGUAGE sql;
>
> I'd do something like:
>
> CREATE OR REPLACE FUNCTION firstidx(anyarray, anyelement)
> RETURNS int AS $$
> SELECT i FROM (
> SELECT generate_series(array_lower($1,1),array_upper($1,1))) g(i)
> WHERE $1[i] = $2
> LIMIT 1;
> $$ LANGUAGE sql IMMUTABLE;
>
> You can replace the call to array_upper with some large number to check
> either function's behavior with large arrays.
>

your code is very very exactly same as my code. First - there are
flattening stage. So if you don't use offset 0, then subselect is
transformed to select. I am not sure, if offset 0 should help here -
it have to do a materialisation (5ms for 10000 items) more. This
function is relative fast:

postgres=# select idx(array(select generate_series(1,10000)),10000);
idx
-------
10000
(1 row)

Time: 40.070 ms

maybe - I cannot test it - there could be code

CREATE OR REPLACE FUNCTION idx(anyarray, anyelement)
RETURNS int AS $$
SELECT i
FROM (SELECT generate_subscripts($1) as i, unnest($1) as v) s
WHERE v = $2
LIMIT 1;
$$ LANGUAGE sql;

but I am sure so C code should be faster

Regards
Pavel Stehule

> --
> Sam http://samason.me.uk/
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Re: obtaining ARRAY position for a given match at 2009-11-19 17:11:09 from Sam Mason

Responses

Re: obtaining ARRAY position for a given match at 2009-11-19 19:02:41 from Pavel Stehule

Browse pgsql-general by date

	From	Date	Subject
Next Message	Scott Bailey	2009-11-19 18:47:02	Re: obtaining ARRAY position for a given match
Previous Message	Sam Mason	2009-11-19 18:05:19	Re: obtaining ARRAY position for a given match