Re: Array of integer indexed nested-loop semi join

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Mickael van der Beek <mickael(dot)van(dot)der(dot)beek(at)gmail(dot)com>
Cc: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Array of integer indexed nested-loop semi join
Date: 2022-05-23 02:45:14
Message-ID: CAMkU=1xRhsJD4MM4JO=fy-2ocnQDixu9BWwy1iE+08a9KawV6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, May 20, 2022 at 6:42 AM Mickael van der Beek <
mickael(dot)van(dot)der(dot)beek(at)gmail(dot)com> wrote:

>
> Query:
>
> EXPLAIN (
>> ANALYZE,
>> VERBOSE,
>> COSTS,
>> BUFFERS,
>> TIMING
>> )
>> SELECT
>> fu.w2_page_idxs
>> FROM
>> fact_users
>> AS fu
>> WHERE
>> EXISTS (
>> SELECT
>> FROM
>> (
>> SELECT
>> ARRAY[idx] AS page_idx
>> FROM
>> fact_pages
>> WHERE
>> attribute_idxs && ARRAY[300000160]
>> FETCH FIRST 1 ROWS ONLY
>> )
>> AS fp
>> WHERE
>> fu.w2_page_idxs && fp.page_idx
>> )
>> ;
>
>
> Without any surprises, the planner is using a sequential scan on the
> "fact_users" table which is very large instead of using the GIN index set
> on the "w2_page_idxs" column.
>

For me, using the subquery in and expression, instead of the EXISTS, does
get it to use the gin index. And I think it must give the same results.

SELECT
fu.w2_page_idxs
FROM fact_users AS fu
WHERE
fu.w2_page_idxs && ARRAY[(select idx from fact_pages where
attribute_idxs && ARRAY[3003] FETCH FIRST 1 ROWS ONLY)];

But why are you using intarray? That is unnecessary here, and by creating
ambiguity about the array operators it might be harmful.

Cheers,

Jeff

>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mickael van der Beek 2022-05-23 07:57:25 Re: Array of integer indexed nested-loop semi join
Previous Message Ganesh Korde 2022-05-20 17:59:40 Re: Selecting RAM and CPU based on max_connections