Re: How to eliminate extra "NOT EXISTS"-query here?

From: hector vass <hector(dot)vass(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andreas Joseph Krogh <andreas(at)visena(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: How to eliminate extra "NOT EXISTS"-query here?
Date: 2023-11-25 16:52:51
Message-ID: CAJJx+iW6aqxtPSOCPxKrdohJoa3KVpdS65-W3uCEiThWpEb2ew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Not sure you need to use array why not simple table joins, so a table with
your criteria x y z t joined to stuff to give you candidates that do match,
then left join with coalesce to add the 'd'

select

--a.id,b.test_id,

coalesce(a.id,b.test_id) as finalresult

from test a

left join (

select

test_id

from stuff a

inner join (values ('x'),('y'),('z'),('t')) b (v) using(v)

group by 1

)b on(a.id=b.test_id);

Regards
Hector Vass

On Sat, Nov 25, 2023 at 4:08 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Andreas Joseph Krogh <andreas(at)visena(dot)com> writes:
> > -- This works, but I'd rather not do the extra EXISTS
> > select * from test t
> > WHERE (NOT ARRAY ['x', 'y', 'z', 't']::varchar[] <@ (select
> array_agg(s.v) from
> > stuffs WHERE s.test_id = t.id)
> > OR NOT EXISTS (
> > select * from stuff s where s.test_id = t.id
> > )
> > )
> > ;
>
> > So, I want to return all entries in test not having any of ARRAY ['x',
> 'y',
> > 'z', 't'] referenced in the table stuff, and I'd like to have test.id="d"
>
> > returned as well, but in order to do that I need to execute the “or not
> > exists”-query. Is it possible to avoid that?
>
> Probably not directly, but perhaps you could improve the performance of
> this query by converting the sub-selects into a left join:
>
> select * from test t
> left join
> (select s.test_id, array_agg(s.v) as arr from stuffs group by
> s.test_id) ss
> on ss.test_id = t.id
> WHERE (NOT ARRAY ['x', 'y', 'z', 't']::varchar[] <@ ss.arr)
> OR ss.test_id IS NULL;
>
> Another possibility is
>
> ...
> WHERE (ARRAY ['x', 'y', 'z', 't']::varchar[] <@ ss.arr) IS NOT TRUE
>
> but I don't think that's more readable really, and it will save little.
>
> In either case, this would result in computing array_agg once for
> each group of test_id values in "stuffs", while your original computes
> a similar aggregate for each row in "test". So whether this is better
> depends on the relative sizes of the tables, although my proposal
> avoids random access to "stuffs" so it will have some advantage.
>
> regards, tom lane
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andreas Joseph Krogh 2023-11-25 17:18:57 Re: How to eliminate extra "NOT EXISTS"-query here?
Previous Message Tom Lane 2023-11-25 16:08:28 Re: How to eliminate extra "NOT EXISTS"-query here?