RE: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT

From: <msalais(at)msym(dot)fr>
To: "'benoit'" <benoit(at)hopsandfork(dot)com>, "'Chris Hoover'" <chrish(at)aweber(dot)com>
Cc: <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: RE: Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
Date: 2023-06-18 21:23:42
Message-ID: 005301d9a22b$28841ae0$798c50a0$@msym.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

Do you really need to do “select *”?

In other words, is it necessary to have all columns in the result?

Michel SALAIS

De : benoit <benoit(at)hopsandfork(dot)com>
Envoyé : lundi 12 juin 2023 23:35
À : Chris Hoover <chrish(at)aweber(dot)com>
Cc : pgsql-performance(at)lists(dot)postgresql(dot)org
Objet : RE: Forced to use UNION ALL when having multiple ANY operators and
ORDER BY LIMIT

This new index is used but still the read is 230mb.

https://explain.dalibo.com/plan/b0f28a9e8a136afd

_____

De : Chris Hoover <chrish(at)aweber(dot)com <mailto:chrish(at)aweber(dot)com> >
Envoyé : lundi 12 juin 2023 22:55
À : benoit
Cc : pgsql-performance(at)lists(dot)postgresql(dot)org
<mailto:pgsql-performance(at)lists(dot)postgresql(dot)org>
Objet : Re: Forced to use UNION ALL when having multiple ANY operators and
ORDER BY LIMIT

I normally create my indexes to match the where clause of the query. While
technically, it should not matter, I find a lot of time, it does.

I would create an index on (status, sender_reference, sent_at) and see if
the improves your query performance.


SELECT * FROM docs WHERE status

IN ('draft',

'sent')

AND sender_reference

IN ('Custom/1175',

'Client/362',

'Custom/280')

ORDER BY sent_at DESC

Thanks,

Chris Hoover

Senior DBA

AWeber.com

Cell: (803) 528-2269

Email: chrish(at)aweber(dot)com <mailto:chrish(at)aweber(dot)com>

On Jun 12, 2023, at 4:17 PM, benoit <benoit(at)hopsandfork(dot)com
<mailto:benoit(at)hopsandfork(dot)com> > wrote:

Hello

I have a database with few 60gb tables. Tables rows are requested with
multiple ANY or IN operators. I am not able to find an easy way to make DB
able to use indexes. I often hit the index, but see a a spike of 200mb of IO
or disk read.

I am using version 13 but soon 14.

I wrote a reproduction script on version 14 with plans included.
https://gist.github.com/benoittgt/ab72dc4cfedea2a0c6a5ee809d16e04d

I also have plans on a snapshot of the DB with real data.

- The current query that I try to improve :
https://explain.dalibo.com/plan/8b8f6e0he9feb551

- I added the DB schema + index in query view. As you can see I have many
indexes for testing purpose and try what the planner can do.

- The optimized query when I have only one ANY and migrate to UNION ALL for
each parameter of the ANY operator
https://explain.dalibo.com/plan/427gg053d07328ga . Query is fast as I would
like but it means generate some merge to be able to get a fast result.

- The new issue I have when I have a new ANY operator on the previous
optimized query. Big IO/read
https://explain.dalibo.com/plan/e7ha9g637b4eh946

It seems to me quite undoable to generate for every parameters a query that
will then merge. I have sometimes 3-4 ANY operators with up to 15 elements
in an array.

Is there a misusage of my indexes?

Is there a limitation when using ANY or IN operators and ordered LIMIT
behind?

Thanks a lot

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alvaro Herrera 2023-06-19 07:46:46 Re: Merge David and Goliath tables efficiently
Previous Message nicolas paris 2023-06-18 20:57:11 Re: Merge David and Goliath tables efficiently