RE: Postgresql equal join on function with columns not use index

From: "James Pang (chaolpan)" <chaolpan(at)cisco(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: RE: Postgresql equal join on function with columns not use index
Date: 2023-06-12 14:20:25
Message-ID: PH0PR11MB51914E757855701122588DA1D654A@PH0PR11MB5191.namprd11.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,
Looks like it's the function "regexp_replace" volatile and restrict=false make the difference, we have our application role with default search_path=oracle,$user,public,pg_catalog.
=# select oid,proname,pronamespace::regnamespace,prosecdef,proisstrict,provolatile from pg_proc where proname='regexp_replace' order by oid;
oid | proname | pronamespace | prosecdef | proisstrict | provolatile
-------+----------------+--------------+-----------+-------------+-------------
2284 | regexp_replace | pg_catalog | f | t | i
2285 | regexp_replace | pg_catalog | f | t | i
17095 | regexp_replace | oracle | f | f | v
17096 | regexp_replace | oracle | f | f | v
17097 | regexp_replace | oracle | f | f | v
17098 | regexp_replace | oracle | f | f | v

--with default it use orafce, oracle.regexp_replace function,
Select a.phonenumber,... from tableA a, tableB b where a.phonenumber=oracle. regexp_replace(b.PHONENUMBER,'[^0-9]','') ,
--index on a.phonenumber not used

Switch to pg_catalog.regexp_replace(b.PHONENUMBER,'[^0-9]',''),
Index on a.phonenumber got used.

Thanks,

James Pang

-----Original Message-----
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Monday, June 12, 2023 9:19 PM
To: James Pang (chaolpan) <chaolpan(at)cisco(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Postgresql equal join on function with columns not use index

"James Pang (chaolpan)" <chaolpan(at)cisco(dot)com> writes:
> We migrate from Oracle to Postgresql14.8, one SQL has regression in Postgres run in 5800 milliseconds in Postgresql v14.8, but the same SQL got done in several hundred milliseconds in Oracle database.
> With multiple table JOINs, if the join condition is
> tablea.column1=tableb.column1, optimizer will use the index to filter
> data in nest loops, but if
> tablea.column1=regexp_replace(tableb.column1....),
> Optimizer will not be able to use the index on tablea.column1, then it do a table scan and nestloop to produce a lot rows then use tablea.column1=regexp_replace(tableb.column1....) as a filter. As a workaround we create a view then use tablea.column1=view.column1 that works.
> Is it expected ? details as below.

It's impossible to comment on this usefully with such a fragmentary description of the problem. Please send a complete, self-contained test case if you want anybody to look at it carefully.

https://wiki.postgresql.org/wiki/Slow_Query_Questions

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message benoit 2023-06-12 20:17:34 Forced to use UNION ALL when having multiple ANY operators and ORDER BY LIMIT
Previous Message Tom Lane 2023-06-12 13:18:39 Re: Postgresql equal join on function with columns not use index