Re: BUG #18765: Inconsistent behaviour and errors with LIKE

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Anmol Mohanty <anmol(dot)mohanty(at)salesforce(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18765: Inconsistent behaviour and errors with LIKE
Date: 2025-01-08 21:12:32
Message-ID: 1693904.1736370752@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Peter Eisentraut <peter(at)eisentraut(dot)org> writes:
> I have also come across this issue recently, and I think we should
> actually fix it. It makes sense to verify that the pattern is
> syntactically correct before trying to use it, instead of just using it
> incrementally and then erroring when you happen to hit the problematic bits.

I'm concerned about the performance cost of adding an extra scan of
the pattern --- a scan that has exactly zero benefit in all normal
use cases. Admittedly typical patterns aren't very big, but still
the cost/benefit ratio is bad for most people.

I wonder if we could buy back whatever it'd cost us by treating
this initial scan as a "compilation" of the LIKE pattern into
some form that would make the actual string search cheaper.
My gut says that building a representation that contains tokens
like "any character", "zero or more any characters", or "literal
string of N bytes" could save some cycles while searching.
In ILIKE, we could perhaps also case-fold the pattern at this
stage.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Fujii Masao 2025-01-09 00:35:39 Re: BUG #18663: synchronous_standby_names vs synchronous_commit vs pg_stat_replication
Previous Message Peter Eisentraut 2025-01-08 20:52:29 Re: BUG #18765: Inconsistent behaviour and errors with LIKE