Re: BUG #15069: group by after regexp_replace

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: koglep(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15069: group by after regexp_replace
Date: 2018-02-15 15:25:12
Message-ID: CAKFQuwb0_ZEyXi-RNGdKpr3fV1-VnXVWDAzYqZo=DQUf3eVGCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Feb 15, 2018 at 8:14 AM, PG Bug reporting form <
noreply(at)postgresql(dot)org> wrote:

> The following bug has been logged on the website:
>
> Bug reference: 15069
> Logged by: Ilhwan Ko
> Email address: koglep(at)gmail(dot)com
> PostgreSQL version: 9.6.7
> Operating system: macOS 10.12
> Description:
>
> select upper(regexp_replace(b, '\\s+', '')) as keyword, sum(c)
> from test_t
> group by upper(regexp_replace(b, '\\s+', ''));
>
> I expected to get the same results regarding to above four queries.
> However, they were different.
>
>
​Not a bug - you mis-understand string literal syntax and escaping. What
you are asking to replace is "a backslash followed by one or more "s"es.

see:
https://www.postgresql.org/docs/10/static/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS

If you want to leave the literal as-is you need to write:

E'\\s+'

The way I prefer is to keep the "E" ​omitted and write:

'\s+'

Without the "E" the backslash is not an escape character in a PostgreSQL
literal and so the backslash in the regex doesn't need to be protected. By
protecting it you are actually protecting the backslash in front of the "s"
thus causing it to become two separate "symbols", "\" and "s" - and the +
then applies to the literal "s".

David J.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrew Gierth 2018-02-15 15:34:59 Re: BUG #15069: group by after regexp_replace
Previous Message PG Bug reporting form 2018-02-15 15:14:49 BUG #15069: group by after regexp_replace