Re: change regexp_substr first argument make tests more easier to understand.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: change regexp_substr first argument make tests more easier to understand.
Date: 2024-09-05 17:45:29
Message-ID: 2005032.1725558329@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> writes:
> Current tests with regexp_instr() and regexp_substr()  with string
> 'abcabcabc' are really unreadable and you would spend time to understand
> that happens in these tests and if they are really correct. I'd better
> change them into "abcdefghi" just like in query

>     SELECT regexp_substr('abcdefghi', 'd.q') IS NULL AS t;

On looking more closely at these test cases, I think the point of them
is exactly to show the behavior of the functions with multiple copies
of the target substring. Thus, what Jian is proposing breaks the
tests: it's no longer perfectly clear whether the result is because
the function did what we expect, or because the pattern failed to
match anywhere else. (Sure, "a.c" *should* match "aXc", but if it
didn't, you wouldn't discover that from this test.) What Ilia
proposes would break them worse.

I think we should just reject this patch, or at least reject the
parts of it that change existing test cases. I have no opinion
about whether the new test cases add anything useful.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2024-09-05 17:56:10 Re: Invalid "trailing junk" error message when non-English letters are used
Previous Message Corey Huinker 2024-09-05 17:34:31 Re: Statistics Import and Export