Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Isaac Morland" <isaac(dot)morland(at)gmail(dot)com>
Cc: "Mark Dilger" <mark(dot)dilger(at)enterprisedb(dot)com>, "Postgres hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Andreas Karlsson" <andreas(at)proxel(dot)se>, "David Fetter" <david(at)fetter(dot)org>
Subject: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]
Date: 2021-03-02 13:58:16
Message-ID: c835ccf4-adfa-433d-9f0e-3601039f3b0e@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Isaac,

Many thanks for the comments.

On Tue, Mar 2, 2021, at 14:34, Isaac Morland wrote:
> One question I would have is whether empty ranges are all equal to each other. If they are, you have an equality that isn’t really equality; if they aren’t then you would have ranges that are unequal even though they have exactly the same membership. Although I suppose this is already true for some types where ends can be specified as open or closed but end up with the same end element; many range types canonicalize to avoid this but I don’t think they all do.

I thought about this problem too. I don't think there is a perfect solution.
Leaving things as they are is problematic too since it makes the range type useless for some use-cases.
I've sent a patch in a separate thread with the least invasive idea I could come up with.

> Returning to the RE result issue, I wonder how much it actually matters where any empty matches are. Certainly the actual contents of the match don’t matter; you don’t need to be able to index into the string to extract the substring. The only scenario I can see where it could matter is if the RE is using lookahead or look back to find occurrences before or after something else.

Hmm, I think it would be ugly to have corner-cases handled differently than the rest.

> If we stipulate that the result array will be in order, then you still don’t have the exact location of empty matches but you do at least have where they are relative to non-empty matches.

This part I didn't fully understand. Can you please provide some example on this?

/Joel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Isaac Morland 2021-03-02 14:05:34 Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]
Previous Message Isaac Morland 2021-03-02 13:34:56 Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]