Re: split_part for the last element

From: Nikhil Benesch <nikhil(dot)benesch(at)gmail(dot)com>
To: PALAYRET Jacques <jacques(dot)palayret(at)meteo(dot)fr>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: split_part for the last element
Date: 2020-10-23 17:27:47
Message-ID: CAPWqQZRCW68sz15HvyRfYMc9AoHrH87B_vqu7xDHMt0x3Fig_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Right, that's option 2 in my original mail. There are several
deficiencies with that idiom:

* It is non-obvious. Sure, it might make sense to you and I, but to
someone just learning SQL, it takes a minute to reason through why it
works. They're also unlikely to invent the trick on their own.
* It is inefficient. When the strings are large reversing the
strings is a silly waste of compute.

On Fri, Oct 23, 2020 at 12:03 PM PALAYRET Jacques
<jacques(dot)palayret(at)meteo(dot)fr> wrote:
>
> Hello,
>
> reverse(split_part(reverse('foo bar baz'), ' ', 1)) -> 'baz'
>
> Regards
>
> ----- Mail original -----
> De: "Nikhil Benesch" <nikhil(dot)benesch(at)gmail(dot)com>
> À: pgsql-general(at)lists(dot)postgresql(dot)org
> Envoyé: Vendredi 23 Octobre 2020 17:47:16
> Objet: split_part for the last element
>
> Hi,
>
> Suppose I need to split a string on a delimiter and select one of the
> resulting components. If I want a specific component counting from the
> start, that's easy:
>
> split_part('foo bar baz', ' ', 1) -> 'foo'
>
> But if I want the last component, I have several less-than-ideal options:
>
> 1. (string_to_array('foo bar baz', '
> '))[cardinality(string_to_array('foo bar baz', ' ')) - 1]
> 2. reverse(split_part(reverse('foo bar baz'), ' ', 1))
> 3. (regexp_match('foo baz bar', '\S*$'))[1]
>
> Option 1 is probably the most understandable, especially if you are
> willing to introduce a temporary parts array:
>
> select parts[cardinality(parts) - 1] from string_to_array('foo bar
> baz', ' ') parts
>
> But if the strings are long, this needlessly builds an array just to
> throw it away. Option 2 has similar efficiency problems and is just
> kind of silly. Option 3 is probably the best, but it's still a good
> bit more complicated than a simple split_part invocation.
>
> Is there another option I'm missing? Would there be interest in
> extending split part so that negative indices counted from the end, as
> in:
>
> split_part('foo bar baz', ' ', -1) -> 'baz'
>
> Or adding a split_part_end function in which positive indices counted
> from the end:
>
> split_part_end('foo bar baz', ' ', 1) -> 'baz'
>
> I'd be happy to prepare a patch if so.
>
> Cheers,
> Nikhil
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2020-10-23 18:20:50 Re: split_part for the last element
Previous Message PALAYRET Jacques 2020-10-23 16:03:47 Re: split_part for the last element