From: | Isaac Morland <isaac(dot)morland(at)gmail(dot)com> |
---|---|
To: | Anders Åstrand <anders(at)449(dot)se> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Add uri percent-encoding for binary data |
Date: | 2019-10-07 21:38:15 |
Message-ID: | CAMsGm5dGOiQm8vG=D7vAgMDyFG9U+L+eJOugTN2WhT5PY84DPA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 7 Oct 2019 at 03:15, Anders Åstrand <anders(at)449(dot)se> wrote:
> Hello
>
> Attached is a patch for adding uri as an encoding option for
> encode/decode. It uses what's called "percent-encoding" in rfc3986
> (https://tools.ietf.org/html/rfc3986#section-2.1)
>
> The background for this patch is that I could easily build urls in
> plpgsql, but doing the actual encoding of the url parts is painfully
> slow. The list of available encodings for encode/decode looks quite
> arbitrary to me, so I can't see any reason this one couldn't be in
> there.
>
> In modern web scenarios one would probably most likely want to encode
> the utf8 representation of a text string for inclusion in a url, in
> which case correct invocation would be ENCODE(CONVERT_TO('some text in
> database encoding goes here', 'UTF8'), 'uri'), but uri
> percent-encoding can of course also be used for other text encodings
> and arbitrary binary data.
>
This seems like a useful idea to me. I've used the equivalent in Python and
it provides more options:
https://docs.python.org/3/library/urllib.parse.html#url-quoting
I suggest reviewing that documentation there, because there are a few
details that need to be checked carefully. Whether or not space should be
encoded as plus and whether certain byte values should be exempt from
%-encoding is something that depends on the application. Unfortunately, as
far as I can tell there isn't a single version of URL encoding that
satisfies all situations (thus explaining the complexity of the Python
implementation). It might be feasible to suppress some of the Python
options (I'm wondering about the safe= parameter) but I'm pretty sure you
at least need the equivalent of quote and quote_plus.
From | Date | Subject | |
---|---|---|---|
Next Message | Smith, Peter | 2019-10-07 23:13:28 | RE: Proposal: Make use of C99 designated initialisers for nulls/values arrays |
Previous Message | Peter Geoghegan | 2019-10-07 20:17:53 | Re: maintenance_work_mem used by Vacuum |