Re: psycopg3 and adaptation choices

From: Vladimir Ryabtsev <greatvovan(at)gmail(dot)com>
To: Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, psycopg(at)postgresql(dot)org
Subject: Re: psycopg3 and adaptation choices
Date: 2020-11-09 02:48:54
Message-ID: CAMqTPqmNC8EWQ0T1qtF6HyxkXehb09QuGBNwB0bDFR-QDsqpew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

A wild idea: support both client-side (like in psycopg2) and server-side
binding. Keep old '%s' syntax and provide a separate method for client-side
binding (not in 'cur.execute()'). This could alleviate cases like
parameters roundtripping and other cases of safe query composition. At
the same time use '$N' for true server-side binding.
Is it an overcomplication or there are valid use-cases of that?

On Sun, 8 Nov 2020 at 18:19, Vladimir Ryabtsev <greatvovan(at)gmail(dot)com> wrote:

> Hello,
>
> From what I understood from your messages, I like passing 'unknown' for
> both strings and numbers.
> Roundtripping parameters seems to be a less common case (with a possible
> fix if it's necessary).
> Is there anything else that does not work or works counterintuitively with
> Python 'int' -> 'unknown'?
>
> BTW, may I ask another question regarding parameters?
> Don't you want to step away from '%s' syntax and use '$1, $2, ...' which
> seems to be more traditional in the database world?
> '%s' feels like old-school string formatting, new server-side parameter
> binding may want to give some new impression.
> Moreover, it appears more convenient when you have parameters numbered and
> can reuse them a few times in a query.
>
> Vladimir
>
> On Sun, 8 Nov 2020 at 14:22, Daniele Varrazzo <daniele(dot)varrazzo(at)gmail(dot)com>
> wrote:
>
>> On Sun, 8 Nov 2020 at 20:35, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
>> wrote:
>>
>> > Alright I understand now.
>> > More below.
>> >
>> > >
>> > > In psycopg3 the idea is to use a more advanced protocol, which
>> > > separates query and parameters. It brings several benefits: can use
>> > > prepared statements (send a query once, several parameters later),
>> > > passing large data doesn't bloat the parser (the params don't hit the
>> > > lexer/parser), can use binary format (useful to pass large binary
>> > > blobs without escaping them in a textual form), the format of the data
>> > > is more homogeneous (no need to quoting), so we can use Python objects
>> > > in COPY instead of limiting the interface for the copy functions to
>> > > file-like objects only.
>> > >
>> > > Both in psycopg2 and 3 there is an adaptation from Python types to
>> > > Postgres string representation. In pg2 there is additional quoting,
>> > > because apart from numbers and bools you need to quote a literal
>> > > string to merge it to the query and make it syntactically valid.
>> >
>> > So the issue in the psycopg3 protocol is making the parameters that are
>> > passed in separately match up correctly in type to what the server is
>> > expecting(or can cast implicitly)?
>>
>> Yes, correct. What we have to choose is which Postgres oid to map to
>> each Python type.
>>
>> Sometimes the mapping is trivial (e.g. `datetime.date` -> `date` in
>> Postgres, `uuid.UUID` -> `uuid`...)
>>
>> Sometimes it might be ambiguous: is a `datetime.datetime` a
>> `timestamp` or a `timestamptz`? In some cases we don't care (here we
>> can say `timestamptz` no problem: if the Python datetime doesn't have
>> tzinfo, Postgres will use the `TimeZone` setting).
>>
>> Sometimes it's messy: what Python type corresponds to a Postgres
>> `jsonb`? It might be a dict, or a list, or types that have other
>> representations too (numbers, strings, bools). In this case, as in
>> psycopg2, there can be a wrapper, e.g. `Json`, to tell psycopg that
>> this dict, or list, or whatever else, must be jsonified for the db.
>>
>> When there are mismatches, sometimes the database cast rules help
>> (e.gi in the timestamp[tz] case). Sometimes not: if we say `text` to a
>> jsonb field, it will raise an error. Sometimes a cast is automatic on
>> inserting in a table but not on passing a function parameter.
>>
>> Numbers are messy, as they usually are: Python has int, float,
>> Decimal, Postgres has int2, int4, int8, float4, float8, numeric. The
>> mappings float -> float8 and Decimal -> numeric are more or less
>> straightforward. `int` is not, as in Python it's unbounded. If you say
>> `select 10` in psql, the server understands "unknown type, but a
>> number", and can try if either int* or numeric fit the context. But we
>> don't have the help from the syntax that psql has: because 10 doesn't
>> have quotes, Postgres is sure that it is a number, and not a string,
>> but executing query/params separately we lose that expressivity: we
>> cannot quote the strings and not the number. So choices are:
>>
>> 1. If we specify `numeric` or `int8` as oid, inserting in an int field
>> in a table will work ok, but some functions/operators won't (e.g. "1
>> >> %s").
>> 2. If we specify `int4` it would work for those few functions defined
>> as `integer`, but if we try to write a number that doesn't fit in 32
>> bits into a Postgres bigint field I assume something will overflow
>> along the way, even if both python and postgres can handle it.
>> 3. If we specify `unknown` it might work more often, but
>> `cursor.execute("select %s", [10]) will return the string "10" instead
>> of a number.
>>
>> So I wonder what's the best compromise to do here: the less bad seems
>> 1. 3. might work in more contexts, but it's a very counterintuitive
>> behaviour, and roundtripping other objects (dates, uuid) works no
>> problem: they don't come back as strings.
>>
>> -- Daniele
>>
>>
>>

In response to

Responses

Browse psycopg by date

  From Date Subject
Next Message Federico Di Gregorio 2020-11-09 06:57:17 Re: psycopg3 and adaptation choices
Previous Message Vladimir Ryabtsev 2020-11-09 02:19:04 Re: psycopg3 and adaptation choices