Re: UTF-8 on Postgres wire protocol

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Rui Pacheco <rui(dot)pacheco(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: UTF-8 on Postgres wire protocol
Date: 2016-12-22 03:10:35
Message-ID: CAB7nPqTxbJFqVAYMLanhkTS5Am46R6GdG1vMWKG4zUTw_sT2wg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Dec 22, 2016 at 8:25 AM, Rui Pacheco <rui(dot)pacheco(at)gmail(dot)com> wrote:
> I’m toying around with the wire protocol and came across something I don’t understand.
>
> I created a table with two columns, one called “id” and one called “señor”. When I select from that table I get the list of columns and while its fairly easy to identify the column with the name “id”, I’m not sure how to identify the other column:
>
> So this would be the ID column:
>
> […]
> [7] = 0x69
> [8] = 0x64

Yes this one maps to "id".

> And this señor:
> [47] = 0x01
> [48] = 0x03
> [49] = 0x00
> [50] = 0x00

The string is from here...

> [51] = 0x73
> [52] = 0x65
> [53] = 0xc3
> [54] = 0xb1
> [55] = 0x6f
> [56] = 0x72

To here. And then señor ends.

> What are the 4 bytes that precede the word señor? In other words, if I were to parse this, how would I know where the column name begins and ends?

I am not sure what message you used to query them, but the answer you
are looking for is much likely here:
https://www.postgresql.org/docs/9.6/static/protocol-message-formats.html
https://www.postgresql.org/docs/9.6/static/protocol-message-types.html
If you are looking at a reliable way to re-implement the frontend-side
protocol parsing the information according to those docs is the way to
go.
--
Michael

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-12-22 03:25:14 Re: Too long startup time after each crash.
Previous Message rich 2016-12-22 03:00:10 Re: Postgres 9.6 Streaming Replication on Solaris 10