Re: Proposal: http2 wire format

From: Damir Simunic <damir(dot)simunic(at)wa-research(dot)ch>
To: David Fetter <david(at)fetter(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal: http2 wire format
Date: 2018-03-25 22:00:35
Message-ID: 8025F003-8593-4BA1-B91A-A02871ABB025@wa-research.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 25 Mar 2018, at 19:42, David Fetter <david(at)fetter(dot)org> wrote:
>
> On Sat, Mar 24, 2018 at 06:52:47PM +0100, Damir Simunic wrote:
>> Hello hackers,
>>
>> I’d like to propose the implementation of new wire protocol using http2 framing.
>
> Welcome to the PostgreSQL community! This is a very interesting idea.
> Please send a patch to this mailing list on this thread.
>

Thanks David, very excited to be part of pgsql-hackers!

> In order to get and keep it on the radar, you should know about how
> development works in PostgreSQL.
>
> http://wiki.postgresql.org/wiki/Development_information
>
> In particular, please look at: http://wiki.postgresql.org/wiki/Submitting_a_Patch
>

To put it out front: my forte is product design, not C coding. (Also, I made a grammar error in the opening sentence: I’m not proposing “the implementation”, but “implementing h2 as new wire proto”)

I did study all of the resources you mentioned. And am voraciously reading up on Postgres internals, scouring its source, practicing C development, etc.

My email is the result of the first advice under “Brand new features” in “So you want to be a developer?”.

> I notice that you patched 10. New features, and this is definitely
> one, go against git master.
>

Let me figure out how to do that pronto. 10.2 tarball was easier to learn from as it was not a moving target. Whatever I did so far is not yet patch-worthy.

>> It appears to me that http2 solves many of the issues on the TODO
>> list under “Wire Protocol Changes / v4 Protocol,“ without any
>> obvious downsides.
>
> Here are a few things to consider, at least from my perspective:
>
> - Docs. Gotta have some: https://wiki.postgresql.org/wiki/Documentation_Tools

No worries about that—I love writing :)

>
> - Testing. Gotta have some in src/test/regress in the source tree.

Before even getting to the patch stage, there will be a period of discussion about latency and other tradeoffs. Mandatory part of any conversation mentioning a wire protocol.

So the plan is to come up with a working prototype that we can plug into protocol testing tools and measure the heck out of it in context. Yet one more thing to figure out. BTW, are there any formal tests of that kind for v3 protocol?

By that time I do hope to learn how to write code tests to put into src/test/regress.

>
> - Tight coupling to OpenSSL, if that's actually what's happening.
> We're actively trying to get away from this, so a TLS-neutral
> implementation or at least one that's not specific to OpenSSL would
> be good.

Didn’t know that. Will ifdef the openssl-dependent code. It’s not hard to implement ALPN nego to cover all viable libraries. Do you know what alternatives are being considered?

>
> - Overhead for all clients. It may be tiny, but it needs to be
> measured and that cost needs to be weighed against the benefits.
> Maybe a cache miss in the context of a network connection is
> negligible, but we do need to know.

Important point. If h2 is to be seriously considered, then it must be an improvement in absolutely every aspect.

The core part of this proposal is that h2 is parallel to v3. Something one can opt into by compiling `--with_http2`.

Even if h2 finds its way already into PG12, its likely that the existing installed base would elect not to compile it in as there are no immediate benefits to them. The first wave of users will be web-facing apps. They already pay the penalty of conversion to/from v3, so in those scenarios the switch will be a gain.

Then again, if h2 becomes the new v4, then libpq-fe will support for it, so we might find that the savings in one or two network round trips amply offset one byte socket peek, and everyone will eagerly upgrade. Who knows.

My PoC strategy is to touch existing code as little as possible. Yet if the ProcessStartupPacket can somehow return the consumed bytes back to the TLS lib for negotiation, then there’s zero cost to protocol detection for v2/v3 clients and only h2 clients pay the price of the extra check.

>
> - Dependency on a new external library. Fortunately, it's MIT
> licensed, so it's PostgreSQL compatible, but what happens if it
> becomes unmaintained? This has happened a couple of times, and it
> causes overhead that needs to be taken into account.

I chose nghttp because it gave me a quick start, it’s well designed, a good fit for this kind of work, and fortunately indeed, the license is compatible. (Also, curl links to it as well, so am pretty confident it’ll be around). Very possible that over time h2 parsing code migrates into pg codebase. There are so much similarities to v3 architecture, we might find a way to generalize both into a single codebase. Then h2 frame parser/state machine becomes only a handful of .c files.

h2 is a standard; however you decide to parse it, your code will eventually converge to a stable state in the same manner that febe v3 code did. Once we master the protocol, I don’t think there’ll be much need to touch the framing code. IOW even if we just import what we need, it won’t be a big issue.

>> My hope is that this post leads to a conversation and gets a few
>> people excited about the idea the way I am. Maybe even some of the
>> GSoC students would take the implementation further?
>
> The conversation has started.

Thanks so much for picking up the invitation!

There are few points that I’d really like to discuss next:

* Is there merit in the idea of a completely new v4 protocol—one that freezes the v3 and takes a new path?

* What are the criteria for getting this into the core?

* Is it better to develop in an experimental fork until the architecture is stable and than patch onto the master, or are we supposed to keep proposing patches for inclusion in the master? Even if not all details are fully fleshed out?

>
> Again, welcome, and thanks for jumping in!
>
> Best,
> David.
> --
> David Fetter <david(at)fetter(dot)org> http://fetter.org/
> Phone: +1 415 235 3778
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

Thanks,
Damir

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2018-03-25 22:32:10 Re: ALTER TABLE ADD COLUMN fast default
Previous Message Arthur Zakirov 2018-03-25 21:17:01 Re: [FEATURE PATCH] pg_stat_statements with plans (v02)