Re: RFC: Async query processing

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Florian Weimer <fweimer(at)redhat(dot)com>
Cc: PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: Async query processing
Date: 2014-01-03 13:36:42
Message-ID: CAGTBQpak7r6Woyc4FqK4C+Yh7=9w1OMBxETxivCP=sg_s35Ytg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer <fweimer(at)redhat(dot)com> wrote:
> On 01/02/2014 07:52 PM, Claudio Freire wrote:
>
>>> No, because this doesn't scale automatically with the bandwidth-delay
>>> product. It also requires that the client buffers queries and their
>>> parameters even though the network has to do that anyway.
>>
>>
>> Why not? I'm talking about transport-level packets, btw, not libpq
>> frames/whatever.
>>
>> Yes, the network stack will sometimes do that. But the it doesn't have
>> to do it. It does it sometimes, which is not the same.
>
>
> The network inevitably buffers because the speed of light is not infinite.
>
> Here's a concrete example. Suppose the server is 100ms away, and you want
> to send data at a constant rate of 10 Mbps. The server needs to acknowledge
> the data you sent, but this acknowledgment arrives after 200 ms. As a
> result, you've sent 2 Mbits before the acknowledgment arrives, so the
> network appears to have buffered 250 KB. This effect can actually be used
> for data storage, called "delay line memory", but it is somewhat out of
> fashion now.
...
>> So, trusting the network start to do the quick start won't work. For
>> steady streams of queries, it will work. But not for short bursts,
>> which will be the most heavily used case I believe (most apps create
>> short bursts of inserts and not continuous streams at full bandwidth).
>
>
> Loading data into the database isn't such an uncommon task. Not everything
> is OLTP.

Truly, but a sustained insert stream of 10 Mbps is certainly way
beyond common non-OLTP loads. This is far more specific than non-OLTP.

Buffering will benefit the vast majority of applications that don't do
steady, sustained query streams. Which is the vast majority of
applications. An ORM doing a flush falls in this category, so it's an
overwhelmingly common case.

>> And buffering algorithms are quite platform-dependent anyway, so it's
>> not the best idea to make libpq highly reliant on them.
>
>
> That is why I think libpq needs to keep sending until the first response
> from the server arrives. Batching a fixed number of INSERTs together in a
> single conceptual query does not achieve auto-tuning to the buffering
> characteristics of the path.

Not on its own, but it does improve thoughput during slow start, which
benefits OLTP, which is a hugely common use case. As you say, the
network will then auto-tune when the query stream is consistent
enough, so what's the problem with explicitly buffering a little then?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-01-03 14:05:22 Re: Add CREATE support to event triggers
Previous Message Florian Weimer 2014-01-03 13:22:42 Re: RFC: Async query processing