From: | Jelte Fennema-Nio <postgres(at)jeltef(dot)nl> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: Flushing large data immediately in pqcomm |
Date: | 2024-03-21 09:44:17 |
Message-ID: | CAGECzQSAPE8JCD7dZQ7WKjy6G4GmsjAXkYq8nYh_ASnuN97Obg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 21 Mar 2024 at 01:45, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> As I understand the code, there's no problem calling
> internal_flush_buffer() when the buffer is empty and I suspect that if
> we're sending a few buffers with "len > PqSendBufferSize" that it's
> just so unlikely that the buffer is empty that we should just do the
> function call and let internal_flush_buffer() handle doing nothing if
> the buffer really is empty. I think the chances of
> internal_flush_buffer() having to do exactly nothing here is less than
> 1 in 8192, so I just don't think the check is worthwhile.
I think you're missing the exact case that we're trying to improve
here: Calls to internal_putbytes with a very large len, e.g. 1MB.
With the new code the buffer will be empty ~50% of the time (not less
than 1 in 8192) with such large buffers, because the flow that will
happen:
1. We check len > PqSendBufferSize. There are some bytes in the buffer
e.g. the 5 bytes of the msgtype. So we fill up the buffer, but have
many bytes left in len.
2. We loop again, because len is not 0.
3. We flush the buffer (at the top of the loop) because the buffer is full.
4. We check len > PqSendBufferSize. Now the buffer is empty, so we
call internal_flush_buffer directly
As you can see we check len > PqSendBufferSize twice (in step 1. and
step 4.), and 1 out of 2 times it returns 0
To be clear, the code is done this way so our behaviour would only
ever be better than the status-quo, and cause no regressions. For
instance, flushing the 5 byte header separately and then flushing the
full input buffer might result in more IP packets being sent in total
in some cases due to our TCP_NODELAY.
From | Date | Subject | |
---|---|---|---|
Next Message | Zhijie Hou (Fujitsu) | 2024-03-21 09:47:12 | Buildfarm failure on tamandua - "timed out waiting for subscriber to synchronize data" |
Previous Message | Dean Rasheed | 2024-03-21 09:35:18 | Re: MERGE ... WHEN NOT MATCHED BY SOURCE |