From: | Jerome Wagner <jerome(dot)wagner(at)laposte(dot)net> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | question regarding copyData containers |
Date: | 2020-06-03 17:28:12 |
Message-ID: | CA+=V_fNc60NLg9SmoL0mY1EkMsnqcJ0w0hQG2Ye6SWLVLdbrvg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
I have been working on a node.js streaming client for different COPY
scenarios.
usually, during CopyOut, clients tend to buffer network chunks until they
have gathered a full copyData message and pass that to the user.
In some cases, this can lead to very large copyData messages. when there
are very long text fields or bytea fields it will require a lot of memory
to be handled (up to 1GB I think in the worst case scenario)
In COPY TO, I managed to relax that requirement, considering that copyData
is simply a transparent container. For each network chunk, the relevent
message content is forwarded which makes for 64KB chunks at most.
If that makes things clearer, here is an example scenarios, with 4 network
chunks received and the way they are forwarded to the client.
in: CopyData Int32Len Byten1
in: Byten2
in: Byten3
in: CopyData Int32Len Byten4
out: Byten1
out: Byten2
out: Byten3
out: Byten4
We loose the semantics of the "row" that copyData has according to the
documentation
https://www.postgresql.org/docs/10/protocol-flow.html#PROTOCOL-COPY
>The backend sends a CopyOutResponse message to the frontend, followed by
zero or more >CopyData messages (**always one per row**), followed by
CopyDone
but it is not a problem because the raw bytes are still parsable (rows +
fields) in text mode (tsv) and in binary mode)
Now I started working on copyBoth and logical decoding scenarios. In this
case, the server send series of copyData. 1 copyData containing 1 message :
at the network chunk level, in the case of large fields, we can observe
in: CopyData Int32 XLogData Int64 Int64 Int64 Byten1
in: Byten2
in: CopyData Int32 XLogData Int64 Int64 Int64 Byten3
in: CopyData Int32 XLogData Int64 Int64 Int64 Byten4
out: XLogData Int64 Int64 Int64 Byten1
out: Byten2
out: XLogData Int64 Int64 Int64 Byten3
out: XLogData Int64 Int64 Int64 Byten4
but at the XLogData level, the protocol is not self-describing its length,
so there is no real way of knowing where the first XLogData ends apart from
- knowing the length of the first copyData (4 + 1 + 3*8 + n1 + n2)
- knowing the internals of the output plugin and benefit from a plugin
that self-describe its span
when a network chunks contains several copyDatas
in: CopyData Int32 XLogData Int64 Int64 Int64 Byten1 CopyData Int32
XLogData Int64 Int64 Int64 Byten2
we have
out: XLogData Int64 Int64 Int64 Byten1 XLogData Int64 Int64 Int64 Byten2
and with test_decoding for example it is impossible to know where the
test_decoding output ends without remembering the original length of the
copyData.
now my question is the following :
is it ok to consider that over the long term copyData is simply a transport
container that exists only to allow the multiplexing of events in the
protocol but that messages inside could be chunked over several copyData
events ?
if we put test_decoding apart, do you consider that output plugins XLogData
should be self-aware of their length ? I suppose (but did not fully verify
yet) that this is the case for pgoutput ? I suppose that wal2json could
also be parsed by balancing the brackets.
I am wondering because when a client sends copyData to the server, the
documentation says
>The message boundaries are not required to have anything to do with row
boundaries, >although that is often a reasonable choice.
I hope that my message will ring a bell on the list.
I tried the best I could to describe my very specific research.
Thank you for your help,
---
Jérôme
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2020-06-03 18:10:50 | Re: Why is pq_begintypsend so slow? |
Previous Message | Alvaro Herrera | 2020-06-03 17:26:28 | Re: Towards easier AMs: Cleaning up inappropriate use of name "relkind" |