From: | Antonin Houska <ah(at)cybertec(dot)at> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Attempt to consolidate reading of XLOG page |
Date: | 2019-09-26 12:08:33 |
Message-ID: | 75115.1569499713@antos |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> On 2019-Sep-24, Antonin Houska wrote:
>
> > Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>
> > > If you don't have any strong dislikes for these changes, I'll push this
> > > part and let you rebase the remains on top.
> >
> > No objections here.
>
> oK, pushed. Please rebase the other parts.
Thanks!
> I made one small adjustment: in read_local_xlog_page() there was one
> *readTLI output parameter that was being changed to a local variable
> plus later assigment to the output struct member; I changed the code to
> continue to assign directly to the output variable instead. There was
> an error case in which the TLI was not assigned to; I suppose this
> doesn't really change things (we don't examine the TLI in that case, do
> we?), but it seemed dangerous to leave like that.
I used the local variable to make some expressions simpler, but missed the
fact that this way I can leave the ws_tli field unassigned if the function
returns prematurely. Now that I look closer, I see that it can be a problem -
in the case of ERROR, XLogReadRecord() does reset the state, but it does not
reset the TLI:
err:
/*
* Invalidate the read state. We might read from a different source after
* failure.
*/
XLogReaderInvalReadState(state);
Thus the TLI appears to be important even on ERROR, and what you've done is
correct. Thanks for fixing that.
One comment on the remaining part of the series:
Before this refactoring, the walsender.c:XLogRead() function contained these
lines
/*
* After reading into the buffer, check that what we read was valid. We do
* this after reading, because even though the segment was present when we
* opened it, it might get recycled or removed while we read it. The
* read() succeeds in that case, but the data we tried to read might
* already have been overwritten with new WAL records.
*/
XLByteToSeg(startptr, segno, segcxt->ws_segsize);
CheckXLogRemoved(segno, ThisTimeLineID);
but they don't fit into the new, generic implementation, so I copied these
lines to the two places right after the call of the new XLogRead(). However I
was not sure if ThisTimeLineID was ever correct here. It seems the original
walsender.c:XLogRead() implementation did not update ThisTimeLineID (and
therefore neither the new callback WalSndSegmentOpen() does), so both
logical_read_xlog_page() and XLogSendPhysical() could read the data from
another (historic) timeline. I think we should check the segment we really
read data from:
CheckXLogRemoved(segno, sendSeg->ws_tli);
The rebased code is attached.
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
Attachment | Content-Type | Size |
---|---|---|
v07-0005-Use-only-xlogreader.c-XLogRead.patch | text/x-diff | 18.7 KB |
v07-0006-Remove-the-old-implemenations-of-XLogRead.patch | text/x-diff | 13.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2019-09-26 13:06:27 | Re: Refactoring of connection with password prompt loop for frontends |
Previous Message | Alexey Kondratov | 2019-09-26 12:08:22 | Re: [Patch] pg_rewind: options to use restore_command from recovery.conf or command line |