From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | sawada(dot)mshk(at)gmail(dot)com |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com, sk(at)zsrv(dot)org, michael(dot)paquier(at)gmail(dot)com, andres(at)anarazel(dot)de, peter(dot)eisentraut(at)2ndquadrant(dot)com |
Subject: | Re: [HACKERS] Restricting maximum keep segments by repslots |
Date: | 2018-07-13 08:40:04 |
Message-ID: | 20180713.174004.249224160.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello.
At Wed, 11 Jul 2018 15:09:23 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoCFtW6+SN_eVTszDAjQeeU2sSea2VpCEx08ejNafk8H9w(at)mail(dot)gmail(dot)com>
> On Mon, Jul 9, 2018 at 2:47 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
..
> Here is review comments of v4 patches.
>
> + if (minKeepLSN)
> + {
> + XLogRecPtr slotPtr = XLogGetReplicationSlotMinimumLSN();
> + Assert(!XLogRecPtrIsInvalid(slotPtr));
> +
> + tailSeg = GetOldestKeepSegment(currpos, slotPtr);
> +
> + XLogSegNoOffsetToRecPtr(tailSeg, 0, *minKeepLSN,
> wal_segment_size);
> + }
>
> The usage of XLogSegNoOffsetToRecPtr is wrong. Since we specify the
> destination at 4th argument the wal_segment_size will be changed in
> the above expression. The regression tests by PostgreSQL Patch Tester
I'm not sure I get this correctly, the definition of the macro is
as follows.
| #define XLogSegNoOffsetToRecPtr(segno, offset, dest, wal_segsz_bytes) \
| (dest) = (segno) * (wal_segsz_bytes) + (offset)
The destination is the *3rd* parameter and the forth is segment
size which is not to be written.
> seem passed but I got the following assertion failure in
> recovery/t/010_logical_decoding_timelines.pl
>
> TRAP: FailedAssertion("!(XLogRecPtrToBytePos(*StartPos) ==
> startbytepos)", File: "xlog.c", Line: 1277)
Hmm. I don't see a relation with this patch, but how did you
cause the failure? The failure means inconsistency between
existing XLogBytePosToRecPtr and XLogRecPtrToBytePos, which
doesn't seem to happen without modifying the two functions.
> ----
> + XLByteToSeg(restartLSN, restartSeg, wal_segment_size);
> +
> +
> + if (minKeepLSN)
> There is an extra empty line.
>
> ----
> + /* but, keep larger than wal_segment_size if any*/
> + if (wal_keep_segments > 0 && keepSegs < wal_keep_segments)
> + keepSegs = wal_keep_segments;
>
> You meant wal_keep_segments in the above comment rather than
> wal_segment_size? Also, the above comment need a whitespace just after
> "any".
Ouch! You're right. Fixed.
> ----
> +bool
> +IsLsnStillAvaiable(XLogRecPtr restartLSN, XLogRecPtr *minKeepLSN)
> +{
>
> I think restartLSN is a word used for replication slots. Since this
> function is defined in xlog.c it would be better to change the
> argument name to more generic name, for example recptr.
Agreed. I used "target" instead.
> ----
> + /*
> + * Calcualte keep segments by slots first. The second term of the
> + * condition is just a sanity check.
> + */
>
> s/calcualte/calculate/
Fixed.
> ----
> + /* get minimum segment ignorig timeline ID */
>
> s/ignorig/ignoring/
Fixed.
# One of my fingers is literally fatter with bandaid than usual..
> ----
> min_keep_lsn in pg_replication_slots currently shows the same value in
> every slots but I felt that the value seems not easy to understand
> intuitively for users because users will have to confirm that value
> and to compare the current LSN in order to check if replication slots
> will become the "lost" status. So how about showing values that
> indicate how far away from the point where we become "lost" for
> individual slots?
Yeah, that is what I did in the first cut of this patch from the
same thought. pg_replication_slots have two additional columns
"live" and "distance".
https://www.postgresql.org/message-id/20171031.184310.182012625.horiguchi.kyotaro@lab.ntt.co.jp
Thre current design is changed following a comment.
https://www.postgresql.org/message-id/20171108.131431.170534842.horiguchi.kyotaro%40lab.ntt.co.jp
> > I don't think 'distance' is a good metric - that's going to continually
> > change. Why not store the LSN that's available and provide a function
> > that computes this? Or just rely on the lsn - lsn operator?
>
> It seems reasonable.,The 'secured minimum LSN' is common among
> all slots so showing it in the view may look a bit stupid but I
> don't find another suitable place for it. distance = 0 meant the
> state that the slot is living but insecured in the previous patch
> and that information is lost by changing 'distance' to
> 'min_secure_lsn'.
As I reconsidered this, I noticed that "lsn - lsn" doesn't make
sense here. The correct formula for the value is
"max_slot_wal_keep_size * 1024 * 1024 - ((oldest LSN to keep) -
restart_lsn). It is not a simple formula to write by hand but
doesn't seem general enough. I re-changed my mind to show the
"distance" there again.
pg_replication_slots now has the column "remain" instaed of
"min_keep_lsn", which shows an LSN when wal_status is "streaming"
and otherwise "0/0". In a special case, "remain" can be "0/0"
while "wal_status" is "streaming". It is the reason for the
tristate return value of IsLsnStillAvaialbe().
wal_status | remain
streaming | 0/19E3C0 -- WAL is reserved
streaming | 0/0 -- Still reserved but on the boundary
keeping | 0/0 -- About to be lost.
lost | 0/0 -- Lost.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Add-WAL-releaf-vent-for-replication-slots.patch | text/x-patch | 6.5 KB |
v5-0002-Add-monitoring-aid-for-max_slot_wal_keep_size.patch | text/x-patch | 11.9 KB |
v5-0003-TAP-test-for-the-slot-limit-feature.patch | text/x-patch | 5.3 KB |
v5-0004-Documentation-for-slot-limit-feature.patch | text/x-patch | 5.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Fabien COELHO | 2018-07-13 08:46:23 | Re: pgbench's expression parsing & negative numbers |
Previous Message | Fabien COELHO | 2018-07-13 08:39:53 | Re: Constraint documentation |