From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | alvherre(at)2ndquadrant(dot)com |
Cc: | jgdr(at)dalibo(dot)com, andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, sawada(dot)mshk(at)gmail(dot)com, peter(dot)eisentraut(at)2ndquadrant(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, thomas(dot)munro(at)enterprisedb(dot)com, sk(at)zsrv(dot)org, michael(dot)paquier(at)gmail(dot)com |
Subject: | Re: [HACKERS] Restricting maximum keep segments by repslots |
Date: | 2020-03-31 05:10:14 |
Message-ID: | 20200331.141014.2028906248823085365.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you for looking this and trouble rebasing!
At Mon, 30 Mar 2020 20:03:27 -0300, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote in
> I rebased this patch; it's failing to apply due to minor concurrent
> changes in PostgresNode.pm. I squashed the patches in a series that
> made the most sense to me.
>
> I have a question about static variable lastFoundOldestSeg in
> FindOldestXLogFileSegNo. It may be set the first time the function
> runs; if it is, the function never again does anything, it just returns
> that value. In other words, the static value is never reset; it never
> advances either. Isn't that strange? I think the coding is to assume
> that XLogCtl->lastRemovedSegNo will always be set, so its code will
> almost never run ... except when the very first wal file has not been
> removed yet. This seems weird and pointless. Maybe we should think
> about this differently -- example: if XLogGetLastRemovedSegno returns
> zero, then the oldest file is the zeroth one. In what cases this is
> wrong? Maybe we should fix those.
That's right, but without the static variable, every call to the
pg_replication_slots view before the fist checkpoint causes scanning
pg_xlog. XLogCtl->lastRemovedSegNo advances only at a checkpoint, so
it is actually right that the return value from
FindOldestXLogFileSegNo doesn't change until the first checkpoint.
Also we could set XLogCtl->lastRemovedSegNo at startup, but the
scanning on pg_xlog is useless in most cases.
I avoided to update the XLogCtl->lastRemovedSegNo directlry, but the
third way would be if XLogGetLastRemovedSegno() returned 0, then set
XLogCtl->lastRemovedSegNo by scanning the WAL directory. The attached
takes this way.
> Regarding the PostgresNode change in 0001, I think adding a special
> parameter for primary_slot_name is limited. I'd like to change the
> definition so that anything that you give as a parameter that's not one
> of the recognized keywords (has_streaming, etc) is tested to see if it's
> a GUC; and if it is, then put it in postgresql.conf. This would have to
> apply both to PostgresNode::init() as well as
> PostgresNode::init_from_backup(), obviously, since it would make no
> sense for the APIs to diverge on this point. So you'd be able to do
> $node->init_from_backup(allow_streaming => 1, work_mem => "4MB");
> without having to add code to init_from_backup to handle work_mem
> specifically. This could be done by having a Perl hash with all the GUC
> names, that we could read lazily from "postmaster --describe-config" the
> first time we see an unrecognized keyword as an option to init() /
> init_from_backup().
Done that way. We could exclude "known" parameters by explicitly
delete the key at reading it, but I choosed to enumerate the known
keywords. Although it can be used widely but actually I changed only
018_repslot_limit.pl to use the feature.
> I edited the doc changes a bit.
>
> I don't know what to think of 0003 yet. Has this been agreed to be a
> good idea?
So it was a separate patch. I think it has not been approved nor
rejected. The main objective of the patch is preventing
pg_replication_slots.wal_status from strange coming back from the
"lost" state to other states. However, in the first place I doubt that
it's right that logical replication sends the content of a WAL segment
already recycled.
> I also made a few small edits to the code; all cosmetic so far:
>
> * added long_desc to the new GUC; it now reads:
>
> {"max_slot_wal_keep_size", PGC_SIGHUP, REPLICATION_SENDING,
> gettext_noop("Sets the maximum size of WAL space reserved by replication slots."),
> gettext_noop("Replication slots will be marked as failed, and segments released "
> "for deletion or recycling, if this much space is occupied by WAL "
> "on disk."),
>
> * updated the comment to ConvertToXSegs() which is now being used for
> this purpose
>
> * remove outdated comment to GetWalAvailability; it was talking about
> restBytes parameter that no longer exists
Thank you for the fixes. All of the looks fine.
I fixed several typos. (s/requred/required/, s/devinitly/definitely/,
s/errror/error/)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
v20-0001-Allow-arbitrary-GUC-parameter-setting-init-and-i.patch | text/x-patch | 2.7 KB |
v20-0002-Add-WAL-relief-vent-for-replication-slots.patch | text/x-patch | 35.5 KB |
v20-0003-Allow-init-and-init_from_backup-to-set-arbitrary.patch | text/x-patch | 2.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2020-03-31 05:13:34 | Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch) |
Previous Message | Masahiko Sawada | 2020-03-31 04:30:19 | Re: Internal key management system |