From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Date: | 2019-09-26 18:36:45 |
Message-ID: | 20190926183645.6getukzf4dxxclp2@development |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 26, 2019 at 06:58:17PM +0530, Amit Kapila wrote:
>On Tue, Sep 3, 2019 at 4:16 PM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>
>> On Mon, Sep 02, 2019 at 06:06:50PM -0400, Alvaro Herrera wrote:
>> >In the interest of moving things forward, how far are we from making
>> >0001 committable? If I understand correctly, the rest of this patchset
>> >depends on https://commitfest.postgresql.org/24/944/ which seems to be
>> >moving at a glacial pace (or, actually, slower, because glaciers do
>> >move, which cannot be said of that other patch.)
>> >
>>
>> I think 0001 is mostly there. I think there's one bug in this patch
>> version, but I need to check and I'll post an updated version shortly if
>> needed.
>>
>
>Did you get a chance to work on 0001? I have a few comments on that patch:
>1.
>+ * To limit the amount of memory used by decoded changes, we track memory
>+ * used at the reorder buffer level (i.e. total amount of memory), and for
>+ * each toplevel transaction. When the total amount of used memory exceeds
>+ * the limit, the toplevel transaction consuming the most memory is either
>+ * serialized or streamed.
>
>Do we need to mention 'streamed' as part of this patch? It seems to
>me that this is an independent patch which can be committed without
>patches that stream the changes. So, we can remove it from here and
>other places where it is used.
>
You're right - this patch should not mention streaming because the parts
adding that capability are later in the series. So it can trigger just
the serialization to disk.
>2.
>+ * deserializing and applying very few changes). We probably to give more
>+ * memory to the oldest subtransactions.
>
>/We probably to/
>It seems some word is missing after probably.
>
Yes.
>3.
>+ * Find the largest transaction (toplevel or subxact) to evict (spill to disk).
>+ *
>+ * XXX With many subtransactions this might be quite slow, because we'll have
>+ * to walk through all of them. There are some options how we could improve
>+ * that: (a) maintain some secondary structure with transactions sorted by
>+ * amount of changes, (b) not looking for the entirely largest transaction,
>+ * but e.g. for transaction using at least some fraction of the memory limit,
>+ * and (c) evicting multiple transactions at once, e.g. to free a given portion
>+ * of the memory limit (e.g. 50%).
>+ */
>+static ReorderBufferTXN *
>+ReorderBufferLargestTXN(ReorderBuffer *rb)
>
>What is the guarantee that after evicting largest transaction, we
>won't immediately hit the memory limit? Say, all of the transactions
>are of almost similar size which I don't think is that uncommon a
>case.
Not sure I understand - what do you mean 'immediately hit'?
We do check the limit after queueing a change, and we know that this
change is what got us over the limit. We pick the largest transaction
(which has to be larger than the change we just entered) and evict it,
getting below the memory limit again.
The next change can get us over the memory limit again, of course, but
there's not much we could do about that.
> Instead, the strategy mentioned in point (c) or something like
>that seems more promising. In that strategy, there is some risk that
>it might lead to many smaller disk writes which we might want to
>control via some threshold (like we should not flush more than N
>xacts). In this, we also need to ensure that the total memory freed
>must be greater than the current change.
>
>I think we have some discussion around this point but didn't reach any
>conclusion which means some more brainstorming is required.
>
I agree it's worth investigating, but I'm not sure it's necessary before
committing v1 of the feature. I don't think there's a clear winner
strategy, and the current approach works fairly well I think.
The comment is concerned with the cost of ReorderBufferLargestTXN with
many transactions, but we can only have certain number of top-level
transactions (max_connections + certain number of not-yet-assigned
subtransactions). And 0002 patch essentially gets rid of the subxacts
entirely, further reducing the maximum number of xacts to walk.
>4.
>+int logical_work_mem; /* 4MB */
>
>What this 4MB in comments indicate?
>
Sorry, that's a mistake.
>5.
>+/*
>+ * Check whether the logical_work_mem limit was reached, and if yes pick
>+ * the transaction tx should spill its data to disk.
>
>The second part of the sentence "pick the transaction tx should spill"
>seems to be incomplete.
>
Yeah, that's a poor wording. Will fix.
>Apart from this, I see that Peter E. has raised some other points on
>this patch which are not yet addressed as those also need some
>discussion, so I will respond to those separately with my opinion.
>
OK, thanks.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2019-09-26 18:47:39 | Re: WAL records |
Previous Message | David Steele | 2019-09-26 18:36:33 | Re: recovery starting when backup_label exists, but not recovery.signal |