From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Robins <robins(at)pobox(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Eliminating PD_ALL_VISIBLE, take 2 |
Date: | 2013-07-15 03:06:04 |
Message-ID: | CA+Tgmoa+-S6+6EdpNTUVpebY4mD+ECN3iaZ=BXd1h5wHuBzn6g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jul 5, 2013 at 4:18 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Tue, 2013-07-02 at 10:12 -0700, Jeff Davis wrote:
>> Regardless, this is at least a concrete issue that I can focus on, and I
>> appreciate that. Are scans of small tables the primary objection to this
>> patch, or are there others? If I solve it, will this patch make real
>> progress?
>
> I had an idea here:
>
> What if we keep PD_ALL_VISIBLE, but make it more like other hints, and
> make it optional? If a page is all visible, either or both of the VM bit
> or PD_ALL_VISIBLE could be set (please suspend disbelief for a moment).
>
> Then, we could use a heuristic, like setting PD_ALL_VISIBLE in the first
> 256 pages of a relation; but in later pages, only setting it if the page
> is already dirty for some other reason.
>
> That has the following benefits:
>
> 1. Eliminates the worry over contention related to scans, because we
> wouldn't need to use the VM for small tables. And I don't think anyone
> was worried about using the VM on a large table scan.
>
> 2. Still avoids dirtying lots of pages after a data load. I'm not
> worried if a few MB of data is rewritten on a large table.
>
> 3. Eliminates the complex way in which we (ab)use WAL and the recovery
> mechanism to keep PD_ALL_VISIBLE and the VM bit in sync.
>
> Of course, there's a reason that PD_ALL_VISIBLE is not like a normal
> hint: we need to make sure that inserts/updates/deletes clear the VM
> bit. But my patch already addresses that by keeping the VM page pinned.
I'm of the opinion that we ought to extract the parts of the patch
that hold the VM pin for longer, review those separately, and if
they're good and desirable, apply them. Although that optimization
becomes more necessary if we were to adopt your proposal than it is
now, it's really separate from this patch. Given that VM pin caching
can be done with or without removing PD_ALL_VISIBLE, it seems to me
that the fair comparison is between master + VM pin caching and master
+ VM pin caching + remove PD_ALL_VISIBLE. Comparing the latter vs.
unpatched master seems to me to be confusing the issue.
> That has some weaknesses: as Heikki pointed out[1], you can defeat the
> cache of the pin by randomly seeking between 512MB regions during an
> update (would only be noticable if it's all in shared buffers already,
> of course). But even in that case, it was a fairly modest degradation
> (20%), so overall this seems like a fairly minor drawback.
I am not convinced. I thought about the problem of repeatedly
switching pinned VM pages during the index-only scans work, and
decided that we could live with it because, if the table was large
enough that we were pinning VM pages frequently, we were also avoiding
I/O. Of course, this is a logical fallacy, since the table could
easily be large enough to have quite a few VM pages and yet small
enough to fit in RAM. And, indeed, at least in the early days, an
index scan could beat out an index-only scan by a significant margin
on a memory-resident table, precisely because of the added cost of the
VM lookups. I haven't benchmarked lately so I don't know for sure
whether that's still the case, but I bet it is.
From your other email:
> I have a gut feeling that the complexity we go through to maintain
> PD_ALL_VISIBLE is unnecessary and will cause us problems later. If we
> could combine freezing and marking all-visible, and use WAL for
> PD_ALL_VISIBLE in a normal fashion, then I'd be content with that.
I think this idea is worth exploring, although I fear the overhead is
likely to be rather large. We could find out, though. Suppose we
simply change XLOG_HEAP2_VISIBLE to emit FPIs for the heap pages; how
much does that slow down vacuuming a large table into which many pages
have been bulk loaded? Sadly, I bet it's rather a lot, but I'd like
to be wrong.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2013-07-15 03:12:19 | Re: Add regression tests for COLLATE |
Previous Message | Noah Misch | 2013-07-15 02:15:12 | Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division] |