From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: rbtree code breaks GIN's adherence to maintenance_work_mem |
Date: | 2010-07-31 16:12:30 |
Message-ID: | AANLkTi=CgdycmAKPP13wbpXNbDBVA8nyEPJEEa7YLov_@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Jul 31, 2010 at 12:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Sat, Jul 31, 2010 at 12:40 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> So, I would like somebody to show cause why that whole module shouldn't
>>> be ripped out and the code reverted to where it was in 8.4. My
>>> recollection is that the argument for adding it was to speed things up
>>> in corner cases, but what I think it's actually going to do is slow
>>> things down in every case.
>
>> I've always been a bit suspicious of this code, too, even though I
>> didn't think about the memory consumption issue. But see here:
>> http://archives.postgresql.org/pgsql-hackers/2010-02/msg00307.php
>
> I did a bit of experimentation and confirmed my fears: HEAD is willing
> to eat about double the specified maintenance_work_mem. If you cut
> back the setting so that its actual memory use is no more than 8.4's,
> it's about 33% slower on non-pathological data (I'm testing the dataset
> from Artur Dabrowski here).
That seems like a pretty serious regression.
> I'm tempted to suggest that making RBNode be a hidden struct containing
> a pointer to somebody else's datum is fundamentally the wrong way to
> go about things, because the extra void pointer is pure overhead,
> and we aren't ever going to be using these things in a context where
> memory usage isn't of concern. If we refactored the API so that RBNode
> was intended to be the first field of some larger struct, as is done in
> dynahash tables for instance, we could eliminate the void pointer and
> the palloc inefficiency. The added storage compared to what 8.4 used
> would be a parent link and the iteratorState/color fields, which would
> end up costing us 16 more bytes per EntryAccumulator rather than 64.
> Still not great but at least it's not a 2X penalty, and the memory
> allocation would become the caller's problem not rbtree's, so the
> problem of tracking usage would be no different from before.
Even if we do that, is it still going to be too much of a performance
regression overall?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-07-31 16:32:03 | Re: rbtree code breaks GIN's adherence to maintenance_work_mem |
Previous Message | Tom Lane | 2010-07-31 16:06:33 | Re: ANALYZE versus expression indexes with nondefault opckeytype |