Re: POC: converting Lists into arrays

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: POC: converting Lists into arrays
Date: 2019-02-25 18:59:36
Message-ID: 2320.1551121176@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Feb 25, 2019 at 1:17 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'm not following your point here. If we change key data structures
>> (i.e. parsetrees, plan trees, execution trees) to use some other list-ish
>> API, that *in itself* breaks everything that accesses those data
>> structures. The approach I propose here isn't zero-breakage, but it
>> requires far fewer places to be touched than a complete API replacement
>> would do.

> Sure, but if you have third-party code that touches those things,
> it'll fail to compile. With your proposed approach, there seems to be
> a risk that it will compile but not work.

Failing to compile isn't really a benefit IMO. Now, if we could avoid
the *semantic* differences (like whether it's safe to hold onto a pointer
into a List while doing FOO on the list), then we'd have something.
The biggest problem with what I'm proposing is that it doesn't always
manage to do that --- but any other implementation is going to break
such assumptions too. I do not think that forcing cosmetic changes
on people is going to do much to help them revisit possibly-hidden
assumptions like those. What will help is to provide debugging aids to
flush out such assumptions, which I've endeavored to do in this patch.
And I would say that any competing proposal is going to be a failure
unless it provides at-least-as-effective support for flushing out bugs
in naive updates of existing List-using code.

>> I completely disagree. Your proposal is probably an order of magnitude
>> more painful than the approach I suggest here, while not really offering
>> any additional performance benefit (or if you think there would be some,
>> you haven't explained how). Strictly on cost/benefit grounds, it isn't
>> ever going to happen that way.

> Why would it be ten times more painful, exactly?

Because it involves touching ten times more code (and that's a very
conservative estimate). Excluding changes in pg_list.h + list.c,
what I posted touches approximately 600 lines of code (520 insertions,
644 deletions to be exact). For comparison's sake, there are about
1800 uses of foreach in the tree, each of which would require at least
3 changes to replace (the foreach itself, the ListCell variable
declaration, and at least one lfirst() reference in the loop body).
So we've already blown past 5000 lines worth of changes if we want to
do it another way ... and that's just *one* component of the List API.
Nor is there any reason to think the changes would be any more mechanical
than what I had to do here. (No fair saying that I already found the
trouble spots, either. A different implementation would likely break
assumptions in different ways.)

If I said your proposal involved two orders of magnitude more work,
I might not be far off the mark.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2019-02-25 19:24:18 Re: Remove Deprecated Exclusive Backup Mode
Previous Message David Steele 2019-02-25 18:43:01 Re: Remove Deprecated Exclusive Backup Mode