Re: The case for removing replacement selection sort

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: The case for removing replacement selection sort
Date: 2017-09-29 14:19:54
Message-ID: CA+TgmoYcSPcnN5Fdi4rTwNFEPUMcwRBL_z09iN9aJYwAZY_NuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 28, 2017 at 6:44 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> I'm glad to hear that. But, I should reiterate that if sorting
> actually gets faster when my patch is applied, then that's something
> that I consider to be a bonus. (This is primarily a refactoring patch,
> to remove a bunch of obsolete code.)

I understand that. The tests above weren't about your patch; they
were about whether replacement selection still has value.

>> Any idea what causes this regression?
>
> I don't know. My best guess is that the overall I/O scheduling is now
> suboptimal due to commit e94568e (Heikki's preloading thing), because
> this is CREATE INDEX, and there is new competitive pressure. You might
> find it hard to replicate the problem with a "SELECT COUNT(DISTINCT
> aid) FROM pgbench_accounts", which would confirm this explanation. Or,
> you could also see what happens with a separate temp tablespace.

I tried out that test case, again on 9.6 and master, again with scale
factor 300. On 9.6, with default settings, it took ~12.5 seconds.
With 4MB of work_mem, it took about ~13.2 s with replacement selection
and ~12.5 s using quicksorted runs. On master, with default settings,
it took ~8.4 seconds. With 4MB of work_mem, it took about ~12.3 s
using replacement selection and ~8.4 seconds using quicksorted runs.
So here, everything was faster on master, but replacement selection
was only slightly faster while the other technique was a lot faster.

That supports your theory that there's some confounding factor in the
CREATE INDEX case, such as I/O scheduling. Since this machine has an
SSD, I guess I don't have a mental model for how that works. We're
not waiting for the platter to rotate...

...but I guess that's all irrelevant as far as this patch goes. The
point of this patch is to simplify things from removing a technique
that is no longer effective, and the evidence we have supports the
contention that it is no longer effective. I'll go commit this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2017-09-29 14:33:18 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Fabien COELHO 2017-09-29 14:15:20 Re: plpgsql_check future