From: | Greg Stark <stark(at)mit(dot)edu> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
Subject: | Re: Using quicksort for every external sort run |
Date: | 2016-01-30 07:27:21 |
Message-ID: | CAM-w4HORqMqS1R3XSW_7gS6=9RXe=xovj5SWwXnPXJR2Mt7niQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 29 Jan 2016 11:58 pm, "Robert Haas" <robertmhaas(at)gmail(dot)com> wrote:
> It
> seems pretty easy to construct cases where this technique regresses,
> and a large percentage of those cases are precisely those where
> replacement selection would have produced a single run, avoiding the
> merge step altogether.
Now that avoiding the merge phase altogether didn't necessarily represent
any actual advantage.
We don't find out we've avoided the merge phase until the entire run has
been spiked to disk. Then we need to read it back in from disk to serve up
those tuples.
If we have tapes to merge but can do then in a single pass we do that
lazily and merge as needed when we serve up the tuples. I doubt there's any
speed difference in reading two sequential streams with our buffering over
one especially in the midst of a quiet doing other i/o. And N extra
comparisons is less than the quicksort advantage.
If we could somehow predict that it'll be a single output run that would be
a huge advantage. But having to spill all the tuples and then find out
isn't really helpful.
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2016-01-30 07:29:14 | Re: Using quicksort for every external sort run |
Previous Message | Peter Geoghegan | 2016-01-30 07:25:38 | Re: Using quicksort for every external sort run |