From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Parallel Sort |
Date: | 2013-05-15 18:32:29 |
Message-ID: | CAM3SWZTz3FtcNT=_gOOhX_5Qt_QRvG-5oma6x1GsuR6VC1AWzQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, May 13, 2013 at 7:28 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> We should decide whether to actually sort in parallel based on the comparator
> cost and the data size. The system currently has no information on comparator
> cost: bt*cmp (and indeed almost all built-in functions) all have procost=1,
> but bttextcmp is at least 1000x slower than btint4cmp.
I think that this effort could justify itself independently of any
attempt to introduce parallelism to in-memory sorting. I abandoned a
patch to introduce timsort to Postgres, because I knew that there was
no principled way to reap the benefits. Unless you introduce
parallelism, it's probably going to be virtually impossible to come up
with an alogorithm that does in-memory sorting faster (in terms of the
amount of system time taken) than a highly optimized quicksort when
sorting integers. But sorting types with really expensive comparators
(even considerably more expensive than bttextcmp) for
pass-by-reference Datums (where the memory locality advantage of
quicksort doesn't really help so much) makes timsort much more
compelling. That's why it's used for Python lists.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Cédric Villemain | 2013-05-15 18:32:37 | Re: PostgreSQL 9.3 beta breaks some extensions "make install" |
Previous Message | Kevin Grittner | 2013-05-15 18:18:35 | Re: counting algorithm for incremental matview maintenance |