Quick Links

Re: Sorting performance vs. MySQL

From:	Yang Zhang <yanghatespam(at)gmail(dot)com>
To:	Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-general(at)postgresql(dot)org
Subject:	Re: Sorting performance vs. MySQL
Date:	2010-02-22 19:50:46
Message-ID:	9066fa251002221150q15913begfb39c1373e6525ca@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Mon, Feb 22, 2010 at 2:39 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> wrote:
> On Mon, Feb 22, 2010 at 12:30 PM, Yang Zhang <yanghatespam(at)gmail(dot)com> wrote:
>> This isn't some microbenchmark. This is part of our actual analytical
>> application. We're running large-scale graph partitioning algorithms.
>
> It's important to see how it runs if you can fit more / most of the
> data set into memory by cranking up work_mem to something really big
> (like a gigabyte or two) and if the query planner can switch to some
> sort of hash algorithm.

We're actually using a very small dataset right now. Being bounded by
memory capacity is not a scalable approach for our application.

>
> Also, can you cluster the table on transactionid ?
>

We can, but that's not really addressing the core issue, which matters
to us since the sort itself is only for performing a self merge join
on transactionid, and the *very next step* is a group by a.tableid,
a.tupleid, b.tableid, b.tupleid (i.e. requiring another sort for the
group-agg).
--
Yang Zhang
http://www.mit.edu/~y_z/

In response to

Re: Sorting performance vs. MySQL at 2010-02-22 19:39:39 from Scott Marlowe

Responses

Re: Sorting performance vs. MySQL at 2010-02-22 19:54:20 from Scott Marlowe

Browse pgsql-general by date

	From	Date	Subject
Next Message	Scott Marlowe	2010-02-22 19:52:51	Re: Sorting performance vs. MySQL
Previous Message	Frank Heikens	2010-02-22 19:41:29	Re: Sorting performance vs. MySQL