RE: GSOC 2018 Project - A New Sorting Routine

From: Kefan Yang <starordust(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Peter Geoghegan <pg(at)bowt(dot)ie>, "alvherre(at)2ndquadrant(dot)com" <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: GSOC 2018 Project - A New Sorting Routine
Date: 2018-07-23 22:21:01
Message-ID: 5b5654cc.1c69fb81.5ea44.d698@mx.google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas!

I did a few tests on my own Linux machine, but the problem is that my resources on AWS(CPU, RAM and even Disk space) are very limited. I considered establishing virtual machine on my own PC but the performance is even worse.

My original patch has two main optimizations: (1) switch to heap sort when depth limit exceeded (2) check whether the array is presorted only once at the beginning. Now I want to test these optimizations separately. On AWS EC2 instance, regressions on CREATE INDEX cases seems to be less significant if we use (1) only, but I can only test up to 100000 records and 512MB memory using your scripts.

So would you mind re-running the tests using the two patches I provided in the attachment? That will be very helpful

Regards,
Kefan

From: Tomas Vondra
Sent: July 18, 2018 2:26 PM
To: Kefan Yang
Cc: Andrey Borodin; Peter Geoghegan; PostgreSQL Hackers
Subject: Re: GSOC 2018 Project - A New Sorting Routine

I don't have any script for that - load the files into a spreadsheet,
create pivot tables and you're done.

regards

On 07/18/2018 11:13 PM, Kefan Yang wrote:
> Hey Tomas!
>
>  
>
> I am trying to reproduce the results on my machine. Could you please
> share the script to generate .ods files?
>
>  
>
> Regards,
>
> Kefan
>
>  
>
> *From: *Tomas Vondra <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>
> *Sent: *July 18, 2018 2:05 AM
> *To: *Andrey Borodin <mailto:x4mmm(at)yandex-team(dot)ru>
> *Cc: *Peter Geoghegan <mailto:pg(at)bowt(dot)ie>; Kefan Yang
> <mailto:starordust(at)gmail(dot)com>; PostgreSQL Hackers
> <mailto:pgsql-hackers(at)lists(dot)postgresql(dot)org>
> *Subject: *Re: GSOC 2018 Project - A New Sorting Routine
>
>  
>
>  
>
>  
>
> On 07/18/2018 07:06 AM, Andrey Borodin wrote:
>
>> Hi, Tomas!
>
>>
>
>>> 15 июля 2018 г., в 1:20, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com
>
>>> <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>> написал(а):
>
>>> 
>
>>> So I doubt it's this, but I've tweaked the scripts to also set this GUC
>
>>> and restarted the tests on both machines. Let's see what that does.
>
>>
>
>> Do you observe any different results?
>
>>
>
>  
>
> It did change the CREATE INDEX results, depending on the scale. The full
>
> data is available at [1] and [2], attached is a spreadsheet summary from
>
> the Xeon box.
>
>  
>
> For the largest scale (1M rows) the regressions for CREATE INDEX queries
>
> mostly disappeared. For 10k rows it still affects CREATE INDEX with a
>
> text column, and the 100k case behaves just like before (so significant
>
> regressions for CREATE INDEX).
>
>  
>
> I don't have time to investigate this further at the moment, but I'm
>
> still of the opinion that there's little to gain by replacing our
>
> current sort algorithm with this.
>
>  
>
>  
>
> [1] https://bitbucket.org/tvondra/sort-intro-sort-xeon/src/master/
>
> [2] https://bitbucket.org/tvondra/sort-intro-sort-i5/src/master/
>
>  
>
> regards
>
>  
>
> --
>
> Tomas Vondra                  http://www.2ndQuadrant.com
>
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>  
>

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
check_once.diff application/octet-stream 9.7 KB
use_heap.diff application/octet-stream 13.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-07-23 22:36:43 Re: "interesting" issue with restore from a pg_dump with a database-wide search_path
Previous Message Jeff Janes 2018-07-23 21:53:55 Re: Have an encrypted pgpass file