From: | Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: Sort optimizations: Making in-memory sort cache-aware |
Date: | 2023-03-02 18:32:21 |
Message-ID: | 15cd9c50-eb5a-2fad-1d01-1c93b836f2c0@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Andres,
I took a stab at naive version of this but been stuck for sometime now.
I have added logic to sort on first column at first pass,
realloc all tuples and do full sort at second pass, but I am not seeing
any benefit (it is actually regressing) at all.
Tried doing above both at bulk and at chunks of data.
> You effectively end up with a bounded number of pre-sorted blocks, so
the most
> obvious thing to try is to build a heap of those blocks and
effectively do a
> heapsort between the presorted blocks.
I am not very clear about implementation for this. How can we do
heapsort between
the presorted blocks? Do we keep changing state->bound=i, i+n, i+2n
something like
this and keep calling make_bounded_heap/sort_bounded_heap?
> A related, but separate, improvement is to reduce / remove the memory
> allocation overhead.
This is still pending from my side.
I have attached some benchmarking results with script and POC
patches (which includes GUC switch to enable optimization for ease of
testing) for the same.
Tested on WORK_MEM=3 GB for 1 and 10 Million rows data.
Please let me know things which I can fix and re-attempt.
Thanks,
Ankit
Attachment | Content-Type | Size |
---|---|---|
image/png | 14.1 KB | |
image/png | 13.7 KB | |
image/png | 15.0 KB | |
image/png | 15.6 KB | |
bench_sort_optimization.sh.txt | text/plain | 1.2 KB |
inmem-sort-opt-batch.patch | text/x-patch | 4.3 KB |
inmem-sort-opt.patch | text/x-patch | 4.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-03-02 18:35:27 | Re: Add SHELL_EXIT_CODE to psql |
Previous Message | Tomas Vondra | 2023-03-02 18:27:00 | Re: Add LZ4 compression in pg_dump |