| From: | Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com> |
| Subject: | Re: Sort optimizations: Making in-memory sort cache-aware |
| Date: | 2023-03-02 18:32:21 |
| Message-ID: | 15cd9c50-eb5a-2fad-1d01-1c93b836f2c0@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Andres,
I took a stab at naive version of this but been stuck for sometime now.
I have added logic to sort on first column at first pass,
realloc all tuples and do full sort at second pass, but I am not seeing
any benefit (it is actually regressing) at all.
Tried doing above both at bulk and at chunks of data.
> You effectively end up with a bounded number of pre-sorted blocks, so
the most
> obvious thing to try is to build a heap of those blocks and
effectively do a
> heapsort between the presorted blocks.
I am not very clear about implementation for this. How can we do
heapsort between
the presorted blocks? Do we keep changing state->bound=i, i+n, i+2n
something like
this and keep calling make_bounded_heap/sort_bounded_heap?
> A related, but separate, improvement is to reduce / remove the memory
> allocation overhead.
This is still pending from my side.
I have attached some benchmarking results with script and POC
patches (which includes GUC switch to enable optimization for ease of
testing) for the same.
Tested on WORK_MEM=3 GB for 1 and 10 Million rows data.
Please let me know things which I can fix and re-attempt.
Thanks,
Ankit
| Attachment | Content-Type | Size |
|---|---|---|
|
|
image/png | 14.1 KB |
|
image/png | 13.7 KB |
|
image/png | 15.0 KB |
|
image/png | 15.6 KB |
| bench_sort_optimization.sh.txt | text/plain | 1.2 KB |
| inmem-sort-opt-batch.patch | text/x-patch | 4.3 KB |
| inmem-sort-opt.patch | text/x-patch | 4.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2023-03-02 18:35:27 | Re: Add SHELL_EXIT_CODE to psql |
| Previous Message | Tomas Vondra | 2023-03-02 18:27:00 | Re: Add LZ4 compression in pg_dump |