| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
| Cc: | Greg Stark <stark(at)mit(dot)edu>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Memory usage during sorting |
| Date: | 2012-03-20 16:33:33 |
| Message-ID: | 27028.1332261213@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Mar 20, 2012 at 7:44 AM, Greg Stark <stark(at)mit(dot)edu> wrote:
>> Offhand I wonder if this is all because we don't have the O(n) heapify
>> implemented.
> I'm pretty sure that's not the problem. Even though our heapify is
> not as efficient as it could be, it's plenty fast enough. I thought
> about writing a patch to implement the better algorithm, but it seems
> like a distraction at this point because the heapify step is such a
> small contributor to overall sort time. What's taking all the time is
> the repeated siftup operations as we pop things out of the heap.
Right, but wouldn't getting rid of the run-number comparisons provide
some marginal improvement in the speed of tuplesort_heap_siftup?
BTW, there's a link at the bottom of the wikipedia page to a very
interesting ACM Queue article, which argues that the binary-tree
data structure isn't terribly well suited to virtual memory because
it touches random locations in succession. I'm not sure I believe
his particular solution, but I'm wondering about B+ trees, ie more
than 2 children per node.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2012-03-20 16:38:36 | Re: Cross-backend signals and administration (Was: Re: pg_terminate_backend for same-role) |
| Previous Message | Robert Haas | 2012-03-20 16:20:29 | Re: Memory usage during sorting |