Quick Links

Re: Merge algorithms for large numbers of "tapes"

From:	"Dann Corbit" <DCorbit(at)connx(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Luke Lonergan" <llonergan(at)greenplum(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Merge algorithms for large numbers of "tapes"
Date:	2006-03-08 23:35:53
Message-ID:	D425483C2C5C9F49B5B7A41F8944154757D5EB@postal.corporate.connx.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Wednesday, March 08, 2006 3:17 PM
> To: Dann Corbit
> Cc: Jim C. Nasby; Luke Lonergan; Simon Riggs;
pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Merge algorithms for large numbers of "tapes"
>
> "Dann Corbit" <DCorbit(at)connx(dot)com> writes:
> > Here are some suggestions of things that I know work really, really
> > well:
> > #1. Two pass merge (none of that silly poly-tape merge goo)
>
> This amounts to an assumption that you have infinite work_mem, in
which
> case you hardly need an external sort at all. If your work_mem is in
> fact finite, then at some point you need more than two passes. I'm
not
> really interested in ripping out support for sort operations that are
> much larger than work_mem.

No it does not. I have explained this before. You can have one million
files and merge them all into a final output with a single pass. It
does not matter how big they are or how much memory you have.

The idea is very simple. Each subfile has its top record inserted into
a priority queue of file handles (or whatever else you want to use --
temp tables, you name it). When you extract the smallest record from the
queue, the priority changes and that file handle gets moved to a new
place in the queue. You keep pulling records from the queue until the
entire queue is empty.

The outline is like this:
1. Sort chunks
2. Write chunks
3. Insert top record of chunks into priority queue
4. Extract records from queue, writing them to final output
5. Repeat step 4 until queue is empty.

> > #2. Load ONLY the keys that are to be sorted into memory. Use a
> > pointer exchange sort, and do not move the physical rows of data at
all.
>
> This suggestion isn't a whole lot better; in general the rows to be
> sorted don't exist until we compute them, and so proposing that we
> "don't load them until later" is pretty much irrelevant. Also, in
> a lot of common cases the keys to be sorted are the bulk of the data
> anyway.

This suggestion is in addition to suggestion 1. They are not even
related except that both suggestions make the sort run a lot faster.

I think I did not explain it clearly enough. Suppose that you have a
set of rows you need to sort. Instead of loading the whole row into
memory, just load the columns (or parts of columns) that are being
sorted. I hope that it is more clear now.

>
> regards, tom lane

Responses

Re: Merge algorithms for large numbers of "tapes" at 2006-03-09 01:44:08 from Jim C. Nasby

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Greg Stark	2006-03-08 23:42:45	Re: Coverity Open Source Defect Scan of PostgreSQL
Previous Message	Tom Lane	2006-03-08 23:17:05	Re: Merge algorithms for large numbers of "tapes"