From: | "Luke Lonergan" <llonergan(at)greenplum(dot)com> |
---|---|
To: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Merge algorithms for large numbers of "tapes" |
Date: | 2006-03-08 18:49:16 |
Message-ID: | C034672C.1EC73%llonergan@greenplum.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Jim,
On 3/8/06 9:49 AM, "Jim C. Nasby" <jnasby(at)pervasive(dot)com> wrote:
> On Wed, Mar 08, 2006 at 11:20:50AM -0500, Tom Lane wrote:
>> Not sure that follows. In particular, the entire point of the recent
>> changes has been to extend the range in which we can use a single merge
>> pass --- that is, write the data once as N sorted runs, then merge them
>> in a single read pass. As soon as you have to do an actual merge-back-
>> to-disk pass, your total I/O volume doubles, so there is definitely a
>> considerable gain if that can be avoided. And a larger work_mem
>> translates directly to fewer/longer sorted runs.
>
> But do fewer/longer sorted runs translate into not merging back to disk?
> I thought that was controlled by if we had to be able to rewind the
> result set.
In the *tape* algorithm, there is an intermediate abstraction in the merging
called tapes (!) that are used to store intermediate merge results. Simon's
work implemented more tapes, which asymptotically approaches a single merge
pass as the number of tapes approaches the number of runs.
The Replacement Selection algorithm generally will produce about 1/2 the
number of runs that a simpler partial sort algorithm would, and the more
memory it uses, the fewer runs there are, and with fewer runs, fewer tapes
are required to avoid more passes on the merge.
This whole tape abstraction is something that I believe is unique to
Postgres among modern databases, and we have found that by removing it
entirely along with logtape.c, we remove 2000 lines of useless code that
only complicates our optimization problem.
- Luke
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-03-08 20:32:09 | Re: Merge algorithms for large numbers of "tapes" |
Previous Message | Joachim Wieland | 2006-03-08 18:03:56 | Re: Status of TODO: postgresql.conf: reset to default when |