Re: [SQL] Questions about vacuum analyze

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: "Steven M(dot) Wheeler" <swheeler(at)sabre(dot)com>, pgsql-sql(at)hub(dot)org
Subject: Re: [SQL] Questions about vacuum analyze
Date: 1999-10-11 14:32:42
Message-ID: 7659.939652362@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us> writes:
> I didn't know sorting algorithms for tape and disk had different
> optimizations. I figured the paging in of disk blocks had a similar
> penalty to tape rewinding. None of us really knows a lot about the best
> algorithm for that job. Nice you recognized it.

I didn't know much about it either until the comments in psort.c led me
to read the relevant parts of Knuth. Tape rewind and disk seek times
are not remotely comparable, especially when you're talking about N tape
units versus one disk unit --- using more files (tapes) to minimize
rewind time can be a big win on tapes, but put those same files on a
disk and you're just spreading the seek time around differently. More
to the point, though, tape-oriented algorithms tend to assume that tape
space is free. Polyphase merge doesn't care in the least that it has
several copies of any given record laying about on different tapes, only
one of which is of interest anymore.

I know how to get the sort space requirement down to 2x the actual data
volume (just use a balanced merge instead of the "smarter" polyphase)
but I am thinking about ways to get it down to data volume + a few
percent with an extra layer of bookkeeping. The trick is to release
and recycle space as soon as we have read in the tuples stored in it...

regards, tom lane

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Bruce Momjian 1999-10-11 15:07:44 Re: [SQL] Questions about vacuum analyze
Previous Message Albert REINER 1999-10-11 11:55:48 DELETE/DROP and inheritance