Re: [SQL] Questions about vacuum analyze

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Steven M(dot) Wheeler" <swheeler(at)sabre(dot)com>, pgsql-sql(at)hub(dot)org
Subject: Re: [SQL] Questions about vacuum analyze
Date: 1999-10-11 03:12:40
Message-ID: 199910110312.XAA06290@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

> A further thought: 40M rows could well be more than 4gig of data.
> Although Postgres supports tables that large (by segmenting them into
> 1gig-apiece files to avoid OS limitations on file size), we currently
> do not cope with temporary sort files that exceed 4 gig, which means
> a "select distinct" on that much data will fail regardless of whether
> you have enough free disk space :-(. We have a TODO item to fix this.
>
> After looking at psort.c, I see a bunch of other shortcomings, one being
> that it's using a polyphase merge algorithm --- which was hot stuff back
> when people did this sort of thing on magnetic tape drives, but it's
> less than appropriate for sorting on disk. The worst problem with it is
> that the space consumption is about 4x the actual data volume, which is
> not too cool when you're talking about a huge file to begin with...
>
> Meanwhile, there is a completely separate implementation of external
> sorting over in nbtsort.c (where it's only used for CREATE INDEX,
> AFAICS), with a somewhat different set of deficiencies.
>
> I think I'll add "rewrite sort code from the ground up" to my to-do
> list ;-)

I have on TODO:

Make index creation use psort code, because it is now faster(Vadim)

I didn't know sorting algorithms for tape and disk had different
optimizations. I figured the paging in of disk blocks had a similar
penalty to tape rewinding. None of us really knows a lot about the best
algorithm for that job. Nice you recognized it.

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Martin Dolog 1999-10-11 07:34:11 unsubscribe
Previous Message Tom Lane 1999-10-11 01:50:34 Re: [SQL] Questions about vacuum analyze