From: | Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Steven M(dot) Wheeler" <swheeler(at)sabre(dot)com>, pgsql-sql(at)hub(dot)org |
Subject: | Re: [SQL] Questions about vacuum analyze |
Date: | 1999-10-11 03:12:40 |
Message-ID: | 199910110312.XAA06290@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
> A further thought: 40M rows could well be more than 4gig of data.
> Although Postgres supports tables that large (by segmenting them into
> 1gig-apiece files to avoid OS limitations on file size), we currently
> do not cope with temporary sort files that exceed 4 gig, which means
> a "select distinct" on that much data will fail regardless of whether
> you have enough free disk space :-(. We have a TODO item to fix this.
>
> After looking at psort.c, I see a bunch of other shortcomings, one being
> that it's using a polyphase merge algorithm --- which was hot stuff back
> when people did this sort of thing on magnetic tape drives, but it's
> less than appropriate for sorting on disk. The worst problem with it is
> that the space consumption is about 4x the actual data volume, which is
> not too cool when you're talking about a huge file to begin with...
>
> Meanwhile, there is a completely separate implementation of external
> sorting over in nbtsort.c (where it's only used for CREATE INDEX,
> AFAICS), with a somewhat different set of deficiencies.
>
> I think I'll add "rewrite sort code from the ground up" to my to-do
> list ;-)
I have on TODO:
Make index creation use psort code, because it is now faster(Vadim)
I didn't know sorting algorithms for tape and disk had different
optimizations. I figured the paging in of disk blocks had a similar
penalty to tape rewinding. None of us really knows a lot about the best
algorithm for that job. Nice you recognized it.
--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
From | Date | Subject | |
---|---|---|---|
Next Message | Martin Dolog | 1999-10-11 07:34:11 | unsubscribe |
Previous Message | Tom Lane | 1999-10-11 01:50:34 | Re: [SQL] Questions about vacuum analyze |