Re: [HACKERS] sort on huge table

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: t-ishii(at)sra(dot)co(dot)jp, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] sort on huge table
Date: 1999-11-01 18:00:21
Message-ID: 199911011800.NAA20652@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Next question is what to do about it. I don't suppose we have any way
> of turning off the OS' read-ahead algorithm :-(. We could forget about
> this space-recycling improvement and go back to separate temp files.
> The objection to that, of course, is that while sorting might be faster,
> it doesn't matter how fast the algorithm is if you don't have the disk
> space to execute it.

Look what I found. I downloaded Linux kernel source for 2.2.0, and
started looking for the word 'ahead' in the file system files. I found
that read-ahead seems to be controlled by f_reada, and look where I
found it being turned off? Seems like any seek turns off read-ahead on
Linux.

When you do a read or write, it seems to be turned on again. Once you
read/write, the next read/write will do read-ahead, assuming you don't
do any lseek() before the second read/write().

Seems like the algorithm in psort now is rarely having read-ahead on
Linux, while other OS's check to see if the read-ahead was eventually
used, and control read-ahead that way.

read-head also seems be off on the first read from a file.

---------------------------------------------------------------------------

/*
* linux/fs/ext2/file.c
...
/*
* Make sure the offset never goes beyond the 32-bit mark..
*/
static long long ext2_file_lseek(
struct file *file,
long long offset,
int origin)
{
struct inode *inode = file->f_dentry->d_inode;

switch (origin) {
case 2:
offset += inode->i_size;
break;
case 1:
offset += file->f_pos;
}
if (((unsigned long long) offset >> 32) != 0) {
#if BITS_PER_LONG < 64
return -EINVAL;
#else
if (offset > ext2_max_sizes[EXT2_BLOCK_SIZE_BITS(inode->i_sb)])
return -EINVAL;
#endif
}
if (offset != file->f_pos) {
file->f_pos = offset;
file->f_reada = 0;
file->f_version = ++event;
}
return offset;
}

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1999-11-01 18:32:08 Re: [HACKERS] sort on huge table
Previous Message Karel Zak - Zakkr 1999-11-01 17:39:09 Re: [HACKERS] Get OID of just inserted record