Re: again on index usage

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Kalchev <daniel(at)digsys(dot)bg>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: again on index usage
Date: 2002-01-09 20:20:59
Message-ID: 14755.1010607659@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Daniel Kalchev <daniel(at)digsys(dot)bg> writes:
> In any case, if we need to hit 50k pages (assuming the indexed data is
> randomly scattered in the file), and having to read these three times each, it
> will be less I/O than having to read 1.7 million records.

How do you arrive at that? Assuming 100 records per page (probably the
right order of magnitude), the seqscan alternative is 17k page reads.
Yes, you examine more tuples, but CPUs are lots faster than disks.

That doesn't even address the fact that Unix systems reward sequential
reads and penalize random access. In a seqscan, we can expect that the
kernel will schedule the next page read before we ask for it, so that
our CPU time to examine a page is overlapped with I/O for the next page.
In an indexscan that advantage goes away, because neither we nor the
kernel know which page will be touched next. On top of the loss of
read-ahead, the filesystem is probably laid out in a way that rewards
sequential access with fewer and shorter seeks.

The tests I've made suggest that the penalty involved is about a factor
of four -- ie, a seqscan can scan four pages in the same amount of time
that it takes to bring in one randomly-accessed page.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Marc G. Fournier 2002-01-09 20:21:04 Re: RC1 time?
Previous Message Tom Lane 2002-01-09 20:07:18 Re: RC1 time?