Re: [SQL] selecting from indexes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Tim Joyce" <tim(at)hoop(dot)co(dot)uk>
Cc: "bloomsbury development" <bloomsbury_development(at)messageboards(dot)paneris(dot)org>, pgsql-sql(at)postgreSQL(dot)org
Subject: Re: [SQL] selecting from indexes
Date: 1999-11-19 17:23:57
Message-ID: 2858.943032237@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

"Tim Joyce" <tim(at)hoop(dot)co(dot)uk> writes:
>>>> SELECT id FROM books WHERE category_key = 1471;
>>>> (this takes ages on a table not ordered by category_key even if I have
>>>> an index on category_key)

> the query above selects 294072 rows, which i obviously don't want to do, but
> I do want to use the clause above in a query that involves a join. eg

Ah, I begin to understand. With an index scan you're going to get
294072 probes into the table (maybe even more, if there are deleted rows
that match the category_key). If the rows are scattered all over the
disk then that may actually take about 300k disk reads. After you
cluster the table, the rows with the same category_key are all
contiguous in the table, so many fewer blocks have to be read to visit
them all. That's why clustering helps here.

Since you're selecting about 1/4th of the table, this particular query
would probably be better off *not* using the index, but just doing a
sequential scan of the whole table :-(. I assume most of your
categories are more selective than this one, though, so dropping the
category index entirely is probably not the answer.

> select id from books, book_words where book_words.word='happy' and
> book_words.id = books.id and books.category_key=1471;

If this is what you're really doing, I think what you actually want is
indexes on book_words.word and books.id. That would allow book_words
to be searched on the word (hopefully giving a more selective result
than the category does), and then books would be probed using the id index.
id has unique values, right?

> perhaps (in the above query) there is a way of directing postgres to only
> access the books that are selected by the 'words' part of the query?

You might want to look at contrib/fulltextindex in the distribution for
ideas about indexing words. fulltextindex might be overkill for your
needs, or maybe not, but you could probably adapt it for your purposes.

regards, tom lane

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tim Joyce 1999-11-19 20:06:34 Re: [SQL] selecting from indexes
Previous Message Tim Joyce 1999-11-19 16:25:40 Re: [SQL] selecting from indexes