From: | Rick Jansen <rick(at)rockingstone(dot)nl> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Tsearch2 performance on big database |
Date: | 2005-03-22 12:28:07 |
Message-ID: | 42400F57.3040604@rockingstone.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi,
I'm looking for a *fast* solution to search thru ~ 4 million records of
book descriptions. I've installed PostgreSQL 8.0.1 on a dual opteron
server with 8G of memory, running Linux 2.6. I haven't done a lot of
tuning on PostgreSQL itself, but here's the settings I have changed so far:
shared_buffers = 2000 (anything much bigger says the kernel doesnt allow
it, still have to look into that)
effective_cache_size = 32768
Here's my table:
ilab=# \d books
Table "public.books"
Column | Type | Modifiers
---------------+------------------------+----------------------------------------------------------
recordnumber | integer | not null default
nextval('books_recordnumber_seq'::text)
membernumber | integer | not null default 0
booknumber | character varying(20) | not null default
''::character varying
author | character varying(60) | not null default
''::character varying
titel | text | not null
description | character varying(100) | not null default
''::character varying
descriprest | text | not null
price | bigint | not null default 0::bigint
keywords | character varying(100) | not null default
''::character varying
dollarprice | bigint | not null default 0::bigint
countrynumber | smallint | not null default 0::smallint
entrydate | date | not null
status | smallint | not null default 0::smallint
recordtype | smallint | not null default 0::smallint
bookflags | smallint | not null default 0::smallint
year | smallint | not null default 0::smallint
firstedition | smallint | not null default 0::smallint
dustwrapper | smallint | not null default 0::smallint
signed | smallint | not null default 0::smallint
cover | smallint | not null default 0::smallint
specialfield | smallint | not null default 0::smallint
idxfti | tsvector |
Indexes:
"recordnumber_idx" unique, btree (recordnumber)
"idxfti_idx" gist (idxfti)
idxfti is a tsvector of concatenated description and descriprest.
ilab=# select
avg(character_length(description)),avg(character_length(descriprest))
from books;
avg | avg
---------------------+----------------------
89.1596992873947218 | 133.0468689304200538
Queries take forever to run. Right now we run a MySQL server, on which
we maintain our own indices (we split the description fields by word and
have different tables for words and the bookdescriptions they appear in).
For example, a query for the word 'terminology' on our MySQL search
takes 5.8 seconds and returns 375 results. The same query on postgresql
using the tsearch2 index takes 30802.105 ms and returns 298 results.
How do I speed this up? Should I change settings, add or change indexes
or.. what?
Rick Jansen
--
Systems Administrator for Rockingstone IT
http://www.rockingstone.com
http://www.megabooksearch.com - Search many book listing sites at once
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2005-03-22 12:36:11 | Re: Tsearch2 performance on big database |
Previous Message | Richard Huxton | 2005-03-22 11:56:24 | Re: What about utility to calculate planner cost constants? |