Re: Performance Question

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: tim(at)dmcity(dot)net
Cc: pgsql-sql(at)hub(dot)org
Subject: Re: Performance Question
Date: 1999-02-04 15:27:54
Message-ID: 7741.918142074@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

"Tim Perdue" <tim(at)directricity(dot)com> writes:
> Am I going to run into huge performance problems on this? Should each
> mailing list be archived in its own table??
> When I do a
> SELECT * FROM tbl_mail_archive WHERE fld_mail_list=1 AND
> fld_mail_body~~'%keyword%';
> am I going to get killed with a performance hit??

Should be OK as long as you make an index on fld_mail_list (and don't
forget to vacuum regularly).

Of course, maintaining that index is not zero-cost. The appropriate
thing to ask is what your usage patterns will be. If you frequently
make searches across multiple mailing lists, then you undoubtedly
want to do it as you show above. If you never (or hardly ever) do that,
you might as well keep each mailing list in its own table and live with
having to do multiple SELECTs when you do want to look across multiple
lists.

> It's running OK now, with the table at 20MB, but I have 100MB more
> worth of letters to drop into the table..... 8-)

I suspect your real problem is going to be that searching 100MB with
"fld_mail_body~~'%keyword%'" is going to be dog-slow. I think you
are going to want a full-text index if you expect to do that a lot.

There is a simple all-Postgres FTI in the contrib part of the
distribution, but I think it'd probably run out of steam long before you
got to 100MB. What I'd probably do in your situation is to use
Glimpse (http://glimpse.cs.arizona.edu/) for the text index.
That'd likely mean storing the message bodies in separate files outside
the database proper, and keeping only the file names in the database
rows. (But that'd get rid of the message-over-8K problem, so it isn't
all bad...)

regards, tom lane

Browse pgsql-sql by date

  From Date Subject
Next Message Gregory Holston 1999-02-04 17:13:22 using matches on int4
Previous Message Tim Perdue 1999-02-04 15:24:38 Re: [SQL] Performance Question