Tom Polak wrote:
Hi neighbor! (We're just up I90/I39 a bit.)
> What kind of performance can I expect out of Postgres compare to
> MSSQL?
I can't speak directly to MS SQL Server 2000, but Sybase and ASE have
common roots; I think MS SQL Server 2000 is still using the engine
that they inherited from Sybase (check the server startup logging to
see if the Sybase copyright notice is still there), so comparisons to
Sybase might be roughly applicable.
Moving from Sybase on Windows to PostgreSQL on Linux we got a major
performance improvement. I hesitate to include a hard number in
this post because the license agreement from Sybase prohibited
publishing any benchmarks involving their product without advance
permission in writing. I don't think it constitutes a "benchmark" to
mention that we went from load balancing our largest web app against
two database servers to running comfortably on one database server
with the switch.
Over 95% of our queries ran faster on PostgreSQL without anything but
basic tuning of the server configuration. Most of the rest were
pretty easy to rework so they ran well. There was one which we had
to break up into multiple smaller queries. That was on PostgreSQL
8.1; in 8.4 the addition of semi-join and anti-join logic for EXISTS
tests solved many of the issues within the server; I'd bet 98% to 99%
of our queries would have run faster on PostgreSQL without any work
had that been present.
> RAID 5 (for data redundancy/security),
Contrary to some admonitions, RAID 5 performs well for some
workloads. It does, however, put you at risk of losing everything
should a second drive experience a failure before you rebuild a lost
drive, and if (as is usually the case) all your drives are from the
same batch, running in the same environment, with fairly evenly
balanced load, that second failure about the same time as the first
is not as rare as you might think. If you don't have good
replication, be sure you have good backups using hot or warm standby.
> 24 GB of RAM
> 10GB of data in a couple of tables
If the active portion of your database fits within RAM, you should
set your seq_page_cost and random_page_cost to equal values, probably
at 0.1 or less. That's in addition to all the other advice on
configuration.
> Any comparisons in terms of performance would be great.
Check your SQL Server license. Odds are good that it prevents you
from publishing benchmarks which they don't review and approve in
advance. There is probably a reason they put that in. PostgreSQL,
of course, has no such restriction. ;-)
We have a server not much bigger than what you're looking at, except
for a lot more drives, holding two databases over 1TB each, plus a
couple smaller ones. We've got over 20 web apps hitting this server,
the largest of which has over five million requests per day. While
the database involved is up to about 1.6 TB now, most of that is
documents; the active part of the database, holding case management
information, is about 200 GB, with some heavily hit tables holding
hundreds of millions of rows. Feel free to poke around to get a
sense of performance:
http://wcca.wicourts.gov/
When you do a name search, due to the inclusion of party aliases and
security and privacy rules, there are about 20 joins.
I hope this helps.
-Kevin