Re: Performance

From: Dustin Sallings <dustin(at)spy(dot)net>
To: Matthias Urlichs <smurf(at)noris(dot)net>
Cc: Alfred Perlstein <bright(at)wintelcom(dot)net>, Charles Tassell <ctassell(at)isn(dot)net>, Diego Schvartzman <dschvar(at)yahoo(dot)com>, Lista PGSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: Performance
Date: 2000-05-20 08:16:16
Message-ID: Pine.NEB.4.10.10005200104420.304-100000@foo.west.spy.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 19 May 2000, Matthias Urlichs wrote:

# If a table has an index (let's say it is a btree on fields a,b), and
# if a SELECT/INSERT/UPDATE/DELETE is issued with field a being either
# inserted or in the where clause, then the database needs to use that
# index. Period.

That's not the case. What if I only have two rows in it? It
would take more resources to use the index than it would to do a
sequential scan.

# Requiring the application to call VACUUM in order to get any kind of
# performance is not a solution. When exactly am I supposed to do that?
# Before inserting one million records into my temporary table it's of
# no use whatsoever, and afterwards it's next week already. Literally.

I had the same conversation with some of my Sybase DBAs, they
explained to me why I was wrong, and why they needed to manually update
statistics for smarter index usage instead of having the hot point during
the inserts. It can probably be designed in such a way that the
statistics can be updated constantly without slowing everything down too
much, but I'm not a postgres developer and don't have the time to find out
if that's true.

# Unfortunately, the observable behavior in this case is something like
# - create table
# - create index
# - call VACUUM or not, doesn't make a difference because the table is
# empty anyway
# - do a whole lot of INSERTs during which PostgreSQL is slow as molasses.
#
# Ouch.

You'll save a tremendous amount of time by loading the data before
you add an index. This is probably a big part of the reason you spend a
week loading one million entries into a table. I don't think it takes me
an hour to load my 15,627,696 row table from scratch, after which I create
the index in about half that time, and a vacuum takes me approximately
five minutes.

Now, it's true, I don't remember having to vacuum before, but the
vacuum isn't very painful.

--
dustin sallings The world is watching America,
http://2852210114/~dustin/ and America is watching TV.

Browse pgsql-general by date

  From Date Subject
Next Message Chris 2000-05-20 08:29:12 Re: OO Patch
Previous Message Matthias Urlichs 2000-05-20 08:09:00 Re: Performance (was: The New Slashdot Setup (includes MySql server))