Performance issues

From: Eric Lee Green <eric(at)badtux(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Performance issues
Date: 2002-03-18 23:01:27
Message-ID: 200203181741.28686@badtux.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

PostGreSQL 7.2, Red Hat Linux 6.2:

I am having a very strange performance issue. I am using postgresql-ruby, but
I don't think this is a postgresql-ruby issue. If I issue 1,000 inserts in a
row as part of a transaction, I can get 900 inserts per second. If I mix the
inserts with a select inbetween on the same row, I can insert at only 7
inserts per second.

I.e.:

begin;
insert into migfiles (source,mtime) values ('/foo/bar',12341523);
.... [999 more]
end;

runs quite rapidly, while

begin;
select from migfiles where source = #{filename};
...(and if the file doesn't exist) ...
insert into migfiles (source, mtime) values ('#{filename}',#{mtime});
[ 999 more select/insert pairs ]
end;

takes a quite long time to run.

(Obviously there's some ruby logic to do the loop and all that good stuff,
but this should give you the gist of things).

'source' and 'mtime' are a text field and an integer field, respectively. The
only index is on 'source', necessary in order to make the 'select' work
properly once we get up past 200,000 records.

I did an strace on the back end:

recv(7, "Qinsert into MigFiles (migrate_t"..., 8192, 0) = 290
lseek(29, 0, SEEK_END) = 8192
lseek(29, 0, SEEK_END) = 8192
send(7, "Pblank\0CINSERT 455040 1\0Z", 25, 0) = 25
recv(7, "Qselect source from migfiles whe"..., 8192, 0) = 94
lseek(40, 0, SEEK_END) = 73097216
send(7, "Pblank\0T\0\1source\0\0\0\0\31\377\377\377\377\377\377"..., 36, 0) =
36recv(7, "Qinsert into MigFiles (migrate_t"..., 8192, 0) = 290
lseek(29, 0, SEEK_END) = 8192
lseek(29, 0, SEEK_END) = 8192
send(7, "Pblank\0CINSERT 455041 1\0Z", 25, 0) = 25
recv(7, "Qselect source from migfiles whe"..., 8192, 0) = 94
lseek(40, 0, SEEK_END) = 73097216
send(7, "Pblank\0T\0\1source\0\0\0\0\31\377\377\377\377\377\377"..., 36, 0) =
36recv(7, "Qinsert into MigFiles (migrate_t"..., 8192, 0) = 290
lseek(29, 0, SEEK_END) = 8192
lseek(29, 0, SEEK_END) = 8192
send(7, "Pblank\0CINSERT 455042 1\0Z", 25, 0) = 25
recv(7, "Qselect source from migfiles whe"..., 8192, 0) = 94
lseek(40, 0, SEEK_END) = 73097216
send(7, "Pblank\0T\0\1source\0\0\0\0\31\377\377\377\377\377\377"..., 36, 0) =
36

So there's nothing there unusual.

I thought perhaps that I was not properly batching these between a begin/end
pair for the inserts, but I modified my test program and even using the
inherent per-insert transaction locking it manages 219 files per second.

I'm stumped. There's no reason in the world why I should only be inserting at
7 records per second! (And yes, I'm running this against data sets that have
not yet been migrated into the database). Anybody have any other suggestions
for what I should do to try to speed things up? I'm about at the point where
I'm just going to batch up the 1000 filenames in an array, do a bulk insert
all at one time (i.e., no more mixed in selects, do a begin, a bunch of
inserts, then an end), and see if that speeds things up... does that sound
like something that might work?

I've done a 'vacuum analyze;' and an 'explain' on the select (yes it's using
the index, the 'select' works quite rapidly, if I run the program on a
directory that has already been entered into the database the program runs
quite rapidly).

BTW: Why does 'select count(*) from migfiles;' run so slowly compared to
MySQL? Is there a better way to do this in PostGres?

We're trying to get away from MySQL because of its limitations, but we can't
live with 7 inserts per second :-(. This one has me stumped.

Eric Lee Green GnuPG public key at http://badtux.org/eric/eric.gpg
mailto:eric(at)badtux(dot)org Web: http://www.badtux.org/eric
Join the Conspiracy: http://evidence-eliminator-sucks.com

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joep deVocht 2002-03-19 00:28:37 Small question
Previous Message Jan Wieck 2002-03-18 20:43:06 Re: always forced restart after status 139?