Quick Links

Re: many similar indexbased selects are extremely slow

From:	Jeff Davis <jdavis-pgsql(at)empires(dot)org>
To:	peter pilsl <pilsl(at)goldfisch(dot)at>
Cc:	PgSQL General List <pgsql-general(at)postgresql(dot)org>
Subject:	Re: many similar indexbased selects are extremely slow
Date:	2005-01-02 02:11:49
Message-ID:	1104631910.3003.427.camel@jeff
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

For what it's worth, I put 100k rows into a table in 8.0beta5, and
selected 10k at a time. When doing each SELECT seperately using the
index, it took about 2.5s to do 10k SELECTs. When using an IN query
containing all the id's that I wanted, it took less than a second.

Jeff

On Sat, 2005-01-01 at 17:34 -0800, Jeff Davis wrote:
> Well, first it would be a good idea to see what postgres is actually
> doing. Send the output of:
>
> => EXPLAIN ANALYZE SELECT OID FROM atrikelindex WHERE id_artikel=?;
>
> (where ? is replaced by some id value)
>
> It will either say index lookup or sequential scan since it's just a
> select from one table. That will tell you whether the index is being
> used or not.
>
> Next, if it is doing something that seems unreasonable, try doing:
>
> => VACUUM ANALYZE artikelindex;
>
> And we should also look at the EXPLAIN output on the IN query version
> like:
>
> => EXPLAIN ANALYZE SELECT OID FROM artikelindex WHERE id_artikel IN
> (?,?,...,?);
>
> Although I'm not sure what kind of performance to expect if you have 10k
> values in that list.
>
> Also, what version are you using? And how many rows in the table? Is it
> regularly vacuumed? Are there a lot of updates/deletes going to that
> table?
>
> I suspect that the fastest way that postgres can get you those rows
> would be the IN query on a recent version of postgresql. It may choose a
> sequential scan, which likely would be good since one seq scan will
> hopefully take less than 10 seconds.
>
> Regards,
> Jeff Davis
>
>
> On Sun, 2005-01-02 at 00:52 +0100, peter pilsl wrote:
> > psql8:
> >
> > I use a bigger psql-table to store information and keep an id-value of
> > each row in memory of my application for faster access.
> > My applications is able to calculate a list of needed id's in very short
> > time and then wants to retrieve all rows corresponding to this id's.
> >
> > So in fact I perform a lot of operations like:
> >
> > select field1,field2,field3 from mytable where id=XX;
> >
> > There is a index on the id-field and the id-field is of type OID, so
> > everything should be quite fast. Unfortunately it is not.
> >
> > On a 2GHz-machine with 1GB RAM it takes approx. 10seconds to fetch 10000
> > rows. In this testscenario I only fetch the OID and no other col.
> > I dont understand this. Am I expecting far to much? Is 10seconds for the
> > retrieval of 10000 OIDs a fine value? I want it to be less than one
> > second and from my experience with postgres this operation is extremely
> > slow compared to the impressive speed of most other operations.
> >
> > I also tried to use the IN-operator, which is much more slower. Is there
> > any other way to speed up things? I can order the list of id's to
> > retrieve in my application if there is a way to tell psql not to search
> > the whole index every time but somehow "do better".
> >
> > If it is of any interest, here is the table. The id is stored as id_artikel.
> >
> >
> > Table "public.artikelindex"
> > Column | Type |
> > Modifiers
> > -----------------------+-----------------------------+----------------------------------------------------
> > autor_artikel | text |
> > titel_artikel | text |
> > jahrgang_zeitschrift | integer |
> > jahr_zeitschrift | character varying(20) |
> > heftnummer | character varying(30) |
> > seitenzahl_artikel | character varying(30) |
> > bemerkungen_artikel | text |
> > deskriptoren_alt | text |
> > deskriptoren_neu | text |
> > personennamen_artikel | text |
> > orte_artikel | text |
> > id_artikel | oid |
> > id_titel | oid |
> > cdate | timestamp without time zone | default
> > ('now'::text)::timestamp(6) with time zone
> > udate | timestamp without time zone | default
> > ('now'::text)::timestamp(6) with time zone
> > uid | oid |
> > gid | oid |
> > mod | boolean |
> > Indexes:
> > "id_artikel_idx" btree (id_artikel)
> > "id_titel_idx" btree (id_titel)
> > "idx_artikelindeax_autor" btree (autor_artikel)
> > "idx_artikelindex_fingerprint" btree (id_artikel)
> > "idx_artikelindex_jahr" btree (jahrgang_zeitschrift)
> > "idx_artikelindex_jahrgang" btree (jahr_zeitschrift)
> > "idx_artikelindex_zeitschrift" btree (id_titel)
> > Rules:
> > delete_val AS
> > ON DELETE TO artikelindex DO UPDATE counter SET val = counter.val + 1
> > WHERE counter.tab::text = 'artikelindex'::character varying::text
> > insert_val AS
> > ON INSERT TO artikelindex DO UPDATE counter SET val = counter.val + 1
> > WHERE counter.tab::text = 'artikelindex'::character varying::text
> > update_val AS
> > ON UPDATE TO artikelindex DO UPDATE counter SET val = counter.val + 1
> > WHERE counter.tab::text = 'artikelindex'::character varying::text
> >
> >
> > And more: here is my retrieving program. I use perl and the DBI-module
> > and the following code-snip
> >
> > --------------------------
> > my $sth=$dbh->prepare(
> > 'select OID from artikelindex where id_artikel=?');
> >
> > foreach (@id) {
> > my $ret=$sth->execute($_);
> > my $x=$sth->fetchrow_arrayref;
> > }
> > -------------------------
> >
> >
> >
> > thnx a lot for any idea,
> > peter
> >
> >
> >
> > --
> > mag. peter pilsl
> > goldfisch.at
> > IT-management
> > tel +43 699 1 3574035
> > fax +43 699 4 3574035
> > pilsl(at)goldfisch(dot)at
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html

In response to

Re: many similar indexbased selects are extremely slow at 2005-01-02 01:34:08 from Jeff Davis

Browse pgsql-general by date

	From	Date	Subject
Next Message	Jeff Davis	2005-01-02 02:35:30	Re: disabling OIDs?
Previous Message	Dan Boitnott	2005-01-02 01:50:30	Re: Large Objects