Re: trouble with (lack of) indexing

From: "Nigel J(dot) Andrews" <nandrews(at)investsystems(dot)co(dot)uk>
To: Søren Boll Overgaard <postgres(at)fork(dot)dk>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: trouble with (lack of) indexing
Date: 2002-05-10 00:15:11
Message-ID: Pine.LNX.4.21.0205100108140.2371-100000@ponder.fairway2k.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Fri, 10 May 2002, Søren Boll Overgaard wrote:
> [snip]
> I missed the outputs of the explain analyze queries in the previous mail:
> explain analyze SELECT ht.id,hq.ip,hq.id FROM hostsqueue as hq,hoststests as ht WHERE ht.hostsqueue_id=hq.id;
> NOTICE: QUERY PLAN:
>
> Hash Join (cost=75331.89..127823.94 rows=160293 width=24) (actual time=122572.70..146683.59 rows=142807 loops=1)
> -> Seq Scan on hoststests ht (cost=0.00..49686.93 rows=160293 width=8) (actual time=170.79..21255.68 rows=142807 loops=1)
> -> Hash (cost=75243.51..75243.51 rows=35351 width=16) (actual time=122401.10..122401.10 rows=0 loops=1)
> -> Seq Scan on hostsqueue hq (cost=0.00..75243.51 rows=35351 width=16) (actual time=518.88..122024.50 rows=28610 loops=1)
> Total runtime: 147598.79 msec
>
> EXPLAIN
>
> And after setting the enable_seqscan to off, I got this:
>
> => SET enable_seqscan = off;
> SET VARIABLE
> => explain analyze SELECT ht.id,hq.ip,hq.id FROM hostsqueue as hq,hoststests as ht WHERE ht.hostsqueue_id=hq.id;
> NOTICE: QUERY PLAN:
>
> Merge Join (cost=0.00..772324.63 rows=160293 width=24) (actual time=8319.78..116061.18 rows=141645 loops=1)
> -> Index Scan using hostsqueue_pkey on hostsqueue hq (cost=0.00..137603.50 rows=35351 width=16) (actual time=34.90..90831.97 rows=28623 loops=1)
> -> Index Scan using idx_htsts_hq on hoststests ht (cost=0.00..632228.35 rows=160293 width=8) (actual time=18.41..23343.75 rows=141645 loops=1)
> Total runtime: 116562.03 msec
>
> EXPLAIN
>
> Actually, I was rather hoping for a large gain in performance, but maybe the
> "total runtimes" are not actally representative of performance of the actual
> selects?


Yes they are. That is the time that is taken to produce the results since that
is what explain analyze is doing. It doesn't cover the time taken to get those
results somewhere where they can be used, which for such large result set could
be significant (network transmission plus buffering in the client all before
the application can use any of it).

BTW, are you aware that those two things seem to be returning different numbers
of rows from each table and consequently from the whole query?

---
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message McCaffity, Ray (Contractor) 2002-05-10 00:21:37 newbie - syntax question
Previous Message Nigel J. Andrews 2002-05-10 00:07:18 Re: trouble with (lack of) indexing