From: | Jesper Krogh <jesper(at)krogh(dot)cc> |
---|---|
To: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search). |
Date: | 2009-10-27 06:42:00 |
Message-ID: | 4AE69638.8040501@krogh.cc |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Craig Ringer wrote:
> On 8.4 on a different system Pg uses the seq scan by preference, with a
> runtime of 1148ms. It doesn't seem to want to do a bitmap heap scan when
> searching for `commonterm' even when enable_seqscan is set to `off'. A
> search for `commonterm80' also uses a seq scan (1067ms), but if
> enable_seqscan is set to off it'll use a bitmap heap scan at 237ms.
Ok, thats excactly as my number.
> On my 8.3 Pg isn't using a seqscan even for `commonterm', which is ...
> odd. If I force it not to use a bitmap heap scan it'll use an index
> scan. Preventing that too results in a seq scan with a runtime of
> 1500ms vs the 161ms of the bitmap heap scan. I agree that it seems like
> a pretty strange result on face value.
PG 8.3 doesnt have statistics data available for gin-indexes so that may
be why the query-planner can do otherwise on 8.3. It also means that it
is a regression since in these cases 8.4 will perform worse than 8.3
did. (allthough the statistics makes a lot other cases way better).
> So, on both 8.3 and 8.4 the sequential scan is indeed taking a LOT
> longer than the bitmap heap scan, though similar numbers of tuples are
> being read by both.
>
> I see the same results when actually reading the results rather than
> just doing an `explain analyze'. With psql set to send output
> to /dev/null and with \timing enabled:
>
> test=> \o /dev/null
> test=> set enable_seqscan = on;
> Time: 0.282 ms
> test=> select id from ftstest where body_fts @@
> to_tsquery('commonterm80');
> Time: 988.880 ms
> test=> set enable_seqscan = off;
> Time: 0.286 ms
> test=> select id from ftstest where body_fts @@
> to_tsquery('commonterm80');
> Time: 159.167 ms
>
> so - nearly 1s vs 0.15s is a big difference between what I previously
> confirmed to be bitmap heap scan and seq scan respectively for the same
> query. The same number of records are being returned in both cases.
>
> If I "select *" rather than just reading the `id' field, the runtimes
> are much more similar - 4130ms seq scan, and 3285 bitmap heap scan (both
> reading data to /dev/null), a difference of ~800. `EXPLAIN ANALYZE'
> results are still quite different, though, at 1020ms seq scan vs 233ms
> bitmap heap, suggesting that the similarity is created only by the time
> taken to actually transfer the data to the client. The time difference
> between the two is much the same.
>
> So - for some reason the seq scan takes 800ms or so longer than the
> bitmap heap scan. I can see why you're puzzled. I can reproduce it on
> two different machines with two different Pg versions, and using two
> slightly different methods for loading the data as well. So, I can
> confirm your test results now that I'm actually testing properly.
Thanks a lot.
> test=> explain analyze select * from ftstest where body_fts @@
> to_tsquery('commonterm80');
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on ftstest (cost=25836.66..36432.95 rows=39753
> width=54) (actual time=27.452..175.481 rows=39852 loops=1)
> Recheck Cond: (body_fts @@ to_tsquery('commonterm80'::text))
> -> Bitmap Index Scan on ftstest_gin_idx (cost=0.00..25826.72
> rows=39753 width=0) (actual time=25.186..25.186 rows=39852 loops=1)
> Index Cond: (body_fts @@ to_tsquery('commonterm80'::text))
> Total runtime: 233.473 ms
> (5 rows)
>
> test=> set enable_seqscan = on;
> SET
> test=> explain analyze select * from ftstest where body_fts @@
> to_tsquery('commonterm80');
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------
> Seq Scan on ftstest (cost=0.00..10750.00 rows=39753 width=54) (actual
> time=0.141..956.496 rows=39852 loops=1)
> Filter: (body_fts @@ to_tsquery('commonterm80'::text))
> Total runtime: 1020.936 ms
> (3 rows)
My systems seems more to prefer bitmap-scans a bit more, but given the
actual number it seems to be preferrablem. Thats about query-planning,
my main reason for posting was the actual run time.
> By the way, for the 8.4 test I modifed the loader script so it wouldn't
> take quite so painfully long to run second time 'round. I turned
> autocommit off, wrapped all the inserts up in a single transaction, and
> moved the fts index creation to after all the data has been inserted.
> It's a *LOT* faster, and the test results match yours.
I'll make that change if I have to work a bit more with it.
Thanks for speding time confirming my findings. (the I know its not just
me getting blind at some problem).
Jesper
--
Jesper
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2009-10-27 14:48:16 | Re: bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search). |
Previous Message | Craig Ringer | 2009-10-27 06:14:37 | Re: bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search). |