From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Jan Urbański <wulczer(at)wulczer(dot)org> |
Cc: | Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: gincostestimate |
Date: | 2010-07-26 10:58:31 |
Message-ID: | Pine.LNX.4.64.1007261452390.32129@sn.sai.msu.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Jan,
On Sun, 25 Jul 2010, Jan Urbaski wrote:
> On 02/07/10 14:33, Teodor Sigaev wrote:
>> Patch implements much more accuracy estimation of cost for GIN index
>> scan than generic cost estimation function.
>
> Hi,
>
> I'm reviewing this patch, and to begin with it I tried to reproduce the
> problem that originally came up on -performance in
> http://archives.postgresql.org/pgsql-performance/2009-10/msg00393.php
I attached scripts
>
> The links from that mail are now dead, so I set up my own test environment:
> * one table testfts(id serial, body text, body_fts tsvector)
> * 50000 rows, each with 1000 random words taken from
> /usr/share/dict/british-english-insane (the wbritish-insane Debian
> package) separated by a single space
> * each row also had the word "commonterm" at the end, 80% had
> commonterm80, 60% had commonterm60 etc (using the same methodology as
> Jesper, that commonterm60 can appear only if commonterm80 is in the row)
> * a GIN index on the tsvectors
>
> I was able to reproduce his issue, that is: select id from ftstest where
> body_fts @@ to_tsquery('commonterm80'); was choosing a sequential scan,
> which was resulting in much longer execution than the bitmap index plan
> that I got after disabling seqscans.
>
> I then applied the patch, recompiled PG and tried again... and nothing
> changed. I first tried running ANALYSE and then dropping and recreating
> the GIN index, but the planner still chooses the seq scan.
read thread
http://archives.postgresql.org/pgsql-hackers/2010-04/msg01407.php
There is always a fuzz factor, as Tom said, about 1% in path cost comparisons.
You may compare plans for 'commonterm60', 'commonterm40'.
>
> Full explains below (the NOTICE is a debugging aid from the patch, which
> I temporarily enabled to see if it's picking up the code).
from this debug you can see that cost estimation now are much accurate
than before.
>
> I'll continue reading the code and trying to understand what it does,
> but in the meantime: am I doing something wrong that I don't see the
> planner switching to the bitmap index plan? I see that the difference in
> costs is small, so maybe I just need to tweak the planner knobs a bit?
> Is the output below expected?
I think Tom explained this
http://archives.postgresql.org/pgsql-hackers/2010-04/msg01426.php
>
> Cheers,
> Jan
>
>
> wulczer=# explain analyse select id from ftstest where body_fts @@
> to_tsquery('commonterm80');
> NOTICE: GIN stats: nEntryPages: 49297.000000 nDataPages: 16951.000000
> nPendingPages :0.000000 nEntries: 277521.000000
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------------
> Seq Scan on ftstest (cost=0.00..1567.00 rows=39890 width=4) (actual
> time=221.893..33179.794 rows=39923 loops=1)
> Filter: (body_fts @@ to_tsquery('commonterm80'::text))
> Total runtime: 33256.661 ms
> (3 rows)
>
> wulczer=# set enable_seqscan to false;
> SET
> Time: 0.257 ms
> wulczer=# explain analyse select id from ftstest where body_fts @@
> to_tsquery('commonterm80');
> NOTICE: GIN stats: nEntryPages: 49297.000000 nDataPages: 16951.000000
> nPendingPages :0.000000 nEntries: 277521.000000
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on ftstest (cost=449.15..1864.50 rows=39890 width=4)
> (actual time=107.421..181.284 rows=39923 loops=1)
> Recheck Cond: (body_fts @@ to_tsquery('commonterm80'::text))
> -> Bitmap Index Scan on ftstest_gin_idx (cost=0.00..439.18
> rows=39890 width=0) (actual time=97.057..97.057 rows=39923 loops=1)
> Index Cond: (body_fts @@ to_tsquery('commonterm80'::text))
> Total runtime: 237.218 ms
> (5 rows)
>
> Time: 237.999 ms
>
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
Attachment | Content-Type | Size |
---|---|---|
words.txt | text/plain | 6.0 KB |
build-test.pl | text/plain | 1.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jan Urbański | 2010-07-26 11:08:54 | Re: gincostestimate |
Previous Message | Marko Tiikkaja | 2010-07-26 10:48:16 | Re: Synchronous replication |