Re: Forcing the right queryplan

From: Henk van Lingen <H(dot)G(dot)K(dot)vanLingen(at)uu(dot)nl>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Forcing the right queryplan
Date: 2010-09-02 08:23:33
Message-ID: 20100902082333.GA5174@uu.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-novice


No ideas on this one?

Regards,

On Tue, Aug 31, 2010 at 04:50:09PM +0200, Henk van Lingen wrote:
>
> Hi,
>
> I've the problem my database is not using the 'right' queryplan in all
> cases. Is there a way I can force that and/or how should I tuned the
> table statistics?
>
> I'm doing a rsyslog database in PostgreSQL with millions of records
> (firewall logging). The db scheme is the so called 'MonitorWare' scheme,
> to wich I added two extra indexes.
>
> syslog=# select version();
> version
>
> --------------------------------------------------------------------------------
> ----------------------------------
> PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20
> 080704 (Red Hat 4.1.2-48), 64-bit
>
>
> syslog=# \d systemevents
> Table "public.systemevents"
> Column | Type | Modi
> fiers
> --------------------+-----------------------------+-----------------------------
> ------------------------------
> id | integer | not null default nextval('sy
> stemevents_id_seq'::regclass)
> customerid | bigint |
> receivedat | timestamp without time zone |
> devicereportedtime | timestamp without time zone |
> facility | smallint |
> priority | smallint |
> fromhost | character varying(60) |
> message | text |
> ntseverity | integer |
> importance | integer |
> eventsource | character varying(60) |
> eventuser | character varying(60) |
> eventcategory | integer |
> eventid | integer |
> eventbinarydata | text |
> maxavailable | integer |
> currusage | integer |
> minusage | integer |
> maxusage | integer |
> infounitid | integer |
> syslogtag | character varying(60) |
> eventlogtype | character varying(60) |
> genericfilename | character varying(60) |
> systemid | integer |
> Indexes:
> "systemevents_pkey" PRIMARY KEY, btree (id)
> "fromhost_idx" btree (fromhost)
> "msgs_idx" gin (to_tsvector('english'::regconfig, message))
>
> The GIN index is to do text searching (via LogAnalyzer).
>
> Now there are two types of query plans:
>
> syslog=# explain SELECT id, devicereportedtime, facility, priority, fromhost, syslogtag, infounitid, message FROM systemevents WHERE ( ( to_tsvector('english', message) @@ to_tsquery ( '131.211.112.9')) ) ORDER BY id DESC LIMIT 100; QUERY PLAN
>
> --------------------------------------------------------------------------------
> ---------------------------------
> Limit (cost=0.00..10177.22 rows=100 width=159)
> -> Index Scan Backward using systemevents_pkey on systemevents (cost=0.00..
> 1052934.86 rows=10346 width=159)
> Filter: (to_tsvector('english'::regconfig, message) @@ to_tsquery('131.
> 211.112.9'::text))
> (3 rows)
>
> This one is useless (takes very long). However this one:
>
> syslog=# explain SELECT id, devicereportedtime, facility, priority, fromhost, syslogtag, infounitid, message FROM systemevents WHERE ( ( to_tsvector('english', message) @@ to_tsquery ( '131.211.112.9')) ) ORDER BY id DESC LIMIT 500;
> QUERY PLAN
>
> --------------------------------------------------------------------------------
> -----------------------------------
> Limit (cost=40928.89..40930.14 rows=500 width=159)
> -> Sort (cost=40928.89..40954.76 rows=10346 width=159)
> Sort Key: id
> -> Bitmap Heap Scan on systemevents (cost=2898.06..40413.36 rows=1034
> 6 width=159)
> Recheck Cond: (to_tsvector('english'::regconfig, message) @@ to_t
> squery('131.211.112.9'::text))
> -> Bitmap Index Scan on msgs_idx (cost=0.00..2895.47 rows=10346
> width=0)
> Index Cond: (to_tsvector('english'::regconfig, message) @@
> to_tsquery('131.211.112.9'::text))
> (7 rows)
>
> works acceptable.
>
> Stats:
>
> syslog=# SELECT relname, relkind, reltuples, relpages
> FROM pg_class WHERE relname LIKE 'systemevents%'; relname | relkind | reltuples | relpages
> -------------------------------+---------+-------------+----------
> systemevents_pkey | i | 2.06915e+06 | 71985
> systemeventsproperties | r | 0 | 0
> systemeventsproperties_pkey | i | 0 | 1
> systemevents_id_seq | S | 1 | 1
> systemeventsproperties_id_seq | S | 1 | 1
> systemevents | r | 2.06915e+06 | 694826
> (6 rows)
>
> syslog=# SELECT relname, relkind, reltuples, relpages
> FROM pg_class
> WHERE relname LIKE 'msg%';
> relname | relkind | reltuples | relpages
> ----------+---------+-------------+----------
> msgs_idx | i | 2.06915e+06 | 128069
> (1 row)
>
> How to use the right plan regardless of the 'LIMIT-size'?
>
> Cheers,
> --
> Henk van Lingen, ICT-SC Netwerk & Telefonie, (o- -+
> Universiteit Utrecht, Jenalaan 18a, room 0.12 /\ |
> phone: +31-30-2538453 v_/_ |
> http://henk.vanlingen.net/ http://www.tuxtown.net/netiquette/
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

--
Henk van Lingen, ICT-SC Netwerk & Telefonie, (o- -+
Universiteit Utrecht, Jenalaan 18a, room 0.12 /\ |
phone: +31-30-2538453 v_/_ |
http://henk.vanlingen.net/ http://www.tuxtown.net/netiquette/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joachim Worringen 2010-09-02 12:16:20 coping with failing disks
Previous Message Gerhard Wiesinger 2010-09-02 05:44:51 Double iteration in the buffer cache code

Browse pgsql-novice by date

  From Date Subject
Next Message Tom Lane 2010-09-02 13:41:15 Re: operator @> does not work with box.
Previous Message A B 2010-09-02 05:50:13 operator @> does not work with box.