Re: partitioning query planner almost always scans all tables

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: robjsargent(at)gmail(dot)com
Cc: sivann(at)inaccess(dot)com, pgsql-general(at)postgresql(dot)org
Subject: Re: partitioning query planner almost always scans all tables
Date: 2015-01-20 02:33:40
Message-ID: 20150120.113340.211360232.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

Your constraint column is of 'timestamp with time zone' and the
query uses 'date'. The comparison between them is an operator
based on a static, non-immutable function so constraint exclusion
doesn't work.

SELECT o.oprname, o.oprcode, p.provolatile
FROM pg_operator o join pg_proc p on (p.oid = o.oprcode)
WHERE oprname = '<' AND
oprleft = 'timestamp with time zone'::regtype AND
oprright = 'date'::regtype;

oprname | oprcode | provolatile
---------+---------------------+-------------
< | timestamptz_lt_date | s

# '<' is a random selection. Any comparison ops will do.

The following query instead will do what you wanted. Only the
operand of '>=' is changed.

EXPLAIN ANALYZE SELECT * FROM measurement_events WHERE measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND measurement_time >= '2015-01-01 00:00:00+0' LIMIT 1;

The function in the expression used for exclusion is immutable.

SELECT o.oprname, o.oprcode, p.provolatile
FROM pg_operator o join pg_proc p on (p.oid = o.oprcode)
WHERE oprname = '<' AND
oprleft = 'timestamp with time zone'::regtype AND
oprright = 'timestamp with time zone::regtype;

oprname | oprcode | provolatile
---------+----------------+-------------
< | timestamptz_lt | i

The details about this in the following page,

http://www.postgresql.org/docs/9.3/static/ddl-partitioning.html

> 5.9.6 Cavert

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Jan 2015 06:42:53 -0700, Rob Sargent <robjsargent(at)gmail(dot)com> wrote in <31FED87E-D31B-4CF9-93F8-CC0F131CB6DF(at)gmail(dot)com>
> I don't understand having both UUID and time stamp in your PK? The first is by defn. unique and the second might be.
>
> Sent from my iPhone
>
> > On Jan 19, 2015, at 6:12 AM, Spiros Ioannou <sivann(at)inaccess(dot)com> wrote:
> >
> > Hello group,
> > we have a timeseries table, and we tried to partition it by month (with pg_partman). It seems the query planner always reads all tables regardless of WHERE, except when WHERE is equality.
> >
> > the parent table:
> >
> > ifms_db=# \dS measurement_events
> > Table "public.measurement_events"
> > Column | Type | Modifiers
> > -----------------------+--------------------------+-----------
> > measurement_source_id | uuid | not null
> > measurement_time | timestamp with time zone | not null
> > event_reception_time | timestamp with time zone | not null
> > measurement_value | character varying(200) | not null
> > quality | character varying(500) | not null
> > Indexes:
> > "measurement_events_pkey" PRIMARY KEY, btree (measurement_source_id, measurement_time)
> > Triggers:
> > measurement_events_part_trig BEFORE INSERT ON measurement_events FOR EACH ROW EXECUTE PROCEDURE measurement_events_part_trig_func()
> > Number of child tables: 25 (Use \d+ to list them.)
> >
> >
> > One of the children tables (2014_3)
> >
> > ifms_db=# \dS measurement_events_p2014_03
> > Table "public.measurement_events_p2014_03"
> > Column | Type | Modifiers
> > -----------------------+--------------------------+-----------
> > measurement_source_id | uuid | not null
> > measurement_time | timestamp with time zone | not null
> > event_reception_time | timestamp with time zone | not null
> > measurement_value | character varying(200) | not null
> > quality | character varying(500) | not null
> > Indexes:
> > "measurement_events_p2014_03_pkey" PRIMARY KEY, btree (measurement_source_id, measurement_time)
> > Check constraints:
> > "measurement_events_p2014_03_partition_check" CHECK (measurement_time >= '2014-03-01 00:00:00+02'::timestamp with time zone AND measurement_time < '2014-04-01 00:00:00+03'::timestamp with time zone)
> > Inherits: measurement_events
> >
> >
> >
> > The query:
> > # explain analyze select * from measurement_events where measurement_source_id='df86917e-8df0-11e1-8f8f-525400e76ceb' AND measurement_time >= DATE '2015-01-01' limit 1;
> >
> > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > Limit (cost=0.00..4.12 rows=1 width=87) (actual time=0.377..0.377 rows=1 loops=1)
> > -> Append (cost=0.00..2696.08 rows=655 width=87) (actual time=0.376..0.376 rows=1 loops=1)
> > -> Seq Scan on measurement_events (cost=0.00..0.00 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1)
> > Filter: ((measurement_time >= '2015-01-01'::date) AND (measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid))
> > -> Index Scan using measurement_events_p2014_01_pkey on measurement_events_p2014_01 (cost=0.14..8.16 rows=1 width=966) (actual time=0.005..0.005 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_02_pkey on measurement_events_p2014_02 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_03_pkey on measurement_events_p2014_03 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_04_pkey on measurement_events_p2014_04 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_05_pkey on measurement_events_p2014_05 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_06_pkey on measurement_events_p2014_06 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_07_pkey on measurement_events_p2014_07 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_08_pkey on measurement_events_p2014_08 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_09_pkey on measurement_events_p2014_09 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_10_pkey on measurement_events_p2014_10 (cost=0.14..8.16 rows=1 width=966) (actual time=0.001..0.001 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_11_pkey on measurement_events_p2014_11 (cost=0.14..8.16 rows=1 width=966) (actual time=0.002..0.002 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2014_12_pkey on measurement_events_p2014_12 (cost=0.28..8.04 rows=1 width=51) (actual time=0.009..0.009 rows=0 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Bitmap Heap Scan on measurement_events_p2015_01 (cost=31.02..2500.30 rows=630 width=54) (actual time=0.345..0.345 rows=1 loops=1)
> > Recheck Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Bitmap Index Scan on measurement_events_p2015_01_pkey (cost=0.00..30.87 rows=630 width=0) (actual time=0.269..0.269 rows=718 loops=1)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2015_02_pkey on measurement_events_p2015_02 (cost=0.14..8.16 rows=1 width=966) (never executed)
> > Index Cond: ((measurement_source_id = 'df86917e-8df0-11e1-8f8f-525400e76ceb'::uuid) AND (measurement_time >= '2015-01-01'::date))
> > -> Index Scan using measurement_events_p2015_03_pkey on measurement_events_p2015_03 (cost=0.14..8.16 rows=1 width=966) (never executed)
> > .....
> >
> > More results:
> >
> > This query:
> > ifms_db=# explain analyze select * from measurement_events where measurement_source_id='bd77387a-fdb4-4531-9bb7-7ef67a8f647d' AND measurement_time = DATE '2015-01-14 15:30:01+02' limit 1;
> >
> > searches in all tables:
> >
> > This query (no date casting):
> > ifms_db=# explain analyze select * from measurement_events where measurement_source_id='bd77387a-fdb4-4531-9bb7-7ef67a8f647d' AND measurement_time = '2015-01-14 15:30:01+02' limit 1;
> >
> > searches only 1 table,
> >
> > and this query (>, no casting):
> > ifms_db=# explain analyze select * from measurement_events where measurement_source_id='bd77387a-fdb4-4531-9bb7-7ef67a8f647d' AND measurement_time > '2015-01-14 15:30:01+02' limit 1;
> >
> > searches first the correct table, then all the others.
> >
> > any ideas?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tim Uckun 2015-01-20 02:40:00 Re: Getting truncated queries from pg_stat_statements
Previous Message Melvin Davidson 2015-01-20 02:12:49 Re: Getting truncated queries from pg_stat_statements