From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alan Jackson <ajax(at)tvsquared(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: inconsistent results querying table partitioned by date |
Date: | 2019-05-09 23:22:39 |
Message-ID: | 24525.1557444159@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Alan Jackson <ajax(at)tvsquared(dot)com> writes:
> Im having a problem with querying a table partitioned by date.
> Depending on the sequence of operations on a date parameter used in a where clause, I either get the expected results, or no results.
Yeah, this is pretty clearly broken. It looks to me like the partition
pruning code is making insupportable assumptions about a comparison to
a stable expression. Using your example table:
regression=# explain select * from dataid where id=1 and datadatetime < localtimestamp;
QUERY PLAN
----------------------------------------------------------------------------------
Bitmap Heap Scan on dataid_default (cost=4.19..11.31 rows=3 width=12)
Recheck Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP))
-> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.19 rows=3 width=0)
Index Cond: ((id = 1) AND (datadatetime < LOCALTIMESTAMP))
(4 rows)
It should absolutely not have pruned away the dataid_201902 partition,
but it did. It's okay with an immutable expression:
regression=# explain select * from dataid where id=1 and datadatetime < '2019-05-09'::timestamp;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Append (cost=4.18..22.63 rows=6 width=12)
-> Bitmap Heap Scan on dataid_201902 (cost=4.18..11.30 rows=3 width=12)
Recheck Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone))
-> Bitmap Index Scan on dataid_201902_pkey (cost=0.00..4.18 rows=3 width=0)
Index Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone))
-> Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12)
Recheck Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone))
-> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0)
Index Cond: ((id = 1) AND (datadatetime < '2019-05-09 00:00:00'::timestamp without time zone))
(9 rows)
or a volatile one:
regression=# explain select * from dataid where id=1 and datadatetime < '2019-05-09'::timestamp + random()*'1 day'::interval;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Append (cost=4.23..29.80 rows=6 width=12)
-> Bitmap Heap Scan on dataid_201902 (cost=4.23..14.88 rows=3 width=12)
Recheck Cond: (id = 1)
Filter: (datadatetime < ('2019-05-09 00:00:00'::timestamp without time zone + (random() * '1 day'::interval)))
-> Bitmap Index Scan on dataid_201902_pkey (cost=0.00..4.23 rows=10 width=0)
Index Cond: (id = 1)
-> Bitmap Heap Scan on dataid_default (cost=4.23..14.88 rows=3 width=12)
Recheck Cond: (id = 1)
Filter: (datadatetime < ('2019-05-09 00:00:00'::timestamp without time zone + (random() * '1 day'::interval)))
-> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.23 rows=10 width=0)
Index Cond: (id = 1)
(11 rows)
but somebody's confused about what can be done with stable expressions.
While I'm on about it, this behavior is also insupportable:
regression=# explain select * from dataid where id=1 and datadatetime < '2018-05-09'::timestamptz ;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on dataid_default (cost=4.18..11.30 rows=3 width=12)
Recheck Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone))
-> Bitmap Index Scan on dataid_default_pkey (cost=0.00..4.18 rows=3 width=0)
Index Cond: ((id = 1) AND (datadatetime < '2018-05-09 00:00:00-04'::timestamp with time zone))
(4 rows)
because timestamp-against-timestamptz comparison is inherently only
stable; the pruning code is way exceeding its authority by supposing
that a comparison that holds at plan time will hold at runtime,
even with a constant comparison value.
The reason for the difference in your results is that one expression
is immutable and the other is only stable:
regression=# explain verbose select
(('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' ) at time zone 'UTC' + '2 days'::interval) as workingdate,
(('2019-02-26T00:00:00'::timestamp::timestamp at time zone 'America/New_York' + '2 days'::interval) at time zone 'UTC') as notworkingdate;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=16)
Output: '2019-02-28 05:00:00'::timestamp without time zone, timezone('UTC'::text, ('2019-02-26 00:00:00-05'::timestamp with time zone + '2 days'::interval))
(2 rows)
the reason being that timestamptz + interval depends on the timezone
setting (for some intervals) but timestamp + interval never does.
Seems to be equally broken in v11 and HEAD. I didn't try v10.
> I hope there is something simple I can change in the partition definitions to work around this.
Until we fix the bug, I think the best you can do is not use stable
expressions in this context.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2019-05-10 01:34:03 | BUG #15797: Wrong Execution Plan |
Previous Message | Tom Lane | 2019-05-09 19:24:20 | Re: BUG #15795: ERROR: could not find pathkey item to sort |