From: | Thomas Mayer <thomas(dot)mayer(at)student(dot)kit(dot)edu> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | pgsql-performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: window function induces full table scan |
Date: | 2014-01-03 00:13:00 |
Message-ID: | 52C6008C.7090201@student.kit.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Just to track it down: The limitation can also be reproduced without
using views. Using views is just a use case where the suggested
optimization is actually needed.
Plus, when I remove the condition "WHERE datepos = 1", the same
behaviour still occurs. Here, I wanted to see if postgresql is
preferring the condition "WHERE datepos = 1" (datepos is the result of
the window function) over the condition "user_id = 43" for optimization.
But this is not the case.
-- workaround example: "WHERE user_id = 43" condition in subselect
SELECT user_id, latitude, longitude
FROM (
SELECT
user_id,
latitude,
longitude,
rank() OVER (PARTITION BY user_id ORDER BY date DESC, id DESC)
AS datepos
FROM checkin_node
WHERE user_id = 43
) AS tmp_last_position
WHERE datepos = 1; -- takes 2 ms
-- track it down: reproduce limitation without a view:
SELECT user_id, latitude, longitude
FROM (
SELECT
user_id,
latitude,
longitude,
rank() OVER (PARTITION BY user_id ORDER BY date DESC, id DESC)
AS datepos
FROM checkin_node
) AS tmp_last_position
WHERE datepos = 1
AND user_id = 43; -- takes 6621 ms
-- without datepos condition
SELECT user_id, latitude, longitude
FROM (
SELECT
user_id,
latitude,
longitude,
rank() OVER (PARTITION BY user_id ORDER BY date DESC, id DESC)
AS datepos
FROM checkin_node
) AS tmp_last_position
WHERE user_id = 43; -- takes 6574 ms
Best regards,
Thomas
Am 03.01.2014 00:12, schrieb Thomas Mayer:
>
> Am 02.01.2014 23:43, schrieb Tom Lane:
>> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>>> On Thu, Jan 2, 2014 at 1:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> It's possible that in the specific case you exhibit here, pushing down
>>>> the clause wouldn't result in changes in the window function's output for
>>>> the selected rows, but the optimizer doesn't have enough knowledge about
>>>> window functions to determine that.
>>
>>> A restriction in the WHERE clause which corresponds to the PARTITION BY
>>> should be pushable, no? I think it doesn't need to understand the internal
>>> semantics of the window function itself, just of the PARTITION BY, which
>>> should be doable, at least in principle.
>>
>> If the restriction clause must give the same answer for any two rows of
>> the same partition, then yeah, we could in principle push it down without
>> knowing anything about the specific window function. It'd be a less than
>> trivial test to make, I think. In any case, it's not a "bug" that the
>> optimizer doesn't do this currently.
>
> I agree, this is not a "bug" in v9.3.2 in terms of correctness.
>
> But it's a limitation, because the query plan is by far not optimal. You
> may consider this report as a feature request then.
>
> The optimization I suggested is normally performed, when no window
> function occurs in the statement.
>
> It seems like the optimizer is already capable of doing a check if the
> WHERE can be done first.
>
> However, this check seems to be done too conservative: I guess, the
> check is ignoring the PARTITION-BY-sets of attributes completely.
>
>>
>> regards, tom lane
>> .
>>
>
> Best regards
> Thomas
>
>
--
======================================
Thomas Mayer
Durlacher Allee 61
D-76131 Karlsruhe
Telefon: +49-721-2081661
Fax: +49-721-72380001
Mobil: +49-174-2152332
E-Mail: thomas(dot)mayer(at)student(dot)kit(dot)edu
=======================================
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Mayer | 2014-01-03 04:37:31 | Re: window function induces full table scan |
Previous Message | Tom Lane | 2014-01-02 23:55:55 | Re: window function induces full table scan |