From: | Nicolas Seinlet <nicolas(at)seinlet(dot)com> |
---|---|
To: | "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: second CTE kills perf |
Date: | 2021-06-24 12:53:26 |
Message-ID: | ycEXwMIlgtDQCHCwkq4Lkc7YkFeqeK6dbYVIJThISyJd4nTDDf1mruRnGlh30e6kMnr-mXATyevPfShR_a5ldJtrFikBCa_Eic0KDyqvKeU=@seinlet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi, thanks for all. I replaced row_number() with some computed int which speeds up a lot the query.
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, June 22nd, 2021 at 15:53, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Nicolas Seinlet nicolas(at)seinlet(dot)com writes:
>
> > I'm trying to understand this behaviour and the limits of CTE, when they reach the limits, when they cannot receive parameters from a caller, ... I'm running a query on PostgreSQL 10 with a cte. the query runs in ~ 10 seconds. If I add a second CTE with the same query as the previous one and select * from second_cte as query, it now runs in ~ 10 minutes.
>
> > oversimplified example:
> >
> > 10 seconds version:
> >
> > | WITH cte1 AS (SELECT x,y,z FROM table) SELECT row_number() over(),x,y,z FROM cte1 WHERE x=32;
>
> > 10 minutes version:
> >
> > | WITH cte1 AS (SELECT x,y,z FROM table), cte2 AS (SELECT row_number() over(),x,y,z FROM cte1) SELECT * FROM cte2 WHERE x=32;
>
> [ shrug... ] You are asking for two different computations, and the
>
> second one is far more expensive.
>
> In the first case, the WHERE x=32 clause is applied before the window
>
> function, so we can (indeed must) filter out all rows not having x=32
>
> before doing the window function.
>
> In the second case, WHERE x=32 is applied above/after the window
>
> function. We cannot push down the WHERE to before the window function.
>
> (In this case, filtering beforehand would obviously change the results
>
> of row_number, but in general we don't know enough about window function
>
> behavior to risk such changes.) So row_number has to be computed over
>
> the entire contents of the "table", and that's not cheap.
>
> It does surprise me a bit that row_number is quite that expensive,
>
> but if you are expecting equivalent results from these two queries,
>
> you're simply wrong.
>
> regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
publickey - nicolas@seinlet.com - 0xCAEB7FAF.asc | application/pgp-keys | 729 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2021-06-24 13:14:47 | Re: insert ..... returning <column> problem |
Previous Message | Joe Conway | 2021-06-24 12:47:14 | Re: removing "serial" from table definitions. |