From: | Eric Raskin <eraskin(at)paslists(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Adding nextval() to a select caused hang/very slow execution |
Date: | 2020-11-04 19:12:07 |
Message-ID: | CAF9L-R42+L101Pg63bUFFtXWzRqGcHLkL6DFibfqUprNccTEZw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
OK - I see. And to add insult to injury, I tried creating a temporary
table to store the intermediate results. Then I was going to just do an
insert... select... to insert the rows. That would de-couple the
nextval() from the query.
Strangely, the first query I tried it on worked great. But, when I tried
to add a second set of data with a similar query to the same temporary
table, it slowed right down again. And, of course, when I remove the
insert, it's fine.
And, of course, your explanation that inserts will not be parallelized must
be the reason. I will certainly re-vacuum the tables. I wonder why
auto-vacuum didn't collect better stats. vacuum analyze <table> is all I
need, right?
As a last resort, what about a PL/PGSQL procedure loop on the query
result? Since the insert is very few rows relative to the work the select
has to do, I could just turn the insert.. select.. into a for loop. Then
the select could be parallel?
What do you think?
On Wed, Nov 4, 2020 at 2:01 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Eric Raskin <eraskin(at)paslists(dot)com> writes:
> > And, to follow up on your question, the plan shape DOES change when I
> > add/remove the nextval() on a plain explain.
> > Without nextval(): https://explain.depesz.com/s/SCdY
> > With nextval(): https://explain.depesz.com/s/oLPn
>
> Ah, there's your problem, I think: the plan without nextval() is
> parallelized while the plan with nextval() is not, because nextval() is
> marked as parallel-unsafe. It's not immediately clear why that would
> result in more than about a 4X speed difference, given that the parallel
> plan is using 4 workers. But some of the rowcount estimates seem fairly
> far off, so I'm betting that the planner is just accidentally lighting on
> a decent plan when it's using parallelism while making some poor choices
> when it isn't.
>
> The reason for the original form of your problem is likely that we don't
> use parallelism at all in non-SELECT queries, so you ended up with a bad
> plan even though the nextval() was hidden in a trigger.
>
> What you need to do is get the rowcount estimates nearer to reality
> --- those places where you've got estimated rowcount 1 while reality
> is tens or hundreds of thousands of rows are just disasters waiting
> to bite. I suspect most of the problem is join conditions like
>
> Join Filter: (CASE WHEN (c.rtype = ANY ('{0,1,7,9}'::bpchar[])) THEN
> c.rtype ELSE x.rtype END = '2'::bpchar)
>
> The planner just isn't going to have any credible idea how selective
> that is. I wonder to what extent you could fix this by storing
> generated columns that represent the derived conditions you want to
> filter on.
>
> regards, tom lane
>
--
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Eric H. Raskin
914-765-0500 x120 or *315-338-4461
(direct)*
Professional Advertising Systems Inc.
fax: 914-765-0500 or *315-338-4461 (direct)*
3 Morgan Drive #310
eraskin(at)paslists(dot)com
Mt Kisco, NY 10549
http://www.paslists.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-11-04 19:23:28 | Re: Adding nextval() to a select caused hang/very slow execution |
Previous Message | Tom Lane | 2020-11-04 19:01:51 | Re: Adding nextval() to a select caused hang/very slow execution |