Re: Adding nextval() to a select caused hang/very slow execution

From: Eric Raskin <eraskin(at)paslists(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Adding nextval() to a select caused hang/very slow execution
Date: 2020-11-04 19:12:07
Message-ID: CAF9L-R42+L101Pg63bUFFtXWzRqGcHLkL6DFibfqUprNccTEZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

OK - I see. And to add insult to injury, I tried creating a temporary
table to store the intermediate results. Then I was going to just do an
insert... select... to insert the rows. That would de-couple the
nextval() from the query.

Strangely, the first query I tried it on worked great. But, when I tried
to add a second set of data with a similar query to the same temporary
table, it slowed right down again. And, of course, when I remove the
insert, it's fine.

And, of course, your explanation that inserts will not be parallelized must
be the reason. I will certainly re-vacuum the tables. I wonder why
auto-vacuum didn't collect better stats. vacuum analyze <table> is all I
need, right?

As a last resort, what about a PL/PGSQL procedure loop on the query
result? Since the insert is very few rows relative to the work the select
has to do, I could just turn the insert.. select.. into a for loop. Then
the select could be parallel?

What do you think?

On Wed, Nov 4, 2020 at 2:01 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Eric Raskin <eraskin(at)paslists(dot)com> writes:
> > And, to follow up on your question, the plan shape DOES change when I
> > add/remove the nextval() on a plain explain.
> > Without nextval(): https://explain.depesz.com/s/SCdY
> > With nextval(): https://explain.depesz.com/s/oLPn
>
> Ah, there's your problem, I think: the plan without nextval() is
> parallelized while the plan with nextval() is not, because nextval() is
> marked as parallel-unsafe. It's not immediately clear why that would
> result in more than about a 4X speed difference, given that the parallel
> plan is using 4 workers. But some of the rowcount estimates seem fairly
> far off, so I'm betting that the planner is just accidentally lighting on
> a decent plan when it's using parallelism while making some poor choices
> when it isn't.
>
> The reason for the original form of your problem is likely that we don't
> use parallelism at all in non-SELECT queries, so you ended up with a bad
> plan even though the nextval() was hidden in a trigger.
>
> What you need to do is get the rowcount estimates nearer to reality
> --- those places where you've got estimated rowcount 1 while reality
> is tens or hundreds of thousands of rows are just disasters waiting
> to bite. I suspect most of the problem is join conditions like
>
> Join Filter: (CASE WHEN (c.rtype = ANY ('{0,1,7,9}'::bpchar[])) THEN
> c.rtype ELSE x.rtype END = '2'::bpchar)
>
> The planner just isn't going to have any credible idea how selective
> that is. I wonder to what extent you could fix this by storing
> generated columns that represent the derived conditions you want to
> filter on.
>
> regards, tom lane
>

--

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Eric H. Raskin
914-765-0500 x120 or *315-338-4461
(direct)*

Professional Advertising Systems Inc.
fax: 914-765-0500 or *315-338-4461 (direct)*

3 Morgan Drive #310
eraskin(at)paslists(dot)com

Mt Kisco, NY 10549
http://www.paslists.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2020-11-04 19:23:28 Re: Adding nextval() to a select caused hang/very slow execution
Previous Message Tom Lane 2020-11-04 19:01:51 Re: Adding nextval() to a select caused hang/very slow execution