Re: Understanding query planner cpu usage

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Lucas Fairchild-Madar <lucas(dot)madar(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Understanding query planner cpu usage
Date: 2018-02-22 03:28:05
Message-ID: 15586.1519270085@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Lucas Fairchild-Madar <lucas(dot)madar(at)gmail(dot)com> writes:
> On Wed, Feb 21, 2018 at 4:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> If so, this might be some manifestation of a problem we've seen before:
>> the planner tries to find out the current live max value of the column
>> by scanning the index, and that's really slow if there are a lot of
>> recently-dead entries at the index end, because each of them has to be
>> inspected and then hinted dead.

> I took a look further in this direction and found that our cleanup process
> and this analytical rollup both happened to run very close to each other
> twice a day. I moved the cleanup process to run at a time when it should
> never collide with analytical rollups and we'll see if the load spike
> happens again.

I'm thinking that will fix it for you.

> Perhaps this is a stupid question, but would creating the index in
> descending order solve this problem (as a unique index, not a primary key)?

I think most likely it'd just move the issue to the other end of the
index, though you could try. See below.

> What is the planner doing when trying to find the current live max value of
> the column?

It's trying to estimate whether a mergejoin will be able to stop short of
reading all the tuples from the other side of the join. (For instance,
if you've got 1,3,5 on one side, and 1,4,5,7,8,9,19 on the other, the
second input doesn't have to be read past "7" because once we run off the
end of the first input, we know we couldn't see any matches later on the
second input. So the planner wants to compare the ending key value on
each side to the key distribution on the other side, to see what this might
save.) Now, that's a unidirectional question for any particular mergejoin
plan, so that for any one cost estimate it's only going to need to look at
one end of the key range. But I think it will consider merge joins with
both sort directions, so that both ends of the key range will get
investigated in this way. I might be wrong though; it's late and I've
not looked at that code in awhile ...

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Marcin Giedz 2018-02-22 04:59:10 oracle to postgresql conversion tool
Previous Message Melvin Davidson 2018-02-22 03:22:43 Re: PostgreSQL backup stategies