Re: Simple DELETE on modest-size table runs 100% CPU forever

From: Craig James <cjames(at)emolecules(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Simple DELETE on modest-size table runs 100% CPU forever
Date: 2019-11-16 17:17:47
Message-ID: CAFwQ8rdk4qXCKUnUzUOQSExhjEBo-=S6C22n95DoTeJnKsbdMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Problem solved ... see below. Thanks everyone for your suggestions and
insights!

On Sat, Nov 16, 2019 at 7:16 AM Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Fri, Nov 15, 2019 at 7:27 PM Craig James <cjames(at)emolecules(dot)com> wrote:
>
>> On Fri, Nov 15, 2019 at 2:45 PM Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> BTW, I'll note at this point that "analyze category_staging_8" prior to
>> this query made no difference.
>>
>
> Isn't that the wrong table to have analyzed? The offender here is
> "categories", not "category_staging_8". Is this some sort of inheritance
> situation?
>
>
>>
>>> What do you see in `select * from pg_stats where tablename='categories'
>>> and attname='category_id' \x\g\x`?
>>>
>>
>> db=> select * from pg_stats where tablename='categories' and
>> attname='category_id' \x\g\x;
>> Expanded display is on.
>>
>
>
>> ...
>>
> n_distinct | 21
>> most_common_vals |
>> {4,3,2,10,11,13,12,16,9,6,7,5,15,23,14,25,24,1,26,28,27}
>> most_common_freqs |
>> {0.2397,0.159933,0.0926667,0.0556,0.0555667,0.0546333,0.0525333,0.0439,0.0426667,0.0346333,0.0331,0.0302333,0.0288333,0.0240667,0.0224,0.0122333,0.011,0.0035,0.00233333,0.000366667,0.0001}
>>
>
> There is a path in the analyze code where if the least-seen value in the
> sample was seen more than once (i.e. no value was seen exactly once) then
> it assumes that the seen values are all the values that exist. I think the
> logic behind that is dubious. I think it is pretty clear that that is
> kicking in here. But why? I think the simple answer is that you analyzed
> the wrong table, and the statistics shown here might be accurate for some
> time in the past but are no longer accurate. It is hard to see how a value
> present 5000 times in a table of 274602 rows could have evaded sampling if
> they were present at the time the sample was done.
>

As I mentioned in a reply to Andreas, I also added an "analyze ..." to the
other two tables as an experiment. It made no difference. However ...

Your comment about missing 5000 values solved the problem: those values
were only inserted in the previous SQL statement, inside of a transaction.
The code is reconciling two collections across two different servers: First
it inserts all new values, then it deletes obsolete values. So the "select
..." in question is including the very 5000 rows that were just inserted.

I added an "analyze" between the insert and the delete. Instant fix.

It also solves one other mystery: This query only caused problems on the
small test system, and has been working well on a production database with
about 100x more data. In production, each "category" is already populated
with a significant amount of data. The production system already has good
statistics, so this one insert/delete doesn't change the statistics.

> Cheers,
>
> Jeff
>

Thanks!
Craig

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Ekaterina Amez 2019-11-18 11:41:27 Out of memory error on automatic vacuum
Previous Message Jeff Janes 2019-11-16 15:16:02 Re: Simple DELETE on modest-size table runs 100% CPU forever