Re: Sudden insert performance degradation

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Henrique Montenegro <typoon(at)gmail(dot)com>
Cc: "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Sudden insert performance degradation
Date: 2020-07-14 00:05:07
Message-ID: CAMkU=1zZvVd5UuDYOou-mSTXbWunRne6N5ewRj2y3Re__L3+1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Jul 13, 2020 at 10:23 AM Henrique Montenegro <typoon(at)gmail(dot)com>
wrote:

insert into users_no_dups (
> created_ts,
> user_id,
> name,
> url
> ) (
> select
> created_ts,
> user_id,
> name,
> url
> from
> users
> ) on conflict do nothing
>

Once the size of the only index exceeds shared_buffers by a bit (the amount
of "a bit" depends on your RAM, kernel version, settings
for dirty_background_ratio, dirty_expire_centisecs, and probably other
things, and is not easy to predict) the performance falls off a cliff when
inserting values in a random order. Every insert dirties a random index
leaf page, which quickly gets evicted from shared_buffers to make room for
other random leaf pages to be read in, and then turns into flush calls when
the kernel freaks out about the amount and age of dirty pages held in
memory.

What happens if you add an "ORDER BY user_id" to your above select?

> shared_buffers = 8GB
> RAM: 256GB
>

Or, crank up shared_buffers by a lot. Like, beyond the size of the growing
index, or up to 240GB if the index ever becomes larger than that. And make
the time between checkpoints longer. If the dirty buffers are retained in
shared_buffers longer, chances of them getting dirtied repeatedly
between writes is much higher than if you just toss them to the kernel and
hope for the best.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Henrique Montenegro 2020-07-14 01:02:14 Re: Sudden insert performance degradation
Previous Message Henrique Montenegro 2020-07-13 17:02:09 Re: Sudden insert performance degradation