Re: 600 million rows of data. Bad hardware or need partitioning?

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Arya F <arya6000(at)gmail(dot)com>
Cc: Michael Lewis <mlewis(at)entrata(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: 600 million rows of data. Bad hardware or need partitioning?
Date: 2020-05-04 09:21:30
Message-ID: 20200504092130.GN28974@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, May 03, 2020 at 11:51:44PM -0400, Arya F wrote:
> On Sun, May 3, 2020 at 11:46 PM Michael Lewis <mlewis(at)entrata(dot)com> wrote:
> > What kinds of storage (ssd or old 5400 rpm)? What else is this machine running?
>
> Not an SSD, but an old 1TB 7200 RPM HDD
>
> > What configs have been customized such as work_mem or random_page_cost?
>
> work_mem = 2403kB
> random_page_cost = 1.1

I mentioned in February and March that you should plan to set shared_buffers
to fit the indexes currently being updated.

Partitioning can help with that *if* the writes mostly affect 1-2 partitions at
a time (otherwise not).

On Wed, Feb 05, 2020 at 11:15:48AM -0600, Justin Pryzby wrote:
> > Would that work? Or any recommendations how I can achieve good performance
> > for a lot of writes?
>
> Can you use partitioning so the updates are mostly affecting only one table at
> once, and its indices are of reasonable size, such that they can fit easily in
> shared_buffers.

On Sun, Mar 22, 2020 at 08:29:04PM -0500, Justin Pryzby wrote:
> On Sun, Mar 22, 2020 at 09:22:50PM -0400, Arya F wrote:
> > I have noticed that my write/update performance starts to dramatically
> > reduce after about 10 million rows on my hardware. The reason for the
> > slowdown is the index updates on every write/update.
>
> It's commonly true that the indexes need to fit entirely in shared_buffers for
> good write performance. I gave some suggestions here:
> https://www.postgresql.org/message-id/20200223101209.GU31889%40telsasoft.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jean-Christophe Boggio 2020-05-04 16:12:34 Re: Recursive query slow on strange conditions
Previous Message David Rowley 2020-05-04 04:44:03 Re: 600 million rows of data. Bad hardware or need partitioning?