Re: 600 million rows of data. Bad hardware or need partitioning?

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Arya F <arya6000(at)gmail(dot)com>
Cc: Michael Lewis <mlewis(at)entrata(dot)com>, Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: 600 million rows of data. Bad hardware or need partitioning?
Date: 2020-05-04 04:44:03
Message-ID: CAApHDvqt53Q9NSzffOGZGx=6qzutmDS1ssYsUDuZiGQskaEEMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, 4 May 2020 at 15:52, Arya F <arya6000(at)gmail(dot)com> wrote:
>
> On Sun, May 3, 2020 at 11:46 PM Michael Lewis <mlewis(at)entrata(dot)com> wrote:
> >
> > What kinds of storage (ssd or old 5400 rpm)? What else is this machine running?
>
> Not an SSD, but an old 1TB 7200 RPM HDD
>
> > What configs have been customized such as work_mem or random_page_cost?
>
> work_mem = 2403kB
> random_page_cost = 1.1

How long does it take if you first do:

SET enable_nestloop TO off;

If you find it's faster then you most likely have random_page_cost set
unrealistically low. In fact, I'd say it's very unlikely that a nested
loop join will be a win in this case when random pages must be read
from a mechanical disk, but by all means, try disabling it with the
above command and see for yourself.

If you set random_page_cost so low to solve some other performance
problem, then you may wish to look at the effective_cache_size
setting. Having that set to something realistic should allow indexes
to be used more in situations where they're likely to not require as
much random I/O from the disk.

David

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2020-05-04 09:21:30 Re: 600 million rows of data. Bad hardware or need partitioning?
Previous Message Arya F 2020-05-04 03:51:44 Re: 600 million rows of data. Bad hardware or need partitioning?