Re: shared_buffers 8GB maximum

From: Vitaliy Garnashevich <vgarnashevich(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: shared_buffers 8GB maximum
Date: 2018-02-19 12:37:43
Message-ID: 98d5c829-7666-e8f7-950e-59ca6ed6a527@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> Yes.  I don't know the exact reason, but reading a buffer from OS
> cache is quite a bit more expensive than just pinning a buffer already
> in the buffer_pool, about 5 times more expensive the last time I
> tested it, which was before Meltdown.  (And just pinning a buffer
> which is already in the cache is already pretty expensive--about 15
> times as expensive as reading the next tuple from an already-pinned
> buffer).

Thanks for the numbers. Just out of curiosity, do you happen to know how
much more expensive compared to that a read from disk is? And also, how
much the pinning can be slowed down, when having to iterate using the
clock-sweep method over large shared_buffers?

> I don't think that there is any reason to think that buffers_clean >
> buffers_checkpoint is a problem.  In fact, you could argue that it is
> the way it was designed to work.  Although the background writer does
> need to tell the checkpointer about every file it dirties, so it can
> be fsynced at the end of the checkpoint.  The overhead of this was
> minimal in my testing.
>

The reason why I mentioned buffers_clean is because I was assuming that
under "healthy" conditions, most writes should be done by checkpointer,
because, as it was already mentioned, that's the most efficient way of
writing (no duplicate writes of the same buffer, write optimizations
etc.). I was thinking about bgwriter as a way of reducing latency by
avoiding the case when a backend has to write buffers by itself. So that
would mean that big numbers in buffers_clean and buffers_backend
compared to buffers_checkpoint, would mean that a lot of writes are done
not by checkpointer, and thus probably less efficiently than they could
be. That might have resulted in IO writes being more random, and more IO
writes done in general, because same buffer can be written multiple
times between checkpoints.

> But buffers_backend > buffers_checkpoint could be a problem,
> especially if they are also much larger than buffers_clean.  But the
> wrinkle here is that if you do bulk inserts or bulk updates (what
> about vacuums?), the backends by design write their own dirty
> buffers.  So if you do those kinds of things, buffers_backend being
> large doesn't indicate much.  There was a patch someplace a while ago
> to separate the counters of backend-intentional writes from
> backend-no-choice writes, but it never went anywhere.

We do daily manual vacuuming. Knowing what part of total writes is
accounted for them indeed would be nice.

When looking at buffers_checkpoint/buffers_clean/buffers_backend, I was
saving the numbers with several hours interval, knowing that there are
no vacuums running at that time, and calculated the difference.

> It is not clear to me that this is the best way to measure health. 
> Did your response time go down?  Did your throughput go up?

We have mixed type of DB usage. There is OLTP-like part with many small
read/write transactions. Predictable latency does not matter in that
case, but throughput does, because that is basically a background data
loading job. Then there is an OLAP-like part when heavier report queries
are being run. Then there are more background jobs which are a
combination of both, which at first run long queries and then do lots of
small inserts, thus pre-calculating some data for bigger reports.

After increasing shared_buffers 8GB -> 64GB, there was 7% improvement in
run time of the background pre-calculating job (measured by running
several times in a row, and caches are hot).

When we configured hugepages for the bigger shared_buffers, the
additional improvement was around 3%.

Regards,
Vitaliy

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Vitaliy Garnashevich 2018-02-19 12:46:36 Re: shared_buffers 8GB maximum
Previous Message Durumdara 2018-02-19 12:02:30 Re: Connection loosing at some places - caused by firewall