Re: Why insertion throughput can be reduced with an increase of batch size?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Павел Филонов <filonovpv(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Why insertion throughput can be reduced with an increase of batch size?
Date: 2016-08-22 20:13:04
Message-ID: CAMkU=1y+tHcimTqpuHj3Q4ZTyFZRZhEjMthpmbhomGBYUG=bCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Aug 21, 2016 at 11:53 PM, Павел Филонов <filonovpv(at)gmail(dot)com> wrote:

> My greetings to everybody!
>
> I recently faced with the observation which I can not explain. Why
> insertion throughput can be reduced with an increase of batch size?
>
> Brief description of the experiment.
>
> - PostgreSQL 9.5.4 as server
> - https://github.com/sfackler/rust-postgres library as client driver
> - one relation with two indices (scheme in attach)
>
> Experiment steps:
>
> - populate DB with 259200000 random records
>
> How is populating the database you do in this step different from the
insertions you do in the next step? Is it just that the goal is to measure
insertions into an already-very-large table?

>
> - start insertion for 60 seconds with one client thread and batch size
> = m
> - record insertions per second (ips) in clients code
>
> Plot median ips from m for m in [2^0, 2^1, ..., 2^15] (in attachment).
>

The median of how many points? Please plot all points, as well as the
median. In what order did you cycle through the list of m? Are you
logging checkpoints? how many checkpoints occur during the run for each
batch size? Have you tuned your database to be targeted at mass insertions?
e.g.. what are max_wal_size, archive_mode, wal_level, wal_buffers,
shared_buffers, and checkpoint_completion_target? Are you issuing manual
checkpoints between runs?

60 seconds is usually not nearly enough time to benchmark a write-heavy
workload. The chances are pretty good that what you are seeing is nothing
but statistical artefacts, caused by checkpoints happening to line up with
certain values of batch size.

Cheers,

Jeff

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bruno Wolff III 2016-08-22 20:44:54 Re: Unique constraint on field inside composite type.
Previous Message David G. Johnston 2016-08-22 18:40:41 Re: Unique constraint on field inside composite type.