Pointers towards identifying bulk import bottleneck (walwriter tuning?)

From: "Holtgrewe, Manuel" <manuel(dot)holtgrewe(at)bihealth(dot)de>
To: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Pointers towards identifying bulk import bottleneck (walwriter tuning?)
Date: 2019-08-27 10:06:31
Message-ID: CC941ED861F3AA469B6D50EE76C57483015EABBC14@s-mx14-bih01.charite.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Dear all,

I hope that this is the right place to ask.

I have trouble identifying the bottleneck in a bulk import. I'm suspecting the bottleneck to be the walwriter. I'm using PostgreSQL 11, my configuration is at [5]. I'm running on ZFS raidz3 that can write write 300MB/s+. However, I'm only seeing ~35MB/s from walwriter.

What would be a strategy to improve bulk insertion performance or find out more about potential ways forward/understanding this better?

Thanks,
Manuel

The full description is as follows:

I have the following use case. My table is relatively complex (see [2] for the definition, [1] contains all scripts and commands while [3] has the data that I'm using for testing; original data from [4]).

To summarise:

- My table has has 47 fields, most are integers but there are also two VARCHAR(64) arrays and one JSONB field.
- The data can nicely be partitioned by "case_id" as I will only query one case at a time.
- I want to import TSV files into the table, the example data has 370k lines which is realistic.
- Ideally I would be able to import multiple TSV files in parallel to saturate the I/O of the system.

The import strategy is that I'm first importing the TSV file into a table with the same number of fields as my actual table but is text only and then insert from this table into the actual table (I've cut out the SQL from my software which uses a library that goes this way for bulk import). I tried to directly \copy into the original table but that did not improve performance much.

For the benchmarking, I have created four separate of these "text tables" and pre-filled them (fill[1-4].sql). Then I'm filling my main table from them (query[1-4].sql), with 1, 2, and 4 queryX.sql executions in parallel. Wall-clock running time are about 14s for one import process, 21s for two import processes and 44s for four import processes.

As the main table is partitioned and each TSV import process only inserts into one partition entry, I was expecting the speedup to be almost linear. I also tried to insert into the partition entry tables directly from TSV in parallel but that did not help either.

My Postgres configuration is at [5]. I'm running PostgreSQL 11.5 on a recent CentOS 7 (running inside a LXC container). The data is stored on a ZFS raidz3 pool with 16 spinning disks (and an SSD each for each ZFS logs and cache). The file system is capable of writing 300MB/s+ but I'm seeing 80MB/sec or less by postgres. With one import process the CPU usage is at 80%, dropping to ~60% for two import processes, and to about 30% for four import processes.

iotop tells me that walwriter does not go beyond ~35MB/s so maybe this is the culprit? Is there a way to tune walwriter I/O performance?

[1] https://gist.github.com/holtgrewe/3e029c6c2555e0d9fd3fb247a0109b45
[2] https://gist.github.com/holtgrewe/3e029c6c2555e0d9fd3fb247a0109b45#file-from_scratch-sql-L54
[3] https://file-public.bihealth.org/transient/varfish-debug/
[4] https://figshare.com/articles/Corpasome/693052
[5] https://gist.github.com/holtgrewe/3e029c6c2555e0d9fd3fb247a0109b45#file-postgresql-conf

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Arnaud L. 2019-08-27 10:34:31 Re: psql \copy hanging
Previous Message Tony Shelver 2019-08-27 08:58:02 Re: How to set up PostGIS to support ArcGIS, MapInfo and QGIS desktop users?