From: | Alex Tokarev <dwalin(at)dwalin(dot)ru> |
---|---|
To: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Table with large number of int columns, very slow COPY FROM |
Date: | 2017-12-08 04:21:45 |
Message-ID: | D64F5359.2993D%dwalin@dwalin.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Hi,
I have a set of tables with fairly large number of columns, mostly int with
a few bigints and short char/varchar columns. I¹ve noticed that Postgres is
pretty slow at inserting data in such a table. I tried to tune every
possible setting: using unlogged tables, increased shared_buffers, etc; even
placed the db cluster on ramfs and turned fsync off. The results are pretty
much the same with the exception of using unlogged tables that improves
performance just a little bit.
I have made a minimally reproducible test case consisting of a table with
848 columns, inserting partial dataset of 100,000 rows with 240 columns. On
my dev VM the COPY FROM operation takes just shy of 3 seconds to complete,
which is entirely unexpected for such a small dataset.
Here¹s a tarball with test schema and data:
http://nohuhu.org/copy_perf.tar.bz2; it¹s 338k compressed but expands to
~50mb. Here¹s the result of profiling session with perf:
https://pastebin.com/pjv7JqxD
--
Regards,
Alex.
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2017-12-08 06:00:19 | Re: proposal: alternative psql commands quit and exit |
Previous Message | Jeevan Chalke | 2017-12-08 04:21:17 | Re: Partition-wise aggregation/grouping |
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Kirkwood | 2017-12-08 04:51:08 | Re: Setting effective_io_concurrency in VM? |
Previous Message | Flávio Henrique | 2017-12-08 01:12:16 | Learning EXPLAIN |