| From: | Alex Tokarev <dwalin(at)dwalin(dot)ru> |
|---|---|
| To: | <pgsql-performance(at)postgresql(dot)org> |
| Subject: | Table with large number of int columns, very slow COPY FROM |
| Date: | 2017-12-08 04:21:45 |
| Message-ID: | D64F5359.2993D%dwalin@dwalin.ru |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers pgsql-performance |
Hi,
I have a set of tables with fairly large number of columns, mostly int with
a few bigints and short char/varchar columns. I¹ve noticed that Postgres is
pretty slow at inserting data in such a table. I tried to tune every
possible setting: using unlogged tables, increased shared_buffers, etc; even
placed the db cluster on ramfs and turned fsync off. The results are pretty
much the same with the exception of using unlogged tables that improves
performance just a little bit.
I have made a minimally reproducible test case consisting of a table with
848 columns, inserting partial dataset of 100,000 rows with 240 columns. On
my dev VM the COPY FROM operation takes just shy of 3 seconds to complete,
which is entirely unexpected for such a small dataset.
Here¹s a tarball with test schema and data:
http://nohuhu.org/copy_perf.tar.bz2; it¹s 338k compressed but expands to
~50mb. Here¹s the result of profiling session with perf:
https://pastebin.com/pjv7JqxD
--
Regards,
Alex.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Laurenz Albe | 2017-12-08 06:00:19 | Re: proposal: alternative psql commands quit and exit |
| Previous Message | Jeevan Chalke | 2017-12-08 04:21:17 | Re: Partition-wise aggregation/grouping |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Mark Kirkwood | 2017-12-08 04:51:08 | Re: Setting effective_io_concurrency in VM? |
| Previous Message | Flávio Henrique | 2017-12-08 01:12:16 | Learning EXPLAIN |