Performance decline maybe caused by multi-column index?

From: Jan Bauer Nielsen <jbn(at)dbc(dot)dk>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Performance decline maybe caused by multi-column index?
Date: 2016-03-18 13:26:14
Message-ID: 56EC01F6.9090400@dbc.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

While developing a batch processing platform using postgresql as the
underlying data store we are seeing a performance decline in our
application.

In this application a job is broken up into chunks where each chunk
contains a number of items (typically 10).

CREATE TABLE item (
id SMALLINT NOT NULL,
chunkId INTEGER NOT NULL,
jobId INTEGER NOT NULL,
-- other attributes omitted for brewity
PRIMARY KEY (jobId, chunkId, id)
);

So a job with 600.000 items results in 600.000 rows in the items table
with a fixed jobId, chunkId ranging from 0-59999 and for each chunkId an
id ranging from 0-9.

All ten inserts for a particular chunkId are handled in a single
transaction, and over time we are seeing an increase in transaction
execution time, <100ms for the first 100.000 items, >300ms when we reach
the 400.000 mark, and the trend seems to be forever increasing.

No decline is observed if we instead sequentially submit 6 jobs of
100.000 items each.

Therefore we are beginning to wonder if we are hitting some sort of
upper limit with regards to the multi column index? Perhaps something
causing it to sort on disk or something like that?

Any suggestions to the cause of this would be very much appreciated.

jobstore=> SELECT version();
version
----------------------------------------------------------------------------------------------
PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Debian
4.7.2-5) 4.7.2, 64-bit

jobstore=> SELECT name, current_setting(name), SOURCE
jobstore-> FROM pg_settings
jobstore-> WHERE SOURCE NOT IN ('default', 'override');
name | current_setting | source
----------------------------+----------------------------------------+----------------------
application_name | psql | client
client_encoding | UTF8 | client
DateStyle | ISO, YMD | configuration file
default_text_search_config | pg_catalog.english | configuration file
lc_messages | en_DK.UTF-8 | configuration file
lc_monetary | en_DK.UTF-8 | configuration file
lc_numeric | en_DK.UTF-8 | configuration file
lc_time | en_DK.UTF-8 | configuration file
listen_addresses | * | configuration file
log_line_prefix | %t | configuration file
log_timezone | localtime | configuration file
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
port | 5432 | configuration file
shared_buffers | 128MB | configuration file
ssl | on | configuration file
ssl_cert_file | /etc/ssl/certs/ssl-cert-snakeoil.pem |
configuration file
ssl_key_file | /etc/ssl/private/ssl-cert-snakeoil.key |
configuration file
TimeZone | localtime | configuration file

Kind regards,

Jan Bauer Nielsen
Software developer
DBC as
http://www.dbc.dk/english

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dave Stibrany 2016-03-18 14:48:06 Re: Disk Benchmarking Question
Previous Message Albe Laurenz 2016-03-18 10:30:17 Re: using shared_buffers during seq_scan