Re: Batch insert heavily affecting query performance.

From: Jean Baro <jfbaro(at)gmail(dot)com>
To: "MichaelDBA(at)sqlexec(dot)com" <michaeldba(at)sqlexec(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Batch insert heavily affecting query performance.
Date: 2017-12-25 00:09:44
Message-ID: CA+fQee=kXxYL9v-Rb=eYEC9a7ZOiNe1hsSiqLgpRk+uTknF07Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Multiple connections, but we are going to test it with only one. Would it
make any difference?

Thanks

Em 24 de dez de 2017 21:52, "MichaelDBA(at)sqlexec(dot)com" <michaeldba(at)sqlexec(dot)com>
escreveu:

> Are the inserts being done through one connection or multiple connections
> concurrently?
>
> Sent from my iPhone
>
> > On Dec 24, 2017, at 2:51 PM, Jean Baro <jfbaro(at)gmail(dot)com> wrote:
> >
> > Hi there,
> >
> > We are testing a new application to try to find performance issues.
> >
> > AWS RDS m4.large 500GB storage (SSD)
> >
> > One table only, called Messages:
> >
> > Uuid
> > Country (ISO)
> > Role (Text)
> > User id (Text)
> > GroupId (integer)
> > Channel (text)
> > Title (Text)
> > Payload (JSON, up to 20kb)
> > Starts_in (UTC)
> > Expires_in (UTC)
> > Seen (boolean)
> > Deleted (boolean)
> > LastUpdate (UTC)
> > Created_by (UTC)
> > Created_in (UTC)
> >
> > Indexes:
> >
> > UUID (PK)
> > UserID + Country (main index)
> > LastUpdate
> > GroupID
> >
> >
> > We inserted 160MM rows, around 2KB each. No partitioning.
> >
> > Insert started at around 3.000 inserts per second, but (as expected)
> started to slow down as the number of rows increased. In the end we got
> around 500 inserts per second.
> >
> > Queries by Userd_ID + Country took less than 2 seconds, but while the
> batch insert was running the queries took over 20 seconds!!!
> >
> > We had 20 Lambda getting messages from SQS and bulk inserting them into
> Postgresql.
> >
> > The insert performance is important, but we would slow it down if needed
> in order to ensure a more flat query performance. (Below 2 seconds). Each
> query (userId + country) returns around 100 diferent messages, which are
> filtered and order by the synchronous Lambda function. So we don't do any
> special filtering, sorting, ordering or full text search in Postgres. In
> some ways we use it more like a glorified file system. :)
> >
> > We are going to limit the number of lambda workers to 1 or 2, and then
> run some queries concurrently to see if the query performance is not affect
> too much. We aim to get at least 50 queries per second (returning 100
> messages each) under 2 seconds, even when there is millions of messages on
> SQS being inserted into PG.
> >
> > We haven't done any performance tuning in the DB.
> >
> > With all that said, the question is:
> >
> > What can be done to ensure good query performance (UserID+ country) even
> when the bulk insert is running (low priority).
> >
> > We are limited to use AWS RDS at the moment.
> >
> > Cheers
> >
> >
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message MichaelDBA 2017-12-25 01:30:27 Re: Batch insert heavily affecting query performance.
Previous Message MichaelDBA@sqlexec.com 2017-12-24 23:52:07 Re: Batch insert heavily affecting query performance.