Re: Built-in connection pooling

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Built-in connection pooling
Date: 2018-02-01 15:07:03
Message-ID: d8d92459-b9e8-b89a-6a9e-2f497adf2a69@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01.02.2018 16:33, Vladimir Sitnikov wrote:
> Konstantin>I have not built YCSB myself, use existed installation.
>
> Which pgjdbc version was in use?
>
postgresql-9.4.1212.jar

> Konstantin>One of the main problems of Postgres is significant degrade
> of performance in case of concurrent write access by multiple
> transactions to the same sows.
>
> I would consider that a workload "problem" rather than PostgreSQL problem.
> That is, if an application (e.g. YCSB) is trying to update the same
> rows in multiple transactions concurrently, then the outcome of such
> updates is likely to be unpredictable. Does it make sense?

I can't agree with you.
Yes, there are workloads where updates are more or less local: clients
are used to update their own private data.
But there are many systems  with "shared" resources which are
concurrently accessed by different users. They may just increment access
count or perform deposit/withdraw...
Just simple example: consider that you have something like AppStore and
there is some popular application which is bought by a lot of users.
From DBMS point of view a lot of clients perform concurrent update of
the same record.
So performance on such workload is also very important. And
unfortunately here Postgres loses to the competition with mySQL and most
of other DBMSes.

>
> At least, I do not see why Mongo would degrade in a different way
> there. Oleg's charts suggest that Mongo does not degrade there, so I
> wonder if we compare apples to apples in the first place.

Postgres locks tuples in very inefficient way in case of high contention.
It first lock buffer and checks if tuple is locked by some other backend.
Then it tries to set heavy weight lock on the tuple's tcid. If there are
several processes trying update this tuple, then all of them will be
queued on this heavy-weight tuple lock.
After getting this tuple lock, backend tries to lock tid of transaction
which updated the tuple.
Once transaction updated this tuple is completed, Postgres unblocks
backends waiting for this transaction. It checks status of the tuple and
release tuple's lock, awaken one of waiting clients.
As far as Postgres  using MVCC, it creates new version of the tuple on
each update.
So the tuple all clients are waiting for is not the last version of of
the tuple any more.
Depending on isolation policy them either need to report error (in case
of repeatable read) or update snapshot and repeat search with new
snapshot...
and perform all checks and locks mentioned above once again.

I hope that it is clear from this brief and not so precise explanation
that Postgres has to do a lot of redundant work if several client are
competing for the same tuple.
There is well known rule that pessimistic locking is more efficient than
optimistic in case of high contention.
So Postgres can provide better performance on this workload if it be
more pessimistic:
set lock not on TCID (identifier of particular tuple version), but on
tuple's PK (primary key) and hold it till end of the transaction
(because until transaction is completed nobody still be
able to update this tuple). This trick with locking PK really helps to
improve performance on this workload, but unfortunately can not reverse
the trend with the degradation of performance with increasing number of
competing transactions.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Rofail 2018-02-01 15:17:04 Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Previous Message Peter Eisentraut 2018-02-01 15:05:25 Re: unique indexes on partitioned tables