From: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> |
---|---|
To: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
Cc: | Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: old_snapshot_threshold bottleneck on replica |
Date: | 2023-01-24 10:35:21 |
Message-ID: | CALT9ZEEwfZkBC8LHddkigJC_MUQtKsxUNN-veVDVWmmBvKn_oQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi, Maxim!
On Mon, 23 Jan 2023 at 18:40, Maxim Orlov <orlovmg(at)gmail(dot)com> wrote:
>
> Hi!
>
> One of our customers stumble onto a significant performance degradation while running multiple OLAP-like queries on a replica.
> After some investigation, it became clear that the problem is in accessing old_snapshot_threshold parameter.
>
> Accessing old_snapshot_threshold parameter is guarded by mutex_threshold. This is not a problem on primary
> server, since we rarely call GetOldSnapshotThresholdTimestamp:
>
> 5028 void
> 5029 TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
> 5030 {
> 5031 ····if (RelationAllowsEarlyPruning(relation)
> 5032 ········&& (snapshot)->whenTaken < GetOldSnapshotThresholdTimestamp())
> 5033 ········ereport(ERROR,
> 5034 ················(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
> 5035 ················ errmsg("snapshot too old")));
>
> But in case of a replica, we have to call GetOldSnapshotThresholdTimestamp much often. So, this become a
> bottleneck. The customer solve this issue by setting old_snapshot_threshold to 0. But, I think, we can
> do something about it.
>
> Some more investigation:
>
> -- On primary --
> $ ./bin/psql postgres -c "create database benchmark"
> CREATE DATABASE
> $ ./bin/pgbench -i -Uorlov -s300 benchmark
> dropping old tables...
> NOTICE: table "pgbench_accounts" does not exist, skipping
> ...
> creating tables...
> generating data (client-side)...
> 30000000 of 30000000 tuples (100%) done (elapsed 142.37 s, remaining 0.00 s)
> vacuuming...
> creating primary keys...
> done in 177.67 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 144.45 s, vacuum 0.59 s, primary keys 32.61 s).
>
> -- On secondary --
> $ touch 1.sql
> $ vim 1.sql
> $ cat 1.sql
> \set bid random(1, 300)
> BEGIN;
> SELECT sum(aid) FROM pgbench_accounts where bid = :bid GROUP BY bid;
> END;
> $ ./bin/pgbench -f 1.sql -p5433 -Uorlov -j10 -c100 -T720 -P1 -n benchmark
> pgbench (16devel)
> progress: 1.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> ...
> progress: 20.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
>
> $ perf record -F 99 -a -g --call-graph=dwarf sleep 5
> $ perf script --header --fields comm,pid,tid,time,event,ip,sym,dso > file
> $ grep s_lock file | wc -l
>
> 3486
>
>
> My proposal is to use atomic for threshold_timestamp and threshold_xid. PFA 0001 patch.
> With patch 0001 we got:
>
> $ grep s_lock file2 | wc -l
> 8
>
>
> Maybe, we shall go farther and remove mutex_threshold here? This will lead to inconsistency of
> threshold_timestamp and threshold_xid, but is this really a problem?
>
> Thoughts?
I think optimizing locking and switching to atomics wherever it
improves performance is a good direction. If performance improvement
could be demonstrated in a more direct way it would be a good argument
to commit the improvement. Personally I like TPS plots like in [1].
Kind regards,
Pavel Borisov,
Supabase
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2023-01-24 11:20:51 | Re: Schema variables - new implementation for Postgres 15 (typo) |
Previous Message | Ranier Vilela | 2023-01-24 10:34:19 | Re: Non-decimal integer literals |