From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | postgresql(at)taljaren(dot)se |
Subject: | BUG #17717: Regression in vacuumdb (15 is slower than 10/11 and possible memory issue) |
Date: | 2022-12-13 10:57:39 |
Message-ID: | 17717-6c50eb1c7d23a886@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
The following bug has been logged on the website:
Bug reference: 17717
Logged by: Gunnar L
Email address: postgresql(at)taljaren(dot)se
PostgreSQL version: 15.0
Operating system: Ubuntu Linux
Description:
We have observed a significant slowdown in vacuumdb performance between
different versions of postgresql. And possibly also a memory issue.
We run a specific data model, where each customer has its own schema with
its own set of tables. Each database server hosts 16 databases, each
containing around 250 customer schemas. Due to postgres creating a new file
for each database object, we end up with around 5 million files on each
database server. This may or may not be related to the issue we're seeing
(new algorithms with new time complexity?)
We upgraded from postgresql 9.5 to postgresql 13, and noticed a significant
slowdown in how vacuumdb performs. Before, we could run a vacuumdb -a -z
each night, taking around 2 hours to complete. After the upgrade, we see a
constant 100% CPU utilization during the vacuumdb process (almost no I/O
activity), and vacuumdb cannot complete within a reasonable time. We're able
to vacuum about 3-4 databases each night.
We are able to recreate this issue, using a simple bash script to generate a
similar setup.
From local testing, here are our findings:
Concerning speed:
* Version 9.5, 10, 11 are fast (9.5 slower than 10 and 11)
* Version 12, 13, 14 are very, very slow
* Version 15 is faster (a lot faster than 12,13,14) but not nearly as fast
as 10 or 11.
Concerning memory usage:
* Version 15 is using a lot more shared memory OR it might not be releasing
it properly after vacuuming a db.
These are the timings for vacuuming the 16 dbs.
Version Seconds Completed
------------------------------
9.5 412 16/16
10 178 16/16
11 166 16/16
12 8319 1/16 or 2/16 (manually aborted)
13 18853 3/16 or 4/16 (manually aborted)
14 16857 3/16 or 4/16 (manually aborted)
15 617 1/16 (crashed!)
15 4158 6/16 (crashed! --shm-size=256mb)
15 9500 16/16 (--shm-size=4096mb)
The timing of the only successful run for postgres 15 is somewhat flaky,
since the machine was suspended for about 1-1.5 hours so 9500 is only an
estimate, but the first run (1 db completed in 10 minutes) gives that it is
faster than 12-14 but slower than 10 and 11 (3 minutes to complete
everything)
The following describes our setup
This is the script (called setup.sh) we’re using to populate the databases
(we give a port number as parameter)
##### start of setup.sh
export PGPASSWORD=mysecretpassword
PORT=$1
echo ""> tables_$PORT.sql
for schema in `seq -w 1 250`; do
echo "create schema schema$schema;" >> tables_$PORT.sql
for table in `seq -w 1 500`; do
echo "create table schema$schema.table$table (id int);" >>
tables_$PORT.sql
done
done
echo "Setting up db: 01"
createdb -h localhost -U postgres -p $PORT db01
psql -q -h localhost -U postgres -p $PORT db01 -f tables_$PORT.sql
# This seems to be the fastest way to create the databases
for db in `seq -w 2 16`; do
echo "Setting up db: $db"
createdb -h localhost -U postgres -p $PORT --template db01 db$db
done
####### end of setup.sh
To execute a test for a particular postgres version (in this example PG
9.5), we run the following. It will setup PG 9.5 on port 15432.
docker run --rm --name pg95 -e POSTGRES_PASSWORD=mysecretpassword -p
15432:5432 -d postgres:9.5
./setup.sh 15432
date; time docker exec -it pg95 bash -c "vacuumdb -a -z -U postgres"; date
(The date commands are added to keep track of when tasks were started).
Here are complete set of commands and output and comments
(We use different ports for different versions of PG)
date; time docker exec -it pg95 bash -c "vacuumdb -a -z -U postgres"; date
(The date commands since it takes some time to run)
time docker exec -it pg95 bash -c "vacuumdb -a -z -U postgres"
vacuumdb: vacuuming database "db01"
…<snip>...
vacuumdb: vacuuming database "db16"
vacuumdb: vacuuming database "postgres"
vacuumdb: vacuuming database "template1"
real 6m52,070s
user 0m0,048s
sys 0m0,029s
time docker exec -it pg10 bash -c "vacuumdb -a -z -U postgres"
vacuumdb: vacuuming database "db01"
…<snip>...
vacuumdb: vacuuming database "db16"
vacuumdb: vacuuming database "postgres"
vacuumdb: vacuuming database "template1"
real 2m58,354s
user 0m0,043s
sys 0m0,013s
time docker exec -it pg11 bash -c "vacuumdb -a -z -U postgres"
vacuumdb: vacuuming database "db01"
…<snip>...
vacuumdb: vacuuming database "db16"
vacuumdb: vacuuming database "postgres"
vacuumdb: vacuuming database "template1"
real 2m46,181s
user 0m0,047s
sys 0m0,012s
date; time docker exec -it pg12 bash -c "vacuumdb -a -z -U postgres"; date
lör 10 dec 2022 18:57:43 CET
vacuumdb: vacuuming database "db01"
vacuumdb: vacuuming database "db02"
^CCancel request sent
vacuumdb: error: vacuuming of table "schema241.table177" in database "db02"
failed: ERROR: canceling statement due to user request
real 138m39,600s
user 0m0,177s
sys 0m0,418s
lör 10 dec 2022 21:16:22 CET
date;time docker exec -it pg13 bash -c "vacuumdb -a -z -U postgres"
lör 10 dec 2022 07:22:32 CET
vacuumdb: vacuuming database "db01"
vacuumdb: vacuuming database "db02"
vacuumdb: vacuuming database "db03"
vacuumdb: vacuuming database "db04"
^CCancel request sent
real 314m13,172s
user 0m0,551s
sys 0m0,663s
lör 10 dec 2022 12:37:03 CET
date;time docker exec -it pg14 bash -c "vacuumdb -a -z -U postgres"; date
lör 10 dec 2022 14:15:37 CET
vacuumdb: vacuuming database "db01"
vacuumdb: vacuuming database "db02"
vacuumdb: vacuuming database "db03"
vacuumdb: vacuuming database "db04"
^CCancel request sent
real 280m57,172s
user 0m0,586s
sys 0m0,559s
lör 10 dec 2022 18:56:34 CET
date;time docker exec -it pg15 bash -c "vacuumdb -a -z -U postgres"; date
lör 10 dec 2022 12:50:25 CET
vacuumdb: vacuuming database "db01"
vacuumdb: vacuuming database "db02"
vacuumdb: error: processing of database "db02" failed: ERROR: could not
resize shared memory segment "/PostgreSQL.2952321776" to 27894720 bytes: No
space left on device
real 10m17,913s
user 0m0,030s
sys 0m0,049s
lör 10 dec 2022 13:00:43 CET
# it was faster, but we need to extend shared memory to make it work
docker run --rm --name pg15 --shm-size=256mb -e
POSTGRES_PASSWORD=mysecretpassword -p 55555:5432 -d postgres:15
date;time docker exec -it pg15 bash -c "vacuumdb -a -z -U postgres"; date
mån 12 dec 2022 08:56:17 CET
vacuumdb: vacuuming database "db01"
…<snip>...
vacuumdb: vacuuming database "db07"
vacuumdb: error: processing of database "db07" failed: ERROR: could not
resize shared memory segment "/PostgreSQL.1003084622" to 27894720 bytes: No
space left on device
real 69m18,345s
user 0m0,217s
sys 0m0,086s
mån 12 dec 2022 10:05:36 CET
docker run --rm --name pg15 --shm-size=4096mb -e
POSTGRES_PASSWORD=mysecretpassword -p 55555:5432 -d postgres:15
date;time docker exec -it pg15 bash -c "vacuumdb -a -z -U postgres"; date
mån 12 dec 2022 11:16:11 CET
vacuumdb: vacuuming database "db01"
…<snip>...
vacuumdb: vacuuming database "db16"
vacuumdb: vacuuming database "postgres"
vacuumdb: vacuuming database "template1"
real 232m46,168s
user 0m0,227s
sys 0m0,467s
mån 12 dec 2022 15:08:57 CET
Here is the hardware that was used
AMD Ryzen 7 PRO 5850U with Radeon Graphics
8 Cores, 16 threads
$ free
total used free shared buff/cache
available
Mem: 28562376 5549716 752624 1088488 22260036
21499752
Swap: 999420 325792 673628
Disk: NVMe device, Samsung SSD 980 1TB
From | Date | Subject | |
---|---|---|---|
Next Message | Bowen Shi | 2022-12-13 11:03:36 | Re: BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG |
Previous Message | PG Bug reporting form | 2022-12-13 07:50:36 | BUG #17716: walsender process hang while decoding 'DROP PUBLICATION' XLOG |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2022-12-13 11:17:58 | Re: Raising the SCRAM iteration count |
Previous Message | Drouvot, Bertrand | 2022-12-13 10:48:17 | Re: Minimal logical decoding on standbys |