Potential performance issues

From: "Jung, Jinho" <jinho(dot)jung(at)gatech(dot)edu>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Potential performance issues
Date: 2021-02-28 15:04:33
Message-ID: BN6PR07MB313763C4040616159DE1B6FBEE9B9@BN6PR07MB3137.namprd07.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

# Performance issues discovered from differential test

Hello. We are studying DBMS from GeorgiaTech and reporting interesting queries that potentially show performance problems.

To discover such cases, we used the following procedures:

* Install four DBMSs with the latest version (PostgreSQL, SQLite, MySQL, CockroachDB)
* Import TPCC-C benchmark for each DBMS
* Generate random query (and translate the query to handle different dialects)
* Run the query and measure the query execution time
* Remove `LIMIT` to prevent any non-deterministic behaviors
* Discard the test case if any DBMS returned an error
* Some DBMS does not show the actual query execution time. In this case, query the `current time` before and after the actual query, and then we calculate the elapsed time.

In this report, we attached a few queries. We believe that there are many duplicated or false-positive cases. It would be great if we can get feedback about the reported queries. Once we know the root cause of the problem or false positive, we will make a follow-up report after we remove them all.

For example, the below query runs x1000 slower than other DBMSs from PostgreSQL.

select ref_0.ol_amount as c0
from order_line as ref_0
left join stock as ref_1
on (ref_0.ol_o_id = ref_1.s_w_id )
inner join warehouse as ref_2
on (ref_1.s_dist_09 is NULL)
where ref_2.w_tax is NULL;

* Query files link:

wget https://gts3.org/~jjung/report1/pg.tar.gz

* Execution result (execution time (second))

| Filename | Postgres | Mysql | Cockroachdb | Sqlite | Ratio |
|---------:|---------:|---------:|------------:|---------:|---------:|
| 34065 | 1.31911 | 0.013 | 0.02493 | 1.025 | 101.47 |
| 36399 | 3.60298 | 0.015 | 1.05593 | 3.487 | 240.20 |
| 35767 | 4.01327 | 0.032 | 0.00727 | 2.311 | 552.19 |
| 11132 | 4.3518 | 0.022 | 0.00635 | 3.617 | 684.88 |
| 29658 | 4.6783 | 0.034 | 0.00778 | 2.63 | 601.10 |
| 19522 | 1.06943 | 0.014 | 0.00569 | 0.0009 | 1188.26 |
| 38388 | 3.21383 | 0.013 | 0.00913 | 2.462 | 352.09 |
| 7187 | 1.20267 | 0.015 | 0.00316 | 0.0009 | 1336.30 |
| 24121 | 2.80611 | 0.014 | 0.03083 | 0.005 | 561.21 |
| 25800 | 3.95163 | 0.024 | 0.73027 | 3.876 | 164.65 |
| 2030 | 1.91181 | 0.013 | 0.04123 | 1.634 | 147.06 |
| 17383 | 3.28785 | 0.014 | 0.00611 | 2.4 | 538.45 |
| 19551 | 4.70967 | 0.014 | 0.00329 | 0.0009 | 5232.97 |
| 26595 | 3.70423 | 0.014 | 0.00601 | 2.747 | 615.92 |
| 469 | 4.18906 | 0.013 | 0.12343 | 0.016 | 322.23 |

# Reproduce: install DBMSs, import TPCC benchmark, run query

### Cockroach (from binary)

```sh
# install DBMS
wget https://binaries.cockroachdb.com/cockroach-v20.2.5.linux-amd64.tgz
tar xzvf cockroach-v20.2.5.linux-amd64.tgz
sudo cp -i cockroach-v20.2.5.linux-amd64/cockroach /usr/local/bin/cockroach20

sudo mkdir -p /usr/local/lib/cockroach
sudo cp -i cockroach-v20.2.5.linux-amd64/lib/libgeos.so /usr/local/lib/cockroach/
sudo cp -i cockroach-v20.2.5.linux-amd64/lib/libgeos_c.so /usr/local/lib/cockroach/

# test
which cockroach20
cockroach20 demo

# start the DBMS (to make initial node files)
cd ~
cockroach20 start-single-node --insecure --store=node20 --listen-addr=localhost:26259 --http-port=28080 --max-sql-memory=1GB --background
# quit
cockroach20 quit --insecure --host=localhost:26259

# import DB
mkdir -p node20/extern
wget https://gts3.org/~jjung/tpcc-perf/tpcc_cr.tar.gz
tar xzvf tpcc_cr.tar.gz
cp tpcc_cr.sql node20/tpcc.sql

# start the DBMS again and createdb
cockroach20 sql --insecure --host=localhost:26259 --execute="CREATE DATABASE IF NOT EXISTS cockroachdb;"
--cockroach20 sql --insecure --host=localhost:26259 --execute="DROP DATABASE cockroachdb;"

cockroach20 sql --insecure --host=localhost:26259 --database=cockroachdb --execute="IMPORT PGDUMP 'nodelocal://self/tpcc.sql';"

# test
cockroach20 sql --insecure --host=localhost:26259 --database=cockroachdb --execute="explain analyze select count(*) from order_line;"

# run query
cockroach20 sql --insecure --host=localhost --port=26259 --database=cockroachdb < query.sql
```

### Postgre (from SRC)

```sh
# remove any previous postgres (if exist)
sudo apt-get --purge remove postgresql postgresql-doc postgresql-common

# build latest postgres
git clone https://github.com/postgres/postgres.git
mkdir bld
cd bld
../configure
make -j 20

# install DBMS
sudo su
make install
adduser postgres
rm -rf /usr/local/pgsql/data
mkdir /usr/local/pgsql/data
chown -R postgres /usr/local/pgsql/data
su - postgres
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start
/usr/local/pgsql/bin/createdb jjung
#/usr/local/pgsql/bin/psql postgresdb

/usr/local/pgsql/bin/createuser -s {username}
/usr/local/pgsql/bin/createdb postgresdb
/usr/local/pgsql/bin/psql

=# alter {username} with superuser

# import DB
wget https://gts3.org/~jjung/tpcc-perf/tpcc_pg.tar.gz
tar xzvf tpcc_pg.tar.gz
/usr/local/pgsql/bin/psql -p 5432 -d postgresdb -f tpcc_pg.sql

# test
/usr/local/pgsql/bin/psql -p 5432 -d postgresdb -c "select * from warehouse"
/usr/local/pgsql/bin/psql -p 5432 -d postgresdb -c "\\dt"

# run query
/usr/local/pgsql/bin/psql -p 5432 -d postgresdb -f query.sql
```

### Sqlite (from SRC)

```sh
# uninstall any existing
sudo apt purge sliqte3

# build latest sqlite from src
git clone https://github.com/sqlite/sqlite.git
cd sqlite
mkdir bld
cd bld
../configure
make -j 20

# install DBMS
sudo make install

# import DB
wget https://gts3.org/~jjung/tpcc-perf/tpcc_sq.tar.gz
tar xzvf tpcc_sq.tar.gz

# test
sqlite3 tpcc_sq.db
sqlite> select * from warehouse;

# run query
sqlite3 tpcc_sq.db < query.sql
```

### Mysql (install V8.0.X)

```sh
# remove mysql v5.X (if exist)
sudo apt purge mysql-server mysql-common mysql-client

# install
wget https://dev.mysql.com/get/mysql-apt-config_0.8.16-1_all.deb
sudo dpkg -i mysql-apt-config_0.8.16-1_all.deb
# then select mysql 8.0 server
sudo apt update
sudo apt install mysql-client mysql-community-server mysql-server

# check
mysql -u root -p

# create user mysql
CREATE USER 'mysql'@'localhost' IDENTIFIED BY 'mysql';
alter user 'root'@'localhost' identified by 'mysql';

# modify the conf (should add "skip-grant-tables" under [mysqld])
sudo vim /etc/mysql/mysql.conf.d/mysqld.cnf

# optimize
# e.g., https://gist.github.com/fevangelou/fb72f36bbe333e059b66

# import DB
wget https://gts3.org/~jjung/tpcc-perf/tpcc_my.tar.gz
tar xzvf tpcc_my.tar.gz
mysql -u mysql -pmysql -e "create database mysqldb"
mysql -u mysql -pmysql mysqldb < tpcc_my.sql

# test
mysql -u mysql -pmysql mysqldb -e "show tables"
mysql -u mysql -pmysql mysqldb -e "select * from customer"

# run query
mysql -u mysql -pmysql mysqldb < query.sql
```

# Evaluation environment

* Server: Ubuntu 18.04 (64bit)
* CockroachDB: v20.2.5
* PostgreSQL: latest commit (21 Feb, 2021)
* MySQL: v8.0.23
* SQLite: latest commit (21 Feb, 2021)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2021-03-01 07:50:28 Re: proposal: schema variables
Previous Message Tom Lane 2021-02-26 03:00:18 Re: Disabling options lowers the estimated cost of a query