Reporting performance issue
(1) By Jinho Jung (jinhojun) on 2021-02-23 14:33:03 [source]
Performance issues discovered from differential test
Hello. We are studying DBMS from GeorgiaTech and reporting interesting queries that potentially show performance problems.
To discover such cases, we used the following procedures:
- Install four DBMSs with latest version (PostgreSQL, SQLite, MySQL, CockroachDB)
- Import TPCC-C benchmark for each DBMS
- Generate random query (and translate the query to handle different dialects)
- Run the query and measure the query execution time
- Remove
LIMIT
to prevent any non-deterministic behaviors - Discard the test case if any DBMS returned an error
- Some DBMS does not show the actual query execution time. In this case, query the
current time
before and after the actual query, and then we calculate the elapsed time.
- Remove
In this report, we attached a few queries. We believe that there are many duplicated or false-positive cases. It would be great if we can get feedback about the reported queries. Once we know the root cause of the problem or false positive, we will make a follow-up report after we remove them all.
Query files link:
wget https://gts3.org/~jjung/report1/sq.tar.gz
Execution result (execution time (second))
Filename | Postgres | Mysql | Cockroachdb | Sqlite | Ratio |
---|---|---|---|---|---|
1753 | 0.00421 | 0.013 | 0.02623 | 4.639 | 1102.40 |
602 | 0.00194 | 0.004 | 0.01218 | 2.497 | 1286.32 |
1948 | 0.00168 | 0.001 | 0.00356 | 1.912 | 1912.14 |
2064 | 0.00117 | 0.0009 | 0.00202 | 2.271 | 2523.33 |
1009 | 0.00194 | 0.001 | 0.03355 | 2.783 | 2782.54 |
389 | 0.00112 | 0.0009 | 0.00193 | 2.375 | 2638.89 |
4293 | 0.00204 | 0.006 | 0.02212 | 2.573 | 1263.25 |
2121 | 0.00171 | 0.0009 | 0.00594 | 2.515 | 2794.44 |
4576 | 0.00168 | 0.005 | 0.0203 | 3.447 | 2050.46 |
1140 | 0.00148 | 0.005 | 0.00998 | 2.698 | 1823.14 |
3580 | 0.00132 | 0.001 | 0.0016 | 2.817 | 2817.20 |
2973 | 0.00202 | 0.005 | 0.01454 | 2.747 | 1362.71 |
1642 | 0.00243 | 0.01 | 0.00597 | 2.537 | 1046.10 |
1539 | 0.00254 | 0.004 | 0.00563 | 2.569 | 1009.48 |
2187 | 0.0121 | 0.054 | 0.05258 | 5.036 | 416.17 |
4475 | 0.00163 | 0.005 | 0.01066 | 2.822 | 1735.53 |
3733 | 0.00109 | 0.006 | 0.00729 | 2.451 | 2248.52 |
3805 | 0.00257 | 0.001 | 0.0209 | 3.735 | 3735.27 |
1292 | 0.00142 | 0.003 | 0.02727 | 1.252 | 884.20 |
2799 | 0.00227 | 0.005 | 0.05051 | 6.286 | 2770.35 |
2070 | 0.00196 | 0.005 | 0.01153 | 2.567 | 1306.49 |
4315 | 0.00193 | 0.0009 | 0.07619 | 2.779 | 3087.78 |
1159 | 0.00148 | 0.0009 | 0.00535 | 2.503 | 2781.11 |
2214 | 0.00106 | 0.005 | 0.00967 | 1.386 | 1303.72 |
2233 | 0.00164 | 0.005 | 0.01079 | 2.925 | 1786.82 |
2171 | 2.75006 | 0.285 | 0.64789 | 3.429 | 12.03 |
4383 | 0.00252 | 0.007 | 0.01062 | 2.771 | 1099.57 |
325 | 0.00154 | 0.004 | 0.00882 | 2.292 | 1489.29 |
2583 | 0.00219 | 0.001 | 0.01203 | 2.923 | 2922.51 |
1027 | 0.00186 | 0.004 | 0.00663 | 2.311 | 1243.81 |
641 | 0.00189 | 0.004 | 0.00585 | 1.367 | 724.12 |
185 | 0.68682 | 0.0009 | 0.04701 | 1.553 | 1725.56 |
4336 | 0.00105 | 0.011 | 0.01241 | 2.512 | 2387.51 |
2680 | 0.00143 | 0.006 | 0.03904 | 5.988 | 4187.31 |
1132 | 0.0849 | 0.001 | 0.24651 | 1.423 | 1422.76 |
597 | 0.00252 | 0.005 | 0.00607 | 2.55 | 1013.02 |
3315 | 0.00151 | 0.006 | 0.01007 | 2.699 | 1783.87 |
3698 | 0.00107 | 0.006 | 0.0069 | 2.49 | 2333.81 |
207 | 0.00148 | 0.004 | 0.00992 | 2.498 | 1691.26 |
3714 | 0.00165 | 0.006 | 0.01072 | 3.975 | 2406.17 |
1990 | 0.00144 | 0.126 | 0.44784 | 1.174 | 814.17 |
1858 | 0.00175 | 0.02 | 0.04427 | 5.774 | 3303.04 |
3741 | 0.00156 | 0.005 | 0.01082 | 2.677 | 1719.21 |
4369 | 0.04095 | 0.0009 | 0.5444 | 1.355 | 1505.56 |
1657 | 0.01819 | 0.001 | 0.21661 | 1.028 | 1028.07 |
4580 | 0.00149 | 0.006 | 0.01062 | 2.772 | 1860.56 |
432 | 0.00115 | 0.005 | 0.00669 | 2.332 | 2034.77 |
2333 | 0.00173 | 0.005 | 0.00685 | 2.365 | 1367.83 |
2498 | 0.00185 | 0.0009 | 0.01235 | 3.995 | 4438.89 |
999 | 0.00102 | 0.005 | 0.00596 | 2.314 | 2257.65 |
2228 | 0.00129 | 0.006 | 0.01223 | 1.428 | 1103.64 |
531 | 0.00153 | 0.0009 | 0.00501 | 2.468 | 2742.22 |
3438 | 0.00128 | 0.02 | 0.02662 | 5.674 | 4439.19 |
Reproduce: install DBMSs, import TPCC benchmark, run query
Cockroach (from binary)
# install DBMS
wget https://binaries.cockroachdb.com/cockroach-v20.2.5.linux-amd64.tgz
tar xzvf cockroach-v20.2.5.linux-amd64.tgz
sudo cp -i cockroach-v20.2.5.linux-amd64/cockroach /usr/local/bin/cockroach20
sudo mkdir -p /usr/local/lib/cockroach
sudo cp -i cockroach-v20.2.5.linux-amd64/lib/libgeos.so /usr/local/lib/cockroach/
sudo cp -i cockroach-v20.2.5.linux-amd64/lib/libgeos_c.so /usr/local/lib/cockroach/
# test
which cockroach20
cockroach20 demo
# start the DBMS (to make initial node files)
cockroach20 start-single-node --insecure --store=node20 --listen-addr=localhost:26259 --http-port=28080 --max-sql-memory=1GB --background
# quit
cockroach20 quit --insecure --host=localhost:26259
# import DB
mkdir -p node20/extern
wget https://gts3.org/~jjung/tpcc-perf/tpcc_cr.tar.gz
tar xzvf tpcc_cr.tar.gz
cp tpcc_cr.sql node20/tpcc.sql
# start the DBMS again and createdb
cockroach20 sql --insecure --host=localhost:26259 --execute="CREATE DATABASE IF NOT EXISTS cockroachdb;"
--cockroach20 sql --insecure --host=localhost:26259 --execute="DROP DATABASE cockroachdb;"
cockroach20 sql --insecure --host=localhost:26259 --database=cockroachdb --execute="IMPORT PGDUMP 'nodelocal://self/tpcc.sql';"
# test
cockroach20 sql --insecure --host=localhost:26259 --database=cockroachdb --execute="explain analyze select count(*) from order_line;"
# run query
cockroach20 sql --insecure --host=localhost --port=26259 --database=cockroachdb < query.sql
Mysql (from package manager)
# install DBMS
sudo apt install mysql-server
sudo mysql_secure_installation
sudo su
mysql
SHOW VARIABLES LIKE 'validate_password%';
SET GLOBAL validate_password_length=4;
SET GLOBAL validate_password_policy = 0;
CREATE USER 'mysql'@'localhost' IDENTIFIED BY 'mysql';
alter user 'root'@'localhost' identified by 'mysql';
# modify the conf (should add "skip-grant-tables" under [mysqld])
sudo vim /etc/mysql/mysql.conf.d
# import DB
wget https://gts3.org/~jjung/tpcc-perf/tpcc_my.tar.gz
tar xzvf tpcc_my.tar.gz
mysql -u mysql -pmysql -e "create database mysqldb"
mysql -u mysql -pmysql mysqldb < tpcc_my.sql
# test
mysql -u mysql -pmysql mysqldb -e "show tables"
mysql -u mysql -pmysql mysqldb -e "select * from customer"
# run query
mysql -u mysql -pmysql mysqldb < query.sql
Postgre (from SRC)
# build latest postgres
git clone https://github.com/postgres/postgres.git
mkdir bld
cd bld
../configure
make -j 20
# install DBMS
sudo su
make install
adduser postgres
rm -rf /usr/local/pgsql/data
mkdir /usr/local/pgsql/data
chown -R postgres /usr/local/pgsql/data
su - postgres
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
/usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start
/usr/local/pgsql/bin/createdb postgresdb
/usr/local/pgsql/bin/psql postgresdb
/usr/local/pgsql/bin/createuser -s {username}
/usr/local/pgsql/bin/createdb {username}
/usr/local/pgsql/bin/psql
=# alter {username} with superuser
# import DB
wget https://gts3.org/~jjung/tpcc-perf/tpcc_pg.tar.gz
tar xzvf tpcc_pg.tar.gz
/usr/local/pgsql/bin/psql -p 5432 -d postgresdb -f tpcc_pg.sql
# test
/usr/local/pgsql/bin/psql -p 5432 -d postgresdb -c "select * from warehouse"
# run query
/usr/local/pgsql/bin/psql -p 5432 -d postgresdb -f query.sql
Sqlite (from SRC)
# build latest sqlite from src
git clone https://github.com/sqlite/sqlite.git
cd sqlite
mkdir bld
cd bld
../configure
make -j 20
# install DBMS
sudo make install
# import DB
wget https://gts3.org/~jjung/tpcc-perf/tpcc_sq.tar.gz
tar xzvf tpcc_sq.tar.gz
# test
sqlite3 tpcc_sq.db
sqlite> select * from warehouse;
# run query
sqlite3 tpcc_sq.db < query.sql
(2.1) By Dan Kennedy (dan) on 2021-02-24 17:21:02 edited from 2.0 in reply to 1 [link] [source]
Thanks for posting these. Very interesting indeed.
All of them seem to be cases where the planner should be able to determine that no work (or very little) is required. I guess that's a product of your 1000x threshold - it's hard to be 1000x slower unless you miss a really good shortcut! I think instances where one db is consistently more than (say) 10x slower would be quite interesting as well.
Anyway, for the record these all seem to come down to two things:
(1) SQLite is not taking advantage of NOT NULL constraints when processing IS NULL. With SQLite, if you do:
CREATE TABLE t1(a NOT NULL);
SELECT * FROM t1 WHERE a IS NULL;
SQLite does a full-table scan. But it should be possible to determine at compile-time that any "col IS NULL" expression for which col has a NOT NULL constraint and does not come from a table on the right-hand-side of a LEFT JOIN is always false.
(2) SQLite is not ignoring columns in the result-set of an EXISTS(SELECT ...) when determining whether or not the SELECT is correlated. i.e. if you do:
CREATE TABLE t1(a, b);
CREATE TABLE t2(x, y);
SELECT * FROM t1 WHERE EXISTS (SELECT a FROM t2 WHERE x=1);
then SQLite runs the sub-query once for every row of t1 because it thinks the "a" in the sub-query makes it correlated. It doesn't - the sub-query only has to be run once.
Naive changes to support the two optimizations above fix most of these examples. But there are a couple of tricks. For example, query 1753 features the equivalent of:
CREATE TABLE t1(a, b);
CREATE TABLE t2(x, y NOT NULL);
SELECT * FROM t1 LEFT JOIN t2 WHERE t1.a=t2.x AND y IS NULL;
In the query, the LEFT JOIN can be transformed to a regular join (because t1.a=t2.x implies that t2.x cannot be NULL) which allows the database to determine that "y IS NULL" will always be false. But currently, by the time SQLite realizes it can transform the join it's tricky to tell exactly which LEFT JOIN (there may be many in the query) column "y" came from.
There's a similar problem in 1948:
CREATE TABLE t1(a, b);
CREATE TABLE t2(x, y NOT NULL);
CREATE TABLE t3(c, d);
SELECT * FROM t1 LEFT JOIN t2 WHERE t1.a=t2.x EXISTS (SELECT 1 FROM t3 WHERE t2.y IS NULL OR t3.c=10);
By the time the LEFT JOIN is transformed to a regular join, the sub-select has already been marked as correlated. So even though the "t2.y IS NULL" is transformed to "false", making the sub-select uncorrelated, SQLite still runs it once for every row visited by the query.
Apparently, other database engines do better in these cases.
I'm not sure how much of a rush we'll be in to fix these specific cases. Something for after 3.35.0 anyway.
Thanks,
Dan.
(3) By Jinho Jung (jinhojun) on 2021-02-25 18:28:15 in reply to 2.1 [link] [source]
We appreciate your feedback on our report.
From our evaluation, we have other queries that also showed a slow-down. Based on your feedback, we will apply a filter rule to remove all queries that have the same reason described above, and then we will report new cases if they are not from the same root cause.
Thanks, Jinho Jung
(4) By Keith Medcalf (kmedcalf) on 2021-02-25 18:54:34 in reply to 2.1 [link] [source]
Naive changes to support the two optimizations above fix most of these examples. But there are a couple of tricks. For example, query 1753 features the equivalent of:
CREATE TABLE t1(a, b);
CREATE TABLE t2(x, y NOT NULL);
SELECT * FROM t1 LEFT JOIN t2 WHERE t1.a=t2.x AND y IS NULL;In the query, the LEFT JOIN can be transformed to a regular join (because t1.a=t2.x implies that t2.x cannot be NULL) which allows the database to determine that "y IS NULL" will always be false. But currently, by the time SQLite realizes it can transform the join it's tricky to tell exactly which LEFT JOIN (there may be many in the query) column "y" came from.
I would say that this case (and the one following) are even simpler. There is no ON condition attached to the RHS table of the left join which means that the join type cannot be an outer join and must be an inner join and this can be done even before examination of the where clause.
(5) By Jinho Jung (jinhojun) on 2021-02-28 18:13:43 in reply to 4 [link] [source]
Dan and Keith,
We appreciate your feedback! Based on the comments above, we applied a filter to remove duplicated cases where "IS NULL" or "EXISTS (" caused the slowdown. Now we have a few queries.
For example, this query runs x300 slower than other DBMSs. It would be great if you can confirm this one as another unique class so that we can add this pattern to our testing machine.
select ref_1.o_w_id as c0
from main.customer as ref_0
inner join main.orders as ref_1
on (ref_0.c_credit_lim is not NULL)
inner join main.warehouse as ref_2
on (ref_0.c_street_2 = ref_2.w_street_1 )
where (ref_0.c_delivery_cnt is not NULL)
and (ref_1.o_entry_d is not NULL);
- Query files link:
$ wget https://gts3.org/~jjung/report2/sq.tar.gz
- Execution result (execution time (second))
Filename | Postgres | Mysql | Cockroachdb | Sqlite | Ratio | QuerySize |
---|---|---|---|---|---|---|
22620 | 0.03366 | 0.026 | 0.00765 | 5.0 | 653.18 | 1319 |
9681 | 0.00792 | 0.031 | 0.05196 | 5.0 | 631.14 | 484 |
19401 | 0.02936 | 0.085 | 0.10491 | 5.0 | 170.28 | 475 |
12652 | 0.01508 | 0.042 | 0.04319 | 5.0 | 331.54 | 307 |
24172 | 0.00291 | 0.015 | 0.33168 | 2.223 | 763.38 | 704 |