EXISTS by itself vs SELECT EXISTS much slower in query.

From: Jimmy A <jimmypsql(at)gmail(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: EXISTS by itself vs SELECT EXISTS much slower in query.
Date: 2021-11-07 22:27:14
Message-ID: CANU97yu0HxOdgZ_2J+kNRzeuh-XF4h2EYF8df2i--ruP0fuXwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

A description of what you are trying to achieve and what results you
expect.:
I have two equivalent queries, one with an EXISTS clause by itself and one
wrapped in a (SELECT EXISTS) and the "naked" exists is much slower.
I would expect both to be the same speed / have same execution plan.

-- slow
explain (analyze, buffers)
SELECT
parent.*,
EXISTS (SELECT * FROM child WHERE child.parent_id=parent.parent_id) AS
child_exists
FROM parent
ORDER BY parent_id LIMIT 10;

-- fast
explain (analyze, buffers)
SELECT
parent.*,
(SELECT EXISTS (SELECT * FROM child WHERE
child.parent_id=parent.parent_id)) AS child_exists
FROM parent
ORDER BY parent_id LIMIT 10;

-- slow
https://explain.depesz.com/s/DzcK

-- fast
https://explain.depesz.com/s/EftS

Setup:
CREATE TABLE parent(parent_id BIGSERIAL PRIMARY KEY, name text);
CREATE TABLE child(child_id BIGSERIAL PRIMARY KEY, parent_id bigint
references parent(parent_id), name text);

-- random name and sequential primary key for 100 thousand parents.
INSERT INTO parent
SELECT
nextval('parent_parent_id_seq'),
md5(random()::text)
FROM generate_series(1, 100000);

-- 1 million children.
-- set every odd id parent to have children. even id parent gets none.
INSERT INTO child
SELECT
nextval('child_child_id_seq'),
((generate_series/2*2) % 100000)::bigint + 1,
md5(random()::text)
FROM generate_series(1, 1000000);

CREATE INDEX ON child(parent_id);
VACUUM ANALYZE parent, child;

Both queries return the same results - I have taken a md5 of both queries
without the LIMIT clause to confirm.
Tables have been vacuumed and analyzed.
No other queries are being executed.
Reproducible with LIMIT 1 or LIMIT 100 or LIMIT 500.
Changing work_mem makes no difference.

-[ RECORD 1 ]--+---------
relname | parent
relpages | 935
reltuples | 100000
relallvisible | 935
relkind | r
relnatts | 2
relhassubclass | f
reloptions |
pg_table_size | 7700480
-[ RECORD 2 ]--+---------
relname | child
relpages | 10310
reltuples | 1e+06
relallvisible | 10310
relkind | r
relnatts | 3
relhassubclass | f
reloptions |
pg_table_size | 84516864

PostgreSQL version number you are running:
PostgreSQL 13.4 on arm-apple-darwin20.5.0, compiled by Apple clang version
12.0.5 (clang-1205.0.22.9), 64-bit

How you installed PostgreSQL:
Using homebrew for mac.
brew install postgres

Changes made to the settings in the postgresql.conf file: see Server
Configuration for a quick way to list them all.
checkpoint_completion_target | 0.9 | configuration file
checkpoint_timeout | 30min | configuration file
client_encoding | UTF8 | client
cpu_tuple_cost | 0.03 | configuration file
effective_cache_size | 4GB | configuration file
log_directory | log | configuration file
log_min_duration_statement | 25ms | configuration file
log_statement | none | configuration file
log_temp_files | 0 | configuration file
log_timezone | America/Anchorage | configuration file
maintenance_work_mem | 512MB | configuration file
max_parallel_maintenance_workers | 2 | configuration file
max_parallel_workers | 4 | configuration file
max_parallel_workers_per_gather | 4 | configuration file
max_stack_depth | 2MB | environment
variable
max_wal_size | 10GB | configuration file
max_worker_processes | 4 | configuration file
min_wal_size | 80MB | configuration file
random_page_cost | 1.1 | configuration file
shared_buffers | 512MB | configuration file
shared_preload_libraries | auto_explain | configuration file
track_io_timing | on | configuration file
vacuum_cost_limit | 1000 | configuration file
wal_buffers | 64MB | configuration file
wal_compression | on | configuration file
work_mem | 128MB | configuration file

Operating system and version:
macOS Big Sur 11.2.3
I have confirmed this to happen on ubuntu linux however.

What program you're using to connect to PostgreSQL:
psql

Is there anything relevant or unusual in the PostgreSQL server logs?:
no

Hardware specs:
MacBook Air10,1 M1
8GB RAM
APPLE SSD AP0512Q 500.28GB

Attachment Content-Type Size
setup.sql application/octet-stream 1.2 KB

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Vasya Boytsov 2021-11-08 09:22:13 Re: EXISTS by itself vs SELECT EXISTS much slower in query.
Previous Message Tomas Vondra 2021-11-05 17:29:37 Re: PostgreSQLv14 TPC-H performance GCC vs Clang