PostgreSQL optimizer use seq scan instead of pkey index only scan (in queries with postgres_fdw)

From: Vitaly Baranovsky <barvetalforums(at)gmail(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: PostgreSQL optimizer use seq scan instead of pkey index only scan (in queries with postgres_fdw)
Date: 2019-05-06 14:43:39
Message-ID: CALWbrdE9pw4+sQjOpa_VFWfLX+A4-kjUrtYEWzxZ7jcX-emRLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello all,

I faced strange behavior of PostgreSQL during the query execution.

So, I have to databases: local and foreign. There are foreign server
definitions in the local database (via postgres_fdw). The local database
has table 'local_table'. The foreign database has table 'foreign_table'.
Both of them have only 1 column: 'primary_uuid'. This column in both
databases is a primary key column. Schema on a local server that stores
remote server definitions is 'foreign_server'. Each table has 100K rows.
Vacuum analyze has been run for both servers.

When I run a query:
SELECT *
FROM
(
SELECT foreign_table.primary_uuid
FROM foreign_server.foreign_table
UNION ALL
SELECT local_table.primary_uuid
FROM local_table
)
join_view
WHERE
join_view.primary_uuid in (select
'19b2db7e-db89-48eb-90b1-0bd468a2346b'::uuid)

I expect that the server will use the pkey index for the local table. But
it uses seq scan instead!

"Hash Semi Join (cost=100.03..3346.23 rows=51024 width=16) (actual
time=482.235..482.235 rows=0 loops=1)"
" Output: foreign_table.primary_uuid"
" Hash Cond: (foreign_table.primary_uuid =
('ef89a151-3eab-42af-8ecc-8850053aa1bb'::uuid))"
" -> Append (cost=100.00..2510.68 rows=102048 width=16) (actual
time=0.529..463.563 rows=200000 loops=1)"
" -> Foreign Scan on foreign_server.foreign_table
(cost=100.00..171.44 rows=2048 width=16) (actual time=0.528..446.715
rows=100000 loops=1)"
" Output: foreign_table.primary_uuid"
" Remote SQL: SELECT primary_uuid FROM public.foreign_table"
" -> Seq Scan on public.local_table (cost=0.00..1829.00
rows=100000 width=16) (actual time=0.021..6.358 rows=100000 loops=1)"
" Output: local_table.primary_uuid"
" -> Hash (cost=0.02..0.02 rows=1 width=16) (actual time=0.006..0.006
rows=1 loops=1)"
" Output: ('ef89a151-3eab-42af-8ecc-8850053aa1bb'::uuid)"
" Buckets: 1024 Batches: 1 Memory Usage: 9kB"
" -> Result (cost=0.00..0.01 rows=1 width=16) (actual
time=0.001..0.001 rows=1 loops=1)"
" Output: 'ef89a151-3eab-42af-8ecc-8850053aa1bb'::uuid"
"Planning Time: 0.126 ms"
"Execution Time: 482.572 ms""Execution Time: 509.315 ms"

So, as you can see, the execution time is 509 ms! It could be very fast if
PostgreSQL used primary key index!

Also, please, note, that SQL without WHERE clause has been set to the
foreign server:
" Remote SQL: SELECT primary_uuid FROM public.foreign_table"

So, the optimizer doesn't select optimal plans for such executions :(

Looks like it's an optimizer inadequacy.

Does someone know, how to optimize this query without query rewriting
(queries like this are generated from the Data Access layer and it's hard
to rebuild that layer)?

Thank you

P.S.: Answers to standard questions:
> PostgreSQL version number you are running:
PostgreSQL 11.2, compiled by Visual C++ build 1914, 64-bit

> How you installed PostgreSQL:
By downloaded standard Windows 64 installer

> Changes made to the settings in the postgresql.conf file:
shared_preload_libraries = '$libdir/pg_stat_statements'

> Operating system and version:
Windows 10 Enterprise 64-bit

> What program you're using to connect to PostgreSQL:
pgAdmin III

> Is there anything relevant or unusual in the PostgreSQL server logs?:
Nope

P.P.S.: DDL scripts:
for the foreign database:
CREATE TABLE public.foreign_table
(
primary_uuid uuid NOT NULL,
CONSTRAINT "PKEY" PRIMARY KEY (primary_uuid)
)

for local database:
CREATE TABLE public.local_table
(
primary_uuid uuid NOT NULL,
CONSTRAINT local_table_pkey PRIMARY KEY (primary_uuid)
)


Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2019-05-06 15:32:03 Re: PostgreSQL optimizer use seq scan instead of pkey index only scan (in queries with postgres_fdw)
Previous Message Tom Lane 2019-04-30 20:38:04 Re: Failure to reordering in case of a lateral join in combination with a left join (not inner join) resulting in suboptimal nested loop plan