From: | Moises Lopez <moylop260(at)vauxoo(dot)com> |
---|---|
To: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | The query plan get all columns but I'm using only one column. |
Date: | 2020-04-24 21:11:12 |
Message-ID: | CAGro9RXViOXNBBA-SZur_UbpWrbkg_Ow4ha13FJu9dOJ=WdB7Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
We have production database that has slow queries because of the query get
all columns even if I'm using only one column.
The result is slow for tables that there are too much columns
The weird part is that there is environment that I can't reproduce it even
if they are using the same postgresql.conf
I didn't find what is the variant/configuration to avoid it
I could reproduce it using the official docker image of postgresql
* Steps to reproduce it
1. Run the following script:
docker run --name psql1 -d -e POSTGRES_PASSWORD=pwd postgres
docker exec -it --user=postgres psql1 psql
# Into docker container
CREATE DATABASE db;
\connect db;
CREATE TABLE link (
ID serial PRIMARY KEY,
url VARCHAR (255) NOT NULL,
name VARCHAR (255) NOT NULL,
description VARCHAR (255),
rel VARCHAR (50)
);
EXPLAIN (ANALYZE, VERBOSE, BUFFERS)
SELECT l1.url
FROM link l1
JOIN link l2
ON l1.url=l2.url;
2. See result of the Query Plan:
QUERY PLAN
-------------------------------------------------------------------------------------------
Hash Join (cost=10.90..21.85 rows=40 width=516) (actual
time=0.080..0.081 rows=1 loops=1)
Output: l1.url
Hash Cond: ((l1.url)::text = (l2.url)::text)
Buffers: shared hit=5
-> Seq Scan on public.link l1 (cost=0.00..10.40 rows=40 width=516)
(actual time=0.010..0.011 rows=1 loops=1)
* Output: l1.id <http://l1.id>, l1.url, l1.name
<http://l1.name>, l1.description, l1.rel*
Buffers: shared hit=1
-> Hash (cost=10.40..10.40 rows=40 width=516) (actual
time=0.021..0.021 rows=1 loops=1)
Output: l2.url
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on public.link l2 (cost=0.00..10.40 rows=40
width=516) (actual time=0.010..0.011 rows=1 loops=1)
Output: l2.url
Buffers: shared hit=1
Planning Time: 0.564 ms
Execution Time: 0.142 ms
3. Notice that I'm using only the column "url" for "JOIN" and "SELECT"
section,
but the "Output" section is returning all columns.
Is there a manner to avoid returning all columns in order to get a better
performance?
Thank you in advance
* PostgreSQL version:
psql postgres -c "SELECT version()"
PostgreSQL 12.2 (Debian 12.2-2.pgdg100+1) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
Changes made to the settings in the postgresql.conf file: see Server
Configuration for a quick way to list them all.
without changes
Operating system and version:
cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
--
Moisés López
@moylop260
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2020-04-24 22:59:41 | Re: PostgreSQL does not choose my indexes well |
Previous Message | Tom Lane | 2020-04-24 19:39:52 | Re: PostgreSQL does not choose my indexes well |