From: | Nandakumar M <m(dot)nanda92(at)gmail(dot)com> |
---|---|
To: | Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Need help with optimising simple query |
Date: | 2018-07-09 11:47:23 |
Message-ID: | CANcFUu5a9W_MORQa5Tv9vsofN_w_irU9-OBQtN34YmL7p4gP_g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi,
I am having a query that has an order by and a limit clause. The
column on which I am doing order by is indexed (default b tree index).
However the index is not being used. On tweaking the query a bit I
found that when I use left join index is not used whereas when I use
inner join the index is used.
Unfortunately, the behaviour we expect is that of left join only. My
question is, is there any way to modify/improve the query to improve
the query speed or is this the best that is possible for this case.
Please find below a simplified version of the queries. I tried the
queries on 9.3 and 10 versions and both gave similar results.
Table structure
performance_test=# \d+ child
Table "public.child"
Column | Type | Collation | Nullable | Default
| Storage | Stats target | Description
--------+--------+-----------+----------+-----------------------------------+----------+--------------+-------------
id | bigint | | not null |
nextval('child_id_seq'::regclass) | plain | |
name | text | | not null |
| extended | |
Indexes:
"child_pkey" PRIMARY KEY, btree (id)
"child_name_unique" UNIQUE CONSTRAINT, btree (name)
Referenced by:
TABLE "parent" CONSTRAINT "parent_child_id_fkey" FOREIGN KEY
(child_id) REFERENCES child(id)
performance_test=# \d+ parent
Table "public.parent"
Column | Type | Collation | Nullable | Default
| Storage | Stats target | Description
----------+--------+-----------+----------+------------------------------------+----------+--------------+-------------
id | bigint | | not null |
nextval('parent_id_seq'::regclass) | plain | |
name | text | | not null |
| extended | |
child_id | bigint | | |
| plain | |
Indexes:
"parent_pkey" PRIMARY KEY, btree (id)
"parent_name_unique" UNIQUE CONSTRAINT, btree (name)
"parent_child_id_idx" btree (child_id)
Foreign-key constraints:
"parent_child_id_fkey" FOREIGN KEY (child_id) REFERENCES child(id)
Query used to populate data
performance_test=# insert into child(name) select concat('child ',
gen.id) as name from (select generate_series(1,100000) as id) as gen;
performance_test=# insert into parent(name, child_id) select
concat('parent ', gen.id) as name, (id%100000) + 1 from (select
generate_series(1,1000000) as id) as gen;
Left join with order by using child name
performance_test=# explain analyze select * from parent left join
child on parent.child_id = child.id order by child.name limit 10;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=69318.55..69318.58 rows=10 width=59) (actual
time=790.708..790.709 rows=10 loops=1)
-> Sort (cost=69318.55..71818.55 rows=1000000 width=59) (actual
time=790.705..790.706 rows=10 loops=1)
Sort Key: child.name
Sort Method: top-N heapsort Memory: 27kB
-> Hash Left Join (cost=3473.00..47708.91 rows=1000000
width=59) (actual time=51.066..401.028 rows=1000000 loops=1)
Hash Cond: (parent.child_id = child.id)
-> Seq Scan on parent (cost=0.00..17353.00
rows=1000000 width=29) (actual time=0.026..67.848 rows=1000000
loops=1)
-> Hash (cost=1637.00..1637.00 rows=100000 width=19)
(actual time=50.879..50.879 rows=100000 loops=1)
Buckets: 65536 Batches: 2 Memory Usage: 3053kB
-> Seq Scan on child (cost=0.00..1637.00
rows=100000 width=19) (actual time=0.018..17.281 rows=100000 loops=1)
Planning time: 1.191 ms
Execution time: 790.797 ms
(12 rows)
Inner join with sorting according to child name
performance_test=# explain analyze select * from parent inner join
child on parent.child_id = child.id order by child.name limit 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.84..2.03 rows=10 width=59) (actual time=0.156..0.193
rows=10 loops=1)
-> Nested Loop (cost=0.84..119132.56 rows=1000000 width=59)
(actual time=0.154..0.186 rows=10 loops=1)
-> Index Scan using child_name_unique on child
(cost=0.42..5448.56 rows=100000 width=19) (actual time=0.126..0.126
rows=1 loops=1)
-> Index Scan using parent_child_id_idx on parent
(cost=0.42..1.04 rows=10 width=29) (actual time=0.019..0.045 rows=10
loops=1)
Index Cond: (child_id = child.id)
Planning time: 0.941 ms
Execution time: 0.283 ms
(7 rows)
Version
performance_test=# select version();
version
-----------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 10.4 (Ubuntu 10.4-2.pgdg14.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4, 64-bit
(1 row)
Any help from Postgres experts would be great.
Thanks,
Nanda
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-07-09 14:23:47 | Re: Need help with optimising simple query |
Previous Message | Mariel Cherkassky | 2018-07-09 06:18:28 | Re: where can I download the binaries of plpython extension |