From: | Dimitrios Apostolou <jimis(at)gmx(dot)net> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions |
Date: | 2024-05-10 19:24:00 |
Message-ID: | 69077f15-4125-2d63-733f-21ce6eac4f01@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, 10 May 2024, Dimitrios Apostolou wrote:
> I noticed that the following query is very very slow (too long to wait for it
> to finish):
>
> SELECT DISTINCT workitem_n FROM test_runs_raw ORDER BY workitem_n DESC LIMIT 10;
Update: even the simplest SELECT DISTINCT query shows similar behaviour:
EXPLAIN
SELECT DISTINCT workitem_n FROM test_runs_raw LIMIT 10;
Limit (cost=724518979.52..724518979.92 rows=10 width=4)
-> Unique (cost=724518979.52..724518987.52 rows=200 width=4)
-> Sort (cost=724518979.52..724518983.52 rows=1600 width=4) Sort Key: test_runs_raw.workitem_n
-> Gather (cost=724518732.37..724518894.37 rows=1600 width=4)
Workers Planned: 4
-> HashAggregate (cost=724517732.37..724517734.37 rows=200 width=4)
Group Key: test_runs_raw.workitem_n
-> Parallel Append (cost=0.00..704131546.90 rows=8154474186 width=4)
-> Parallel Index Only Scan using test_runs_raw__part_max9600k_pkey on test_runs_raw__part_max9600k test_runs_raw_480 (cost=0.57..1429238.50 rows=16811660 width=4)
-> Parallel Index Only Scan using test_runs_raw__part_max10140k_pkey on test_runs_raw__part_max10140k test_runs_raw_507 (cost=0.57..1081827.27 rows=12896836 width=4)
-> Parallel Seq Scan on test_runs_raw__part_max9500k test_runs_raw_475 (cost=0.00..2717185.06 rows=32060806 width=4)
-> Parallel Seq Scan on test_runs_raw__part_max11180k test_runs_raw_559 (cost=0.00..2610814.95 rows=30806095 width=4)
It also takes ages to return, so I have to interrupt it.
I believe it should exit early, as soon as it finds 10 distinct values
(which should be rather easy even with parallel seqscans, given the
pattern followed when inserting the data).
Thanks,
Dimitris
From | Date | Subject | |
---|---|---|---|
Next Message | Dimitrios Apostolou | 2024-05-10 19:35:57 | Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions |
Previous Message | Dimitrios Apostolou | 2024-05-10 18:38:07 | SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions |