From: | Hans-Jürgen Schönig <postgres(at)cybertec(dot)at> |
---|---|
To: | PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | "micro bucket sort" ... |
Date: | 2010-08-11 12:21:10 |
Message-ID: | F90E257C-AD3B-4928-A90E-1DE67D432741@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
hello all ...
i am bugged with a small issue which is basically like this ...
test=# create table t_test as select x, x % 5 as y from generate_series(1, 1000000) AS x;
SELECT
test=# create index idx_aaaaa on t_test (x) ;
CREATE INDEX
test=# ANALYZE ;
ANALYZE
test=# explain analyze select * from t_test order by x;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_aaaaa on t_test (cost=0.00..30408.36 rows=1000000 width=8) (actual time=0.057..311.832 rows=1000000 loops=1)
Total runtime: 392.943 ms
(2 rows)
we know that we get sorted output from the index and thus we do the index traversal here ...
if you add a condition to the sorting you will naturally get a sort in postgres because y is clearly now known to be sorted.
test=# explain analyze select * from t_test order by x, y;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Sort (cost=141431.84..143931.84 rows=1000000 width=8) (actual time=1086.014..1271.257 rows=1000000 loops=1)
Sort Key: x, y
Sort Method: external sort Disk: 17608kB
-> Seq Scan on t_test (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.024..143.474 rows=1000000 loops=1)
Total runtime: 1351.848 ms
(5 rows)
same with limit ...
test=# explain analyze select * from t_test order by x, y limit 20;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Limit (cost=41034.64..41034.69 rows=20 width=8) (actual time=317.939..317.943 rows=20 loops=1)
-> Sort (cost=41034.64..43534.64 rows=1000000 width=8) (actual time=317.934..317.936 rows=20 loops=1)
Sort Key: x, y
Sort Method: top-N heapsort Memory: 26kB
-> Seq Scan on t_test (cost=0.00..14425.00 rows=1000000 width=8) (actual time=0.019..144.109 rows=1000000 loops=1)
Total runtime: 317.995 ms
(6 rows)
now, the problem is: i cannot easily create additional indexes as i have too many possible "second" conditions here.
what makes it even more funny: i don't have enough space to do the resort of the entire thing (X TB).
so, a more expensive index traversal is my only option.
my question is: is there already a concept out there to make this work or does anybody know of a patch out there addressing an issue like that?
some idea is heavily appreciated. it seems our sort key infrastructure is not enough for this.
many thanks,
hans
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2010-08-11 12:38:34 | Re: Bug / shortcoming in has_*_privilege |
Previous Message | Fujii Masao | 2010-08-11 11:54:06 | pgstat_report_waiting() in hot standby |