Why DISTINCT ... DESC is slow?

From: Anton <anton200(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Why DISTINCT ... DESC is slow?
Date: 2006-12-12 07:28:22
Message-ID: 8cac8dd0612112328m5ac0520m7e3d9778a9d14b47@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi. With this table (about 800 000 rows):

=# \d n_traffic
Table "public.n_traffic"
Column | Type | Modifiers
--------------+-----------------------------+------------------------------
login_id | integer | not null
traftype_id | integer | not null
collect_time | timestamp without time zone | not null default now()
bytes_in | bigint | not null default (0)::bigint
bytes_out | bigint | not null default (0)::bigint
Indexes:
"n_traffic_collect_time" btree (collect_time)
"n_traffic_login_id" btree (login_id)
"n_traffic_login_id_collect_time" btree (login_id, collect_time)
Foreign-key constraints:
"n_traffic_login_id_fkey" FOREIGN KEY (login_id) REFERENCES
n_logins(login_id) ON UPDATE CASCADE
"n_traffic_traftype_id_fkey" FOREIGN KEY (traftype_id) REFERENCES
n_traftypes(traftype_id) ON UPDATE CASCADE

=# explain analyze SELECT DISTINCT ON (login_id) login_id,
collect_time AS dt FROM n_traffic ORDER BY login_id collect_time DESC;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Unique (cost=91711.13..95669.41 rows=532 width=12) (actual
time=10698.860..13790.918 rows=798 loops=1)
-> Sort (cost=91711.13..93690.27 rows=791656 width=12) (actual
time=10698.851..12140.496 rows=791656 loops=1)
Sort Key: login_id, collect_time
-> Seq Scan on n_traffic (cost=0.00..14150.56 rows=791656
width=12) (actual time=0.013..2776.572 rows=791656 loops=1)
Total runtime: 14049.378 ms
(5 rows)

For me it is strange that postgres uses Seq Scan, but set
enable_seqscan = off don't get any changes.

While without DESC query goes faster... But not so fast!
=# explain analyze SELECT DISTINCT ON (login_id) login_id,
collect_time AS dt FROM n_traffic ORDER BY login_id collect_time;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=0.00..29843.08 rows=532 width=12) (actual
time=0.045..5146.768 rows=798 loops=1)
-> Index Scan using n_traffic_login_id_collect_time on n_traffic
(cost=0.00..27863.94 rows=791656 width=12) (actual
time=0.037..3682.853 rows=791656 loops=1)
Total runtime: 5158.735 ms
(3 rows)

Why? 768 rows is about 1000 times smaller than entire n_traffic. And
why Index Scan used without DESC but with DESC is not?
--
engineer

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2006-12-12 07:43:04 Re: Why DISTINCT ... DESC is slow?
Previous Message Ashish Karalkar 2006-12-12 06:57:39 8.2.0 Installation Problem