From: | Troels Arvin <troels(at)arvin(dot)dk> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Quota query with decent performance? |
Date: | 2003-11-11 22:58:09 |
Message-ID: | pan.2003.11.11.22.58.08.758402@arvin.dk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Hello,
I'm researching how "quota queries" (a term used by Fabian Pascal) may be
performed in various DBMSes with acceptable performance:
http://troels.arvin.dk/db/rdbms/#select-limit-simple-note
An example of a quota query could be to get the top-3 youngest people from
a collection of people. The complicated part is that such a query might
return more than 3 rows in some tie situations.
In MSSQL and DB2 there are very efficient facilities for such queries, but
I can't find any well-performing declarative methods for PostgreSQL. I
have tried a couple of different strategies, and I currently get the best
results from a correlated subquery like
SELECT * FROM person AS px
WHERE (
SELECT COUNT(*)
FROM person AS py
WHERE py.age < px.age
) < 3;
When my base table has 4000 rows, my query takes 27 seconds in PostgreSQL
7.2.3 (PIII 1000MHz) which is clearly unacceptable, especially comparing
to the same query in DB2 which only takes 1.4 seconds (on the same server)
- or to this non-standard-SQL DB2-query which only takes 0.02 seconds to
calculate the same result:
SELECT *
FROM (
SELECT firstname,age,RANK() OVER (ORDER BY age ASC) AS rank
FROM person
) AS foo
WHERE rank<=3;
Test-files with table definitions and randomly generated rows:
http://troels.arvin.dk/db/tests/quota.1/
Any suggestions on how to perform fast "quota queries" in PostgreSQL?
--
Greetings from Troels Arvin, Copenhagen, Denmark
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2003-11-11 23:41:34 | Re: Quota query with decent performance? |
Previous Message | Nick Fankhauser | 2003-11-11 21:23:02 | Re: Is there a more elegant way to write this query?... |