From: | David Fuhry <dfuhry(at)cs(dot)kent(dot)edu> |
---|---|
To: | Gavin Sherry <swm(at)alcove(dot)com(dot)au> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Chris Browne <cbbrowne(at)acm(dot)org> |
Subject: | Re: PostgreSQL - 'SKYLINE OF' clause added! |
Date: | 2007-03-07 16:28:22 |
Message-ID: | 45EEE826.5010407@cs.kent.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
The query Ranbeer gave - as with any skyline query - can be solved with
just pure SQL:
select * from books b where not exists(
select * from books b2 where
b2.rating >= b.rating and b2.price <= b.price and
(b2.rating > b.rating or b2.price < b.price)
);
book_name | rating | price
-------------------+--------+-------
Prodigal Daughter | 3 | 250
The Notebook | 4 | 300
Fountain Head | 5 | 350
(3 rows)
The idea of the BNL (block nested loop) skyline algorithm is to avoid
the nested loop by storing "dominating" records as the query proceeds -
in the above example, records which are relatively high in rating and
low in price - and comparing each candidate record to those first.
BNL is the most reasonable skyline algorithm in the absence of a
multidimensional (usually R-Tree) index on the columns. For answering
skyline queries where such an index exists over all query columns, the
most broadly used generalized algorithm is BBS [1].
Thanks,
Dave Fuhry
[1] Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2005. Progressive
skyline computation in database systems. ACM Trans. Database Syst. 30, 1
(Mar. 2005), 41-82. DOI= http://doi.acm.org/10.1145/1061318.1061320
Gavin Sherry wrote:
> On Tue, 6 Mar 2007, Alvaro Herrera wrote:
>
>> Also, keep in mind that there were plenty of changes in the executor.
>> This stuff is not likely to be very easy to implement efficiently using
>> our extant executor machinery; note that Ranbeer mentioned
>> implementation of "block nested loop" and other algorithms. Not sure
>> how easy would be to fold that stuff into the optimizer for multi-input
>> aggregates, instead of hardwiring it to the SKYLINE OF syntax.
>>
>
> Yes, there's been a lot of working on calculating skyline efficiently,
> with different sorting techniques and so on. This is the most interesting
> part of the idea. You could calculate the query Ranbeer gave using pure
> SQL and, perhaps, use of some covariance aggregates or something already.
> Of course, it gets harder when you want to calculate across many
> dimensions.
>
> Personally, I'd love to see some of these newer data analysis
> capabilities added to PostgreSQL -- or at least put out there as
> interesting patches.
>
> Thanks,
>
> Gavin
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
From | Date | Subject | |
---|---|---|---|
Next Message | Zeugswetter Andreas ADI SD | 2007-03-07 16:45:51 | Re: Auto creation of Partitions |
Previous Message | Zoltan Boszormenyi | 2007-03-07 16:22:13 | Test report on GENERATED/IDENTITY |