From: | "Tzahi Fadida" <tzahi_ml(at)myrealbox(dot)com> |
---|---|
To: | <pgsql-general(at)postgresql(dot)org> |
Cc: | <ana_cata_hylo(at)yahoo(dot)com> |
Subject: | Re: Google Summer of Code: Full Disjunctions |
Date: | 2006-05-08 17:30:17 |
Message-ID: | 015901c672c5$130b76c0$0b00a8c0@llord |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi Lee,
First, i have no knowledge of anyone that have implemented full disjunctions(ever) aside
from the theoretical works of my colleagues.
With the exception of a corner case of it, that I believe was a simulation in 96.
(A. Rajaman and J.D. Ullman Integrating information by outerjoins and full-disjunctions).
I'd love to hear about any implementation out there (aside from my colleagues work, which
is mine also: cohen,sagiv, kimelfeld,kanza)
It can never be a binary operation since at the heart of the matter is that you need to take
each subset of the relations and join them. i.e.:
A
B
C
A,B
A,C
B,C
A,B,C
and remove subsumptions, for example if you have {t_A,t_B} from the A,B join,
you will not include just {t_A} or just {t_B} in the result (subsumption).
Usually binary operations allow for a bottom up computation approach, but FD is a TOP down approach
(Galindo-Legaria, C. outerjoins as disjunctions).
The answer what to do is a bit complex and part of it is in a paper that is currently in review,
however:
You take the cyclic component, for example FD(A,B,C,D) and then join it with another relation E.
It's not a trouble to write (A fd B) fd C but it will still have to be computed as one computation -
FD(A,B,C).
Again, note that FD is only useful for cases you can't use natural outer joins.
So only if you have a cycle you use FD.
In my current implementation I do getfdr(A,B,C) to get the RECORD since with FD you cannot
know in advance what will be the eventual RECORD (unless you save the RECORD type for A,B,C
and A,B,C,D etc...)
then I run, for example, FD('A,B,C') as RECORD (a0 int, ....)
As you can see, you can now join other relations to the result, it's a standard recordset.
So (FD('A,B,C') as RECORD (a0 int, ....))
Here is an example of a RUN (ignore all the additional parameters):
select getfdr('rand_1_0,rand_1_1,rand_1_2');
getfdr
---------------------------
(a0 int4,a1 int4,a2 int4)
(1 row)
select * from odmbfd('rand_1_0,rand_1_1,rand_1_2',true,true,100,'random',10,10) AS RECORD (a0 int,
a1 int, a2 int) natural full outer join rand_4_2;
a2 | a0 | a1 | a3
------+------+------+------
1 | | 1813 |
1 | | 3091 |
1 | | 316 |
2 | 2113 | | 2738
3 | | 2924 | 823
3 | | 2924 | 2735
.....
.....
select count(*) from odmbfd('rand_1_0,rand_1_1,rand_1_2',true,true,100,'random',10,10) AS RECORD (a0
int, a1 int, a2 int) natural full outer join rand_4_2;
count
-------
3201
(1 row)
>Hi Tzahi
>
>Since you seem to know about full disjunctions, I was wondering if you
>could answer this question:
>
>Can full disjunction be implemented as a binary operator?
>
>The algorithms I've seen for computing it seem to require that all
>input relations be supplied at once. Even in your message above you
>specified your example as FD(A,B,C) rather than say (A fd B) fd C.
>So I was wondering if the latter is possible?
>
>The reason I ask is that I'm currently trying to solve a problem that
>could in principle be solved using FD but it would fit into my current
>framework much better if it were a binary operator like the other
>joins.
>
>Thanks in advance.
>Lee
From | Date | Subject | |
---|---|---|---|
Next Message | Joao Miguel Ferreira | 2006-05-08 17:43:40 | Re: database size grows (even after vacuum (full and analyze)).... |
Previous Message | Bruno Wolff III | 2006-05-08 17:07:24 | Re: database size grows (even after vacuum (full and analyze)).... |