From: | Marc Cousin <cousinmarc(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: star schema and the optimizer |
Date: | 2015-02-27 14:27:10 |
Message-ID: | 54F07EBE.5030508@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 27/02/2015 15:08, Tom Lane wrote:
> Marc Cousin <cousinmarc(at)gmail(dot)com> writes:
>> So I gave a look at the optimizer's code to try to understand why I got this problem. If I understand correctly, the optimizer won't do cross joins, except if it has no choice.
>
> That's right, and as you say, the planning-speed consequences of doing
> otherwise would be disastrous. However, all you need to help it find the
> right plan is some dummy join condition between the dimension tables,
> which will allow the join path you want to be considered. Perhaps you
> could do something like
>
> SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and facts.dim2=dim2.a and dim1.b=12 AND dim2.b=17 and (dim1.a+dim2.a) is not null;
No I can't. I cannot rewrite the query at all, in my context.
What do you mean by disastrous ?
I've given it a few tries here, and with 8 joins (same model, 7
dimensions), planning time is around 100ms. At least in my context, it's
well worth the planning time, to save minutes of execution.
I perfectly understand that it's not something that should be "by
default", that would be crazy. But in a datawarehouse, it seems to me
that accepting one, or even a few seconds of planning time to save
minutes of execution is perfectly legetimate.
From | Date | Subject | |
---|---|---|---|
Next Message | Gilles Darold | 2015-02-27 15:29:11 | Re: Bug in pg_dump |
Previous Message | Anastasia Lubennikova | 2015-02-27 14:19:47 | Re: Index-only scans for GiST. |