From: | Arjen van der Meijden <acmmailing(at)vulcanus(dot)its(dot)tudelft(dot)nl> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Large CASE-statement is pretty slow? |
Date: | 2004-03-14 23:21:17 |
Message-ID: | 4054E8ED.3090200@vulcanus.its.tudelft.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Tom Lane wrote:
> Arjen van der Meijden <acmmailing(at)vulcanus(dot)its(dot)tudelft(dot)nl> writes:
>
>>Anyway, I was looking into the usefullness of a INSERT INTO newtable
>>SELECT field, field, CASE pkey WHEN x1 THEN y1 WHEN x2 THEN y2 etc END
>>FROM oldtable
>
>
>>The resulting select was about 1.7MB of query-text, mostly composed of
>>the CASE-statement.
>
>
> Hm, you mean one single SELECT, one single CASE? How many WHEN clauses
> exactly? Exactly what did a typical clause of the CASE look like?
Yes, one SELECT-query with one single CASE-statement.
The CASE-statement had the simple-case-structure like:
SELECT CASE UserID WHEN 1 THEN 1 WHEN 34 THEN 2 ... etc
I noticed, by the way, that the ordering is on the THEN y parameter, the
x parameter (WHEN x THEN y) is "more or less increasing".
But some numbers:
The table I did my tests on has 88291 rows, I did the select on the
integer primary key, so the CASE was the only column in the select.
I'm running the query again on a table that has only the primary key of
my original table and it seems to be as slow.
I'm not really sure how many WHEN's there are in that CASE, but it is
supposed to be a relocation of all primary key-values to some other
value, so it will contain some number close to that 88291.
> I wouldn't be too surprised to find some bit of code that's O(N^2) in
> the number of arms of the CASE, or something like that; it's not an area
> that we've ever felt the need to optimize. But I'd like a fairly
> specific test case before trying to look into it.
Well, I have discarded this type of query as "too inefficient" and found
a better way, so don't feel the need to optimize it just because I
noticed it is slow with very large CASEs. Although CASEs with a few
hundred WHENs wont be that uncommon and might improve a bit as well?
I can send you the "primary key only"-table and the query off list if
you want to. That won't make me violate any privacy rule and is probably
a good test case?
Best regards,
Arjen van der Meijden
From | Date | Subject | |
---|---|---|---|
Next Message | scott.marlowe | 2004-03-15 15:17:49 | Re: [PERFORM] syslog slowing the database? |
Previous Message | Aaron Werman | 2004-03-14 22:11:22 | Re: Scaling further up |