From: | Robert Klemme <shortcutter(at)googlemail(dot)com> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | Li Jin <ljin(at)tripadvisor(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Performance penalty when using WITH |
Date: | 2011-08-03 17:30:46 |
Message-ID: | CAM9pMnNpZ4Sq5okZYGJdojki16a1nsLXXr-UuFDWJT5UfAoAHQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, Aug 3, 2011 at 6:24 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Wed, Aug 3, 2011 at 2:18 AM, Robert Klemme
> <shortcutter(at)googlemail(dot)com> wrote:
>>>> Another observation: That criterion looks suspicious to me. I would
>>>> expect any RDBMS to be better able to optimize this:
>>>>
>>>> WHERE firstname = 'Eddie' AND lastname like 'T%'
>>>>
>>>> I know it's semantically not the same but I would assume this is good
>>>> enough for the common usecase. Plus, if there is an index on
>>>> (firstname, lastname) then that could be used.
>>>
>>> disagree. just one of the ways that could be stymied would to change
>>> the function behind the '||' operator.
>>
>> I don't understand what you mean. Can you please elaborate?
>>
>> To explain my point a bit: I meant that by querying individual fields
>> separately instead of applying a criterion on a function of the two
>> the RDBMS has a better chance to use indexes and come up with a better
>> plan for this part of the query.
>
> Yes, but your assuming that it is safe and generally advantageous to
> do that. Both assumptions I think are false.
I am not sure why you say I assume this is _safe_. I said it is "good
enough for the common usecase". And it is certainly good enough for
this particular query.
As for the "generally advantageous" I'd say that an index on "raw"
column values is usually useful for more queries than an index on a
specific function. That's why I'd say generally an index on column
values is more versatile and I would prefer it. Of course you might
achieve orders of magnitude of speedup for individual queries with an
index on a function tailored to that particular query but if you need
to do that for multiple queries you pay a higher penalty for updates.
> The || operator is trivially hacked:
> create or replace function funky_concat(l text, r text) returns text as
> $$
> select textcat(textcat($1, 'abc'), $2);
> $$ language sql immutable ;
>
> update pg_operator set oprcode = 'funky_concat' where oid = 654;
>
> postgres=# select 'a' || 'b';
> ?column?
> ----------
> aabcb
> (1 row)
>
> Also even ignoring the above it's not free to have the database try
> and analyze every instance of the || operator to see if it can be
> decomposed to boolean field operations.
Even with your hacked operator you would need an index on the
expression to make it efficient. That could be done with the original
|| as well. But my point was to query
WHERE a = 'foo' and b like 'b%'
instead of WHERE a || ' ' || substring(b, 1, 1) = 'foo b'
to use an index on (a,b). That index would also be useful for queries like
WHERE a = 'foo'
WHERE a like 'fo%'
WHERE a = 'foo' and b = 'bar'
and probably also
WHERE a > 'foo'
WHERE a > 'foo' and b like 'b%'
WHERE a > 'foo' and b = 'bar'
Kind regards
robert
PS: Sorry for the earlier duplicate. Gmail had a hickup.
--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Clem Dickey | 2011-08-04 01:53:19 | Re: Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time |
Previous Message | Kevin Grittner | 2011-08-03 17:29:47 | Re: Postgres performance on Linux and Windows |