Re: Changing SQL Inlining Behaviour (or...?)

From: Darafei "Komяpa" Praliaskouski <me(at)komzpa(dot)net>
To: Paul Ramsey <pramsey(at)cleverelephant(dot)ca>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Changing SQL Inlining Behaviour (or...?)
Date: 2018-11-14 13:59:02
Message-ID: CAC8Q8tLjt0td4Jpt672T-RQ+UU-s2rxUVRWH4p8OJOn85hHUYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
>> At pgconf.eu, I canvassed this problem and some potential solutions:
>>
>

I wonder if there is a middle ground between #2 and #3. A proper mechanism
for deduplicating entries might be hard, but on the inlining stage we
already know they're going to get duplicated. Can we make a subplan/lateral
join/whatever for arguments and deduplicate just them, as we know about
that duplication from structure already?

Another thing I see here is PostGIS using indexes. Can we just always
inline if we see an index-accelerated operator (or just an operator) on top
level AND inside inlined function?

> * Solution #1 - Quick and dirty and visible: Add an 'INLINE' function
>> decorator, which tells PostgreSQL to just ignore costs and inline the
>> function regardless. Pros: it's not too hard to implement and I'm happy to
>> contribute this. Cons: it adds very specific single-purpose syntax to
>> CREATE FUNCTION.
>>
>> * Solution #2 - Quick and dirty and invisible. Tom suggested a hack that
>> achieves the aims of #1 but without adding syntax to CREATE FUNCTION: have
>> the inlining logic look at the cost of the wrapper and the cost of
>> parameters, and if the cost of the wrapper "greatly exceeded" the cost of
>> the parameters, then inline. So the PostGIS project would just set the cost
>> of our wrappers very high, and we'd get the behaviour we want, while other
>> users who want to use wrappers to force caching of calculations would have
>> zero coded wrapper functions. Pros: Solves the problem and easy to
>> implement, I'm happy to contribute. Cons: it's so clearly a hack involving
>> hidden (from users) magic.
>>
>> * Solution #3 - Correct and globally helpful. When first presented with
>> this problem last year, both Andres and Tom said [1] "but the right fix is
>> to avoid the double-calculation of identical entries in the target list"
>> because then it would be safe to inline functions with duplicate expensive
>> parameters. This would not only address the proximate PostGIS problem but
>> make a whole class of queries faster. There was some discussion of this
>> approach last week [2]. Pros: The right thing! Improves a whole pile of
>> other performance cases. Cons: Hard! Only experienced PgSQL developers need
>> apply.
>>
>> Naturally, I would love to see #3 implemented, but there's only so much
>> experienced developer time to go around, and it's beyond my current skill
>> set. I would like to be able to start to improve PostGIS parallelism with
>> PgSQL 12, so in order to make that not impossible, I'd like to implement
>> either #1 or #2 in case #3 doesn't happen for PgSQL 12.
>>
>> So my question to hackers is: which is less worse, #1 or #2, to implement
>> and submit to commitfest, in case #3 does not materialize in time for PgSQL
>> 12?
>>
>
> Absent any preferences, I would be inclined to go with #2, having a high
> personal tolerance for ugly hacks... :)
>
> P
>
>
>
>>
>> [1]
>> https://www.postgresql.org/message-id/20171116182208.kcvf75nfaldv36uh%40alap3.anarazel.de
>> [2]
>> https://www.postgresql.org/message-id/10355.1540926295%40sss.pgh.pa.us
>>
>> --
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-11-14 14:02:46 Re: pgsql: Add flag values in WAL description to all heap records
Previous Message Pavel Stehule 2018-11-14 13:09:58 Re: proposal: simple query profile and tracing API