From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
Cc: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Inheritance |
Date: | 2016-05-25 13:28:47 |
Message-ID: | CAHyXU0yxCgDzpE6LdRannwG6_8qYWWX4KWbSzFMYvH_eDi3Ukg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, May 24, 2016 at 9:47 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> On 24 May 2016 at 22:45, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
> wrote:
>>
>> There is one aspect of inheritance support which was not mentioned:
>> polymorphic queries.
>> Actually polymorphism is the fundamental feature of OOP, without it there
>> is no behavioral inheritance and inheritance can be considered just as
>> "syntax sugar" for sharing some common subset of attributes between tables.
>>
>> The main problem with supporting polymorphic queries is that SQL query
>> returns set of tuples, not set of objects.
>> So there is no nice way to return both instances of based and derived
>> tables. There are several alternatives
>> (for example return joined set of attributes in all derived tables,
>> leaving missed as NULLs) but none of them is good.
>
>
> Exactly. We have a sort-of-object-ish storage option, but none of the
> surrounding stuff to make it useful for actual OO / object-relational work.
>
> The "joined set of attributes" approach is exactly what ORMs already do, and
> many direct implementations of the same idea will use too. So we'd offer no
> advantage over what they already do in a way that works with multiple
> DBMSes, except we might be able to do it faster. Maybe.
>
> The lack of polymorphism is critical. It's not really usefully OO but it
> costs you important relational features if you use it. We have some very
> limited polymorphism in the sense that you can query the parent table and
> see rows in child tables, but you only get the subset of cols that exists at
> that level of the heirarchy.
>
> One thing I'd like to explore one day is a nice, user-friendly way to
> express "SELECT this row and the corresponding sets of rows from [these
> tables and their children in turn] as a structured object". Right now users
> have to write series of LEFT JOINs and de-duplicate the left-hand sides. Or
> do multiple queries (n+1 selects), possibly expensively with repeated join
> work involved. Or they have to write pretty baroque queries to construct a
> jsonb object with jsonb_agg with multiple levels of group-by in
> subqueries-in-from. We should be able to do this for them, so they can say
>
> SELECTOBJECT customer
> CHILD JOIN invoice ON (customer.customer_id = invoice.customer_id AND
> invoice_date > $1)
> CHILD JOIN invoiceline USING (invoice_id)
> CHILD JOIN address USING (customer_id)
> WHERE customer.in_debt_collections;
>
> instead of the current nested mess of aggregation and subqueries needed,
> like:
>
> SELECT
> to_jsonb(customer) || (
> SELECT jsonb_build_object('invoices', jsonb_agg(invoice_obj))
> FROM (
> SELECT to_jsonb(invoice) || jsonb_build_object('invoice_lines',
> jsonb_agg(invoice_line))
> FROM invoice
> LEFT OUTER JOIN invoice_line ON (invoice.invoice_id =
> invoice_line.invoice_id)
> WHERE invoice.customer_id = customer.customer_id AND invoice_date >=
> current_date
> GROUP BY invoice.invoice_id
> ) invoice_obj
> ) || (
> SELECT jsonb_build_object('addresses', jsonb_agg(address))
> FROM address
> WHERE address.customer_id = customer.customer_id
> )
> FROM customer
> WHERE customer.in_debt_collections
Well, I don't know. There's a lot of ways to write that type of thing.
Personally, I tend to prefer to delay the serialization to json as
long as possible (although it's sometimes unavoidable) because it
keeps the query cleaner. I also sometimes use the array() subquery
syntax for sake of brevity, but this query could be restructured to
use proper aggregation on all levels if you're concerned about
performance (this query would tend to underperform yours for very
large compositions because of the second subquery scan vs the hash
join OTOH, it's a faster serialization model). I didn't test the
syntax, but you get the idea.
SELECT to_json(q)
FROM
(
SELECT
c.*,
array(
SELECT
i.*,
array(
SELECT il
FROM invoice_line il
WHERE il.invoice_id = i.invoice_id
) AS invoice_lines
FROM invoice i
WHERE i.customer_id = c.customer_id AND invoice_date >= current_date
) AS invoices
FROM customer c
WHERE c.in_debt_collections
) q
The point is this: the postgresql type system is flexible enough that
you can do arbitrary constructions pretty easy and the situation has
been one of continuous improvement over the last several releases. It
isn't perfect, but json enhancements FWICT have made syntactical
approaches to the problem a dead end; json gets the job done is less
likely to cause problems with the SQL standard down the road. For the
same set of reasons I no longer use crosstab.
In the 15+ years I've been watching postgres inheritance has gone
precisely nowhere, and there other ways to do the things it can do
that also supply a much broader range of use cases. Plus, I'm biased:
I happen to think the 90's OO style of inheritance is pretty dumb :-).
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Teodor Sigaev | 2016-05-25 13:30:46 | Re: [PROPOSAL] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind |
Previous Message | ktm@rice.edu | 2016-05-25 13:22:43 | Re: Does people favor to have matrix data type? |