Quick Links

Re: Idea on how to simplify comparing two sets

From:	Nico Williams <nico(at)cryptonector(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Joel Jacobson <joel(at)trustly(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Idea on how to simplify comparing two sets
Date:	2017-02-23 21:55:03
Message-ID:	20170223215457.GE30233@localhost
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Feb 07, 2017 at 01:03:14PM -0500, Tom Lane wrote:
> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
>
> Actually ... now that you mention full join, I believe this works:
>
> select * from (select ...) s1 full join (select ...) s2
> on ((s1.*)=(s2.*)) where s1.* is distinct from s2.*;

You can drop the .*s:

select * from (select ...) s1 full join (select ...) s2
on s1 = s2 where s1 is distinct from s2;

And even:

select s1, s2 from (select ...) s1 natural full outer join (select ...) s2;

This makes it possible to write very generic (schema-wise) code for
comparing table sources.

As I've mentioned elsewhere, there is an issue with NULLs in columns...

I really, really would like either a full equijoin where equality treats
NULL = NULL -> true for this purpose, or a natural join where only
primary key or not-nullable columns are used, or a USING clause form
where I can specify such behavior without having to list all the columns
that should be used.

I use NATURAL FULL OUTER JOIN for computing materialized view diffs in
my alternate view materialization system. NULLs are poison for this
purpose, yielding false positive differences. But my code also uses the
table row value form above in order to avoid having to generate column
lists for a USING clause or expressions for ON.

These requests are not for syntactic sugar, not really. But I realize
they may be non-trivial -- I may be looking for unobtanium.

> > That said I'm not sure how much we want to go down this road on our own.
> > It'd be nice to have when its needed but its not something that gets much
> > visibility on these lists to suggest a large pent-up demand.
>
> Yeah, if this isn't in the standard and not in other databases either,
> that would seem to suggest that it's not a big requirement.

SQLite3 famously lacks FULL joins. It kills me because the alternative
constructions become O(N log M) instead of O(N) for a properly
implemented FULL join (assuming suitable indices anyways).

I wouldn't suggest that that's a reason not to support FULL joins in any
other RDBMS, rather, I'd suggest that SQLite3 is missing an important
feature.

Pardon the tangent. It may not really be applicable here, as here I
think OP is looking for syntactic sugar rather than an important
optimization. But the point is that sometimes you have to lead the
standards-setting and/or the competition.

Nico
--

In response to

Re: Idea on how to simplify comparing two sets at 2017-02-07 18:03:14 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jim Nasby	2017-02-23 21:56:41	Re: Faster methods for getting SPI results (460% improvement)
Previous Message	Fabien COELHO	2017-02-23 21:46:31	Re: \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)