Re: Idea on how to simplify comparing two sets

From: Joel Jacobson <joel(at)trustly(dot)com>
To: Anders Granlund <anders(dot)granlund(at)trustly(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Idea on how to simplify comparing two sets
Date: 2017-02-07 15:43:16
Message-ID: CAASwCXf+DqVjH-MjY7rstWNX7L_g5tfa+HU4RW3xOnB0TSSnPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

But that's already a valid statement, so there is no ambiguity:

SELECT TRUE WHERE FALSE
IS NOT DISTINCT FROM
(SELECT TRUE);
bool
------
(0 rows)

If you want to compare the set (SELECT TRUE WHERE FALSE) with the set
(SELECT TRUE) then just add parenthesis:
(SELECT TRUE WHERE FALSE)
IS NOT DISTINCT FROM
(SELECT TRUE);
ERROR: syntax error at or near "IS"
LINE 2: IS NOT DISTINCT FROM
^

Which is currently invalid syntax.

On Tue, Feb 7, 2017 at 4:40 PM, Anders Granlund
<anders(dot)granlund(at)trustly(dot)com> wrote:
> What about this ambiguity?
>
> SELECT TRUE
> WHERE FALSE
> IS NOT DISTINCT FROM
> (SELECT TRUE)
>
> On Tue, Feb 7, 2017 at 4:13 PM, Joel Jacobson <joel(at)trustly(dot)com> wrote:
>>
>> Hi hackers,
>>
>> Currently there is no simple way to check if two sets are equal.
>>
>> Looks like no RDBMS in the world has a simple command for it.
>>
>> You have to do something like:
>>
>> WITH
>> T1 AS (SELECT * FROM Foo WHERE FooID BETWEEN 1 AND 10000),
>> T2 AS (SELECT * FROM Bar WHERE BarID BETWEEN 1 AND 10000)
>> SELECT
>> GREATEST(
>> (SELECT COUNT(*) FROM T1),
>> (SELECT COUNT(*) FROM T2)
>> )
>> =
>> (SELECT COUNT(*) FROM (
>> SELECT * FROM T1
>> INTERSECT ALL
>> SELECT * FROM T2
>> ) AS X)
>> INTO _Identical;
>>
>> or,
>>
>> SELECT 'Missmatch!' WHERE EXISTS (
>> SELECT * FROM Foo
>> FULL JOIN Bar ON (Foo.FooID = Bar.BarID AND
>> Foo IS NOT DISTINCT FROM Bar)
>> WHERE TRUE
>> AND ( Foo.FooID BETWEEN 1 AND 10000 AND
>> Bar.BarID BETWEEN 1 AND 10000 )
>> AND ( Foo.FooID IS NULL OR
>> Bar.BarID IS NULL);
>>
>> Introducing new SQL keywords is of course not an option,
>> since it would possibly break backwards compatibility.
>>
>> So here is an idea that doesn't break backwards compatibility:
>>
>> Let's give a meaning for the existing IS DISTINCT and IS NOT DISTINCT,
>> that is currently a syntax error when used between two sets.
>>
>> SELECT 1 IS DISTINCT FROM SELECT 1;
>> ERROR: syntax error at or near "SELECT"
>> LINE 1: SELECT 1 IS DISTINCT FROM SELECT 1;
>>
>> The example above could be written as:
>>
>> _Identical := (
>> SELECT * FROM Foo WHERE FooID BETWEEN 1 AND 10000
>> IS NOT DISTINCT FROM
>> SELECT * FROM Bar WHERE BarID BETWEEN 1 AND 10000
>> );
>>
>> Which would set _Identical to TRUE if the two sets are equal,
>> and FALSE otherwise.
>>
>> Since it's currently a syntax error, there is no risk for changed
>> behaviour for any existing executable queries.
>>
>> Thoughts?
>>
>> /Joel
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
Joel Jacobson

Mobile: +46703603801
Trustly.com | Newsroom | LinkedIn | Twitter

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-02-07 15:43:47 possibility to specify template database for pg_regress
Previous Message Anders Granlund 2017-02-07 15:40:39 Re: Idea on how to simplify comparing two sets