Re: why isn't this subquery wrong?

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: jonathan vanasco <postgres(at)2xlp(dot)com>
Cc: pgsql-general general <pgsql-general(at)postgresql(dot)org>
Subject: Re: why isn't this subquery wrong?
Date: 2017-04-20 22:38:02
Message-ID: CAEfWYywDgAGoq1ndCAa=x43-+O74=y9fnQQkdX2CUVf1MUexwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Apr 20, 2017 at 3:17 PM, jonathan vanasco <postgres(at)2xlp(dot)com> wrote:

>
> I ran into an issue while changing a database schema around. Some queries
> still worked, even though I didn't expect them to.
>
> Can anyone explain to me why the following is valid (running 9.6) ?
>
> schema
>
> CREATE TEMPORARY TABLE example_a__data (
> foo_id INT,
> bar_id INT
> );
> CREATE TEMPORARY TABLE example_a__rollup_source (
> id int primary key,
> name varchar(64),
> foo_id INT,
> check_bool BOOLEAN
> );
> CREATE TEMPORARY TABLE example_a__rollup AS
> SELECT id, name, foo_id
> FROM example_a__rollup_source
> WHERE check_bool IS TRUE
> ;
>
>
> query:
>
> SELECT foo_id
> FROM example_a__data
> WHERE foo_id IN (SELECT bar_id FROM example_a__rollup)
> ;
>
>
> a raw select of `SELECT bar_id FROM example_a__rollup;` will cause an
> error because bar_id doesn't exist
>
> postgres doesn't raise an error because example_a__data does have a bar_id
> -- but example_a__rollup doesn't and there's no explicit correlation in the
> query.
>
> can someone explain why this happens? i'm guessing there is a good reason
> -- but I'm unfamiliar with the type of implicit join/queries this behavior
> is enabling.
>
>
>
There is no requirement in this query that bar_id be in the
example_a__rollup table and since it is only in one table it is unambiguous
so the server doesn't complain.

It may be explanatory to add a couple records to your example_a_rollup
table:

insert into example_a__data values (3,4),(5,6);

Then run a simple select showing what the where clause would see:

SELECT,
foo_id,
(SELECT bar_id FROM example_a__rollup)
FROM
example_a__data
;

foo_id | bar_id
--------+--------
3 |
5 |

bar_id is null because there are no rows in example_a_rollup.

Now add a single record to example_a_rollup:

insert into example_a__rollup (id) values (10);

Rerun the query and you will get:

foo_id | bar_id
--------+--------
3 | 4
5 | 6

If you add another record to example_a__rollup and run it and you will get:

ERROR: more than one row returned by a subquery used as an expression

Although the subquery won't work as an expression it would still work in a
the where clause but I doubt it will return what you desire.
Unfortunately there
are lots of ways to write syntactically correct but logically flawed
statements.

Cheers,
Steve

In response to

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2017-04-20 22:42:06 Re: why isn't this subquery wrong?
Previous Message jonathan vanasco 2017-04-20 22:17:25 why isn't this subquery wrong?