Re: Fetch zero result rows when executing a query?

From: Shay Rojansky <roji(at)roji(dot)org>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: "Stephen R(dot) van den Berg" <srb(at)cuci(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fetch zero result rows when executing a query?
Date: 2015-02-07 17:40:34
Message-ID: CADT4RqDTQrvHVwFG1Y0K33gESzzxUiNkO69NTf2no=NiWQfjgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry everyone, I was unexpectedly very busy and couldn't respond
earlier... My apologies.

I'll clarify just a little... I am indeed talking about the PostgreSQL
network protocol, and not about query optimization (with LIMIT or omitting
RETURNING etc.). I am implementing ADO.NET's ExecuteNonQuery, through which
the user indicates they're not interested in any result rows whether those
exist or not. For the situation where a user does ExecuteNonQuery but the
query returns result rows, the driver can save the needless network
transfers. We can definitely say it's the user's fault for providing a
query with a resultset to ExecuteNonQuery, but we *do* have the user's
clear intention that no rows be fetched so why not act on it. I agree this
isn't a terribly important optimization, the trigger for this question was
first and foremost curiosity: it seems strange the protocol allows you to
specify max_rows for any value other than 0.

Here's a possible believable use-case which doesn't involve user neglect:
imagine some server-side function which has side-effects and also returns
some rows. In some situations the user is interested in the result rows,
but in others they only want the side-effect. The user would probably have
no control over the function, and their only way to *not* transfer the
result rows would be with a mechanism such as max_rows.

Marko, regarding your general criticism of max_rows:

> This seems to be a common pattern, and I think it's a *huge* mistake to
specify maxrows=1 and/or ignore rows after the first one in
> the driver layer. If the user says "give me the only row returned by
this query", the interface should check that only one row is in
> reality returned by the query. If the query returns more than one row,
the user made a mistake in formulating the query and she
> probably wants to know about it. If she genuinely doesn't care about the
rows after the first one, she can always specify LIMIT 1.

> For a sad example, look at PL/PgSQL's SELECT .. INTO ..; it's not
terribly difficult to write a query which returns more than one
> row *by mistake* and have something really bad happen later on since it
went undetected during testing because you just
> happened to get the expected row back first. And when you do want to
specifically enforce it for e.g. security critical code,
> you have to resort to really ugly hacks like window functions.

There are some problems with what you say... First, the ADO.NET API
provides a SingleRow API option which explicitly provides exactly this.
This API option doesn't at all mean that there *should* be only one row
(i.e. an error should be raised if otherwise), but simply that any other
rows beyond the first should be discarded. So regardless of what we think
best practices are on this, this behavior is mandated/specified by a major
API.

More to the point, doesn't max_rows=1 have exactly the same dangers as
LIMIT 1? The two seem to be identical, except that one is expressed in the
SQL query and the other at the network protocol level. The way I see it, if
the user specifies one of them without specifying ORDER BY, they are
explicitly saying they don't care which row comes out. And if their testing
code fails because this is wrong, then they've made a mistake - IMHo this
isn't a reason to kill the entire feature.

In general, in my view it's beneficial to separate between the SQL queries
and the features that the driver is supposed to provide in its API. The SQL
may be written or managed by one entity, reused in many places (some of
which want all rows and others which want only 1).

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G Johnston 2015-02-07 18:33:33 Re: Fetch zero result rows when executing a query?
Previous Message Corey Huinker 2015-02-07 16:50:33 Re: GRANT USAGE on FOREIGN SERVER exposes passwords