Re: [HACKERS] What about LIMIT in SELECT ?

From: jwieck(at)debis(dot)com (Jan Wieck)
To: t-ishii(at)sra(dot)co(dot)jp
Cc: jwieck(at)debis(dot)com, oleg(at)sai(dot)msu(dot)su, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] What about LIMIT in SELECT ?
Date: 1998-10-15 12:23:43
Message-ID: m0zTmRT-000EBRC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tatsuo Ishii wrote:

> I think we have understanded your point. set query_limit is just a
> easy alternative of using cursor and fetch.
>
> > I haven't looked at Tatsuo's patch very well. But if it
> > limits the amount of data going into the sort (on ORDER BY),
> > it will break it! The requested ordering could be different
> > from what the choosen index might return. The used index is
> > choosen by the planner upon the qualifications given, not the
> > ordering wanted.
>
> I think it limits the final result. When query_limit is set,
> the arg "numberTuples" of ExecutePlan() is set to it instead of 0
> (this means no limit).
>
> Talking about "offset," it shouldn't be very difficult. I guess all we
> have to do is adding a new arg "offset" to ExecutePlan() then making
> obvious modifications. (and of course we have to modify set
> query_limit syntax but it's trivial)

The offset could become

FETCH n IN cursor [OFFSET n];

and

SELECT ... [LIMIT offset,count];

The FETCH command already calls ExecutorRun() with the given
count (the tuple limit). Telling it the offset too is really
simple. And ExecutorRun() could check if the toplevel
executor node is an index scan. Skipping tuples during the
index scan requires, that all qualifications are in the
indexqual, thus any tuple returned by it will become a final
result row (as it would be in the simple 1-table-queries we
discussed). If that isn't the case, the executor must
fallback to skip the final result tuples and that is after an
eventually processed sort/merge of the complete result set.
That would only reduce communication to the client and memory
required there to buffer the result set (not a bad thing
either).

ProcessQueryDesc() in tcop/pquery.c also calls ExecutorRun()
but with a constant 0 tuple count. Having offset and count in
the parsetree would make it without any state variables or
SET command. And it's the only clean way to restrict LIMIT to
SELECT queries. Any thrown in LIMIT to ExecutorRun() from
another place could badly hurt the rewrite system. Remember
that non-instead actions on insert/update/delete are
processed before the original query! And what about SQL
functions that get processed during the evaluation of another
query (view using an SQL function for count(*))?

A little better would it be to make the LIMIT values able to
be parameter nodes. C or PL functions use the prepared plan
feature of the SPI manager for performance reasons.
Especially the offset value might there need to be a
parameter that the executor has to pick out first. If we
change the count argument of ExecutorRun to a List *limit,
this one could be NIL (to mean the old 0 count 0 offset
behaviour) or a list of two elements that both can be either
a Const or a Param of type int4. Easy for the executor to
evaluate.

The only places where ExecutorRun() is called are
tcop/pquery.c (queries from frontend), commands/command.c
(FETCH command), executor/functions.c (SQL functions) and
executor/spi.c (SPI manager). So it is easy to change the
call interface too.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck(at)debis(dot)com (Jan Wieck) #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Litovka 1998-10-15 12:35:53 Optimizing perfomance using indexes
Previous Message D'Arcy J.M. Cain 1998-10-15 11:38:04 Re: [HACKERS] PostgreSQL v6.4 BETA2...