Re: [HACKERS] What about LIMIT in SELECT ?

From: jwieck(at)debis(dot)com (Jan Wieck)
To: oleg(at)sai(dot)msu(dot)su
Cc: hackers(at)postgreSQL(dot)org, t-ishii(at)sra(dot)co(dot)jp
Subject: Re: [HACKERS] What about LIMIT in SELECT ?
Date: 1998-10-14 14:24:56
Message-ID: m0zTRrE-000EBRC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

Oleg Bartunov wrote:
> This is not a problem for CGI-script to know which key to start.

Never meant that would be a problem. A FORM variable will of
course do this.

> Without LIMIT every CGI call backend will do *FULL* selection
> and cursor helps just in fetching a definite number of rows,
> in principle I can do this with CGI-script. Also, cursor
> returns data back in ASCII format (man l declare) and this requires
> additional job for backend to convert data from intrinsic (binary)
> format. Right implementation of LIMIT offset,number_of_rows could be
> a great win and make postgres superior free database engine for
> Web applications. Many colleagues of mine used mysql instead of

That's the point I was missing. The offset!

> postgres just because of lacking LIMIT. Tatsuo posted a patch
> for set query_limit to 'num', I just tested it and seems it
> works fine. Now, we need only possibility to specify offset,
> say
> set query_limit to 'offset,num'
> ( Tatsuo, How difficult to do this ?)
> and LIMIT problem will ne gone.

Think you haven't read my posting completely. Even with the
executor limit, the complete scan into the sort is done by
the backend. You need to specify ORDER BY to get the same
list again (without the offset doesn't make sense). But
currently, ORDER BY forces a sort node into the query plan.

What the executor limit tells is how many rows will be
returned from the sorted data. Not what goes into the sort.
Filling the sort and sorting the data consumes the most time
of the queries execution.

I haven't looked at Tatsuo's patch very well. But if it
limits the amount of data going into the sort (on ORDER BY),
it will break it! The requested ordering could be different
from what the choosen index might return. The used index is
choosen by the planner upon the qualifications given, not the
ordering wanted.

So if you select WHERE b = 1 ORDER BY a, then it will use an
index on attribute b to match the qualification. The complete
result of that index scan goes into the sort to get ordered
by a. If now the executor limit stops sort filling after the
limit is exceeded, only the same tuples will go into the sort
every time. But they have nothing to do with the requested
order by a.

What LIMIT first needs is a planner enhancement. In file
backend/optimizer/plan/planner.c line 284 it must be checked
if the actual plan is an indexscan, if the indexed attributes
are all the same as those in the given sort clause and that
the requested sort order (operator) is that what the index
will return. If that all matches, it can ignore the sort
clause and return the index scan itself.

Second enhancement must be the handling of the offset. In
the executor, the index scan must skip offset index tuples
before returning the first. But NOT if the plan isn't a
1-table-index-scan. In that case the result tuples (from the
topmost unique/join/whatever node) have to be skipped.

With these enhancements, the index tuples to be skipped
(offset) will still be scanned, but not the data tuples they
point to. Index scanning might be somewhat faster.

This all will only speedup simple 1-table-queries, no joins
or if the requested order isn't that what the index exactly
returns.

Anyway, I'll take a look if I can change the planner to omit
the sort if the tests described above are true. I think it
would be good anyway.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck(at)debis(dot)com (Jan Wieck) #

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Jan Wieck 1998-10-14 14:34:47 Re: [HACKERS] What about LIMIT in SELECT ?
Previous Message Thomas G. Lockhart 1998-10-14 13:59:56 Re: [HACKERS] What about LIMIT in SELECT ?

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 1998-10-14 14:34:47 Re: [HACKERS] What about LIMIT in SELECT ?
Previous Message Thomas G. Lockhart 1998-10-14 13:59:56 Re: [HACKERS] What about LIMIT in SELECT ?