Re: Startup cost of sequential scan

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Startup cost of sequential scan
Date: 2018-08-30 15:23:44
Message-ID: 87tvnbx55o.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Konstantin" == Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> writes:

>> No, startup cost is not the "time to find the first row". It's
>> overhead paid before you even get to start examining rows.

Konstantin> But it seems to me that calculation of cost in LIMIT node
Konstantin> contradicts with this statement:

The model (assuming I understand it rightly) is that what we're actually
tracking is a startup cost and a per-output-row cost, but for comparison
purposes we actually store the rows and the computed total, rather than
just the per-row cost:

rows
startup_cost
total_cost = startup_cost + (rows * per_row_cost)

So what Limit is doing the for the offset count is recovering the
subpath's per_row_cost from (total_cost - startup_cost)/rows, and then
scaling that by the number of rows in the offset (which are being
discarded), and adding that to the startup cost. So this is saying: the
startup cost for OFFSET N is the startup cost of the subplan, plus the
cost of fetching N rows from the subplan. (And after fetching those N
rows, we still haven't found the first row that we will actually
return.)

For LIMIT N, we instead replace the old total cost with a new one
calculated from the startup cost plus N times the subplan's per-row
cost.

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-08-30 15:33:19 Re: Startup cost of sequential scan
Previous Message Alexander Korotkov 2018-08-30 15:23:30 Re: Startup cost of sequential scan