Re: [Question] Similar Cost but variable execution time in sort

From: Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pghackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Question] Similar Cost but variable execution time in sort
Date: 2023-03-05 18:17:57
Message-ID: d3f3b586-3627-35b1-1129-542be0295eb9@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 05/03/23 22:21, Tom Lane wrote:

> Ankit Kumar Pandey <itsankitkp(at)gmail(dot)com> writes:
> > From my observation, we only account for data in cost computation but
> > not number of columns sorted.
> > Should we not account for number of columns in sort as well?
>
> I'm not sure whether simply charging more for 2 sort columns than 1
> would help much. The traditional reasoning for not caring was that
> data and I/O costs would swamp comparison costs anyway, but maybe with
> ever-increasing memory sizes we're getting to the point where it is
> worth refining the model for in-memory sorts. But see the header
> comment for cost_sort().
>
> Also ... not too long ago we tried and failed to install more-complex
> sort cost estimates for GROUP BY. The commit log message for f4c7c410e
> gives some of the reasons why that failed, but what it boils down to
> is that useful estimates would require information we don't have, such
> as a pretty concrete idea of the relative costs of different datatypes'
> comparison functions.
>
> In short, maybe there's something to be done here, but I'm afraid
> there is a lot of infrastructure slogging needed first, if you want
> estimates that are better than garbage-in-garbage-out.
>
> regards, tom lane

Thanks, I can see the challenges in this.

Regards,
Ankit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Jones 2023-03-05 18:44:01 [PATCH] Add CANONICAL option to xmlserialize
Previous Message Tom Lane 2023-03-05 17:54:31 Re: Date-Time dangling unit fix