From: | Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> |
---|---|
To: | James Coleman <jtc331(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Subject: | Re: [PATCH] Use optimized single-datum tuplesort in ExecSort |
Date: | 2021-07-06 15:03:32 |
Message-ID: | 4385594.q5DUe7zQD7@aivenronan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you for the review, I will address those shortly, but will answer some
questions in the meantime.
> > First, the changes are lacking any explanatory comments. Probably we
> > should follow how nodeAgg does this and add both comments to the
> > ExecSort function header as well as specific comments above the "if"
> > around the new tuplesort_begin_datum explaining the specific
> > conditions that are required for the optimization to be useful and
> > safe.
Done, since I lifted the restrictions following your questions, there isn't
much left to comment. (see below)
> >
> > That leads to a question I had: I don't follow why bounded mode (when
> > using byval) needs to be excluded. Comments should be added if there's
> > a good reason (as noted above), but maybe it's a case we can handle
> > safely?
I had test failures when trying to move the Datum around when performing a
bounded sort, but did not look into it at first.
Now I've looked into it, and the switch to a heapsort when using bounded mode
just unconditionnaly tried to free a tuple that was never there to begin with.
So if the SortTuple does not contain an actual tuple, but only a single datum,
do not do that.
I've updated the patch to fix this and enable the optimization in the case of
bounded sort.
> >
> > A second question: at first glance it's intuitively the case we might
> > not be able to handle byref values. But nodeAgg doesn't seem to have
> > that restriction. What's the difference here?
>
> I think tuplesort_begin_datum, doesn't have any such limitation, it
> can handle any type of Datum so I think we don't need to consider the
> only attbyval, we can consider any type of attribute for this
> optimization.
I've restricted the optimization to byval types because of the following
comment in nodeAgg.c:
/*
* Note: if input type is pass-by-ref, the datums returned by the
sort are
* freshly palloc'd in the per-query context, so we must be careful
to
* pfree them when they are no longer needed.
*/
As I was not sure how to handle that, I prefered the safety of not enabling
it. Since you both told me it should be safe, I've lifted that restriction
too.
> A few small code observations:
> - In my view the addition of unlikely() in ExecSort is unlikely to be
> of benefit because it's a single call for the entire node's execution
> (not in the tuple loop).
Done.
> - It seems clearer to change the "if (!node->is_single_val)" to flip
> the true/false cases so we don't need the negation.
Agreed, done.
> - I assume there are tests that likely already cover this case, but
> it'd be worth verifying that.
Yes many test cases cover that, but maybe it would be better to explictly
check for it on some cases: do you think we could add a debug message that can
be checked for ?
> Finally, I believe the same optimization likely ought to be added to
> nodeIncrementalSort. It's less likely the tests there are sufficient
> for both this and the original case, but we'll see.
I will look into it, but isn't incrementalsort used to sort tuples on several
keys, when they are already sorted on the first ? In that case, I doubt we
would ever have a single-valued tuple here, except if there is a projection to
strip the tuple from extraneous attributes.
--
Ronan Dunklau
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Allow-Sort-nodes-to-use-the-fast-single-datum-tuples.patch | text/x-patch | 4.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2021-07-06 15:08:35 | Re: Enhanced error message to include hint messages for redundant options error |
Previous Message | Alvaro Herrera | 2021-07-06 14:58:42 | Re: Pipeline mode and PQpipelineSync() |