From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Joseph Koshakow <koshy44(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Infinite Interval |
Date: | 2023-09-20 10:27:16 |
Message-ID: | CACJufxHFmB6uOWd8Rz6cppqG2-UOdqTZ6_2DG0FJ9FCfvErpHQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Sep 19, 2023 at 7:14 PM Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>
>
> I haven't reviewed this part in any detail yet, but I can confirm that
> there are some impressive performance improvements for avg(). However,
> for me, sum() seems to be consistently a few percent slower with this
> patch.
Now there should be no degradation.
> The introduction of an internal transition state struct seems like a
> promising approach, but I think there is more to be gained by
> eliminating per-row pallocs, and IntervalAggState's MemoryContext
> (interval addition, unlike numeric addition, doesn't require memory
> allocation, right?).
"eliminating per-row pallocs"
I guess I understand. If not , please point it out.
> IntervalAggState's MemoryContext
> (interval addition, unlike numeric addition, doesn't require memory
> allocation, right?).
if I remove IntervalAggState's element: MemoryContext, it will not work.
so I don't understand what the above sentence means...... Sorry. (it's
my problem)
> Also, this needs to include serialization and deserialization
> functions, otherwise these aggregates will no longer be able to use
> parallel workers. That makes a big difference to queryE, if the size
> of the test data is scaled up.
>
I tried, but failed. sum(interval) result is correct, but
avg(interval) result is wrong.
Datum
interval_avg_serialize(PG_FUNCTION_ARGS)
{
IntervalAggState *state;
StringInfoData buf;
bytea *result;
/* Ensure we disallow calling when not in aggregate context */
if (!AggCheckCallContext(fcinfo, NULL))
elog(ERROR, "aggregate function called in non-aggregate context");
state = (IntervalAggState *) PG_GETARG_POINTER(0);
pq_begintypsend(&buf);
/* N */
pq_sendint64(&buf, state->N);
/* Interval struct elements, one by one. */
pq_sendint64(&buf, state->sumX.time);
pq_sendint32(&buf, state->sumX.day);
pq_sendint32(&buf, state->sumX.month);
/* pInfcount */
pq_sendint64(&buf, state->pInfcount);
/* nInfcount */
pq_sendint64(&buf, state->nInfcount);
result = pq_endtypsend(&buf);
PG_RETURN_BYTEA_P(result);
}
SELECT sum(b) ,avg(b)
,avg(b) = sum(b)/count(*) as should_be_true
,avg(b) * count(*) = sum(b) as should_be_true_too
from interval_aggtest_1m; --1million row.
The above query expects two bool columns to return true, but actually
both returns false.(spend some time found out parallel mode will make
the number of rows to 1_000_002, should be 1_000_0000).
> This comment:
>
> + int64 N; /* count of processed numbers */
>
> should be "count of processed intervals".
fixed.
Attachment | Content-Type | Size |
---|---|---|
v21-0001-Move-integer-helper-function-to-int.h.patch | text/x-patch | 3.3 KB |
v21-0002-Check-for-overflow-in-make_interval.patch | text/x-patch | 5.0 KB |
v21-0005-doc-for-special-interval-value.patch | text/x-patch | 1.3 KB |
v21-0003-Add-infinite-interval-values.patch | text/x-patch | 96.9 KB |
v21-0004-Revert-Remove-dead-code-in-DecodeInterval.patch | text/x-patch | 1021 bytes |
v21-0006-refactor-avg-interval-sum-interval-aggregate.patch | text/x-patch | 25.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-09-20 10:49:18 | Re: Comment about set_join_pathlist_hook() |
Previous Message | Amit Kapila | 2023-09-20 10:11:29 | Re: [PoC] pg_upgrade: allow to upgrade publisher node |