Re: Add LSN <-> time conversion functionality

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Ilya Kosmodemiansky <hydrobiont(at)gmail(dot)com>
Subject: Re: Add LSN <-> time conversion functionality
Date: 2024-08-09 13:15:01
Message-ID: CAAKRu_YNct_AAj7PsKA2XyEn5THeiD5e20UR2DE=KvN8+gyPpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 9, 2024 at 9:09 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
>
>
> On 8/9/24 03:02, Melanie Plageman wrote:
> > On Thu, Aug 8, 2024 at 2:34 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> >> each seconds. And we want to allow merging stuff nicely. The smallest
> >> merges we could do is 1s -> 2s -> 4s -> 8s -> ... but let's say we do
> >> 1s -> 10s -> 100s -> 1000s instead.
> >>
> >> So we start with 100x one-second buckets
> >>
> >> [A_0, A_1, ..., A_99] -> 100 x 1s buckets
> >> [B_0, B_1, ..., B_99] -> 100 x 10s buckets
> >> [C_0, C_1, ..., C_99] -> 100 x 100s buckets
> >> [D_0, D_1, ..., D_99] -> 100 x 1000s buckets
> >>
> >> We start by adding data into A_k buckets. After filling all 100 of them,
> >> we grab the oldest 10 buckets, and combine/move them into B_k. And so
> >> on, until B is gets full too. Then we grab the 10 oldest B_k entries,
> >> and move them into C. and so on. For D the oldest entries would get
> >> discarded, or we could add another layer with each bucket representing
> >> 10k seconds.
> >
> > I originally had an algorithm that stored old values somewhat like
> > this (each element stored 2x logical members of the preceding
> > element). When I was testing algorithms, I abandoned this method
> > because it was less accurate than the method which calculates the
> > interpolation error "area". But, this would be expected -- it would be
> > less accurate for older values.
> >
> > I'm currently considering an algorithm that uses a combination of the
> > interpolation error and the age of the point. I'm thinking of adding
> > to or dividing the error of each point by "now - that point's time (or
> > lsn)". This would lead me to be more likely to drop points that are
> > older.
> >
> > This is a bit different than "combining" buckets, but it seems like it
> > might allow us to drop unneeded recent points when they are very
> > regular.
> >
>
> TBH I'm a bit lost in how the various patch versions merge the data.
> Maybe there is a perfect algorithm, keeping a perfectly accurate
> approximation in the smallest space, but does that really matter? If we
> needed to keep many instances / very long history, maybe it's matter.
>
> But we need one instance, and we seem to agree it's enough to have a
> couple days of history at most. And even the super wasteful struct I
> described above would only need ~8kB for that.
>
> I suggest we do the simplest and most obvious algorithm possible, at
> least for now. Focusing on this part seems like a distraction from the
> freezing thing you actually want to do.

The simplest thing to do would be to pick an arbitrary point in the
past (say one week) and then throw out all the points (except the very
oldest to avoid extrapolation) from before that cliff. I would like to
spend time on getting a new version of the freezing patch on the list,
but I think Robert had strong feelings about having a complete design
first. I'll switch focus to that for a bit so that perhaps you all can
see how I am using the time -> LSN conversion and that could inform
the design of the data structure.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-08-09 13:16:11 Re: Add trim_trailing_whitespace to editorconfig file
Previous Message vignesh C 2024-08-09 13:12:41 Re: Logical Replication of sequences