Re: Post-2018 messages in archives

From: Noah Misch <noah(at)leadboat(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: Re: Post-2018 messages in archives
Date: 2018-12-05 01:53:14
Message-ID: 20181205015314.GA2931419@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Mon, Dec 03, 2018 at 10:08:20AM +0100, Magnus Hagander wrote:
> On Mon, Dec 3, 2018 at 2:40 AM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > At some point in the last few months, the archives of many mailing lists
> > added
> > messages dated far in the future. For example, pgsql-hackers archives
> > gained
> > four messages from years 2030, 2032 and 2036:
> >
> > https://www.postgresql.org/list/pgsql-hackers/since/203011010000/

> > Perhaps the fix is to set the archive date to the archives ingest time when
> > the message asserts a date substantially (15min?) earlier or later. Would
> > that be an improvement?

> Unfortunately we don't keep the ingest time separately. But for the future,
> doing so would probably be a good idea, for other reasons as well. I think
> 15 minutes might be pushing it a bit given the kind of times we see around,
> in particular with incorrectly configured timezones. But something like 24h
> would probably work.
>
> Luckily, it's not too terribly bad:
>
> archives=# select count(*) from messages where date > now();
> count
> -------
> 10
> (1 row)
>
> (out of about 1.3M messages).
>
> So short-term I will go process those messages manually.

Data looks clean now. Thanks. If the problem remains as rare as it has been,
the automated fix I was contemplating is premature.

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Magnus Hagander 2018-12-05 08:39:18 Re: Post-2018 messages in archives
Previous Message Daniel Gustafsson 2018-12-04 20:53:10 Re: A few random typos