Re: Post-2018 messages in archives

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: Re: Post-2018 messages in archives
Date: 2018-12-06 06:14:18
Message-ID: 20181206061418.GC2945370@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Wed, Dec 05, 2018 at 11:31:39PM -0500, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > On Wed, Dec 05, 2018 at 09:39:18AM +0100, Magnus Hagander wrote:
> >>> Unfortunately we don't keep the ingest time separately. But for the future,
> >>> doing so would probably be a good idea, for other reasons as well.
>
> > Works for me. Pondering it more, the timestamp that matters most for archive
> > purposes is the timestamp at which list subscribers started to receive their
> > copies of the message. Based on that, I'm thinking we should ignore the Date
> > header and always use the timestamp from a particular "Received ... by
> > HOSTNAME.postgresql.org" header. Before settling on that, I'd want to check
> > how many messages change timestamp by more than ~100s, and I'd want to spot
> > check a few messages to see whether the change looks like an improvement.
>
> Another point worth considering here is moderation queue delays, which
> are not infrequently measured in days :-(. I am not quite sure whether
> it'd be better to tag a moderation-delayed message with the timestamp
> when it entered the queue or the time when it exited. But either one
> would be better than believing the Date: header.

Good point. I'd prefer to use the time when it exited the queue, which
conforms to "timestamp at which list subscribers started to receive their
copies of the message" mentioned above. I usually download November's mbox in
the first few days of December. If we use the timestamp of entering the queue
(or the Date header), there's no particular upper bound on when the November
mbox stops accruing new messages.

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Magnus Hagander 2018-12-06 12:26:15 Re: Post-2018 messages in archives
Previous Message Tom Lane 2018-12-06 04:31:39 Re: Post-2018 messages in archives