Re: Header unfolding in archived mail

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: PostgreSQL WWW <pgsql-www(at)postgresql(dot)org>
Subject: Re: Header unfolding in archived mail
Date: 2013-12-15 16:56:13
Message-ID: CABUevExAwvB70KzXpTm249LbWAbpv8_=oneBfztmsBF0X8edhg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Mon, Dec 9, 2013 at 1:41 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:

> On Sat, Sep 07, 2013 at 06:07:45PM -0400, Noah Misch wrote:
> > The mailing list web archives display the subject of message
> > 20130603190727(dot)GA360354(at)tornado(dot)leadboat(dot)com as follows:
> >
> > Partitioning performance: cache stringToNode() ofpg_constraint.ccbin
> >
> > Note the lack of whitespace after "of". The original message, which you
> can
> > see by downloading the mbox for June 2013, conveyed the subject this way:
> >
> > Subject: Partitioning performance: cache stringToNode() of
> > pg_constraint.ccbin
> >
> > Per RFC 5322, section 2.2.3:
> >
> > The process of moving from this folded multiple-line representation
> > of a header field to its single line representation is called
> > "unfolding". Unfolding is accomplished by simply removing any CRLF
> > that is immediately followed by WSP. Each header field should be
> > treated in its unfolded form for further syntactic and semantic
> > evaluation. An unfolded header field has no length restriction and
> > therefore may be indeterminately long.
> >
> > So, the archives should present the subject like this:
> >
> > Partitioning performance: cache stringToNode() of pg_constraint.ccbin
> >
> > Gmane and osdir.com do so. MARC and Gmail show a space in place of the
> tab,
> > but Gmail converts every subject-line tab to a space. I have attached a
> > patch, against pgarchives.git, making its unfolding code conform to RFC
> 5322.
> > The change also affects headers folded before a space rather than before
> a
> > tab, such as 50E31370(dot)5030405(at)cybertec(dot)at(dot) Those have been displaying
> fine
> > despite the lack of unfolding because newline-space renders like a space
> in
> > HTML. I unit-tested the change, but I did not test the full archives
> load.
> >
> >
> > The "raw" message display feature seems to have its own set of rules,
> and I
> > failed to find their implementation. Here are the subject lines for the
> > aforementioned messages according to "raw" display:
> >
> > Subject: Partitioning performance: cache stringToNode() of
> pg_constraint.ccbin
> > Subject: Review of "pg_basebackup and pg_receivexlog to use non-blocking
> socket
> > communication", was: Re: Re: [BUGS] BUG #7534: walreceiver takes
> > long time to detect n/w breakdown
> >
> > In one case, "\n\t" from the true raw original (in the mbox file) became
> " ".
> > In the other case, two instances of "\n " became "\n\t". Any ideas
> where that
> > transformation is coming from?
>
> Ping. Any advice on how to more-thoroughly test the pgarchives.git
> change, or
> where I might find the corresponding code affecting "raw" message display?
>
>
>
Hi!

This one is entirely on me, I just haven't been able to get around to it
yet :( It's still on my TODO list though, so I haven't given up on you!

Sorry!

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Magnus Hagander 2013-12-15 18:03:38 Re: Change my contributor information
Previous Message Ian Lawrence Barwick 2013-12-15 11:30:37 Re: Wiki spammers :(