Re: PGLister fails to de-dup messages addressed twice to same list

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-www(at)lists(dot)postgresql(dot)org
Subject: Re: PGLister fails to de-dup messages addressed twice to same list
Date: 2017-11-21 15:43:38
Message-ID: 20171121154338.GZ4628@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

Tom,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> >> ... I have no doubt at all that that's
> >> going to happen a *lot* during the list domain changeover, so I'd
> >> strongly recommend putting something in place to de-dup.
>
> > Yeah, I'm already chatting w/ Magnus about this.
>
> Curiously, my replies to the same message seem to have been delivered
> only once, and that's not because I was awake enough to notice and
> remove the extra cc ;-). So my guess at this point is that you do
> have some de-dup in there, but it ain't working for gmail-originated
> messages.

As near as I can tell, GMail delivered the message to us in two
independent runs with two connections to our mail server, while your
server only delivered one message in one run to our server.

I'm guessing that your server realized it was the same MX for both
postgresql.org and lists.postgresql.org and expected our server to
handle delivering to the multiple addresses, but PGLister, for a given
email that comes in, is only going to deliver once to each of the lists
that are listed in the inbound email. On the other hand, GMail seems to
split the email on the source side for each domain/subdomain and
delivers them independently.

Unfortunately, we aren't going to be able to depend on the sender's MTA
to always put the message into one email to us, as made clear by GMail
but also because it's not really "correct." We need to have a
message-id cache in the PG database that will throw away dups when they
come in on a per-list basis. I don't anticipate it being too difficult
to implement, really, but I think we'll need it to last at least a
couple of days which implies having a cleanup job for it, et al.

Thanks!

Stephen

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Andrew Sullivan 2017-11-21 18:34:55 migrations (was Re: To all who wish to unsubscribe)
Previous Message Tom Lane 2017-11-21 15:30:54 Re: PGLister fails to de-dup messages addressed twice to same list