[ntp:questions] forwarding
Per Hedeland
per at hedeland.org
Wed Mar 16 00:50:02 UTC 2005
First, this is long, very long, and doesn't discuss NTP at all - just
move on if this offends you. Second, I strongly disagree with those that
claim that this discussion is off-topic and should be taken to private
e-mail - meta-discussions about a newsgroup's form and/or charter are
quite relevant (within limits of course), and while there is only a
handful of people that have actually complained about what the mailing
list gateway is doing to the group, per standard Unenet/mailinglist
demographics it can reasonably be assumed that many more are silently
either suffering or simply dropping off.
That being said, I'll also tentatively disagree with those that think
that the influx from the mailing list is lowering the technical quality
of the group. For as long as I can remember, there has been "stupid"
newbie and/or off-topic questions posted directly to the newgroup, and I
don't see the messages coming from the mailing list as having a
perceptly higher ratio of such. Possibly the opposite, as a large
proportion of the messages from the mailing list, for some reason, are
sent by people that regularly provide potentially useful answers to
questions asked.
I can't really buy the argument that the mailing list is needed for
those that don't have access to Usenet though - surely none of the
abovementioned posters are in this category, and in this day and age the
intersection of people that a) have access to unrestricted e-mail, b)
are unable to point a web browser at groups.google.com, and c) have a
need to discuss NTP, must be vanishingly small. I can certainly
appreciate that some people *prefer* the mailing list medium to either
real Usenet or a web-based interface to same, and it's perfectly fine to
provide them with a mail gateway - as long as it doesn't negatively
impact the group.
But in my opinion the non-Usenet-standard form of this gateway is a
severe negative impact on the group, in the way that it fragments and
disconnects discussion threads, leading to redundant posts, and
responses to questions (or vice versa) being hard to find - in short, a
general disorder compared to a properly functioning Usenet newsgroup.
And in my opinion, this is not acceptable. Since Brad has explained that
he doesn't have the technical knowledge to modify the gatewaying
software, I can certainly help out with that, having implemented several
news<->mail gateways without these annoying properties. But there seems
to be some other obstacles - read on...
In article <mailman.18.1110671158.576.questions at lists.ntp.isc.org> Brad
Knowles <brad at stop.mail-abuse.org> writes:
>At 10:01 PM +0000 2005-03-12, Hans Jørgen Jakobsen wrote:
>
>> On my reader (slrn) the gateway'd articles seem to
>> 1) break threading
>
> This is the Message-id: issue. This is not going to change. See
><http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq04.059.htp>.
This is a most interesting page, since it gives an entirely different
motivation for mailman's behaviour than the one given in the code (which
you earlier - indirectly - referred to). And the motivation given here
is just plain wrong:
1) Unless you assume that all news<->mail gateways use the same
software, or that there is a standard way to rewrite Message-IDs in
news<->mail gatewaying (obviously neither is the case), modifying
Message-IDs in any way greatly *increases* the risk of infinite
loops. It isn't hard to imagine a scenario where two different
gateways each do their own munging of Message-IDs, neither
recognizing the form the other uses, with the result that a message
can circulate forever mail->news->mail->news->...
In contrast, preserving the Message-ID means that the buck in
principle stops the second time someone tries to feed the message
into Usenet - it will be rejected since the message already exists
(for those that don't know it, this is a fundamental property of the
Usenet flood distribution mechanism).
2) There is an abundance of methods to solve the "don't send messages
from the mailing list back to the mailing list when doing news->mail
gatewaying" problem, that don't include munging of Message-IDs. The
most common method is to solve it at the news server side, using the
same mechanisms that normally prevents a news server from even
offering articles back to the peer it received it from (i.e. the Path
header). Another possibility is to have a small local database of
messages that have recently been gatewayed mail->news (using the
unique and unchanging Message-ID as key, of course). And the "add an
extra header" method also works fine with a properly functioning news
server.
3) The "Zawinski algorithm", which is identical to what Wayne Davison
implemented in trn (modulo the use of In-Reply-To[1]) back when
Netscape was at most a gleam in Jamie's eye, can *not* compensate for
the breakage caused by mailman's munging. A posting that doesn't have
*any* Message-ID from a Usenet posting in its References header (the
normal case for a first-level followup from the mailing list to an
original message from the mailing list) can never be properly
threaded.
Messages that have some Usenet Message-IDs and some non-Usenet in
References can possibly be placed at the right level in the thread,
but not fully linked into it - in particular it's typically not
possible to find the direct parent. This is something you have to
accept for a followup to a private or missing message, but totally
inexcusable when the parent message is sitting right there in the
group - but with a munged Message-ID.
The Subject/date/etc sorting is just a crude workaround "when all
else fails" - it is not threading, but can at most place a message in
the "general area" of related posts. And even this is frequently
defeated by mailman's Subject munging (in particular for the
first-level followup case above - those messages can't even be
"Subject sorted" but ends up totally disconnected from the parent).
Again, this is something that a decent newsreader should implement as
a last resort - but it's totally inexcusable for any software to
systematically force this kind of breakage onto a newsgroup.
[1] In-Reply-To does not have any defined semantics for Usenet messages
(it would be reasonable to call it "invalid", if it weren't for the fact
that what standards there are specifically "allow" any "undefined"
headers). Of course a newsreader implementor is free to try to take
advantage of it, but it isn't a reasonable requirement. It is however
quite reasonable to expect that mail->news gatewaying software combines
In-Reply-To and References into the only one of them that are relevant
for Usenet, i.e. References, as part of the format conversion it
*should* do (a concept that seems to be unknown to the mailman
developers). And on that topic, a gateway that munges Message-IDs must
*of course* munge References/In-Reply-To in exactly the same way.
In contrast to this, the motivation given in the code was at least
technically correct, even if misguided. It concerned the possibilty of a
message being sent simultaneously to two mailing lists, both gatewayed
to Usenet (though at different places), or of having two different
gateways from a single mailing list into different newsgroups. In these
cases, gatewaying without Message-ID munging typically means that only
the "first" gatewaying of the message is successful, the others will be
rejected since the message already exists on Usenet - and so e.g. the
"cross-mailed" message will only appear in one of the two newsgroups it
"should" be in.
This is a pretty well-known shortcoming of mail->news gatewaying, and
generally accepted given that the occurence is relatively unusual and
that there is no good solution (the non-existent "standard way to
rewrite Message-IDs" could handle it) - certainly way preferable over
causing the type of damage to *all* messages that mailman does.
>> 2) modify Subject header. ie insert "[ntp:questions]", sometimes in
>> several levels
>
> The gateway does not do this. I've checked the gateway source
>code, and when configured to do so, it goes out of it's way to avoid
>adding the subject prefix to messages being posted to the newsgroup.
>This is done by default, and the gateway we run is configured this
>way.
>
> However, what does not happen in Mailman 2.1.5 is the *removal*
>of all previous examples of the prefix which may have been put in the
>subject line by MUAs.
Sorry, but this is nonsensical. Mailman inserts the string on messages
sent out on the mailing list - none of the newsgroup's business *so
far*. A properly functioning MUA will not modify the Subject on a reply
other than by prefixing with "Re: ", or allowing modifications performed
manually by the user. Ergo, all responses on the mailing list will still
have the string that mailman inserted, and it is still there when
mailman gateways the message into the newsgroup. Saying that "the MUA
put it there" makes about as much sense as saying that the MUA produced
the text of the message - technically correct, but entirely pointless.
> One solution to this problem will be
>incorporated into Mailman 2.1.6, which I am planning on installing as
>soon as it is officially released.
>From the source I looked at, this would be a quick one-liner to fix - no
need to sit around waiting for the next "official release". Removal of
the Message-ID munging might actually require commenting out more than
one line, IIRC. The ability to modify and fix is after all one of the
major advantages with using open-source software.
--Per Hedeland
per at hedeland.org
More information about the questions
mailing list