1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011
|
<pre>RFC 850 June 1983
<span class="h1">Standard for Interchange of USENET Messages</span>
Mark R. Horton
[ This memo is distributed as an RFC only to make this
information easily accessible to researchers in the ARPA
community. It does not specify an Internet standard. ]
<span class="h2"><a class="selflink" id="section-1" href="#section-1">1</a>. Introduction</span>
This document defines the standard format for interchange
of Network News articles among USENET sites. It describes
the format for articles themselves, and gives partial
standards for transmission of news. The news transmission
is not entirely standardized in order to give a good deal
of flexibility to the individual hosts to choose
transmission hardware and software, whether to batch news,
and so on.
There are five sections to this document. Section two
section defines the format. Section three defines the
valid control messages. Section four specifies some valid
transmission methods. Section five describes the overall
news propagation algorithm.
<span class="h2"><a class="selflink" id="section-2" href="#section-2">2</a>. Article Format</span>
The primary consideration in choosing an article format is
that it fit in with existing tools as well as possible.
Existing tools include both implementations of mail and
news. (The notesfiles system from the University of
Illinois is considered a news implementation.) A standard
format for mail messages has existed for many years on the
ARPANET, and this format meets most of the needs of
USENET. Since the ARPANET format is extensible,
extensions to meet the additional needs of USENET are
easily made within the ARPANET standard. Therefore, the
rule is adopted that all USENET news articles must be
formatted as valid ARPANET mail messages, according to the
ARPANET standard RFC 822. This standard is more
restrictive than the ARPANET standard, placing additional
requirements on each article and forbidding use of certain
ARPANET features. However, it should always be possible
to use a tool expecting an ARPANET message to process a
news article. In any situation where this standard
conflicts with the ARPANET standard, RFC 822 should be
considered correct and this standard in error.
- 1 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'>An example message is included to illustrate the fields.
Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP
Posting-Version: version B 2.10 2/13/83; site eagle.UUCP
Path: cbosgd!mhuxj!mhuxt!eagle!jerry
From: jerry@eagle.uucp (Jerry Schwarz)
Newsgroups: net.general
Subject: Usenet Etiquette -- Please Read
Message-ID: <642@eagle.UUCP>
Date: Friday, 19-Nov-82 16:14:55 EST
Followup-To: net.news
Expires: Saturday, 1-Jan-83 00:00:00 EST
Date-Received: Friday, 19-Nov-82 16:59:30 EST
Organization: Bell Labs, Murray Hill
The body of the article comes here, after a blank line.
Here is an example of a message in the old format (before
the existence of this standard). It is recommended that
implementations also accept articles in this format to
ease upward conversion.
From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz)
Newsgroups: net.general
Title: Usenet Etiquette -- Please Read
Article-I.D.: eagle.642
Posted: Fri Nov 19 16:14:55 1982
Received: Fri Nov 19 16:59:30 1982
Expires: Mon Jan 1 00:00:00 1990
The body of the article comes here, after a blank line.
Some news systems transmit news in the "A" format, which
looks like this:
Aeagle.642
net.general
cbosgd!mhuxj!mhuxt!eagle!jerry
Fri Nov 19 16:14:55 1982
Usenet Etiquette - Please Read
The body of the article comes here, with no blank line.
An article consists of several header lines, followed by a
blank line, followed by the body of the message. The
header lines consist of a keyword, a colon, a blank, and
some additional information. This is a subset of the
ARPANET standard, simplified to allow simpler software to
handle it. The "from" line may optionally include a
full name, in the format above, or use the ARPANET angle
bracket syntax. To keep the implementations simple, other
formats (for example, with part of the machine address
after the close parenthesis) are not allowed. The ARPANET
convention of continuation header lines (beginning with a
blank or tab) is allowed.
- 2 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'>Certain headers are required, certain headers are
optional. Any unrecognized headers are allowed, and will
be passed through unchanged. The required headers are
Relay-Version, Posting-Version, From, Date, Newsgroups,
Subject, Message-ID, Path. The optional headers are
Followup-To, Date-Received, Expires, Reply-To, Sender,
References, Control, Distribution, Organization.
<span class="h3"><a class="selflink" id="section-2.1" href="#section-2.1">2.1</a> Required Headers</span>
<span class="h4"><a class="selflink" id="section-2.1.1" href="#section-2.1.1">2.1.1</a> Relay-Version </span>This header line shows the version
of the program responsible for the transmission of this
article over the immediate link, that is, the program that
is relaying the article from the next site. For example,
suppose site A sends an article to site B, and site B
forwards the article to site C. The message being
transmitted from A to B would have a Relay-Version header
identifying the program running on A, and the message
transmitted from B to C would identify the program running
on B. This header can be used to interpret older headers
in an upward compatible way. Relay-Version must always be
the first in a message; thus, all articles meeting this
standard will begin with an upper case "R". No other
restrictions are placed on the order of header lines.
The line contains two fields, separated by semicolons.
The fields are the version and the full domain name of the
site. The version should identify the system program used
(e.g., "B") as well as a version number and version
date. For example, the header line might contain
Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP
This header should not be passed on to additional sites.
A relay program, when passing an article on, should
include only its own Relay-Version, not the Relay-Version
of some other site. (For upward compatibility with older
software, if a Relay-Version is found in a header which is
not the first line, it should be assumed to be moved by an
older version of news and deleted.)
<span class="h4"><a class="selflink" id="section-2.1.2" href="#section-2.1.2">2.1.2</a> Posting-Version </span> This header identifies the
software responsible for entering this message into the
network. It has the same format as Relay-Version. It
will normally identify the same site as the Message-ID,
unless the posting site is serving as a gateway for a
message that already contains a message ID generated by
mail. (While it is permissible for a gateway to use an
externally generated message ID, the message ID should be
checked to ensure it conforms to this standard and to RFC
822.)
- 3 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span class="h4"><a class="selflink" id="section-2.1.3" href="#section-2.1.3">2.1.3</a> From </span>The From line contains the electronic mailing
address of the person who sent the message, in the ARPA
internet syntax. It may optionally also contain the full
name of the person, in parentheses, after the electronic
address. The electronic address is the same as the entity
responsible for originating the article, unless the Sender
header is present, in which case the From header might not
be verified. Note that in all site and domain names,
upper and lower case are considered the same, thus
mark@cbosgd.UUCP, mark@cbosgd.uucp, and mark@CBosgD.UUcp
are all equivalent. User names may or may not be case
sensitive, for example, Billy@cbosgd.UUCP might be
different from BillY@cbosgd.UUCP. Programs should avoid
changing the case of electronic addresses when forwarding
news or mail.
<a href="./rfc822">RFC 822</a> specifies that all text in parentheses is to be
interpreted as a comment. It is common in ARPANET mail to
place the full name of the user in a comment at the end of
the From line. This standard specifies a more rigid
syntax. The full name is not considered a comment, but an
optional part of the header line. Either the full name is
omitted, or it appears in parentheses after the electronic
address of the person posting the article, or it appears
before an electronic address enclosed in angle brackets.
Thus, the three permissible forms are:
From: mark@cbosgd.UUCP
From: mark@cbosgd.UUCP (Mark Horton)
From: Mark Horton <mark@cbosgd.UUCP>
Full names may contain any printing ASCII characters from
space through tilde, with the exceptions that they may not
contain parentheses "(" or ")", or angle brackets
"<" or ">". Additional restrictions may be placed on
full names by the mail standard, in particular, the
characters comma ",", colon ":", and semicolon ";"
are inadvisable in full names.
<span class="h4"><a class="selflink" id="section-2.1.4" href="#section-2.1.4">2.1.4</a> Date </span>The Date line (formerly "Posted") is the
date, in a format that must be acceptable both to the
ARPANET and to the getdate routine, that the article was
originally posted to the network. This date remains
unchanged as the article is propagated throughout the
network. One format that is acceptable to both is
Weekday, DD-Mon-YY HH:MM:SS TIMEZONE
Several examples of valid dates appear in the sample
article above. Note in particular that ctime format:
Wdy Mon DD HH:MM:SS YYYY
- 4 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'>is not acceptable because it is not a valid ARPANET date.
However, since older software still generates this format,
news implementations are encouraged to accept this format
and translate it into an acceptable format.
The contents of the TIMEZONE field is currently subject to
worldwide time zone abbreviations, including the usual
American zones (PST, PDT, MST, MDT, CST, CDT, EST, EDT),
the other North American zones (Bering through
Newfoundland), European zones, Australian zones, and so
on. Lacking a complete list at present (and unsure if an
unambiguous list exists), authors of software are
encouraged to keep this code flexible, and in particular
not to assume that time zone names are exactly three
letters long. Implementations are free to edit this
field, keeping the time the same, but changing the time
zone (with an appropriate adjustment to the local time
shown) to a known time zone.
<span class="h4"><a class="selflink" id="section-2.1.5" href="#section-2.1.5">2.1.5</a> Newsgroups </span>The Newsgroups line specifies which
newsgroup or newsgroups the article belongs in. Multiple
newsgroups may be specified, separated by a comma.
Newsgroups specified must all be the names of existing
newsgroups, as no new newsgroups will be created by simply
posting to them.
Wildcards (e.g., the word "all") are never allowed in a
Newsgroups line. For example, a newsgroup "net.all" is
illegal, although a newsgroup name "net.sport.football"
is permitted.
If an article is received with a Newsgroups line listing
some valid newsgroups and some invalid newsgroups, a site
should not remove invalid newsgroups from the list.
Instead, the invalid newsgroups should be ignored. For
example, suppose site A subscribes to the classes
"btl.all" and "net.all", and exchanges news articles
with site B, which subscribes to "net.all" but not
"btl.all". Suppose A receives an article with
"Newsgroups: net.micro,btl.general". This article is
passed on to B because B receives net.micro, but B does
not receive btl.general. A must leave the Newsgroup line
unchanged. If it were to remove "btl.general", the
edited header could eventually reenter the "btl.all"
class, resulting in an article that is not shown to users
subscribing to "btl.general". Also, followups from
outside "btl.all" would not be shown to such users.
- 5 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span class="h4"><a class="selflink" id="section-2.1.6" href="#section-2.1.6">2.1.6</a> Subject </span> The Subject line (formerly "Title")
tells what the article is about. It should be suggestive
enough of the contents of the article to enable a reader
to make a decision whether to read the article based on
the subject alone. If the article is submitted in
response to another article (e.g., is a "followup") the
default subject should begin with the four characters
"Re: " and the References line is required. (The user
might wish to edit the subject of the followup, but the
default should begin with "Re: ".)
<span class="h4"><a class="selflink" id="section-2.1.7" href="#section-2.1.7">2.1.7</a> Message-ID </span>The Message-ID line gives the article a
unique identifier. The same message ID may not be reused
during the lifetime of any article with the same message
ID. (It is recommended that no message ID be reused for
at least two years.) Message ID's have the syntax
"<" "string not containing blank or >" ">"
In order to conform to <a href="./rfc822">RFC 822</a>, the Message-ID must have
the format
"<" "unique" "@" "full domain name" ">"
where "full domain name" is the full name of the host at
which the article entered the network, including a domain
that host is in, and unique is any string of printing
ASCII characters, not including "<", ">", or "@". For
example, the "unique" part could be an integer
representing a sequence number for articles submitted to
the network, or a short string derived from the date and
time the article was created. For example, valid message
ID for an article submitted from site ucbvax in domain
Berkeley.ARPA would be "<4123@ucbvax.Berkeley.ARPA>".
Programmers are urged not to make assumptions about the
content of message ID fields from other hosts, but to
treat them as unknown character strings. It is not safe,
for example, to assume that a message ID will be under 14
characters, nor that it is unique in the first 14
characters.
The angle brackets are considered part of the message ID.
Thus, in references to the message ID, such as the
ihave/sendme and cancel control messages, the angle
brackets are included. White space characters (e.g.,
blank and tab) are not allowed in a message ID. All
characters between the angle brackets must be printing
ASCII characters.
<span class="h4"><a class="selflink" id="section-2.1.8" href="#section-2.1.8">2.1.8</a> Path </span>This line shows the path the article took to
reach the current system. When a system forwards the
message, it should add its own name to the list of systems
in the Path line. The names may be separated by any
punctuation character or characters, thus
- 6 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'>"cbosgd!mhuxj!mhuxt", "cbosgd, mhuxj, mhuxt", and
"@cbosgd.uucp,@mhuxj.uucp,@mhuxt.uucp" and even
"teklabs, zehntel, sri-unix@cca!decvax" are valid
entries. (The latter path indicates a message that passed
through decvax, cca, sri-unix, zehntel, and teklabs, in
that order.) Additional names should be added from the
left, for example, the most recently added name in the
third example was "teklabs". Letters, digits, periods
and hyphens are considered part of site names; other
punctuation, including blanks, are considered separators.
Normally, the rightmost name will be the name of the
originating system. However, it is also permissible to
include an extra entry on the right, which is the name of
the sender. This is for upward compatibility with older
system.
The Path line is not used for replies, and should not be
taken as a mailing address. It is intended to show the
route the message travelled to reach the local site.
There are several uses for this information. One is to
monitor USENET routing for performance reasons. Another
is to establish a path to reach new sites. Perhaps the
most important is to cut down on redundant USENET traffic
by failing to forward a message to a site that is known to
have already received it. In particular, when site A
sends an article to site B, the Path line includes "A",
so that site B will not immediately send the article back
to site A. The site name each site uses to identify
itself should be the same as the name by which its
neighbors know it, in order to make this optimization
possible.
A site adds its own name to the front of a path when it
receives a message from another site. Thus, if a message
with path A!X!Y!Z is passed from site A to site B, B will
add its own name to the path when it receives the message
from A, e.g., B!A!X!Y!Z. If B then passes the message on
to C, the message sent to C will contain the path
B!A!X!Y!Z, and when C receives it, C will change it to
C!B!A!X!Y!Z.
Special upward compatibility note: Since the From, Sender,
and Reply-To lines are in internet format, and since many
USENET sites do not yet have mailers capable of
understanding internet format, it would break the reply
capability to completely sever the connection between the
Path header and the reply function. Thus, sites are
required to continue to keep the Path line in a working
reply format as much as possible, until January 1, 1984.
It is recognized that the path is not always a valid reply
string in older implementations, and no requirement to fix
this problem is placed on implementations. However, the
- 7 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'>existing convention of placing the site name and an "!"
at the front of the path, and of starting the path with
the site name, an "!", and the user name, should be
maintained at least until 1984.
<span class="h3"><a class="selflink" id="section-2.2" href="#section-2.2">2.2</a> Optional Headers</span>
<span class="h4"><a class="selflink" id="section-2.2.1" href="#section-2.2.1">2.2.1</a> Reply-To </span>This line has the same format as From.
If present, mailed replies to the author should be sent to
the name given here. Otherwise, replies are mailed to the
name on the From line. (This does not prevent additional
copies from being sent to recipients named by the replier,
or on To or Cc lines.) The full name may be optionally
given, in parentheses, as in the From line.
<span class="h4"><a class="selflink" id="section-2.2.2" href="#section-2.2.2">2.2.2</a> Sender </span>This field is present only if the submitter
manually enters a From line. It is intended to record the
entity responsible for submitting the article to the
network, and should be verified by the software at the
submitting site.
For example, if John Smith is visiting CCA and wishes to
post an article to the network, using friend Sarah Jones
account, the message might read
From: smith@ucbvax.uucp (John Smith)
Sender: jones@cca.arpa (Sarah Jones)
If a gateway program enters a mail message into the
network at site sri-unix, the lines might read
From: John.Doe@CMU-CS-A.ARPA
Sender: network@sri-unix.ARPA
The primary purpose of this field is to be able to track
down articles to determine how they were entered into the
network. The full name may be optionally given, in
parentheses, as in the From line.
<span class="h4"><a class="selflink" id="section-2.2.3" href="#section-2.2.3">2.2.3</a> Followup-To </span>This line has the same format as
Newsgroups. If present, follow-up articles are to be
posted to the newsgroup(s) listed here. If this line is
not present, followups are posted to the newsgroup(s)
listed in the Newsgroups line, except that followups to
"net.general" should instead go to "net.followup".
<span class="h4"><a class="selflink" id="section-2.2.4" href="#section-2.2.4">2.2.4</a> Date-Received </span>This line (formerly "Received") is
in a legal USENET date format. It records the date and
time that the article was first received on the local
system. If this line is present in an article being
transmitted from one host to another, the receiving host
should ignore it and replace it with the current date.
Since this field is intended for local use only, no site
is required to support it. However, no site should pass
this field on to another site unchanged.
- 8 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span class="h4"><a class="selflink" id="section-2.2.5" href="#section-2.2.5">2.2.5</a> Expires </span>This line, if present, is in a legal
USENET date format. It specifies a suggested expiration
date for the article. If not present, the local default
expiration date is used.
This field is intended to be used to clean up articles
with a limited usefulness, or to keep important articles
around for longer than usual. For example, a message
announcing an upcoming seminar could have an expiration
date the day after the seminar, since the message is not
useful after the seminar is over. Since local sites have
local policies for expiration of news (depending on
available disk space, for instance), users are discouraged
from providing expiration dates for articles unless there
is a natural expiration date associated with the topic.
System software should almost never provide a default
Expires line. Leave it out and allow local policies to be
used unless there is a good reason not to.
<span class="h4"><a class="selflink" id="section-2.2.6" href="#section-2.2.6">2.2.6</a> References </span>This field lists the message ID's of
any articles prompting the submission of this article. It
is required for all follow-up articles, and forbidden when
a new subject is raised. Implementations should provide a
follow-up command, which allows a user to post a follow-up
article. This command should generate a Subject line
which is the same as the original article, except that if
the original subject does not begin with "Re: " or "re: ",
the four characters "Re: " are inserted before the
subject. If there is no References line on the original
header, the References line should contain the message ID
of the original article (including the angle brackets).
If the original article does have a References line, the
followup article should have a References line containing
the text of the original References line, a blank, and the
message ID of the original article.
The purpose of the References header is to allow articles
to be grouped into conversations by the user interface
program. This allows conversations within a newsgroup to
be kept together, and potentially users might shut off
entire conversations without unsubscribing to a newsgroup.
User interfaces may not make use of this header, but all
automatically generated followups should generate the
References line for the benefit of systems that do use it,
and manually generated followups (e.g. typed in well after
the original article has been printed by the machine)
should be encouraged to include them as well.
<span class="h4"><a class="selflink" id="section-2.2.7" href="#section-2.2.7">2.2.7</a> Control </span>If an article contains a Control line, the
article is a control message. Control messages are used
for communication among USENET host machines, not to be
read by users. Control messages are distributed by the
same newsgroup mechanism as ordinary messages. The body
of the Control header line is the message to the host.
- 9 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'>For upward compatibility, messages that match the
newsgroup pattern "all.all.ctl" should also be
interpreted as control messages. If no Control: header is
present on such messages, the subject is used as the
control message. However, messages on newsgroups matching
this pattern do not conform to this standard.
<span class="h4"><a class="selflink" id="section-2.2.8" href="#section-2.2.8">2.2.8</a> Distribution </span> This line is used to alter the
distribution scope of the message. It has the same format
as the Newsgroups line. User subscriptions are still
controlled by Newsgroups, but the message is sent to all
systems subscribing to the newsgroups on the Distribution
line instead of the Newsgroups line. Thus, a car for sale
in New Jersey might have headers including
Newsgroups: net.auto,net.wanted
Distribution: nj.all
so that it would only go to persons subscribing to
net.auto or net.wanted within New Jersey. The intent of
this header is to further restrict the distribution of a
newsgroup, not to increase it. A local newsgroup, such as
nj.crazy-eddie, will probably not be propagated by sites
outside New Jersey that do not show such a newsgroup as
valid. Wildcards in newsgroup names in the Distribution
line are allowed. Followup articles should default to the
same Distribution line as the original article, but the
user can change it to a more limited one, or escalate the
distribution if it was originally restricted and a more
widely distributed reply is appropriate.
<span class="h4"><a class="selflink" id="section-2.2.9" href="#section-2.2.9">2.2.9</a> Organization </span>The text of this line is a short
phrase describing the organization to which the sender
belongs, or to which the machine belongs. The intent of
this line is to help identify the person posting the
message, since site names are often cryptic enough to make
it hard to recognize the organization by the electronic
address.
<span class="h2"><a class="selflink" id="section-3" href="#section-3">3</a>. Control Messages</span>
This section lists the control messages currently defined.
The body of the Control header is the control message.
Messages are a sequence of zero or more words, separated
by white space (blanks or tabs). The first word is the
name of the control message, remaining words are
parameters to the message. The remainder of the header
and the body of the message are also potential parameters;
for example, the From line might suggest an address to
which a response is to be mailed.
- 10 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'>Implementors and administrators may choose to allow
control messages to be automatically carried out, or to
queue them for manual processing. However, manually
processed messages should be dealt with promptly.
<span class="h3"><a class="selflink" id="section-3.1" href="#section-3.1">3.1</a> Cancel</span>
cancel <message ID>
If an article with the given message ID is present on the
local system, the article is cancelled. This mechanism
allows a user to cancel an article after the article has
been distributed over the network.
Only the author of the article or the local super user is
allowed to use this message. The verified sender of a
message is the Sender line, or if no Sender line is
present, the From line. The verified sender of the cancel
message must be the same as either the Sender or From
field of the original message. A verified sender in the
cancel message is allowed to match an unverified From in
the original message.
<span class="h3"><a class="selflink" id="section-3.2" href="#section-3.2">3.2</a> Ihave/Sendme</span>
ihave <message ID list> <remotesys>
sendme <message ID list> <remotesys>
This message is part of the "ihave/sendme" protocol,
which allows one site (say "A") to tell another site
("B") that a particular message has been received on A.
Suppose that site A receives article "ucbvax.1234", and
wishes to transmit the article to site B. A sends the
control message "ihave ucbvax.1234 A" to site B (by
posting it to newsgroup "to.B"). B responds with the
control message "sendme ucbvax.1234 B" (on newsgroup
to.A) if it has not already received the article. Upon
receiving the Sendme message, A sends the article to B.
This protocol can be used to cut down on redundant traffic
between sites. It is optional and should be used only if
the particular situation makes it worthwhile. Frequently,
the outcome is that, since most original messages are
short, and since there is a high overhead to start sending
a new message with UUCP, it costs as much to send the
Ihave as it would cost to send the article itself.
One possible solution to this overhead problem is to batch
requests. Several message ID's may be announced or
requested in one message. If no message ID's are listed
in the control message, the body of the message should be
scanned for message ID's, one per line.
- 11 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span class="h3"><a class="selflink" id="section-3.3" href="#section-3.3">3.3</a> Newgroup</span>
newgroup <groupname>
This control message creates a new newsgroup with the name
given. Since no articles may be posted or forwarded until
a newsgroup is created, this message is required before a
newsgroup can be used. The body of the message is
expected to be a short paragraph describing the intended
use of the newsgroup.
<span class="h3"><a class="selflink" id="section-3.4" href="#section-3.4">3.4</a> Rmgroup</span>
rmgroup <groupname>
This message removes a newsgroup with the given name.
Since the newsgroup is removed from every site on the
network, this command should be used carefully by a
responsible administrator.
<span class="h3"><a class="selflink" id="section-3.5" href="#section-3.5">3.5</a> Sendsys</span>
sendsys (no arguments)
The "sys" file, listing all neighbors and which
newsgroups are sent to each neighbor, will be mailed to
the author of the control message (Reply-to, if present,
otherwise From). This information is considered public
information, and it is a requirement of membership in
USENET that this information be provided on request,
either automatically in response to this control message,
or manually, by mailing the requested information to the
author of the message. This information is used to keep
the map of USENET up to date, and to determine where
netnews is sent.
The format of the file mailed back to the author should be
the same as that of the "sys" file. This format has one
line per neighboring site (plus one line for the local
site), containing four colon separated fields. The first
field has the site name of the neighbor, the second field
has a newsgroup pattern describing the newsgroups sent to
the neighbor. The third and fourth fields are not defined
by this standard. A sample response:
From cbosgd!mark Sun Mar 27 20:39:37 1983
Subject: response to your sendsys request
To: mark@cbosgd.UUCP
- 12 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'> Responding-System: cbosgd.UUCP
cbosgd:osg,cb,btl,bell,net,fa,to,test
ucbvax:net,fa,to.ucbvax:L:
cbosg:net,fa,bell,btl,cb,osg,to.cbosg:F:/usr/spool/outnews/cbosg
cbosgb:osg,to.cbosgb:F:/usr/spool/outnews/cbosgb
sescent:net,fa,bell,btl,cb,to.sescent:F:/usr/spool/outnews/sescent
npois:net,fa,bell,btl,ug,to.npois:F:/usr/spool/outnews/npois
mhuxi:net,fa,bell,btl,ug,to.mhuxi:F:/usr/spool/outnews/mhuxi
<span class="h3"><a class="selflink" id="section-3.6" href="#section-3.6">3.6</a> Senduuname</span>
senduuname (no arguments)
The "uuname" program is run, and the output is mailed to
the author of the control message (Reply-to, if present,
otherwise From). This program lists all uucp neighbors of
the local site. This information is used to make maps of
the UUCP network. The sys file is not the same as the
UUCP L.sys file. The L.sys file should never be
transmitted to another party without the consent of the
sites whose passwords are listed therein.
It is optional for a site to provide this information.
Some reply should be made to the author of the control
message, so that a transmission error won't be blamed. It
is also permissible for a site to run the uuname program
(or in some other way determine the uucp neighbors) and
edit the output, either automatically or manually, before
mailing the reply back to the author. The file should
contain one site per line, beginning with the uucp site
name. Additional information may be included, separated
from the site name by a blank or tab. The phone number or
password for the site should NOT be included, as the reply
is considered to be in the public domain. (The uuname
program will send only the site name and not the entire
contents of the L.sys file, thus, phone numbers and
passwords are not transmitted.)
The purpose of this message is to generate and maintain
UUCP mail routing maps. Thus, connections over which mail
can be sent using the site!user syntax should be included,
regardless of whether the link is actually a UUCP link at
the physical level. If a mail router should use it, it
should be included. Since all information sent in
response to this message is optional, sites are free to
edit the list, deleting secret or private links they do
not wish to publicise.
<span class="h3"><a class="selflink" id="section-3.7" href="#section-3.7">3.7</a> Version</span>
version (no arguments)
The name and version of the software running on the local
system is to be mailed back to the author of the article
(Reply-to if present, otherwise From).
- 13 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span class="h2"><a class="selflink" id="section-4" href="#section-4">4</a>. Transmission Methods</span>
USENET is not a physical network, but rather a logical
network resting on top of several existing physical
networks. These networks include, but are not limited to,
UUCP, the ARPANET, an Ethernet, the BLICN network, an NSC
Hyperchannel, and a Berknet. What is important is that
two neighboring systems on USENET have some method to get
a new article, in the format listed here, from one system
to the other, and once on the receiving system, processed
by the netnews software on that system. (On UNIX systems,
this usually means the "rnews" program being run with
the article on the standard input.)
It is not a requirement that USENET sites have mail
systems capable of understanding the ARPA Internet mail
syntax, but it is strongly recommended. Since From,
Reply-To, and Sender lines use the Internet syntax,
replies will be difficult or impossible without an
internet mailer. A site without an internet mailer can
attempt to use the Path header line for replies, but this
field is not guaranteed to be a working path for replies.
In any event, any site generating or forwarding news
messages must have an internet address that allows them to
receive mail from sites with internet mailers, and they
must include their internet address on their From line.
<span class="h3"><a class="selflink" id="section-4.1" href="#section-4.1">4.1</a> Remote Execution</span>
Some networks permit direct remote command execution. On
these networks, news may be forwarded by spooling the
rnews command with the article on the standard input. For
example, if the remote system is called "remote", news
would be sent over a UUCP link with the command "uux -
remote!rnews", and on a Berknet, "net -mremote rnews".
It is important that the article be sent via a reliable
mechansim, normally involving the possibility of spooling,
rather than direct real-time remote execution. This is
because, if the remote system is down, a direct execution
command will fail, and the article will never be
delivered. If the article is spooled, it will eventually
be delivered when both systems are up.
<span class="h3"><a class="selflink" id="section-4.2" href="#section-4.2">4.2</a> Transfer by Mail</span>
On some systems, direct remote spooled execution is not
possible. However, most systems support electronic mail,
and a news article can be sent as mail. One approach is
to send a mail message which is identical to the news
message: the mail headers are the news headers, and the
mail body is the news body. By convention, this mail is
sent to the user "newsmail" on the remote machine.
- 14 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'>One problem with this method is that it may not be
possible to convince the mail system that the From line of
the message is valid, since the mail message was generated
by a program on a system different from the source of the
news article. Another problem is that error messages
caused by the mail transmission would be sent to the
originator of the news article, who has no control over
news transmission between two cooperating hosts and does
not know who to contact. Transmission error messages
should be directed to a responsible contact person on the
sending machine.
A solution to this problem is to encapsulate the news
article into a mail message, such that the entire article
(headers and body) are part of the body of the mail
message. The convention here is that such mail is sent to
user "rnews" on the remote system. A mail message body
is generated by prepending the letter "N" to each line
of the news article, and then attaching whatever mail
headers are convenient to generate. The N's are attached
to prevent any special lines in the news article from
interfering with mail transmission, and to prevent any
extra lines inserted by the mailer (headers, blank lines,
etc.) from becoming part of the news article. A program
on the receiving machine receives mail to "rnews",
extracting the article itself and invoking the "rnews"
program. An example in this format might look like this:
Date: Monday, 3-Jan-83 08:33:47 MST
From: news@cbosgd.UUCP
Subject: network news article
To: rnews@npois.UUCP
NRelay-Version: B 2.10 2/13/83 cbosgd.UUCP
NPosting-Version: B 2.9 6/21/82 sask.UUCP
NPath: cbosgd!mhuxj!harpo!utah-cs!sask!derek
NFrom: derek@sask.UUCP (Derek Andrew)
NNewsgroups: net.test
NSubject: necessary test
NMessage-ID: <176@sask.UUCP>
NDate: Monday, 3-Jan-83 00:59:15 MST
N
NThis really is a test. If anyone out there more than 6
Nhops away would kindly confirm this note I would
Nappreciate it. We suspect that our news postings
Nare not getting out into the world.
N
Using mail solves the spooling problem, since mail must
always be spooled if the destination host is down.
However, it adds more overhead to the transmission process
(to encapsulate and extract the article) and makes it
harder for software to give different priorities to news
and mail.
- 15 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span class="h3"><a class="selflink" id="section-4.3" href="#section-4.3">4.3</a> Batching</span>
Since news articles are usually short, and since a large
number of messages are often sent between two sites in a
day, it may make sense to batch news articles. Several
articles can be combined into one large article, using
conventions agreed upon in advance by the two sites. One
such batching scheme is described here; its use is still
considered experimental.
News articles are combined into a script, separated by a
header of the form:
##! rnews 1234
where 1234 is the length, in bytes, of the article. Each
such line is followed by an article containing the given
number of bytes. (The newline at the end of each line of
the article is counted as one byte, for purposes of this
count, even if it is stored as CRLF.) For example, a batch
of articles might look like this:
#! rnews 374
Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP
Posting-Version: version B 2.10 2/13/83; site eagle.UUCP
Path: cbosgd!mhuxj!mhuxt!eagle!jerry
From: jerry@eagle.uucp (Jerry Schwarz)
Newsgroups: net.general
Subject: Usenet Etiquette -- Please Read
Message-ID: <642@eagle.UUCP>
Date: Friday, 19-Nov-82 16:14:55 EST
Here is an important message about USENET Etiquette.
#! rnews 378
Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP
Posting-Version: version B 2.10 2/13/83; site eagle.UUCP
Path: cbosgd!mhuxj!mhuxt!eagle!jerry
From: jerry@eagle.uucp (Jerry Schwarz)
Newsgroups: net.followup
Subject: Notes on Etiquette article
Message-ID: <643@eagle.UUCP>
Date: Friday, 19-Nov-82 17:24:12 EST
There was something I forgot to mention in the last message.
Batched news is recognized because the first character in
the message is "#". The message is then passed to the
unbatcher for interpretation.
- 16 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'><span class="h2"><a class="selflink" id="section-5" href="#section-5">5</a>. The News Propagation Algorithm</span>
This section describes the overall scheme of USENET and
the algorithm followed by sites in propagating news to the
entire network. Since all sites are affected by
incorrectly formatted articles and by propagation errors,
it is important for the method to be standardized.
USENET is a directed graph. Each node in the graph is a
host computer, each arc in the graph is a transmission
path from one host to another host. Each arc is labelled
with a newsgroup pattern, specifying which newsgroup
classes are forwarded along that link. Most arcs are
bidirectional, that is, if site A sends a class of
newsgroups to site B, then site B usually sends the same
class of newsgroups to site A. This bidirectionality is
not, however, required.
USENET is made up of many subnetworks. Each subnet has a
name, such as "net" or "btl". The special subnet
"net" is defined to be USENET, although the union of all
subnets may be a superset of USENET (because of sites that
get local newsgroup classes but do not get net.all). Each
subnet is a connected graph, that is, a path exists from
every node to every other node in the subnet. In
addition, the entire graph is (theoretically) connected.
(In practice, some political considerations have caused
some sites to be unable to post articles reaching the rest
of the network.)
An article is posted on one machine to a list of
newsgroups. That machine accepts it locally, then
forwards it to all its neighbors that are interested in at
least one of the newsgroups of the message. (Site A deems
site B to be "interested" in a newsgroup if the
newsgroup matches the pattern on the arc from A to B.
This pattern is stored in a file on the A machine.) The
sites receiving the incoming article examine it to make
sure they really want the article, accept it locally, and
then in turn forward the article to all their interest
neighbors. This process continues until the entire
network has seen the article.
An important part of the algorithm is the prevention of
loops. The above process would cause a message to loop
along a cycle forever. In particular, when site A sends
an article to site B, site B will send it back to site A,
which will send it to site B, and so on. One solution to
this is the history mechanism. Each site keeps track of
all articles it has seen (by their message ID) and
whenever an article comes in that it has already seen, the
incoming article is discarded immediately. This solution
is sufficient to prevent loops, but additional
optimizations can be made to avoid sending articles to
sites that will simply throw them away.
- 17 -</pre>
<hr class='noprint'/><!--NewPage--><pre class='newpage'>One optimization is that an article should never be sent
to a machine listed in the Path line of the header. When
a machine name is in the Path line, the message is known
to have passed through the machine. Another optimization
is that, if the article originated on site A, then site A
has already seen the article. (Origination can be
determined by the Posting-Version line.)
Thus, if an article is posted to newsgroup "net.misc",
it will match the pattern "net.all" (where "all" is a
metasymbol that matches any string), and will be forwarded
to all sites that subscribe to net.all (as determined by
what their neighbors send them). These sites make up the
"net" subnetwork. An article posted to "btl.general"
will reach all sites receiving "btl.all", but will not
reach sites that do not get "btl.all". In effect, the
articles reaches the "btl" subnetwork. An article
posted to newsgroups "net.micro,btl.general" will reach
all sites subscribing to either of the two classes.
- 18 -
</pre>
|