File: dark_side.html

package info (click to toggle)
xmhtml 1.1.10-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 6,296 kB
  • sloc: ansic: 70,372; makefile: 480; sh: 176; perl: 36
file content (432 lines) | stat: -rw-r--r-- 17,780 bytes parent folder | download | duplicates (10)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
<head><title>Dark Side of the HTML</title></head>
<body bgcolor="#101010" text="#d0d0d0" link="#ffc0c0" vlink="#ff8080">

<table border="0" width="100%">
<tr>
<td width="20%"> </td>
<td><p><font size="-1">
<p>&quot;...Decent surfing value...&quot;<br>
&quot;...A long rant...&quot;<br>
&quot;... If you ignore the spelling and grammatical errors, ... you will find this enlightening...&quot;<br>
<p>
</font>
</td>
</tr>
</table>

<br>
<h1><tt>D a r k</tt><br>
<tt>S i d e</tt><br>
<tt>O f</tt><br>
<tt>T h e</tt><br>
<tt><img src="html.gif" align = top alt = "HTML"></tt></h1>

<table border="0" width="100%">
<tr>
<td width="40%"> </td>
<td>
There are more things in heaven and earth, Horatio,<br>
Than are dreamt of in your philosophy.
</td>
</tr>
<tr><td width = "40%"> </td><td align=right>Shakespeare.</td></tr>
</table>
<p>
this is <b><i>overlapping bold italic text</b></i>

<br>
<h3><a name="intro">Historical Note.</a></h3>
<dl><dd>
<font size="-1">This page was around for so damn long (at least by the Internet Time Standards) without
<i>any</i> modifications, so it ended up living in its own time-space continuum without any
visible correlation to ours.
I felt compelled to do something about it, and after spending countless sleepless 
nights in thinking how to avoid rewriting of the whole thing, I came up with this
<a href="bkgrnote.html">historical background</a> note.
</font>
</dl>

<h4>Latest developments.</h4>
<dl><dd>
<font size="-1">
At the <a href="#darkest">end</a>. Mind-shattered news. <b>The</b> Darkest stuff.<br>
[<tt>updated 26-December-1996</tt>]
</font>
</dl>


<h2><a name="intro">Intro</a></h2>
<dl><dd>
It's a common knowledge that all documents on the WWW should be written in 
so-called HTML, aka HyperText Markup Language.

<p>Much less is known what to <i>count</i> as HTML.

<p>There's rather vague relationship between HTML (Hypertext Markup Language) 
as (almost) standartized by Internet Engineering Task Force and whatever is 
called HTML as it implemented by WEB browsers.
To add confusion, there's some <i>levels</i> and <i>revisions(?)</i> of HTML.
For example there's HTML-2.0 Level-1, and HTML-3.0. HTML-2.0 supposed to be 
some sort of the standard most browsers are trying to support.

<p>To make things really obfuscated, should be noted that HTML is a 
<b><i>markup</i></b> language. Markup means that it will <i>mark</i>
different elements of your document,  but how this document will be seen by 
WEB wandering individuals
is on total behalf of miscellaneous WEB browsers running on a multitude of 
variuos operating systems.

<p>What's interesting, some of the browsers are rushing to support 
yet-to-be-defined HTML-3.0, by the way ignoring some basic features of 
the (almost) standard HTML-2.0. Looks like HTML profanation is getting 
really profound.

<p>Sudden paroxysm of critical paranoia stroke me and I wrote this page.
</dl>

<h2><a name="tags">Tags</a></h2>
<dl><dd>
<p>Tags are the base of HTML. Tags is what differentiates HTML from simple 
dull dumb boring plain vanilla ASCII text. Sometimes it may seems HTML is just
loose collection of various tags. Unfortuantely, HTML is also a 
<i>language</i> -- in it's own right. 
Language -- it what is last letter in the HTML stands for.
But story so far will be about tags.

<p>Very little attention is paid to what exactly tag <i>is</i>. 
Common sense says that tag is some word (called tag identifier) 
surrounded by the angle brackets. 
For example, &lt;H1&gt; declares beginning of a heading level ONE and law abiding browser
should display aforementioned heading in rather big font. 
Right angle bracket, a.k.a less-than sign is
called start-tag open symbol, and left angle bracket, 
a.k.a greater-than sign is called tag close symbol. 

<p>Most of tags should be balanced -- when they are belong
to the element with certain context, like 
<pre>
  &lt;H1&gt;This is heading number one&lt;/H1&gt;
</pre>
-- where opening tag is followed by the closing tag. 
Note, that closing-tag open symbol is less-than sign followed by slash, 
<tt><b>&lt;/</b></tt>.   
For some HTML elements open or close or even both tags could be omitted. 
For example paragraph close tag &lt;/P&gt; could be omitted.

<h3>Diversion</h3>

<p>Being a lousy typist, I'm having severe troubles in typing markup.
For example to get angle quotes you should keep pressing and releasing SHIFT 
key while tapping on less-or-greater-than keys. Also typing proper 
closing tags is quite boring, especially when they are nested.

<p>So I was quite excited when I've found that HTML-2.0 language definition 
 allows <i>Tag minimization</i> : 
buried deep inside dark mess of HTML-2.0 SGML declaration (don't miss it with 
DTD - Document Type Definition) was <i>the</i> magick word 
SHORTTAG in FEATURES section and it was set to <b>YES</b>.

<p>I've rushed to my keyboard and typed this :
<pre>
        &lt;H1/First minimized HTML tag ever typed by humanity/
</pre>
Nothing happened. 

<p>Netscape (which I'm, as millions of other people, evaluating for 90
days on the fact whether to purchase an ongoing license to the Software 
or rather not) just ignored this thing as if it weren't there. Mosaic for Windows 
won't go much further either.

<p>Slightly puzzled whether my knowledge is wrong or browers are screwed, 
I went to the <a href="http://www.webtechs.com/html-val-svc/">
HTML Validation Service</a> (went - in cyber sense, you know; on the matter
I've just made a search on Yahoo)
and found that minimization tags are perfectly legal even in the strong arm of 
the <b><i>Strict</i> HTML</b> law.

<p>Now I'm entertaining myself by hounding various web browsers with 
several test pages shown below.

<p>Here they are -- perfectly HTML-compliant and utterly useless, tragically
invisible and infernally hostile to any existing HTML rendering device ... 
<p><b>Minimization TAGS</b>

<p><ul>
<li><b>Empty tags</b> : tags which identifier can be omitted and will be
 implied by the HTML reader (I cannot type browser since there's no one 
 capable to do so).<br>
<b>Empty start-tag</b> : consists from start-tag open and tag close symbols 
(<tt>&lt;</tt> and <tt>&gt;</tt> respectively) without any space in between.<br>
If such tag is encountered by the HTML reader, the program will give to the empty 
tag identifier of the most recently started element.
<pre>
    &lt;UL&gt;
       &lt;LI&gt;  this is the first item of the list
       &lt;&gt;  this is second one -- implied identifier is LI
    &lt;/&gt;
</pre>
which is rendered as:
<ul>
<li> this is the first item of the list
<> this is second one -- implied identifier is LI
</>

- note, this  unordered list was ended by <i>empty end-tag</i>.<br>
<b>Empty end-tag</b>: consists of end-tag open and tag close symbols 
(i.e. <tt>&lt;/&gt;</tt>)<br>
Identifier given to such tag by the HTML program is always that of the of the 
last element to be opened:
<pre>
     Some &lt;B&gt;bold text with empty end tag &lt;/&gt; -- right here.
</pre>
Now check out how your browser will chew up <a href="empty.html">page</a> with such tags.
Doesn't it looks like <a href="empty1.html">this one</a>?

<p><li><b>Unclosed tags</b> : where two or more consecutive tags are required 
in a document, end delimiters of all tags except the very last one in the sequence,
can be omitted:
<pre>
   This text is &lt;b&lt;i&gt; bold and italic at once &lt;/b&lt;/i&gt;.
</pre>
Take a look at the <a href="unclosed.html">page</a> filled with such tags. 
Obviously it should looks like <a href="unclosed1.html">this</a>, eh?

<p><li><b>Null-end tags</b> : allows to specify the end of an element with a single character, 
like this:
<pre>
       &lt;H1/Header with null-end tag/
</pre>
Null-end tag consists of start-tag open symbol followed by the tag identifier and 
textual data enclosed within two null-end tag symbols (slash).<br>

Appreciate how your browser will screw up <a href="null_end.html">such page</a>
which obviously gonna look like <a href="null_end1.html">this</a>. 
</ul>
 
<p>So far I've stressed following browsers :
<ul>
 <li> Netscape for Windows NT, version 1.2 and 2.0b1.
 <li> Netscape for X-Windows, version 1.1N
 <li> Mosaic for Windows, version 2.0
 <li> Arena, version 0.98
</ul>
<p>Needless to say, none of them was capable to handle minimized tags.

<p>I have been told that Harmony Hyper-G Text Viewer can cope with MINIMIZED tags, 
but since it cannot work from behind the firewall, I was unable to verify its
capabilities.

<p>Experience with Arena browser was the most inspiring. This browser is supposed
to be testbed for upcoming HTML-3.0 standard and have little indicator telling
you whether HTML ducument is bad (i.e. incorrect) or not. For <i>all</i> the
sample pages shown above and using minimization tags, it undoubtely flashed "Bad HTML"
sign. But these pages was Strict HTML-3.0 checked ! What HTML we're
talking about after all?

<p>Oh, yes - if you think this all is a joke - go to the
<a href="http://www.webtechs.com/html-val-svc/"> HTML Validation Service</a>
and check it for yourself. 
</dl></dl>

<h2> Wait, there's more ...</h2>
<dl><dd>
<p>During HTML validation, I've found some things that contradicted to something
I've heard just before. Nothing serious, just another little critical paranoia splash: 

<h3>Ubiquitous &lt;P&gt; tag</h3>

Paragraph element is one of few elements which <i>end-tag</i> symbol 
could be omitted.<br>
Roaming around varous WEB tutorials and various sorts of wisdom stores 
I've seen mentions about bad style of having paragraph break after something
which implies paragraph break by itself. Like having &lt;P&gt; tag right
after &lt;/h1&gt;. Ultimate link lead to the 
<a href="correction2.html">HTML spec. page</a> which showed
two examples (I've edited them for brevity's sake): 
<p><b>"Bad"</b>
<pre>        &lt;h1&gt;What not to do&lt;/h1&gt;
        &lt;p&gt;This is like bad or something...
</pre>  
<p><b>"Good"</b>
<pre>        &lt;h1&gt;What to do&lt;/h1&gt;
        This is like good &lt;p&gt;or something...&lt;p&gt;
</pre>  

<p>Without much hesitation, I've feed both examples to 
<a href="http://www.webtechs.com/html-val-svc/"> HTML Validation Service</a> 
and slammed them against strict HTML-2.0.<br>
As I expected results were exacly in reverse to the name of examples : "Bad"
example passed test and "Good" example caused wrath of compiler.

<p>After consulting with HTML-2.0 spec. I've found that Validation Service
was 100% right (it would be surprising if it wouldn't). If someone doesn't know -
in HTML-2.0 paragraph is <i>non-empty</i> element with mandatory start tag
and optional end tag (<tt>&lt;P&gt; and &lt;/P&gt;</tt> respectively). Therefore
paragraph element in HTML-2.0 can contain arbitrary number of subelements -
lists, text data, etc. In HTML-1.0 paragraph was <i>EMPTY</i> element, 
which actually represented not a paragraph, but rather paragraph break and 
had only start tag, <tt>&lt;P&gt;</tt> -- similar to the line break 
<tt>&lt;BR&gt;</tt>. I failed to find HTML-1 DTD, but I think things are 
pretty close to what I've described. 

<p>Note, that wedging paragraph tags <i>within</i> &lt;h1&gt;...&lt;/h1&gt; <b>is</b>
an error, so don't try to catch me on this. 

<p>Another note: non-strict HTML-2.0 is more relaxed, so both examples would be ok.

<p>After testing  there was one question left - where's such interesting 
HTML specification page
came from? My guess (and I think I'm right with probability about 0.99) :
this was remnants of the HTML-1.0 spec. safely decomposing in some of the dark
corners of the W3 consortium. I've failed to find head or TOC of this document.
Interestingly enough, 
<a href="correction1.html">link</a> to HTML-1.0 spec. on 
<a href="http://www.w3.org/hypertext/WWW/MarkUp/MarkUp.html">W3 page</a> was 
hoplessly broken too.

<h3>Minimal HTML document</h3>

<p>Looking at different HTML tutorials I've found suprising multitude of opinions 
what should be considered as <i>minimal</i> HTML document. Id est what is the minimal
amount of tags you should put to make your ASCI text look like valid HTML document?
In other words, what is this thin metaphysical boundary beyond which plain ASCII
text became <b>HYPER</b>?
To cut off the fuzziness of the word "valid", I decided <i>valid</i> 
would be <i>fully conforming to
HTML-2.0 (strict) specification</i> (or DTD -- for those who behold). 
Since I wasn't sure by myself about what is the minimal valid document is, 
I digged up HTML-2.0 spec and stared at it for a moment.<p>

I've found following amazing (or maybe not) facts:
<ul>
<li> HTML document is <b>content</b> surrounded by tags &lt;HTML&gt;...&lt;/HTML&gt;
<li> <b>content</b> is <b>HEAD</b> followed by <b>BODY</b>.
<li> <b>HEAD</b> is mandatory <b>TITLE</b> plus <i>optional</i> <b>ISINDEX</b> and <b>BASE</b>. 
<li> <b>BODY</b> is a collection of <b>headings</b> <b>text</b>, etc. repeated 
<i>0 (zero)</i> or more times (note emphasis on zero -- it means all actual
information in HTML document is <i>optional</i>).
</ul>

<p>Summing all of the above minimal document would look like :
<pre>
    &lt;HTML&gt;
      &lt;HEAD&gt;
         &lt;TITLE&gt;Minimal HTML Document&lt;/TITLE&gt;
      &lt;/HEAD&gt;      
      &lt;BODY&gt;
      &lt;/BODY&gt;
    &lt;/HTML&gt;
</pre>

<p>But...all the elements except <b>TITLE</b> happened to have optional 
start-tag and end-tag symbols ! So until you not a typing maniac, minimal
HTML document would be:
<pre>
    &lt;TITLE&gt;Minimal HTML Document&lt;/TITLE&gt;
</pre>
<p>If an HTML document is to convey any sort of information, minimal 
<i>HTML-2.0 strict</i> -conforming document would be 
(note <b>&lt;P&gt;</b> symbol!) : 
<pre>
    &lt;TITLE&gt;Minimal HTML Document&lt;/TITLE&gt;
    &lt;P&gt;Some text without any spark of sense.
</pre>
<p>Oh, yes, if we'd want to treat our minimal HTML documents as <i>SGML</i> one,
<b>document identifier</b> should precede everything, like:
<pre>
    &lt;!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"&gt;
    &lt;TITLE&gt;Minimal HTML Document&lt;/TITLE&gt;
</pre>
<p>-- but now we starting to play by the SGML rules and no browser can stand where
SGML reigns... 

<h3>Non-breaking space</h3>

<p>This element can be used whenever you want to protect your precious spaces
from all-spaces-in-one jamming browser.
<p>Non-breaking space value is 160, symbol is <tt><b>&amp;#160</b></tt> and code is 
<tt><b>&amp;nbsp;</b></tt>. Code <tt>&amp;nbsp</tt> is a part of HTML-2.0, 
by the way.
<p>Some browsers, like any version of X-Mosaic, ignore both
<tt>&amp;#160</tt> and <tt>&amp;nbsp;</tt> or, like Arena -- only <tt>&amp;#160</tt>, 
while Mosaic for Windows and Netscape 
(for everything) can cope with both.
<p>Should be noted, proportional font (default in many browsers) usually have 
pretty narrow space character, so it is advisable to switch to the fixed
font before using non-breaking space. 
<p>Here's some example:
<pre>
   &lt;dl&gt;&lt;dd&gt;
   &lt;tt&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt/tt&gt;Look, 
   here's some paragraph with indent,&lt;br&gt; 
   whoa -- check it out.
   &lt;/dl&gt;
</pre>
<p> will looks like
<p>   <dl><dd>
   <tt>&nbsp;&nbsp;&nbsp;</tt>Look, 
   here's some paragraph with indent,<br> whoa -- check it out.
   </dl>
<p>Useful side effect of the non-breaking space is that latter is not 
considered by the the browsers as space at all,
so it could be used whenever you want to protect words from breaking apart.

</dl>


<h2>Moral of the story</h2>

<dl><dd>
<p>Now that I've have enough of the subject and it would be just right 
time to outline what I've tried to tell and what would be the best approach
to cope with HTML:
<ul>
<li> Today's HTML is ruled by the Lynch Mob of the various WEB browsers. <br>
  Whatever any of this browsers is capable to grind - is <i>the</i> HTML. 
  Obviously such HTML is changing from browser to browser.
<li> "Standard" HTML (HTML-2.0) is rather abstract thing : you may create 
files complying to HTML standard, or spec., you may verify them using
free public service (if it gives you any relief), but there's no
guarantee that your document could be viewed at all (a little exaggeration here).
<li> If your page can be viewed by the certain browser -- stick with it. Put
disclaimer like "This page is optimized for NetZillaSoft Naviplorer, v. 0.003", 
and consider all other people not using your browser are losers. This is much 
better than the previous case : at least you can be sure your document will be
accepted by the at least one type of browser.
<li> If you have some information that you'd think is really KEWL, use 
minimal amount of tags. While contents is k00l nobody would actually care 
about absence of glitzy pics and fancy adornments. 
<li> Don't trust any HTML tutorials, manuals, collection of advices, etc. 
including this one.
<li> Don't try to feed this page to HTML Validator, it won't pass anyway. 
Actually, who cares?

<li> Browse safely.
</ul>
</dl>

<h2>Credits.</h2>
<dl><dd>
<p><b>HAIL</b> to folks at WebTechs (formerly HAL)
 for the pretty useful HTML Validation Service referred 
throughout this manuscript. It saved me quite a time on running SP manually.<br>
<p><b>HAIL</b> to James Clark @ jclark.com, creator of the most profound SGML parser so far.
One of the previous version of this parser is used in the HTML Validation Service.
</dl>

<h2><a name="darkest">The end.</h2>
<dl>
<dt><tt>26-December-1996</tt>
<dd><p>Beyound the dark side: loads of incredibly odd information about tables
in <a href="http://www.absurd.org/absurd/tablemaquia">TABLEMAQUIA</a>.
</dl>

<hr>

<h5> You can send your frustrated comments to 
<a href = "mailto:sur_html@sem.vip.best.com">me</a>. Take care then. </h5>

<h6>Disclaimer: This page is not information.</h6>

</body>