File: autolinks.md

package info (click to toggle)
haskell-commonmark-extensions 0.2.5.5-2
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 368 kB
  • sloc: haskell: 2,574; makefile: 9
file content (210 lines) | stat: -rw-r--r-- 8,089 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
## Autolinks (extension)

GFM enables the `autolink` extension, where autolinks will be recognised in a
greater number of conditions.

[Autolink]s can also be constructed without requiring the use of `<` and to `>`
to delimit them, although they will be recognized under a smaller set of
circumstances.  All such recognized autolinks can only come at the beginning of
a line, after whitespace, or any of the delimiting characters `*`, `_`, `~`,
and `(`.

An [extended www autolink](@) will be recognized when the text `www.` is found
followed by a [valid domain]. A [valid domain](@) consists of alphanumeric
characters, underscores (`_`), hyphens (`-`) and periods (`.`).  There must be
at least one period, and no underscores may be present in the last two segments
of the domain.

The scheme `http` will be inserted automatically:

```````````````````````````````` example
www.commonmark.org
.
<p><a href="http://www.commonmark.org">www.commonmark.org</a></p>
````````````````````````````````

After a [valid domain], zero or more non-space non-`<` characters may follow:

```````````````````````````````` example
Visit www.commonmark.org/help for more information.
.
<p>Visit <a href="http://www.commonmark.org/help">www.commonmark.org/help</a> for more information.</p>
````````````````````````````````

We then apply [extended autolink path validation](@) as follows:

Trailing punctuation (specifically, `?`, `!`, `.`, `,`, `:`, `*`, `_`, and `~`)
will not be considered part of the autolink, though they may be included in the
interior of the link:

```````````````````````````````` example
Visit www.commonmark.org.

Visit www.commonmark.org/a.b.

Visit www.commonmark.org/~jm/foo/bar.pdf.
.
<p>Visit <a href="http://www.commonmark.org">www.commonmark.org</a>.</p>
<p>Visit <a href="http://www.commonmark.org/a.b">www.commonmark.org/a.b</a>.</p>
<p>Visit <a href="http://www.commonmark.org/~jm/foo/bar.pdf">www.commonmark.org/~jm/foo/bar.pdf</a>.</p>
````````````````````````````````

((commonmark-hs: We depart from the GFM spec here. Alternative spec and tests
can be found after these commented-out ones. For motivation for this departure from
the GFM spec, see #147.))

> When an autolink ends in `)`, we scan the entire autolink for the total number
> of parentheses.  If there is a greater number of closing parentheses than
> opening ones, we don't consider the last character part of the autolink, in
> order to facilitate including an autolink inside a parenthesis:
>
> ```````````````````````````````` example
> www.google.com/search?q=Markup+(business)
>
> (www.google.com/search?q=Markup+(business))
> .
> <p><a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a></p>
> <p>(<a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a>)</p>
> ````````````````````````````````
>
> This check is only done when the link ends in a closing parentheses `)`, so if
> the only parentheses are in the interior of the autolink, no special rules are
> applied:
>
> ```````````````````````````````` example
> www.google.com/search?q=(business))+ok
> .
> <p><a href="http://www.google.com/search?q=(business))+ok">www.google.com/search?q=(business))+ok</a></p>
> ````````````````````````````````

Autolinks can contain balanced pairs of parentheses, or unbalanced `)`.
We don't allow unbalanced `)`, in order to facilitate including
an autolink inside a parenthesis:

```````````````````````````````` example
www.google.com/search?q=Markup+(business)

(www.google.com/search?q=Markup+(business))
.
<p><a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a></p>
<p>(<a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a>)</p>
````````````````````````````````

Issue #147:
```````````````````````````````` example
[link](https://baidu.com)aaa<span></span>bbb
.
<p><a href="https://baidu.com">link</a>aaa<span></span>bbb</p>
````````````````````````````````

((End of diverging section.))

If an autolink ends in a semicolon (`;`), we check to see if it appears to
resemble an [entity reference][entity references]; if the preceding text is `&`
followed by one or more alphanumeric characters.  If so, it is excluded from
the autolink:

```````````````````````````````` example
www.google.com/search?q=commonmark&hl=en

www.google.com/search?q=commonmark&hl;
.
<p><a href="http://www.google.com/search?q=commonmark&amp;hl=en">www.google.com/search?q=commonmark&amp;hl=en</a></p>
<p><a href="http://www.google.com/search?q=commonmark">www.google.com/search?q=commonmark</a>&amp;hl;</p>
````````````````````````````````

`<` immediately ends an autolink.

```````````````````````````````` example
www.commonmark.org/he<lp
.
<p><a href="http://www.commonmark.org/he">www.commonmark.org/he</a>&lt;lp</p>
````````````````````````````````

An [extended url autolink](@) will be recognised when one of the schemes
`http://`, `https://`, or `ftp://`, followed by a [valid domain], then zero or
more non-space non-`<` characters according to
[extended autolink path validation]:

```````````````````````````````` example
http://commonmark.org

(Visit https://encrypted.google.com/search?q=Markup+(business))

Anonymous FTP is available at ftp://foo.bar.baz.
.
<p><a href="http://commonmark.org">http://commonmark.org</a></p>
<p>(Visit <a href="https://encrypted.google.com/search?q=Markup+(business)">https://encrypted.google.com/search?q=Markup+(business)</a>)</p>
<p>Anonymous FTP is available at <a href="ftp://foo.bar.baz">ftp://foo.bar.baz</a>.</p>
````````````````````````````````


An [extended email autolink](@) will be recognised when an email address is
recognised within any text node.  Email addresses are recognised according to
the following rules:

* One ore more characters which are alphanumeric, or `.`, `-`, `_`, or `+`.
* An `@` symbol.
* One or more characters which are alphanumeric, or `.`, `-`, or `_`. At least
  one of the characters here must be a period (`.`).  The last character must
  not be one of `-` or `_`.  If the last character is a period (`.`), it will
  be excluded from the autolink.

The scheme `mailto:` will automatically be added to the generated link:

```````````````````````````````` example
foo@bar.baz
.
<p><a href="mailto:foo@bar.baz">foo@bar.baz</a></p>
````````````````````````````````

`+` can occur before the `@`, but not after.

```````````````````````````````` example
hello@mail+xyz.example isn't valid, but hello+xyz@mail.example is.
.
<p>hello@mail+xyz.example isn't valid, but <a href="mailto:hello+xyz@mail.example">hello+xyz@mail.example</a> is.</p>
````````````````````````````````

`.`, `-`, and `_` can occur on both sides of the `@`, but only `.` may occur at
the end of the email address, in which case it will not be considered part of
the address:

```````````````````````````````` example
a.b-c_d@a.b

a.b-c_d@a.b.

a.b-c_d@a.b-

a.b-c_d@a.b_
.
<p><a href="mailto:a.b-c_d@a.b">a.b-c_d@a.b</a></p>
<p><a href="mailto:a.b-c_d@a.b">a.b-c_d@a.b</a>.</p>
<p>a.b-c_d@a.b-</p>
<p>a.b-c_d@a.b_</p>
````````````````````````````````

The autolinks extension should not interfere with regular links
(#65).


```````````````````````````````` example
[a link](http://www.google.com/)stuff?
.
<p><a href="http://www.google.com/">a link</a>stuff?</p>
````````````````````````````````

Autolinks with punctuation (#151):

```````````````````````````````` example
https://en.wikipedia.org/wiki/St._Petersburg_paradox

https://en.wikipedia.org/wiki/Liaison_(French)

https://en.wikipedia.org/wiki/Frederick_III,_German_Emperor
.
<p><a href="https://en.wikipedia.org/wiki/St._Petersburg_paradox">https://en.wikipedia.org/wiki/St._Petersburg_paradox</a></p>
<p><a href="https://en.wikipedia.org/wiki/Liaison_(French)">https://en.wikipedia.org/wiki/Liaison_(French)</a></p>
<p><a href="https://en.wikipedia.org/wiki/Frederick_III,_German_Emperor">https://en.wikipedia.org/wiki/Frederick_III,_German_Emperor</a></p>
````````````````````````````````