File: README.md

package info (click to toggle)
golang-github-davidmytton-url-verifier 1.0.0-2
  • links: PTS, VCS
  • area: main
  • in suites: sid, trixie
  • size: 168 kB
  • sloc: makefile: 2
file content (169 lines) | stat: -rw-r--r-- 4,627 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
# url-verifier

🔗 A Go library for URL validation and verification: does this URL actually
work?

[![Build Status](https://github.com/davidmytton/url-verifier/actions/workflows/go.yml/badge.svg)](https://github.com/davidmytton/url-verifier/actions)
[![codecov](https://codecov.io/gh/davidmytton/url-verifier/branch/main/graph/badge.svg?token=HXSXEHU79J)](https://codecov.io/gh/davidmytton/url-verifier)

## Features

- **URL Validation:** validates whether a string is a valid URL.
- **Different Validation Types:** validates whether the URL is valid according
  to a "human" definition of a correct URL, strict compliance with
  [RFC3986](https://www.rfc-editor.org/rfc/rfc3986) (Uniform Resource Identifier
  (URI): Generic Syntax), and/or compliance with RFC3986 with the addition of a
  schema e.g. HTTPS.
- **Reachability:** verifies whether the URL is actually reachable via an HTTP
  GET request and provides the status code returned.

## Rationale

There are several methods of validating URLs in Go depending on what you're
trying to achieve. Strict, technical validation can be done through a simple
call to [`url.Parse`](https://pkg.go.dev/net/url#Parse) in Go's Standard library
or a more "human" definition of a valid URL using
[govalidator](https://github.com/asaskevich/govalidator) (which is what this
library uses internally for syntax verification).

However, this will successfully validate all types of URLs, from relative paths
through to hostnames without a scheme. Often, when building user-facing
applications, what we actually want is a way to check whether the URL input
provided will actually work i.e. it's valid, it resolves, and it can be loaded
in a web browser.

## Install

Use `go get` to install this package.

```shell
go get -u github.com/davidmytton/url-verifier
```

## Usage

### Basic usage

Use `Verify` to check whether a URL is correct:

```go
package main

import (
 "fmt"

 urlverifier "github.com/davidmytton/url-verifier"
)

func main() {
 url := "https://example.com/"

 verifier := urlverifier.NewVerifier()
 ret, err := verifier.Verify(url)

 if err != nil {
  fmt.Errorf("Error: %s", err)
 }

 fmt.Printf("Result: %+v\n", ret)
 /*
   Result: &{
    URL:https://example.com/
    URLComponents:https://example.com/
    IsURL:true
    IsRFC3986URL:true
    IsRFC3986URI:true
    HTTP:<nil>
   }
 */
}

```

### URL reachability check

Call `EnableHTTPCheck()` to issue a `GET` request to the HTTP or HTTPS URL and
check whether it is reachable and successfully returns a response (a success
(2xx) or success-like code (3xx)). Non-HTTP(S) URLs will return an error.

```go
package main

import (
 "fmt"

 urlverifier "github.com/davidmytton/url-verifier"
)

func main() {
 url := "https://example.com/"

 verifier := urlverifier.NewVerifier()
 verifier.EnableHTTPCheck()
 ret, err := verifier.Verify(url)

 if err != nil {
  fmt.Errorf("Error: %s", err)
 }

 fmt.Printf("Result: %+v\n", ret)
 fmt.Printf("HTTP: %+v\n", ret.HTTP)

 if ret.HTTP.IsSuccess {
  fmt.Println("The URL is reachable with status code", ret.HTTP.StatusCode)
 }
 /*
   Result: &{
    URL:https://example.com/
    URLComponents:https://example.com/
    IsURL:true
    IsRFC3986URL:true
    IsRFC3986URI:true
    HTTP:0x140000b6a50
   }
   HTTP: &{
    Reachable:true
    StatusCode:200
    IsSuccess:true
   }
   The URL is reachable with status code 200
 */
}
```

## HTTP checks against internal URLs

By default, the reachability checks are only executed if the host resolves to a
non-internal IP address. An internal IP address is defined as any of:
[private](https://pkg.go.dev/net#IP.IsPrivate),
[loopback](https://pkg.go.dev/net#IP.IsLoopback), [link-local
unicast](https://pkg.go.dev/net#IP.IsLinkLocalUnicast), [link-local
multicast](https://pkg.go.dev/net#IP.IsLinkLocalMulticast), [interface-local
multicast](https://pkg.go.dev/net#IP.IsInterfaceLocalMulticast), or
[unspecified](https://pkg.go.dev/net#IP.IsUnspecified).

This is one layer of protection against [Server Side Request
Forgery](https://cheatsheetseries.owasp.org/cheatsheets/Server_Side_Request_Forgery_Prevention_Cheat_Sheet.html#application-layer_1)
(SSRF) requests.

To allow internal HTTP checks, call `verifier.AllowHTTPCheckInternal()`:

```go
urlToCheck := "http://localhost:3000"

verifier := NewVerifier()
verifier.EnableHTTPCheck()
// Danger: Makes SSRF easier!
verifier.AllowHTTPCheckInternal()
ret, err := verifier.Verify(urlToCheck)
...
```

## Credits

This library is heavily inspired by
[`email-verifier`](https://github.com/AfterShip/email-verifier).

## License

This package is licensed under the MIT License.