1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169
|
# url-verifier
🔗 A Go library for URL validation and verification: does this URL actually
work?
[](https://github.com/davidmytton/url-verifier/actions)
[](https://codecov.io/gh/davidmytton/url-verifier)
## Features
- **URL Validation:** validates whether a string is a valid URL.
- **Different Validation Types:** validates whether the URL is valid according
to a "human" definition of a correct URL, strict compliance with
[RFC3986](https://www.rfc-editor.org/rfc/rfc3986) (Uniform Resource Identifier
(URI): Generic Syntax), and/or compliance with RFC3986 with the addition of a
schema e.g. HTTPS.
- **Reachability:** verifies whether the URL is actually reachable via an HTTP
GET request and provides the status code returned.
## Rationale
There are several methods of validating URLs in Go depending on what you're
trying to achieve. Strict, technical validation can be done through a simple
call to [`url.Parse`](https://pkg.go.dev/net/url#Parse) in Go's Standard library
or a more "human" definition of a valid URL using
[govalidator](https://github.com/asaskevich/govalidator) (which is what this
library uses internally for syntax verification).
However, this will successfully validate all types of URLs, from relative paths
through to hostnames without a scheme. Often, when building user-facing
applications, what we actually want is a way to check whether the URL input
provided will actually work i.e. it's valid, it resolves, and it can be loaded
in a web browser.
## Install
Use `go get` to install this package.
```shell
go get -u github.com/davidmytton/url-verifier
```
## Usage
### Basic usage
Use `Verify` to check whether a URL is correct:
```go
package main
import (
"fmt"
urlverifier "github.com/davidmytton/url-verifier"
)
func main() {
url := "https://example.com/"
verifier := urlverifier.NewVerifier()
ret, err := verifier.Verify(url)
if err != nil {
fmt.Errorf("Error: %s", err)
}
fmt.Printf("Result: %+v\n", ret)
/*
Result: &{
URL:https://example.com/
URLComponents:https://example.com/
IsURL:true
IsRFC3986URL:true
IsRFC3986URI:true
HTTP:<nil>
}
*/
}
```
### URL reachability check
Call `EnableHTTPCheck()` to issue a `GET` request to the HTTP or HTTPS URL and
check whether it is reachable and successfully returns a response (a success
(2xx) or success-like code (3xx)). Non-HTTP(S) URLs will return an error.
```go
package main
import (
"fmt"
urlverifier "github.com/davidmytton/url-verifier"
)
func main() {
url := "https://example.com/"
verifier := urlverifier.NewVerifier()
verifier.EnableHTTPCheck()
ret, err := verifier.Verify(url)
if err != nil {
fmt.Errorf("Error: %s", err)
}
fmt.Printf("Result: %+v\n", ret)
fmt.Printf("HTTP: %+v\n", ret.HTTP)
if ret.HTTP.IsSuccess {
fmt.Println("The URL is reachable with status code", ret.HTTP.StatusCode)
}
/*
Result: &{
URL:https://example.com/
URLComponents:https://example.com/
IsURL:true
IsRFC3986URL:true
IsRFC3986URI:true
HTTP:0x140000b6a50
}
HTTP: &{
Reachable:true
StatusCode:200
IsSuccess:true
}
The URL is reachable with status code 200
*/
}
```
## HTTP checks against internal URLs
By default, the reachability checks are only executed if the host resolves to a
non-internal IP address. An internal IP address is defined as any of:
[private](https://pkg.go.dev/net#IP.IsPrivate),
[loopback](https://pkg.go.dev/net#IP.IsLoopback), [link-local
unicast](https://pkg.go.dev/net#IP.IsLinkLocalUnicast), [link-local
multicast](https://pkg.go.dev/net#IP.IsLinkLocalMulticast), [interface-local
multicast](https://pkg.go.dev/net#IP.IsInterfaceLocalMulticast), or
[unspecified](https://pkg.go.dev/net#IP.IsUnspecified).
This is one layer of protection against [Server Side Request
Forgery](https://cheatsheetseries.owasp.org/cheatsheets/Server_Side_Request_Forgery_Prevention_Cheat_Sheet.html#application-layer_1)
(SSRF) requests.
To allow internal HTTP checks, call `verifier.AllowHTTPCheckInternal()`:
```go
urlToCheck := "http://localhost:3000"
verifier := NewVerifier()
verifier.EnableHTTPCheck()
// Danger: Makes SSRF easier!
verifier.AllowHTTPCheckInternal()
ret, err := verifier.Verify(urlToCheck)
...
```
## Credits
This library is heavily inspired by
[`email-verifier`](https://github.com/AfterShip/email-verifier).
## License
This package is licensed under the MIT License.
|