File: url_parse.Rd

package info (click to toggle)
r-cran-urltools 1.7.2%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: buster
  • size: 512 kB
  • sloc: cpp: 1,234; ansic: 303; sh: 13; makefile: 2
file content (39 lines) | stat: -rw-r--r-- 1,520 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/RcppExports.R
\name{url_parse}
\alias{url_parse}
\title{split URLs into their component parts}
\usage{
url_parse(urls)
}
\arguments{
\item{urls}{a vector of URLs}
}
\value{
a data.frame consisting of the columns scheme, domain, port, path, query
and fragment. See the '\href{http://tools.ietf.org/html/rfc3986}{relevant IETF RfC} for
definitions. If an element cannot be identified, it is represented by an empty string.
}
\description{
\code{url_parse} takes a vector of URLs and splits each one into its component
parts, as recognised by RfC 3986.
}
\details{
It's useful to be able to take a URL and split it out into its component parts - 
for the purpose of hostname extraction, for example, or analysing API calls. This functionality
is not provided in base R, although it is provided in \code{\link[httr]{parse_url}}; that
implementation is entirely in R, uses regular expressions, and is not vectorised. It's
perfectly suitable for the intended purpose (decomposition in the context of automated
HTTP requests from R), but not for large-scale analysis.

Note that user authentication/identification information is not extracted;
this can be found with \code{\link{get_credentials}}.
}
\examples{
url_parse("https://en.wikipedia.org/wiki/Article")

}
\seealso{
\code{\link{param_get}} for extracting values associated with particular keys in a URL's
query string, and \code{\link{url_compose}}, which is \code{url_parse} in reverse.
}