1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164
|
---
title: "Query structure"
layout: default
canonical: "/puppetdb/latest/api/query/v4/query.html"
---
# Query structure
[prefix]: http://en.wikipedia.org/wiki/Polish_notation
[jetty]: ../../../configure.markdown#jetty-http-settings
[urlencode]: http://en.wikipedia.org/wiki/Percent-encoding
[ast]: ./ast.markdown
[tutorial]: ../tutorial.markdown
[curl]: ../curl.markdown
[paging]: ./paging.markdown
[entities]: ./entities.markdown
[root]: ./overview.markdown
[pql]: ./pql.markdown
## Summary
PuppetDB's query API can retrieve data objects from PuppetDB for use in other
applications. For example, the PuppetDB-termini for Puppet Servers use this
API to collect exported resources.
The query API is implemented as HTTP URLs on the PuppetDB server. By default,
it can only be accessed over the network via host-verified HTTPS; [see the
jetty settings][jetty] if you need to access the API over unencrypted HTTP.
## Query structure
A query consists of an HTTP GET request to an endpoint URL which may or may not contain:
* A `query` URL parameter, whose value is a **query string**.
* Other URL parameters, to configure [paging][] or other behavior.
That is, most queries will look like a GET request to a URL that resembles the following:
https://puppetdb:8081/pdb/query/v4/<ENDPOINT>?query=<QUERY STRING>
Alternatively, you can provide the entity context instead of using the `<ENDPOINT>` suffix with the following:
https://puppetdb:8081/pdb/query/v4?query=<QUERY STRING>
Consult the [root][] endpoint documentation for more details.
### API URLs
API URLs generally look like this:
https://<SERVER>:<PORT>/pdb/query/<API VERSION>/<ENDPOINT>?<PARAMETER>=<VALUE>&<PARAMETER>=<VALUE>
For example: `https://puppetdb:8081/pdb/query/v4/resources?limit=50&offset=50`.
### API version
After the `/pdb/query/` prefix, the first part of an API URL is the
**API version,** written in the `v4` format. This section describes version
4 of the API, so every URL will begin with `/pdb/query/v4`.
### Entity endpoints
After the version, URLs are organized into a number of **endpoints** that express the entity you wish to query for.
Conceptually, an entity endpoint represents a PuppetDB entity. Each version of the PuppetDB API defines a set number of endpoints.
See the [entities documentation][entities] for a list of the available endpoints. Each endpoint may have additional sub-endpoints under it; these are generally just shortcuts for the most common types of query, so that you can write terser and simpler query strings.
### URL parameters
Finally, the URL may include some **URL parameters.** Some endpoints require certain parameters; for others they're optional or disallowed. Each endpoint's page lists the parameters it accepts, and most endpoints also support the [paging][] parameters.
A group of parameters begins with a question mark (`?`). Each parameter is formatted as `<PARAMETER>=<VALUE>`, and additional parameters are separated by ampersands (`&`). All parameter values must be [URL-encoded.][urlencode]
#### `query`
The most common URL parameter is `query`, which lets you define the set of results returned by most endpoints.
There are two query languages available in PuppetDB, consult the documentation for each for more details.
* [AST query language][ast]: a JSON based query language.
* [Puppet query language][pql]: a new query language designed for human users to simplify
querying over the legacy AST language.
A complete query string describes a comparison operation. When submitting a query, PuppetDB will check every _possible_
result from the endpoint to see if it matches the comparison from the query string, and will only return those objects
that match.
> **Note:** Only the [root][] endpoint supports PQL.
#### Paging
The next most common URL parameters are the **paging** parameters.
Most PuppetDB query endpoints support paged results via a set of shared URL parameters. For more information, please see the documentation on [paging][paging].
## Query responses
All queries return data with a content type of `application/json`. Each endpoint's page describes the format of its return data.
### Rich data
Puppet 6 supports
[rich_data](https://github.com/puppetlabs/puppet-specifications/blob/master/language/types_values_variables.md#richdata)
types like Timestamp and SemVer, and enables rich data by default.
When rich data is enabled, readable string representations of rich
data values may appear in the report resource event `old_value` and
`new_value` fields, and in catalog parameter values.
For example, a Timestamp value would be recorded in PuppetDB as a
string like `"2012-10-10T00:00:00.000000000 UTC"`, and a Deferred value
would be recorded as a string like
`"Deferred({'name' => 'join', 'arguments' => [[1, 2, 3], ':']})"`.
## Tutorial and tips
For a walkthrough on constructing queries, see [the query tutorial page][tutorial]. For quick tips on using curl to make ad hoc queries, see [the curl tips page][curl].
## Experimental query termination
PuppetDB now monitors all queries for client disconnects, and
terminates the query (including the database work) as soon as the
client is gone. The same mechanism also helps enforce relevant query
timeouts promptly.
For now, this subsystem can be disabled by setting the environment
variable `PDB_PROMPTLY_END_QUERIES` to `false`, which might be helpful
if you encounter
[this issue with PuppetDB 8.1.0](https://github.com/puppetlabs/puppetdb/issues/3866),
but the variable is likely to be removed in a future release.
## Experimental query optimization
> *Note*: this feature is experimental and may be altered or removed
> in a future release, and while it is expected to be safe to enable
> it, and it is now enabled by default, some caution is still
> advisable. If something were to go wrong, the result set returned
> by the query might be incorrect. See below for one way to
> double-check the results if you suspect something might be amiss.
PuppetDB has an experimental query optimizer that may be able to
substantially decrease the cost (and correspondingly decrease the
response time) of some queries. It does this by attempting to avoid
retrieving unnecessary data when generating a response.
At the moment, this optimization can only be applied for queries that
ask for a subset of the available query response fields, for example,
a query against the nodes endpoint that only extracts the `certname`.
Further, for any given query it may or may not have any effect at all,
and the effect may vary across PuppetDB releases.
This optimization is now enabled by default, but the default can be
changed by setting the `PDB_QUERY_OPTIMIZE_DROP_UNUSED_JOINS`
environment variable to `by-request` before starting puppetdb.
To enable the optimization for an individual query, just add
`optimize_drop_unused_joins=true` as a parameter. If you'd like to
determine wheter or not PuppetDB attempted to optimize a query, any
efforts are logged at debug level.
(If you happen to have `diff` and `jq` installed, you should be able
to compare the results of a given query with and without the
optimizer by saving the data in each case to a file and then running
a command like this:
`diff -u <(jq -S . result-1.json) <(jq -S . result-2.json)`.)
|