File: 2022-02-09-programmatic-inspection-in-go-with-atlas.md

package info (click to toggle)
golang-ariga-atlas 0.7.2-2
  • links: PTS, VCS
  • area: main
  • in suites: bookworm, forky, sid, trixie
  • size: 5,676 kB
  • sloc: javascript: 592; sql: 404; makefile: 10
file content (382 lines) | stat: -rw-r--r-- 16,943 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
---
title: Programmatic inspection of databases in Go using Atlas 
date: "2022-02-09"
author: Rotem Tamir 
authorURL: "https://github.com/rotemtam"
authorImageURL: "https://s.gravatar.com/avatar/36b3739951a27d2e37251867b7d44b1a?s=80"
authorTwitter: _rtam 
url: /programmatic-inspection-of-databases-in-go-using-atlas/
image: https://release.ariga.io/images/assets/inspector-carbon.png
---
Database inspection is the process of connecting to a database to extract metadata about the way data is structured
inside it. In this post, we will present some use cases for inspecting a database, demonstrate why it is a non-trivial
problem to solve, and finally show how it can be solved using [Atlas](https://atlasgo.io), an open-source package (and
command-line tool) written in [Go](https://go.dev) that we are maintaining at Ariga.

As an infrastructure engineer, I have wished many times to have a simple way to programmatically inspect a database.
Database schema inspection can be useful for many purposes. For instance, you might use it to create visualizations of
data topologies, or use it to find table columns that are no longer in use and can be deprecated. Perhaps you would like
to automatically generate resources from this schema (such as documentation or GraphQL schemas), or to use to locate
fields that might carry personally-identifiable information for compliance purposes. Whatever your use case may be,
having a robust way to get the schema of your database is the foundation for many kinds of infrastructure applications.

When we started working on the core engine for Atlas, we quickly discovered that there wasn't any established tool or
package that could parse the information schema of popular databases and return a data structure representing it. Why is
this the case? After all, most databases provide some command-line tool to perform inspection. For example,
`psql`, the standard CLI for Postgres, supports the `\d` command to describe a table:

```text
postgres=# \d users;
                       Table "public.users"
 Column |          Type          | Collation | Nullable | Default
--------+------------------------+-----------+----------+---------
 id     | integer                |           | not null |
 name   | character varying(255) |           |          |
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
```

So what makes inspection a non-trivial problem to solve? In this post, I will discuss two aspects that I think are
interesting. The first is the variance in how databases expose schema metadata and the second is the complexity of the
data model that is required to represent a database schema.

#### How databases expose schema metadata

Most of the SQL that we use in day-to-day applications is pretty standard. However, when it comes to exposing schema
metadata, database engines vary greatly in the way they work. The way to retrieve information about things like
available schemas and tables, column types and their default values and many other aspects of the database schema looks
completely different in each database engine. For instance, consider this query
([source](https://github.com/ariga/atlas/blob/2e0886e03c5862c54247f41f906f60d64f9c7eaf/sql/postgres/inspect.go#L728))
which can be used to get the metadata about table columns from a Postgres database:

```sql
SELECT t1.table_name,
       t1.column_name,
       t1.data_type,
       t1.is_nullable,
       t1.column_default,
       t1.character_maximum_length,
       t1.numeric_precision,
       t1.datetime_precision,
       t1.numeric_scale,
       t1.character_set_name,
       t1.collation_name,
       t1.udt_name,
       t1.is_identity,
       t1.identity_start,
       t1.identity_increment,
       t1.identity_generation,
       col_description(to_regclass("table_schema" || '.' || "table_name")::oid, "ordinal_position") AS comment,
       t2.typtype,
       t2.oid
FROM "information_schema"."columns" AS t1
         LEFT JOIN pg_catalog.pg_type AS t2
                   ON t1.udt_name = t2.typname
WHERE table_schema = $1
  AND table_name IN (%s)
ORDER BY t1.table_name, t1.ordinal_position
```

As you can see, while it's definitely possible to get the needed metadata, information about the schema is stored in
multiple tables in a way that isn't particularly well documented, and often requires delving into the actual source code
to understand fully. Here's a query to get similar information from
MySQL ([source](https://github.com/ariga/atlas/blob/2e0886e03c5862c54247f41f906f60d64f9c7eaf/sql/mysql/inspect.go#L631)):

```sql
SELECT `TABLE_NAME`,
       `COLUMN_NAME`,
       `COLUMN_TYPE`,
       `COLUMN_COMMENT`,
       `IS_NULLABLE`,
       `COLUMN_KEY`,
       `COLUMN_DEFAULT`,
       `EXTRA`,
       `CHARACTER_SET_NAME`,
       `COLLATION_NAME`
FROM `INFORMATION_SCHEMA`.`COLUMNS`
WHERE `TABLE_SCHEMA` = ?
  AND `TABLE_NAME` IN (%s)
ORDER BY `ORDINAL_POSITION`
```

While this query is much shorter, you can see that it's completely different from the one we ran to inspect Postgres
column metadata. This demonstrates just one way in inspecting Postgres is difference from inspecting MySQL.

#### Mapping database schemas into a useful data structure

To be a solid foundation for building infrastructure, inspection must produce a useful data structure that can be
traversed and analyzed to provide insights, in other words, a graph representing the data topology. As mentioned
above, such graphs can be used to create ERD (entity-relation diagram) charts, such as the schema visualizations
on the [Atlas Management UI](https://atlasgo.io/ui/intro):

![Schema ERD open](https://atlasgo.io/uploads/images/docs/schema-erd-open.png)

Let's consider some aspects of database schemas that such a data structure should capture:

* Databases are split into logical schemas.
* Schemas contain tables, and may have attributes (such as default collation).
* Tables contain columns, indexes and constraints.
* Columns are complex entities that have types, that may be standard to the database engine (and version) or
  custom data types that are defined by the user. In addition,  Columns may have attributes, such as default 
  values, that may be a literal or an expression (it is important to be  able to discern between `now()` and `"now()"`).
* Indexes contain references to columns of the table they are defined on.
* Foreign Keys contain references to column in other tables, that may reside in other schemas.
* ...and much, much more!

To capture any one of these aspects boils down to figuring out the correct query for the specific database engine you
are working with. To be able to provide developers with a data structure that captures all of it, and to do it well
across different versions of multiple database engines we've learned, is not an easy task. This is a perfect opportunity
for an infrastructure project: a problem that is annoyingly complex to solve and that if solved well, becomes a
foundation for many kinds of applications. This was one of our motivations for
creating [Atlas](https://atlasgo.io) ([GitHub](https://github.com/ariga/atlas)) - an open-source project that we
maintain here at [Ariga](https://ariga.io).

Using Atlas, database schemas can be inspected to product Go structs representing a graph of the database
schema topology. Notice the many cyclic references that make it hard to print (but very ergonomic to travere :-)):

```go
&schema.Realm{
    Schemas: {
        &schema.Schema{
            Name:   "test",
            Tables: {
                &schema.Table{
                    Name:    "users",
                    Schema:  &schema.Schema{(CYCLIC REFERENCE)},
                    Columns: {
                        &schema.Column{
                            Name: "id",
                            Type: &schema.ColumnType{
                                Type: &schema.IntegerType{
                                    T:        "int",
                                    Unsigned: false,
                                },
                                Null: false,
                            },
                        },
                    },
                    PrimaryKey: &schema.Index{
                        Unique: false,
                        Table:  &schema.Table{(CYCLIC REFERENCE)},
                        Attrs:  nil,
                        Parts:  {
                            &schema.IndexPart{
                                SeqNo: 0,
                                Desc:  false,
                                C:     &schema.Column{(CYCLIC REFERENCE)},
                            },
                        },
                    },
                },
                &schema.Table{
                    Name:    "posts",
                    Schema:  &schema.Schema{(CYCLIC REFERENCE)},
                    Columns: {
                        &schema.Column{
                            Name: "id",
                            Type: &schema.ColumnType{
                                Type: &schema.IntegerType{
                                    T:        "int",
                                    Unsigned: false,
                                },
                                Null: false,
                            },
                        },
                        &schema.Column{
                            Name: "author_id",
                            Type: &schema.ColumnType{
                                Type: &schema.IntegerType{
                                    T:        "int",
                                    Unsigned: false,
                                },
                                Null: true,
                            },
                        },
                    },
                    PrimaryKey: &schema.Index{
                        Unique: false,
                        Table:  &schema.Table{(CYCLIC REFERENCE)},
                        Parts:  {
                            &schema.IndexPart{
                                SeqNo: 0,
                                Desc:  false,
                                C:     &schema.Column{(CYCLIC REFERENCE)},
                            },
                        },
                    },
                    ForeignKeys: {
                        &schema.ForeignKey{
                            Symbol:  "owner_id",
                            Table:   &schema.Table{(CYCLIC REFERENCE)},
                            Columns: {
                                &schema.Column{(CYCLIC REFERENCE)},
                            },
                            RefTable:   &schema.Table{(CYCLIC REFERENCE)},
                            RefColumns: {
                                &schema.Column{(CYCLIC REFERENCE)},
                            },
                            OnDelete: "SET NULL",
                        },
                    },
                },
            },
        },
    },
}
```

#### Inspecting databases in Go using Atlas

While Atlas is commonly used as a [CLI tool](https://atlasgo.io/cli/getting-started/setting-up), all of Atlas's
core-engine capabilities are available as a [Go module](https://pkg.go.dev/ariga.io/atlas) that you can use
programmatically. Let's get started with database inspection in Go:

To install Atlas, use:

```shell
go get ariga.io/atlas@master
```

#### Drivers

Atlas currently supports three core capabilities for working with SQL schemas.

* "Inspection" - Connecting to a database and understanding its schema.
* "Plan" - Compares two schemas and produces a set of changes needed to reconcile the target schema to the source
  schema.
* "Apply" - creates concrete set of SQL queries to migrate the target database.

In this post we will dive into the inspection with Atlas. The way inspection is done varies greatly between the
different SQL databases. Atlas currently has four supported drivers:

* MySQL
* MariaDB
* PostgreSQL
* SQLite

Atlas drivers are built on top of the standard library [`database/sql`](https://pkg.go.dev/database/sql)
package. To initialize the different drivers, we need to initialize a `sql.DB` and pass it to the Atlas driver
constructor. For example:

```go
package tutorial

import (
	"database/sql"
	"log"
	"testing"
	_ "github.com/mattn/go-sqlite3"
	"ariga.io/atlas/sql/schema"
	"ariga.io/atlas/sql/sqlite"
)

func Test(t *testing.T) {
	// Open a "connection" to sqlite.
	db, err := sql.Open("sqlite3", "file:example.db?cache=shared&_fk=1&mode=memory")
	if err != nil {
		log.Fatalf("failed opening db: %s", err)
	}
	// Open an atlas driver.
	driver, err := sqlite.Open(db)
	if err != nil {
		log.Fatalf("failed opening atlas driver: %s", err)
	}
	// ... do stuff with the driver
}
```

#### Inspection

As we mentioned above, inspection is one of Atlas's core capabilities. Consider the `Inspector`
interface in the [sql/schema](https://pkg.go.dev/ariga.io/atlas@master/sql/schema#Inspector)
package:

```go
package schema

// Inspector is the interface implemented by the different database
// drivers for inspecting multiple tables.
type Inspector interface {
	// InspectSchema returns the schema description by its name. An empty name means the
	// "attached schema" (e.g. SCHEMA() in MySQL or CURRENT_SCHEMA() in PostgreSQL).
	// A NotExistError error is returned if the schema does not exists in the database.
	InspectSchema(ctx context.Context, name string, opts *InspectOptions) (*Schema, error)

	// InspectRealm returns the description of the connected database.
	InspectRealm(ctx context.Context, opts *InspectRealmOption) (*Realm, error)
}
```

As you can see, the `Inspector` interface provides methods for inspecting on different levels:

* `InspectSchema` - provides inspection capabilities for a single schema within a database server.
* `InspectRealm` - inspects the entire connected database server.

Each database driver (for example [MySQL](https://pkg.go.dev/ariga.io/atlas@master/sql/mysql#Driver),
[Postgres](https://pkg.go.dev/ariga.io/atlas@master/sql/postgres#Driver) or
[SQLite](https://pkg.go.dev/ariga.io/atlas@master/sql/sqlite#Driver)) implements this interface. Let's see how we can
use this interface by inspecting a "dummy" SQLite database. Continuing on the example from above:

```go
package tutorial

func TestInspect(t *testing.T) {
	// ... skipping driver creation
	ctx := context.Background()
	// Create an "example" table for Atlas to inspect.
	_, err = db.ExecContext(ctx, "create table example ( id int not null );")
	if err != nil {
		log.Fatalf("failed creating example table: %s", err)
	}
	// Open an atlas driver.
	driver, err := sqlite.Open(db)
	if err != nil {
		log.Fatalf("failed opening atlas driver: %s", err)
	}
	// Inspect the created table.
	sch, err := driver.InspectSchema(ctx, "main", &schema.InspectOptions{
		Tables: []string{"example"},
	})
	if err != nil {
		log.Fatalf("failed inspecting schema: %s", err)
	}
	tbl, ok := sch.Table("example")
	require.True(t, ok, "expected to find example table")
	require.EqualValues(t, "example", tbl.Name)
	id, ok := tbl.Column("id")
	require.True(t, ok, "expected to find id column")
	require.EqualValues(t, &schema.ColumnType{
		Type: &schema.IntegerType{T: "int"}, // An integer type, specifically "int".
		Null: false,                         // The column has NOT NULL set.
		Raw:  "INT",                         // The raw type inspected from the DB.
	}, id.Type)
}
```

The full source-code for this example is available in
the [atlas-examples repo](https://github.com/ariga/atlas-examples/blob/fb7fef80ca0ad635f056c40a0a1ea223ccf0a9c0/inspect_test.go#L15)
.

And voila! In this example, we first created a table named "example" by executing a query directly against the database.
Next, we used the driver's `InspectSchema` method to inspect the schema of the table we created. Finally, we made some
assertions on the returned `schema.Table` instance to verify that it was inspected correctly.

#### Inspecting using the CLI

If you don't want to write any code and just want to get a document representing your database schema, you can always
use the Atlas CLI to do it for you. To get
started, [head over to the docs](https://atlasgo.io/cli/getting-started/setting-up).

#### Wrapping up

In this post we presented the Go API of Atlas, which we initially built around our use case of building a new database
migration tool, as part of
the [Operational Data Graph Platform](https://blog.ariga.io/data-access-should-be-an-infrastructure-problem/)
that we are creating here at Ariga. As we mentioned in the beginning of this post, there are a lot of cool things you
can build if you have proper database inspection, which raises the question, what will **you** build with it?

#### Getting involved with Atlas

* Follow the [Getting Started](https://atlasgo.io/cli/getting-started/setting-up) guide.
* Join our [Discord Server](https://discord.gg/zZ6sWVg6NT).
* Follow us [on Twitter](https://twitter.com/ariga_io).