File: mysql.rst

package info (click to toggle)
pgloader 3.6.9-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 12,040 kB
  • sloc: sql: 32,321; lisp: 14,792; makefile: 439; sh: 85; python: 26
file content (178 lines) | stat: -rw-r--r-- 9,378 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
Migrating from MySQL to PostgreSQL
----------------------------------

If you want to migrate your data over to `PostgreSQL
<http://www.postgresql.org>`_ from MySQL then pgloader is the tool of
choice!

Most tools around are skipping the main problem with migrating from MySQL,
which is to do with the type casting and data sanitizing that needs to be
done. pgloader will not leave you alone on those topics.

In a Single Command Line
^^^^^^^^^^^^^^^^^^^^^^^^

As an example, we will use the f1db database from <http://ergast.com/mrd/>
which which provides a historical record of motor racing data for
non-commercial purposes. You can either use their API or download the whole
database at `http://ergast.com/downloads/f1db.sql.gz
<http://ergast.com/downloads/f1db.sql.gz>`_. Once you've done that load the
database in MySQL::

    $ mysql -u root
    > create database f1db;
    > source f1db.sql

Now let's migrate this database into PostgreSQL in a single command line::

    $ createdb f1db
    $ pgloader mysql://root@localhost/f1db pgsql:///f1db

Done! All with schema, table definitions, constraints, indexes, primary
keys, *auto_increment* columns turned into *bigserial* , foreign keys,
comments, and if you had some MySQL default values such as *ON UPDATE
CURRENT_TIMESTAMP* they would have been translated to a `PostgreSQL before
update trigger
<https://www.postgresql.org/docs/current/static/plpgsql-trigger.html>`_
automatically.

::

    $ pgloader mysql://root@localhost/f1db pgsql:///f1db
    2017-06-16T08:56:14.064000+02:00 LOG Main logs in '/private/tmp/pgloader/pgloader.log'
    2017-06-16T08:56:14.068000+02:00 LOG Data errors in '/private/tmp/pgloader/'
    2017-06-16T08:56:19.542000+02:00 LOG report summary reset
                   table name       read   imported     errors      total time
    -------------------------  ---------  ---------  ---------  --------------
              fetch meta data         33         33          0          0.365s 
               Create Schemas          0          0          0          0.007s 
             Create SQL Types          0          0          0          0.006s 
                Create tables         26         26          0          0.068s 
               Set Table OIDs         13         13          0          0.012s 
    -------------------------  ---------  ---------  ---------  --------------
      f1db.constructorresults      11011      11011          0          0.205s 
                f1db.circuits         73         73          0          0.150s 
            f1db.constructors        208        208          0          0.059s 
    f1db.constructorstandings      11766      11766          0          0.365s 
                 f1db.drivers        841        841          0          0.268s 
                f1db.laptimes     413578     413578          0          2.892s 
         f1db.driverstandings      31420      31420          0          0.583s 
                f1db.pitstops       5796       5796          0          2.154s 
                   f1db.races        976        976          0          0.227s 
              f1db.qualifying       7257       7257          0          0.228s 
                 f1db.seasons         68         68          0          0.527s 
                 f1db.results      23514      23514          0          0.658s 
                  f1db.status        133        133          0          0.130s 
    -------------------------  ---------  ---------  ---------  --------------
      COPY Threads Completion         39         39          0          4.303s 
               Create Indexes         20         20          0          1.497s 
       Index Build Completion         20         20          0          0.214s 
              Reset Sequences          0         10          0          0.058s 
                 Primary Keys         13         13          0          0.012s 
          Create Foreign Keys          0          0          0          0.000s 
              Create Triggers          0          0          0          0.001s 
             Install Comments          0          0          0          0.000s 
    -------------------------  ---------  ---------  ---------  --------------
            Total import time     506641     506641          0          5.547s 

You may need to have special cases to take care of tho, or views that you
want to materialize while doing the migration. In advanced case you can use
the pgloader command.

The Command
^^^^^^^^^^^

To load data with pgloader you need to define in a *command* the operations
in some details. Here's our example for loading the `MySQL Sakila Sample
Database <http://dev.mysql.com/doc/sakila/en/>`_.

Here's our command::

    load database
         from      mysql://root@localhost/sakila
         into postgresql:///sakila
    
     WITH include drop, create tables, no truncate,
          create indexes, reset sequences, foreign keys
    
      SET maintenance_work_mem to '128MB', work_mem to '12MB', search_path to 'sakila'
    
     CAST type datetime to timestamptz
                    drop default drop not null using zero-dates-to-null,
          type date drop not null drop default using zero-dates-to-null
         
     MATERIALIZE VIEWS film_list, staff_list
    
     -- INCLUDING ONLY TABLE NAMES MATCHING ~/film/, 'actor'
     -- EXCLUDING TABLE NAMES MATCHING ~<ory>
    
     BEFORE LOAD DO
     $$ create schema if not exists sakila; $$;

Note that here pgloader will benefit from the meta-data information found in
the MySQL database to create a PostgreSQL database capable of hosting the
data as described, then load the data.

In particular, some specific *casting rules* are given here, to cope with
date values such as `0000-00-00` that MySQL allows and PostgreSQL rejects
for not existing in our calendar. It's possible to add per-column casting
rules too, which is useful is some of your `tinyint` are in fact `smallint`
while some others are in fact `boolean` values.

Finaly note that we are using the *MATERIALIZE VIEWS* clause of pgloader:
the selected views here will be migrated over to PostgreSQL *with their
contents*.

It's possible to use the *MATERIALIZE VIEWS* clause and give both the name
and the SQL (in MySQL dialect) definition of view, then pgloader creates the
view before loading the data, then drops it again at the end.

Loading the data
^^^^^^^^^^^^^^^^

Let's start the `pgloader` command with our `sakila.load` command file::

    $ pgloader sakila.load
    ... LOG Starting pgloader, log system is ready.
    ... LOG Parsing commands from file "/Users/dim/dev/pgloader/test/sakila.load"
       <WARNING: table "xxx" does not exists have been edited away>
    
                table name       read   imported     errors            time
    ----------------------  ---------  ---------  ---------  --------------
               before load          1          1          0          0.007s
           fetch meta data         45         45          0          0.402s
              create, drop          0         36          0          0.208s
    ----------------------  ---------  ---------  ---------  --------------
                     actor        200        200          0          0.071s
                   address        603        603          0          0.035s
                  category         16         16          0          0.018s
                      city        600        600          0          0.037s
                   country        109        109          0          0.023s
                  customer        599        599          0          0.073s
                      film       1000       1000          0          0.135s
                film_actor       5462       5462          0          0.236s
             film_category       1000       1000          0          0.070s
                 film_text       1000       1000          0          0.080s
                 inventory       4581       4581          0          0.136s
                  language          6          6          0          0.036s
                   payment      16049      16049          0          0.539s
                    rental      16044      16044          0          0.648s
                     staff          2          2          0          0.041s
                     store          2          2          0          0.036s
                 film_list        997        997          0          0.247s
                staff_list          2          2          0          0.135s
    Index Build Completion          0          0          0          0.000s
    ----------------------  ---------  ---------  ---------  --------------
            Create Indexes         41         41          0          0.964s
           Reset Sequences          0          1          0          0.035s
              Foreign Keys         22         22          0          0.254s
    ----------------------  ---------  ---------  ---------  --------------
         Total import time      48272      48272          0          3.502s

The *WARNING* messages we see here are expected as the PostgreSQL database
is empty when running the command, and pgloader is using the SQL commands
`DROP TABLE IF EXISTS` when the given command uses the `include drop`
option.

Note that the output of the command has been edited to facilitate its
browsing online.