File: README.md

package info (click to toggle)
pgloader 3.6.9-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 12,040 kB
  • sloc: sql: 32,321; lisp: 14,792; makefile: 439; sh: 85; python: 26
file content (42 lines) | stat: -rw-r--r-- 1,442 bytes parent folder | download | duplicates (5)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# Citus Multi-Tenant Automatic Distribution

In this test case we follow the following documentation:

  https://docs.citusdata.com/en/v7.5/use_cases/multi_tenant.html
  
We install the schema before Citus migration, and load the data without the
backfilling that is already done. For that we use pgloader to ignore the
company_id column in the tables that didn't have this column prior to the
Citus migration effort.

Then the following `company.load` file contains the pgloader command that
runs a full migration from PostgreSQL to Citus:

```
load database
   from pgsql:///hackathon
   into pgsql://localhost:9700/dim

   with include drop, reset no sequences

   distribute companies using id;
```

Tables are marked distributed, the company_id column is added where it's
needed, primary keys and foreign keys definitions are altered to the new
model, and finally the data is backfilled automatically in the target table
thanks to generating queries like the following:

~~~
SELECT "campaigns".company_id::text,
       "impressions".id::text,
       "impressions".ad_id::text,
       "impressions".seen_at::text,
       "impressions".site_url::text,
       "impressions".cost_per_impression_usd::text,
       "impressions".user_ip::text,
       "impressions".user_data::text
  FROM "public"."impressions"  
        JOIN "public"."ads" ON impressions.ad_id = ads.id
        JOIN "public"."campaigns" ON ads.campaign_id = campaigns.id
~~~