Swift Global Cluster (Multi-Region)

Intro


Swift Global Cluster [1] is a feature that allows Swift to span across multiple regions. By default Swift operates in a single-region mode. Setting up Swift Global Cluster is not difficult, but the configuration overhead is as usual very high. Fortunately there are application modelling tools like Juju [2] available which facilitate software installation and configuration. I have recently added support for Swift Global Cluster feature to Swift charms [3]. In the following article I will present how to setup Swift in multi-region mode with Juju.

Design


Let's assume that you have two geographically-distributed sites: dc1 and dc2, and you want to deploy Swift regions 1 and 2 in them respectively. We will use Juju for modelling purposes and MaaS [4] as a provider for Juju. Each site has MaaS installed, configured and three nodes enlisted, and commissioned in MaaS. 10.0.1.0/24 and 10.0.2.0/24 subnets are routed and there are no restrictions between them. The whole environment is managed from a Juju client which is external to the sites. The above concept is presented in the following figure:


Each node will host Swift storage services and LXD container [5] with Swift proxy service. Swift proxy will be deployed in the HA mode. Each node belongs to a different zone and has 3 disks: sdb, sdc and sdd for object storing purposes. The end goal is to have 3 replicas of the object in each site.

P.S.: If you have more than two sites, don't worry. Swift Global Cluster scales out, so can easily add another regions later on.

Initial deployment


Let's assume that you already have Juju client installed, two MaaS clouds added to the client and Juju controllers bootstrapped in each cloud. If you don't know how to do it, you can refer to Juju documentation [2]. You can list Juju controllers by executing the following command:

$ juju list-controllers

Controller  Model    User   Access     Cloud/Region  Models  Machines  HA    Version
juju-dc1*   default  admin  superuser  maas-dc1      2       1         none  2.5.1
juju-dc2    default  admin  superuser  maas-dc2      2       1         none  2.5.1

NOTE: Make sure you use Juju version 2.5.1 or later.

The asterisk character indicates the current controller in use. You can switch between them by executing the following command:

$ juju switch <controller_name>

Before we start we have to download patched charms from the branches I created (they haven't been merged with the upstream code yet):

$ cd /tmp

$ git clone git@github.com:tytus-kurek/charm-swift-proxy.git

$ git clone git@github.com:tytus-kurek/charm-swift-storage.git

$ cd charm-swift-proxy

$ git checkout 1815879

$ cd ../charm-swift-storage

$ git checkout 1815879

Then we create Juju bundles which will be used to deploy the models:

$ cat <<EOF > /tmp/swift-dc1.yaml
series: bionic
services:
  swift-storage-dc1-zone1:
    charm: /tmp/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 1
      zone: 1
  swift-storage-dc1-zone2:
    charm: /tmp/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 1
      zone: 2
  swift-storage-dc1-zone3:
    charm: /tmp/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 1
      zone: 3
  swift-proxy-dc1:
    charm: /tmp/charm-swift-proxy
    num_units: 3
    options:
      enable-multi-region: true
      read-affinity: "r1=100, r2=200"
      region: "RegionOne"
      replicas: 3
      vip: "10.0.1.254"
      write-affinity: "r1, r2"
      write-affinity-node-count: 3
      zone-assignment: manual
    to:
    - lxd:0
    - lxd:1
    - lxd:2
  haproxy-swift-proxy-dc1:
    charm: cs:haproxy
relations:
  - [ "haproxy-swift-proxy-dc1:ha", "swift-proxy-dc1:ha" ]
  - [ "swift-proxy-dc1:swift-storage", "swift-storage-dc1-zone1:swift-storage" ]
  - [ "swift-proxy-dc1:swift-storage", "swift-storage-dc1-zone2:swift-storage" ]
  - [ "swift-proxy-dc1:swift-storage", "swift-storage-dc1-zone3:swift-storage" ]
EOF

$ cat <<EOF > /tmp/swift-dc2.yaml
series: bionic
services:
  swift-storage-dc2-zone1:
    charm: /tmp/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 2
      zone: 1
  swift-storage-dc2-zone2:
    charm: /tmp/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 2
      zone: 2
  swift-storage-dc2-zone3:
    charm: /tmp/charm-swift-storage
    num_units: 1
    options:
      block-device: sdb sdc sdd
      region: 2
      zone: 3
  swift-proxy-dc2:
    charm: /tmp/charm-swift-proxy
    num_units: 3
    options:
      enable-multi-region: true
      read-affinity: "r2=100, r1=200"
      region: "RegionTwo"
      replicas: 3
      vip: "10.0.2.254"
      write-affinity: "r2, r1"
      write-affinity-node-count: 3
      zone-assignment: manual
    to:
    - lxd:0
    - lxd:1
    - lxd:2
  haproxy-swift-proxy-dc2:
    charm: cs:haproxy
relations:
  - [ "haproxy-swift-proxy-dc2:ha", "swift-proxy-dc2:ha" ]
  - [ "swift-proxy-dc2:swift-storage", "swift-storage-dc2-zone1:swift-storage" ]
  - [ "swift-proxy-dc2:swift-storage", "swift-storage-dc2-zone2:swift-storage" ]
  - [ "swift-proxy-dc2:swift-storage", "swift-storage-dc2-zone3:swift-storage" ]
EOF

Note that we mark all storage nodes in dc1 as Swift region 1 and all storage nodes in dc2 as Swift region 2. The affinity settings of Swift proxy application will be used to determine how the data will be read and written.

Finally we create the models and deploy the bundles:

$ juju switch juju-dc1

$ juju add-model swift-dc1

$ juju deploy /tmp/swift-dc1.yaml

$ juju switch juju-dc2

$ juju add-model swift-dc2

$ juju deploy /tmp/swift-dc2.yaml

This takes a while. Monitor Juju status and wait until all units in both models enter the active state.

Setting up Swift Global Cluster


In order to setup Swift Global Cluster we have to relate storage nodes from dc1 with the Swift proxy application in dc2 and vice versa. Moreover a master-slave relation has to be established between swift-proxy-dc1 and swift-proxy-dc2 applications. However, as they don't belong to the same model / controller / cloud, we have to create offers [6] first (offers allow cross-model / cross-controller / cross-cloud relations creation):
 
$ juju switch juju-dc1

$ juju offer swift-proxy-dc1:master swift-proxy-dc1-master

$ juju offer swift-proxy-dc1:swift-storage swift-proxy-dc1-swift-storage

$ juju switch juju-dc2

$ juju offer swift-proxy-dc2:swift-storage swift-proxy-dc2-swift-storage

Then consume the offers:

$ juju switch juju-dc1

$ juju consume maas-dc2:admin/swift-proxy-dc2-swift-storage

$ juju switch juju-dc2

$ juju consume maas-dc1:admin/swift-proxy-dc1-master

$ juju consume maas-dc1:admin/swift-proxy-dc1-swift-storage

Add required relations:

$ juju switch juju-dc1

$ juju relate swift-storage-dc1-zone1 swift-proxy-dc2-swift-storage

$ juju relate swift-storage-dc1-zone2 swift-proxy-dc2-swift-storage

$ juju relate swift-storage-dc1-zone3 swift-proxy-dc2-swift-storage

$ juju switch juju-dc2

$ juju relate swift-storage-dc2-zone1 swift-proxy-dc1-swift-storage

$ juju relate swift-storage-dc2-zone2 swift-proxy-dc1-swift-storage

$ juju relate swift-storage-dc2-zone3 swift-proxy-dc1-swift-storage

$ juju relate swift-proxy-dc2:slave swift-proxy-dc1-master

Finally increase the replication factor to 6:

$ juju switch juju-dc1

$ juju config swift-proxy-dc1 replicas=6

$ juju switch juju-dc2

$ juju config swift-proxy-dc2 replicas=6

This setting together with the affinity settings will cause that in each site 3 replicas of the object will be created.

Site failure


At this point we have Swift Global Cluster configured. There are two sites and each of them is acting as a different Swift region. As each node belongs to a different zone and the replication factor has been set to 6, each storage node is hosting 1 replica of each object. Both proxies can be used to read and write the data. Such cluster is highly available and geo-redundant. This means it can survive a failure of any site, however, due to an eventual consistency nature of Swift, some data can be lost during the failure event.

Failover


In case of the dc1 failure the Swift Proxy application in dc2 can be used to read and write the data in both regions. However, if dc1 cannot be recovered, swift-proxy-dc2 has to be manually transitioned to master, so that another regions could be deployed. In order to transition swift-proxy-dc2 to master execute the following command:

$ juju switch juju-dc2

$ juju config swift-proxy-dc2 enable-transition-to-master=True

Not that this should be used with an extra caution. After that another regions can be deployed based on the instructions from the previous sections. Don't forget to update the affinity settings when deploying additional regions. 






How to make Samsung Xpress C480W scanner working on Ubuntu Bionic

Intro


As I don't have time today, this is going to be one of the shortest posts in this blog. But I really want to save it as I've already spent a couple of hours trying to figure it out. So to make the long story short I reinstalled my laptop with Ubuntu Bionic and my Samsung Xpress C480W scanner stopped working. Nooooo!

Fixing the scanner


OK, so in order to make the scanner working again download and install the Samsung Unified Linux Driver:

sudo apt install libusb-0.1-4

cd /tmp

wget https://www.bchemnet.com/suldr/driver/UnifiedLinuxDriver-1.00.39.tar.gz

tar -xzf UnifiedLinuxDriver-1.00.39.tar.gz

cd uld

./install.sh

Easy peasy. So what's wrong? The problem is that the installer places the module under the "/usr/lib/sane" directory while Ubuntu Bionic expects them under the "/usr/lib/x86_64-linux-gnu/sane" directory! Sigh ... I don't know whether this is a bug or not, I basically didn't have time to check. But the problem can be easily solved by linking the module into the proper location:

sudo ln -sf /usr/lib/sane/libsane-smfp.so* /usr/lib/x86_64-linux-gnu/sane/

I hope it helps.


How to setup Canonical Identity Service (Candid) in the HA mode

Intro


Canonical Identity Service (Candid) does not require any HA-specific extensions / settings and can be deployed in the HA mode in various ways. Which one is the best one, however? It is hard to say, but the reference architecture is always the most obvious choice. In the following post we will setup Candid in the HA mode based on the following tool set:
  • PostgreSQL - reliable and highly available backend
  • Corosync & Pacemaker - messaging and service management
  • HAProxy - load balancing and SSL termination
This architecture is shown in the following diagram:


The following part contains step-by-step instructions to install and configure Candid in the HA mode. All services are set on LXD containers with Ubuntu Bionic. It is assumed that LDAP is used as the Identity Provider.

Prerequisites



PostgreSQL


On all containers install and stop PostgreSQL:

# apt -y install postgresql

# systemctl stop postgresql

On candid-ha-0 container disable PostgreSQL:

# systemctl disable postgresql

On all containers change ownership and permissions of PostgreSQL certificate and key:

# chown postgres:postgres /var/lib/postgresql/10/main/server.crt

# chown postgres:postgres /var/lib/postgresql/10/main/server.key

# chmod 600 /var/lib/postgresql/10/main/server.key

On all containers configure PostgreSQL:

# sed -i "s^ssl_cert_file = '/etc/ssl/certs/ssl-cert-snakeoil.pem'^#ssl_cert_file = '/etc/ssl/certs/ssl-cert-snakeoil.pem'^" /etc/postgresql/10/main/postgresql.conf

# sed -i "s^ssl_key_file = '/etc/ssl/private/ssl-cert-snakeoil.key'^#ssl_key_file = '/etc/ssl/private/ssl-cert-snakeoil.key'^" /etc/postgresql/10/main/postgresql.conf

# cat <<EOF >> /etc/postgresql/10/main/postgresql.conf
listen_addresses = '*'
max_connections = 300
wal_level = hot_standby
synchronous_commit = on
archive_mode = off
max_wal_senders = 10
wal_keep_segments = 256
hot_standby = on
restart_after_crash = off
hot_standby_feedback = on
ssl_cert_file = '/var/lib/postgresql/10/main/server.crt'
ssl_key_file = '/var/lib/postgresql/10/main/server.crt'
EOF

# cat /etc/postgresql/10/main/pg_hba.conf | head -n -3 > /tmp/pg_hba.conf; mv /tmp/pg_hba.conf /etc/postgresql/10/main/pg_hba.conf

# cat <<EOF >> /etc/postgresql/10/main/pg_hba.conf
host    replication     postgres        10.130.194.10/32        trust
host    replication     postgres        10.130.194.11/32        trust
host    replication     postgres        10.130.194.12/32        trust
host    replication     postgres        10.130.194.253/32       trust
host    replication     postgres        10.130.194.254/32       trust
hostssl candid          candid          10.130.194.10/32        md5
hostssl candid          candid          10.130.194.11/32        md5
hostssl candid          candid          10.130.194.12/32        md5
hostssl candid          candid          10.130.194.253/32       md5
hostssl candid          candid          10.130.194.254/32       md5
EOF

NOTE: Replace IP addresses with IP addresses of all containers and IP addresses reserved for Candid and PostgreSQL VIP.

On candid-ha-1 container start PostgreSQL:

# systemctl start postgresql

On candid-ha-2 container initiate the replication and start PostgreSQL:

# rm -rf /var/lib/postgresql/10/main/

# sudo -u postgres pg_basebackup -h 10.130.194.11 -D /var/lib/postgresql/10/main -v --wal-method=stream

# systemctl start postgresql

NOTE: Replace IP addresses with IP addresses of candid-ha-1 container.

HAProxy


On all containers install HAProxy:

# apt -y install haproxy

On all containers configure HAProxy:

cat <<EOF > /etc/haproxy/haproxy.cfg
defaults
    timeout connect 10s
    timeout client 1m
    timeout server 1m

global
    tune.ssl.default-dh-param 2048

frontend candid
    bind *:443 ssl crt /etc/ssl/private/candid.pem
    reqadd X-Forwarded-Proto:\ https
    option http-server-close
    default_backend candid

backend candid
    balance source
    hash-type consistent
    server candid-ha-1 10.130.194.10:8081 check
    server candid-ha-2 10.130.194.11:8081 check
    server candid-ha-3 10.130.194.12:8081 check
EOF

NOTE: candid.pem is a concatenation of candid.crt and candid.key files (cat candid.crt candid.key > candid.pem)

NOTE: Replace IP addresses with IP addresses of containers.

On all containers restart HAProxy:

# systemctl restart haproxy

Corosync & Pacemaker


On all containers install Corosync and Pacemaker:

# apt -y install crmsh

On all containers configure Corosync:

# cat <<EOF > /etc/corosync/corosync.conf
totem {
  version: 2
  token: 3000
  token_retransmits_before_loss_const: 10
  join: 60
  consensus: 3600
  vsftype: none
  max_messages: 20
  clear_node_high_bit: yes
  secauth: off
  threads: 0
  ip_version: ipv4
  rrp_mode: none
  transport: udpu
}

quorum {
  provider: corosync_votequorum
  }

nodelist {
  node {
    ring0_addr: 10.130.194.10
    nodeid: 1000
  }
  node {
    ring0_addr: 10.130.194.11
    nodeid: 1001
  }
  node {
    ring0_addr: 10.130.194.12
    nodeid: 1002
  }
}

logging {
  fileline: off
  to_stderr: yes
  to_logfile: no
  to_syslog: yes
  syslog_facility: daemon
  debug: off
  logger_subsys {
    subsys: QUORUM
    debug: off
  }
}
EOF

NOTE: Replace IP addresses with IP addresses of containers.

On all containers restart Corosync:

# systemctl restart corosync

On candid-ha-0 container configure Pacemaker (run the command and replace the data):

# crm configure edit

node 1000: candid-ha-0 \
        attributes pgsql-data-status=DISCONNECT
node 1001: candid-ha-1 \
        attributes pgsql-data-status=LATEST
node 1002: candid-ha-2 \
        attributes pgsql-data-status=DISCONNECT
primitive haproxy lsb:haproxy \
        op monitor interval=15s
primitive pgsql pgsql \
        params rep_mode=sync pgctl="/usr/lib/postgresql/10/bin/pg_ctl" psql="/usr/bin/psql" pgdata="/var/lib/postgresql/10/main/" socketdir="/var/run/postgresql" config="/etc/postgresql/10/main/postgresql.conf" logfile="/var/log/postgresql/postgresql-10-ha.log" master_ip=10.130.194.253 node_list="candid-ha-0 candid-ha-1 candid-ha-2" primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5 keepalives_count=5" restart_on_promote=true \
        op start timeout=60s interval=0s on-fail=restart \
        op monitor timeout=60s interval=4s on-fail=restart \
        op monitor timeout=60s interval=3s on-fail=restart role=Master \
        op promote timeout=60s interval=0s on-fail=restart \
        op demote timeout=60s interval=0s on-fail=stop \
        op stop timeout=60s interval=0s on-fail=block \
        op notify timeout=60s interval=0s
primitive res_candid_vip IPaddr2 \
        params ip=10.130.194.254 cidr_netmask=32 \
        op monitor interval=10s \
        meta migration-threshold=0
primitive res_pgsql_vip IPaddr2 \
        params ip=10.130.194.253 cidr_netmask=32 \
        op monitor interval=10s \
        meta migration-threshold=0 target-role=Started
ms ms_pgsql pgsql \
        meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
clone haproxy-clone haproxy
location cli-prefer-res_pgsql_vip res_pgsql_vip role=Started inf: candid-ha-0
order ord_demote 0: ms_pgsql:demote res_pgsql_vip:stop symmetrical=false
order ord_promote inf: ms_pgsql:promote res_pgsql_vip:start symmetrical=false
location pgsql_on_two_nodes ms_pgsql -inf: candid-ha-0
colocation pgsql_vip inf: res_pgsql_vip ms_pgsql:Master
property cib-bootstrap-options: \
        have-watchdog=false \
        dc-version=1.1.18-2b07d5c5a9 \
        cluster-infrastructure=corosync \
        cluster-name=debian \
        stonith-enabled=false \
        last-lrm-refresh=1534598484
rsc_defaults rsc-options: \
        resource-stickiness=INFINITY \
        migration-threshold=10

NOTE: Replace IP addresses with IP addresses reserved for Candid and PostgreSQL VIP.

Candid


On all containers install Candid:

# apt -y install candid

On candid-ha-1 container create PostgreSQL user and database for Candid:

# su postgres -c "createuser candid -P"

# su postgres -c "createdb candid -O candid"

On all containers export CANDID_URL variable:

# export CANDID_URL="https://candid.example.com"

# echo "export CANDID_URL=\"https://candid.example.com\"" >> /root/.bashrc

On candid-ha-0 container create admin credentials for Candid API:

# candid put-agent --admin --agent-file admin.agent

# candid put-agent --admin --agent-file services.keys

On all containers configure Candid:

# cat <<EOF > /etc/candid/config.yaml
listen-address: 10.130.194.10:8081
location: 'https://candid.example.com'
private-addr: 10.130.194.10
storage:
  type: postgres
  connection-string: dbname=candid user=candid password=candid host=postgres.example.com
private-key: Xh/hbA92cqSAFunu3IgVK0VeZrZvtpR7E50OXR39S48=
public-key: Olf//8WpzSnIFm0HwJX4WCoXlTkw1ndAAvFGP1nj71U=
admin-agent-public-key: JsJOh7kXuONBGvgF2kunmbn+gcg8MpoBfMVMrB8RrTw=
resource-path: /usr/share/candid/
access-log: /var/log/identity/access.log
identity-providers:
  - type: ldap
    name: ldap
    domain: example
    url: ldap://ldap.example.com/dc=example,dc=com
    ca-cert: |
      -----BEGIN CERTIFICATE-----
      MIIEGzCCAwOgAwIBAgIJAI67J2tCUZWMMA0GCSqGSIb3DQEBCwUAMIGjMQswCQYD
      VQQGEwJQTDETMBEGA1UECAwKU29tZS1TdGF0ZTEPMA0GA1UEBwwGS3Jha293MRww
      GgYDVQQKDBNJVHN0ZWVyIFR5dHVzIEt1cmVrMRMwEQYDVQQLDApDb25zdWx0aW5n
      MRgwFgYDVQQDDA93d3cuaXRzdGVlci5jb20xITAfBgkqhkiG9w0BCQEWEm9mZmlj
      ZUBpdHN0ZWVyLmNvbTAeFw0xODAyMDExMTQ1NTZaFw0yODAxMzAxMTQ1NTZaMIGj
      MQswCQYDVQQGEwJQTDETMBEGA1UECAwKU29tZS1TdGF0ZTEPMA0GA1UEBwwGS3Jh
      a293MRwwGgYDVQQKDBNJVHN0ZWVyIFR5dHVzIEt1cmVrMRMwEQYDVQQLDApDb25z
      dWx0aW5nMRgwFgYDVQQDDA93d3cuaXRzdGVlci5jb20xITAfBgkqhkiG9w0BCQEW
      Em9mZmljZUBpdHN0ZWVyLmNvbTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoC
      ggEBAM3cE1zSJgQw3XNzOn0Z7pcwlHg6B2/ubOQ1L6UDmQNFqdz0Zmg5nSTPpeU6
      VlxrUz8YogiISEl549v92TjBSw7SrDTexUNqKNeHdF6wdVQpEsU8hZbndP1sgYH8
      2ONYTKG1sqs03JS8gdbb8ZBJYQGiqT2owOLU43QTlVl1KE5yq5b7PwgUlqCfSMbG
      FUlE51YBcbv0DYDILJ5trbslAT3xXCk9Lbxyi7cW87fB9mfvkmd48jZb1yl2EY1V
      qFiHTrLw0TK+JcI49psxccOy1aXzKJjVbjTt3l/d2mCUIh76S5AlBOiLmn9zOo2G
      GA0LtTm7lgJVD1kahpf5NbNAVTUCAwEAAaNQME4wHQYDVR0OBBYEFN8zFHBCIM4T
      BPEBeRonPyf7h7K4MB8GA1UdIwQYMBaAFN8zFHBCIM4TBPEBeRonPyf7h7K4MAwG
      A1UdEwQFMAMBAf8wDQYJKoZIhvcNAQELBQADggEBAAvpzYUcMT2Z7ohbUpV94ZOz
      8bL919UozNY0V9lrcbnI5v/GlibnNDd/lE7/kBZAdpJMFpzYLxQdBdukXNsQ66fu
      UCj7OVZbnhykN6aiAmB7NHHb4gp6Eu9Aan5Cfky2UE66FmZRMulNMH+l0B64AJ9h
      crRUIGpsK0BrLl6KITE7OB9Qbjm8VSsRBxDy1MrdwGjDyeWCVIU5YRGcs/j5X45k
      OeK8S1KwpuU8/wjkP5lYKUeNRDXIbduWsNAYbLLY8N1wWh+373IuZg3OkfSkIEV7
      ApcRh/uwdJKsx0ebO8aHTDCiBi4AYGDcAumsmpY1CAaWBDzdja77bQocI3qDV60=
      -----END CERTIFICATE-----
    dn: cn=admin,dc=example,dc=com
    password: admin
    user-query-filter: (objectClass=posixAccount)
    user-query-attrs:
      id: uid
      email: mail
      display-name: displayName
    group-query-filter: (&(objectClass=groupOfNames)(member={{.User}}))
EOF

NOTE: Replace the following entries:
  • listen-address - IP address of the container with port
  • private-addr - IP address of the container
  • password - password of PostgreSQL candid user
  • private-key - value of private from services.keys file
  • public-key - value of public from services.keys file
  • admin-agent-public-key - value of public from admin.keys file
  • domain - LDAP domain
  • url - LDAP URL
  • ca-cert - content of the ca.pem file
  • dn - LDAP bind credentials
  • password - LDAP bind password
On all containers restart Candid service:

# systemctl restart candid

At this point you should be able to access Candid at https://candid.example.com

Circular asynchronous MySQL replication between multiple geographically-distributed Percona XtraDB Clusters with Juju

Intro


I have recently shown you how to replicate databases between two Percona XtraDB Clusters using asynchronous MySQL replication with Juju [1]. Today I am going to take you one step further. I will show you how to configure circular asynchronous MySQL replication between geographically-distributed Percona XtraDB Clusters. I will use Juju for this purpose again as it not only simplifies the deployment, but the entire life cycle management. So ... grab a cup of coffee and see the world around you changing!

Design


Let's assume that you have three geographically-distributed sites: dc1, dc2 and dc3, and you want to replicate example database across Percona XtraDB Clusters located in each site. We will use Juju for modelling purposes and MaaS [2] as a provider for Juju. Each site has MaaS installed, configured and nodes enlisted, and commissioned in MaaS. The whole environment is managed from a Juju client which is external to the sites. The above is presented on the following diagram:

Percona XtraDB Cluster circular asynchronous MySQL replication

P.S.: If you have more than three sites, don't worry. Circular replication scales out, so can replicate the database across multiple Percona XtraDB Clusters.

Initial deployment


Let's assume that you already have Juju client installed, all your three MaaS clouds added to the client and Juju controllers bootstrapped in each cloud. If you don't know how to do it, you can refer to MaaS documentation [3]. You can list Juju controllers by executing the following command:

$ juju list-controllers

Controller  Model    User   Access     Cloud/Region  Models  Machines  HA    Version
juju-dc1*   default  admin  superuser  maas-dc1      2       1         none  2.3.7
juju-dc2    default  admin  superuser  maas-dc2      2       1         none  2.3.7
juju-dc3    default  admin  superuser  maas-dc3      2       1         none  2.3.7

The asterisk character indicates the current controller in use. You can switch between them by executing the following command:

$ juju switch <controller_name>

In each cloud, on each controller we create a model and deploy Percona XtraDB Cluster within this model. I'm going to use bundles [4] today to make the deployment easier:

$ juju switch juju-dc1

$ juju add-model pxc-rep1

$ cat <<EOF > pxc1.yaml

series: xenial
services:
  pxc1:
    charm: "/tmp/charm-percona-cluster"
    num_units: 3
    options:
      cluster-id: 1
      databases-to-replicate: "example"
      root-password: "root"
      vip: 10.0.1.100
  hacluster-pxc1:
    charm: "cs:hacluster"
    options:
      cluster_count: 3
relations:
  - [ pxc1, hacluster-pxc1 ]
EOF

$ juju deploy pxc1.yaml

$ juju switch juju-dc2

$ juju add-model pxc-rep2

$ cat <<EOF > pxc2.yaml

series: xenial
services:
  pxc2:
    charm: "/tmp/charm-percona-cluster"
    num_units: 3
    options:
      cluster-id: 2
      databases-to-replicate: "example"
      root-password: "root"
      vip: 10.0.2.100
  hacluster-pxc2:
    charm: "cs:hacluster"
    options:
      cluster_count: 3
relations:
  - [ pxc2, hacluster-pxc2 ]
EOF

$ juju deploy pxc2.yaml

$ juju switch juju-dc3

$ juju add-model pxc-rep3

$ cat <<EOF > pxc3.yaml

series: xenial
services:
  pxc3:
    charm: "/tmp/charm-percona-cluster"
    num_units: 3
    options:
      cluster-id: 3
      databases-to-replicate: "example"
      root-password: "root"
      vip: 10.0.3.100
  hacluster-pxc3:
    charm: "cs:hacluster"
    options:
      cluster_count: 3
relations:
  - [ pxc3, hacluster-pxc3 ]
EOF

$ juju deploy pxc3.yaml

Re-fill your cup of coffee and after some time check the Juju status:

$ juju switch juju-dc1

$ juju status
Model     Controller  Cloud/Region  Version  SLA
pxc-rep1  juju-dc1    maas-dc1      2.3.7    unsupported

App             Version       Status  Scale  Charm            Store  Rev  OS      Notes
hacluster-pxc1                active      3  hacluster        local    0  ubuntu  
pxc1            5.6.37-26.21  active      3  percona-cluster  local   45  ubuntu  

Unit                 Workload  Agent  Machine  Public address  Ports     Message
pxc1/0*              active    idle   0        10.0.1.1        3306/tcp  Unit is ready
  hacluster-pxc1/0*  active    idle            10.0.1.1                  Unit is ready and clustered
pxc1/1               active    idle   1        10.0.1.2        3306/tcp  Unit is ready
  hacluster-pxc1/1   active    idle            10.0.1.2                  Unit is ready and clustered
pxc1/2               active    idle   2        10.0.1.3        3306/tcp  Unit is ready
  hacluster-pxc1/2   active    idle            10.0.1.3                  Unit is ready and clustered

Machine  State    DNS       Inst id        Series  AZ  Message
0        started  10.0.1.1  juju-83da9e-0  xenial      Running
1        started  10.0.1.2  juju-83da9e-1  xenial      Running
2        started  10.0.1.3  juju-83da9e-2  xenial      Running

Relation provider      Requirer               Interface        Type         Message
hacluster-pxc1:ha      pxc1:ha                hacluster        subordinate  
hacluster-pxc1:hanode  hacluster-pxc1:hanode  hacluster        peer         
pxc1:cluster           pxc1:cluster           percona-cluster  peer

If all units turned to the active state, you're ready to go. Remember to check the status in all models.

Setting up circular asynchronous MySQL replication


In order to set up circular asynchronous MySQL replication between all three Percona XtraDB Clusters we have to relate them. However, as they don't belong to the same model / controller / cloud, we have to create offers [5] first (offers allow cross-model / cross-controller / cross-cloud relations):

$ juju switch juju-dc1

$ juju offer pxc1:slave

$ juju switch juju-dc2

$ juju offer pxc2:slave


$ juju switch juju-dc3

$ juju offer pxc3:slave

Then we have to consume the offers:

$ juju switch juju-dc1

$ juju consume juju-dc2:admin/pxc-rep2.pxc2 pxc2


$ juju switch juju-dc2

$ juju consume juju-dc3:admin/pxc-rep3.pxc3 pxc3


$ juju switch juju-dc3

$ juju consume juju-dc1:admin/pxc-rep1.pxc1 pxc1

Finally, we can add the cross-cloud relations:

$ juju switch juju-dc1

$ juju relate pxc1:master pxc2

$ juju switch juju-dc2

$ juju relate pxc2:master pxc3

$ juju switch juju-dc3

$ juju relate pxc3:master pxc1

Wait a couple of minutes and check whether all units turned into active state:

$ juju switch juju-dc1

$ juju status
Model     Controller  Cloud/Region  Version  SLA
pxc-rep1  juju-dc1    maas-dc1      2.3.7    unsupported

SAAS  Status  Store      URL
pxc2  active  maas-dc2  admin/pxc-rep2.pxc2

App             Version       Status  Scale  Charm            Store  Rev  OS      Notes
hacluster-pxc1                active      3  hacluster        local    0  ubuntu  
pxc1            5.6.37-26.21  active      3  percona-cluster  local   45  ubuntu  

Unit                 Workload  Agent  Machine  Public address  Ports     Message
pxc1/0*              active    idle   0        10.0.1.1        3306/tcp  Unit is ready
  hacluster-pxc1/0*  active    idle            10.0.1.1                  Unit is ready and clustered
pxc1/1               active    idle   1        10.0.1.2        3306/tcp  Unit is ready
  hacluster-pxc1/1   active    idle            10.0.1.2                  Unit is ready and clustered
pxc1/2               active    idle   2        10.0.1.3        3306/tcp  Unit is ready
  hacluster-pxc1/2   active    idle            10.0.1.3                  Unit is ready and clustered

Machine  State    DNS       Inst id        Series  AZ  Message
0        started  10.0.1.1  juju-83da9e-0  xenial      Running
1        started  10.0.1.2  juju-83da9e-1  xenial      Running
2        started  10.0.1.3  juju-83da9e-2  xenial      Running

Offer  Application  Charm            Rev  Connected  Endpoint  Interface                Role
pxc1   pxc1         percona-cluster  48   1/1        slave     mysql-async-replication  requirer

Relation provider      Requirer               Interface        Type         Message
hacluster-pxc1:ha      pxc1:ha                hacluster        subordinate  
hacluster-pxc1:hanode  hacluster-pxc1:hanode  hacluster        peer         
pxc1:cluster           pxc1:cluster           percona-cluster  peer
pxc1:master            pxc2:slave             mysql-async-replication  regular

At this point you should have circular asynchronous MySQL replication working between all three Percona XtraDB Clusters. The asterisk character in the output above indicates the leader unit. Let's check whether it's actually working by connecting to the MySQL console on the leader unit:

$ juju ssh pxc1/0

$ mysql -u root -p

First check whether as a master it has granted access to pxc2 application units:

mysql> SELECT Host FROM mysql.user WHERE User='replication';
+----------+
| Host     |
+----------+
| 10.0.2.1 |
| 10.0.2.2 |
| 10.0.2.3 |
+----------+
3 rows in set (0.00 sec)

Then check its slave status as a slave of pxc3:

mysql> SHOW SLAVE STATUS\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.3.100
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000004
          Read_Master_Log_Pos: 338
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 338
              Relay_Log_Space: 457
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: e803b085-739f-11e8-8f7e-00163e391eab
             Master_Info_File: /var/lib/percona-xtradb-cluster/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
1 row in set (0.00 sec)

Finally, create a database:

mysql> CREATE DATABASE example;
Query OK, 1 row affected (0.01 sec)

and check whether it has been created on pxc2:

$ juju switch juju-dc2

$ juju ssh pxc2/0

$ mysql -u root -p

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| example            |
| mysql              |
| performance_schema |
| test               |
+--------------------+
5 rows in set (0.00 sec)

It is there! It should be created on pxc3 as well. Go there and check it.

At this point you can write to example database from all units of all Percona XtraDB Clusters. This is how circular asynchronous MySQL replication works. Isn't that easy? Of course it is - thanks to Juju!

References