Features/COLO/Managed HOWTO: Difference between revisions
Lukas Straub (talk | contribs) (Created page with "On every node do the following: Install debian buster amd64 https://www.debian.org/distrib/ $ = run as user # = run as root Install packages: # apt-get -y install git bu...") |
Lukas Straub (talk | contribs) No edit summary |
||
Line 1: | Line 1: | ||
== Overview == | |||
This is a step-by-step guide to install qemu-colo, configure a pacemaker cluster and configure and run a qemu-colo cluster resource. This is just a minimal pacemaker setup and should not be used in production. For more information about pacemaker configuration, look at the [https://clusterlabs.org/pacemaker/doc/ pacemaker] and [https://manpages.debian.org/buster/corosync/corosync.conf.5.en.html corosync] documentation. | |||
It's assumed that you have two cluster nodes with the following ip's: | |||
test-cluster-01 192.168.220.244 | |||
test-cluster-02 192.168.220.245 | |||
== Setup == | |||
$ = run as normal user | |||
# = run as root | |||
On every node do the following: | On every node do the following: | ||
Install debian buster amd64 | Install debian buster amd64 | ||
https://www.debian.org/distrib/ | https://www.debian.org/distrib/ | ||
Install packages: | Install packages: | ||
# apt-get -y install git build-essential wget nano bridge-utils corosync pacemaker crmsh python3 pkg-config libglib2.0-dev libpixman-1-dev | # apt-get -y install git build-essential wget nano bridge-utils corosync pacemaker crmsh python3 pkg-config libglib2.0-dev libpixman-1-dev | ||
Workaround: | Workaround for a [https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=960271 bug]: | ||
# wget https://snapshot.debian.org/archive/debian/20200129T091834Z/pool/main/l/linux/linux-libc-dev_4.19.98-1_amd64.deb | # wget https://snapshot.debian.org/archive/debian/20200129T091834Z/pool/main/l/linux/linux-libc-dev_4.19.98-1_amd64.deb | ||
# dpkg -i linux-libc-dev_4.19.98-1_amd64.deb | # dpkg -i linux-libc-dev_4.19.98-1_amd64.deb | ||
Line 22: | Line 30: | ||
Configure networking: | Configure networking: | ||
Replace <code>/etc/network/interfaces</code> with the following. Adjust <code>eth0</code> and the ip address as needed for the node. | |||
auto lo | auto lo | ||
iface lo inet loopback | iface lo inet loopback | ||
Line 38: | Line 44: | ||
netmask 255.255.255.0 | netmask 255.255.255.0 | ||
gateway 192.168.220.1 | gateway 192.168.220.1 | ||
Configure your dns server in <code>/etc/resolv.conf</code>: | |||
nameserver 192.168.220.1 | nameserver 192.168.220.1 | ||
Apply changes: | |||
# ifdown eth0 | # ifdown eth0 | ||
# ifup br0 | # ifup br0 | ||
Configure | Configure local dns: | ||
Replace <code>/etc/hosts</code> on <code>test-cluster-01</code> with the following: | |||
127.0.0.1 localhost | 127.0.0.1 localhost | ||
127.0.1.1 test-cluster-01.home.intra test-cluster-01 | 127.0.1.1 test-cluster-01.home.intra test-cluster-01 | ||
Line 59: | Line 64: | ||
192.168.220.245 test-cluster-02.home.intra test-cluster-02 | 192.168.220.245 test-cluster-02.home.intra test-cluster-02 | ||
Replace <code>/etc/hosts</code> on <code>test-cluster-02</code> with the following: | |||
127.0.0.1 localhost | 127.0.0.1 localhost | ||
127.0.1.1 test-cluster-02.home.intra test-cluster-02 | 127.0.1.1 test-cluster-02.home.intra test-cluster-02 | ||
Line 71: | Line 75: | ||
192.168.220.244 test-cluster-01.home.intra test-cluster-01 | 192.168.220.244 test-cluster-01.home.intra test-cluster-01 | ||
Configure corosync: | Configure corosync: | ||
Replace <code>/etc/corosync/corosync.conf</code> with the following: | |||
# Please read the corosync.conf.5 manual page | # Please read the corosync.conf.5 manual page | ||
totem { | totem { | ||
Line 133: | Line 137: | ||
} | } | ||
} | } | ||
Apply changes: | |||
# systemctl enable corosync | # systemctl enable corosync | ||
# systemctl restart corosync | # systemctl restart corosync | ||
# systemctl enable pacemaker | |||
# systemctl restart pacemaker | # systemctl restart pacemaker | ||
Configure a qemu-colo cluster resource: | == Configure a qemu-colo cluster resource == | ||
Create images on all nodes: | |||
# qemu-img create -f qcow2 /mnt/vms/vma.qcow2 10g | |||
Show user guide of the resource agent for explanation of parameters and more: | |||
# crm ra info ocf:qemu:colo | # crm ra info ocf:qemu:colo | ||
Configure the resource (on one node only): | |||
# crm configure primitive vma ocf:qemu:colo \ | # crm configure primitive vma ocf:qemu:colo \ | ||
meta target-role=Stopped \ | meta target-role=Stopped \ | ||
Line 155: | Line 163: | ||
op promote timeout=30s interval=0 \ | op promote timeout=30s interval=0 \ | ||
op demote timeout=120s interval=0 | op demote timeout=120s interval=0 | ||
# crm configure clone vma_ms vma \ | # crm configure clone vma_ms vma \ | ||
meta promotable=true clone-max=2 promoted-max=1 notify=true target-role=Started | |||
# crm_master -r vma -v 10 | |||
Show cluster status: | |||
# crm_mon | # crm_mon | ||
# journalctl - | The resource should be 'Master' on one node and 'Slave' on the other | ||
For detailed error messages and resync status, look at the system log: | |||
# journalctl -f |
Revision as of 18:32, 6 June 2020
Overview
This is a step-by-step guide to install qemu-colo, configure a pacemaker cluster and configure and run a qemu-colo cluster resource. This is just a minimal pacemaker setup and should not be used in production. For more information about pacemaker configuration, look at the pacemaker and corosync documentation.
It's assumed that you have two cluster nodes with the following ip's:
test-cluster-01 192.168.220.244 test-cluster-02 192.168.220.245
Setup
$ = run as normal user # = run as root
On every node do the following:
Install debian buster amd64 https://www.debian.org/distrib/
Install packages:
# apt-get -y install git build-essential wget nano bridge-utils corosync pacemaker crmsh python3 pkg-config libglib2.0-dev libpixman-1-dev
Workaround for a bug:
# wget https://snapshot.debian.org/archive/debian/20200129T091834Z/pool/main/l/linux/linux-libc-dev_4.19.98-1_amd64.deb # dpkg -i linux-libc-dev_4.19.98-1_amd64.deb
Install qemu:
$ git clone --single-branch --depth 1 -b new_build https://github.com/Lukey3332/qemu.git $ cd qemu $ ./configure --target-list=x86_64-softmmu,i386-softmmu --enable-replication --enable-colo-ra --enable-kvm --prefix=/usr $ make -j4; make # make install
Configure networking:
Replace /etc/network/interfaces
with the following. Adjust eth0
and the ip address as needed for the node.
auto lo iface lo inet loopback iface eth0 inet manual auto br0 iface br0 inet static mtu 1500 bridge_ports eth0 address 192.168.220.244 netmask 255.255.255.0 gateway 192.168.220.1
Configure your dns server in /etc/resolv.conf
:
nameserver 192.168.220.1
Apply changes:
# ifdown eth0 # ifup br0
Configure local dns:
Replace /etc/hosts
on test-cluster-01
with the following:
127.0.0.1 localhost 127.0.1.1 test-cluster-01.home.intra test-cluster-01 # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters 192.168.220.245 test-cluster-02.home.intra test-cluster-02
Replace /etc/hosts
on test-cluster-02
with the following:
127.0.0.1 localhost 127.0.1.1 test-cluster-02.home.intra test-cluster-02 # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters 192.168.220.244 test-cluster-01.home.intra test-cluster-01
Configure corosync:
Replace /etc/corosync/corosync.conf
with the following:
# Please read the corosync.conf.5 manual page totem { version: 2 cluster_name: test-cluster } logging { # Log the source file and line where messages are being # generated. When in doubt, leave off. Potentially useful for # debugging. fileline: off # Log to standard error. When in doubt, set to yes. Useful when # running in the foreground (when invoking "corosync -f") to_stderr: yes # Log to a log file. When set to "no", the "logfile" option # must not be set. to_logfile: yes logfile: /var/log/corosync/corosync.log # Log to the system log daemon. When in doubt, set to yes. to_syslog: yes # Log debug messages (very verbose). When in doubt, leave off. debug: off # Log messages with time stamps. When in doubt, set to hires (or on) #timestamp: hires logger_subsys { subsys: QUORUM debug: off } } quorum { # Enable and configure quorum subsystem (default: off) # see also corosync.conf.5 and votequorum.5 provider: corosync_votequorum two_node: 1 } nodelist { node { # Hostname of the node name: test-cluster-01 # Cluster membership node identifier nodeid: 1 ring0_addr: 192.168.220.244 } node { # Hostname of the node name: test-cluster-02 # Cluster membership node identifier nodeid: 2 ring0_addr: 192.168.220.245 } }
Apply changes:
# systemctl enable corosync # systemctl restart corosync # systemctl enable pacemaker # systemctl restart pacemaker
Configure a qemu-colo cluster resource
Create images on all nodes:
# qemu-img create -f qcow2 /mnt/vms/vma.qcow2 10g
Show user guide of the resource agent for explanation of parameters and more:
# crm ra info ocf:qemu:colo
Configure the resource (on one node only):
# crm configure primitive vma ocf:qemu:colo \ meta target-role=Stopped \ params active_hidden_dir="/mnt/vms" \ options="-vnc :0 -enable-kvm -cpu qemu64,+kvmclock -m 512 -netdev bridge,br=br0,id=hn0 -device e1000,netdev=hn0 -device virtio-blk,drive=colo-disk0 -drive if=none,node-name=parent0,format=qcow2,file=/mnt/vms/vma.qcow2" \ op start timeout=30s interval=0 \ op stop timeout=10s interval=0 \ op monitor role=Master interval=1000ms timeout=30s \ op monitor role=Slave interval=1001ms timeout=30s \ op notify timeout=30s interval=0 \ op promote timeout=30s interval=0 \ op demote timeout=120s interval=0 # crm configure clone vma_ms vma \ meta promotable=true clone-max=2 promoted-max=1 notify=true target-role=Started # crm_master -r vma -v 10
Show cluster status:
# crm_mon
The resource should be 'Master' on one node and 'Slave' on the other
For detailed error messages and resync status, look at the system log:
# journalctl -f