In this article we assume that you are familiar with Salt, formulas, reclass and OpenStack-Salt. This is my cheatsheet which drives you step by step to deploy OpenStack Mitaka based on the latest model, using a new Cluster Class. If you feel lost, feel free to get back to the beginning.
What’s new
The mk-lab-salt model, built for training and development, introduce the following new features
- Salt Master now runs on Ubuntu 16.04 LTS
- OpenStack packages comes from Mirantis repositories
- New top level cluster class allows a simple modeling of multiple deployments in a single reclass model.
- Mirantis Stacklight, logging, monitoring and alerting tooling is now integrated
Modeling your Infrastructure
In this session, I’ll be using my forked repository of Mirantis mk-lab-salt-model
infrastructure modeling.
Fork and Clone upstream Mirantis model
Fork the Mirantis repository, Log into your GitHub account, accessing the Mirantis repository and click on fork.
Once its done, you can clone your fork
# git clone git@github.com:planetrobbie/mk-lab-salt-model.git /srv/salt/reclass -b master
Enter passphrase for key '/root/.ssh/id_rsa':
remote: Counting objects: 1528, done.
remote: Total 1528 (delta 0), reused 0 (delta 0), pack-reused 1528
Receiving objects: 100% (1528/1528), 162.76 KiB | 210.00 KiB/s, done.
Resolving deltas: 100% (704/704), done.
Checking connectivity... done.
Note: avoid HTTPS while cloning, prefer SSH, it will enable you to push your updates just using your SSH key instead of typing your password at each push to origin !
Tweak your model to your need
First of all, each of your minion should have an ID
which correspond to nodes described in the reclass model. Currently the nodes should have the following ID:
- ctl01.mk22-lab-advanced.local
- ctl02.mk22-lab-advanced.local
- ctl03.mk22-lab-advanced.local
- cmp01.mk22-lab-advanced.local
- mon01.mk22-lab-advanced.local
- prx01.mk22-lab-advanced.local
We’ve introduced the concept of cluster class to give you an easy way to describe multiple deployment in the same model.
Each deployment is now defined in its own cluster directory classes/cluster/<deployment_name>
. Our current model offers the following ones:
- mk20_lab_basic
- mk20_lab_advanced
- mk20_lab_expert
- mk20_stacklight_basic
- mk20_stacklight_advanced
- mk22_lab_basic
- mk22_lab_advanced
- mk22_scale_mirantis
- mk22_full_scale
Let’s review the mk22_lab_advanced
cluster class, which describe the following nodes
- 1 config node - Salt Master Node
- 3 control nodes - OpenStack/OpenContrail control plane
- 1 compute node - KVM node
- 3 monitor node - StackLight backend.
- 1 Proxy Node - frontend to web UI
the Cluster classe structure of our mk22_lab_advanced
looks like this
├── fuel
| ├── config.yml
│ └── init.yml
├── init.yml
├── openstack
│ ├── compute.yml
│ ├── control.yml
│ ├── dashboard.yml
│ ├── init.yml
│ └── proxy.yml
└── stacklight
├── client.yml
├── init.yml
├── proxy.yml
└── server.yml
path | details |
---|---|
fuel/config.yml |
Salt Master parameters: git repository, git branch, IPs, interfaces, Galera nodes declaration |
fuel/init.yml | Declaration of cfg01 host |
init.yml | cluster domain, compute node declaration, IPs |
openstack/init.yml | OpenStack and OpenContrail version, passwords, Controllers IPs, VIP, nodes declaration |
openstack/compute.yml | compute nodes params: OpenContrail GW, Data plane interface (vhost0) |
openstack/control.yml | Keepalived interface |
openstack/dashboard.yml | repository |
openstack/proxy.yml | declaration for nginx ssl endpoint to proxy access to API and UI |
stacklight/init.yml | passwords, monitoring node declaration |
stacklight/client.yml | classes for collectd, heka, sensu nodes |
stacklight/proxy.yml | declaration for nginx ssl endpoint to proxy access to Kibana and Grafana |
stacklight/server.yml | Kibana host |
Apart from the cluster class you can look at the following files
path | details |
---|---|
classes/system/openssh/server/single.yml | declare additional users with their SSH keys |
nodes/control/cfg01.mk22-lab-advanced.local.yml | model repository branch and can overload Timezone |
Tweak all the necessary bold parameters, depending on your infrastructure requirements above.
Check if it looks good
Following up your model adaptation above, check that everything looks good
# reclass-salt --top
Verify that your minions are responding and are running the same version as your Salt Master
# salt '*' test.version
Note: As of today, december 2017, it should be 2016.3.3 (Boron).
If versions aren’t in sync, refer to the official installation guide or just reinstall salt-minion from the official SaltStack repository on the target nodes as follow:
# echo "deb http://repo.saltstack.com/apt/ubuntu/14.04/amd64/2016.3 trusty main" > /etc/apt/sources.list.d/saltstack.list
# wget -O - https://repo.saltstack.com/apt/ubuntu/14.04/amd64/2016.3/SALTSTACK-GPG-KEY.pub | apt-key add -
# apt-get clean
# apt-get update
# apt-get install -y salt-minion
Commit your changes
Before you can apply the model to any nodes, you need to commit the changes to your repository
# cd /srv/salt/reclass
# git add -A .
# git commit -m "model updated to lab requirements"
# git push origin
Run the reclass.storage
state to generate all the nodes within /srv/salt/reclass/nodes/_generated
salt 'cfg01*' state.sls salt.master,reclass
Refresh minion’s pillar data
# salt '*' saltutil.refresh_pillar
ctl03.mk22-lab-advanced.local:
True
ctl01.mk22-lab-advanced.local:
True
ctl02.mk22-lab-advanced.local:
True
cfg01.mk22-lab-advanced.local:
True
Sync all Salt resources
# salt '*' saltutil.sync_all
Salt Master
To install your Salt Master, follow the instructions given in our our previous article, the only changed step is the one above where I use another model instead of the workshop-salt-model
, I’m copying for reference the command below
git clone git@github.com:planetrobbie/mk-lab-salt-model.git /srv/salt/reclass -b master
Formulas
In this development lab, we will be cloning formulas from their respective source repo using a script that you can find in my repository.
Salt Master provisioning cheatsheet
To provision your Salt Master run
salt "cfg01*" state.sls salt.master
salt "cfg01*" state.sls linux,openssh,salt.minion,ntp
Everything should be green, re-run it if that’s not the case.
Deploy Common environment and Support Services
Controllers > linux | openssh | salt.minion | ntp
Let’s run the first batch of states
salt "ctl*" state.sls linux,openssh,salt.minion,ntp
Controllers > keepalived
Provision keepalived, a daemon for cluster VIP based on VRRP, here we are using a Compound matcher](https://docs.saltstack.com/en/latest/topics/targeting/compound.html) which glob on Pillar Data and will then be applied on all corresponding node.
# salt -C 'I@keepalived:cluster' state.sls keepalived -b 1
-b 1
define a batch size of 1, instead of executing on all targeted minions at once, execute on a progressive set of minions.
Check Keepalived
# salt -C 'I@keepalived:cluster' cmd.run "ip a | grep '\/32'"
ctl03.mk22-lab-advanced.local:
inet 172.16.10.254/32 scope global eth0
As you can see above, our 172.16.10.254
VIP is now present on ctl03
.
Controllers > Gluster
Setup Gluster Service
# salt -C 'I@glusterfs:server' state.sls glusterfs.server.service
Now Prepare the Glusterfs volumes
# salt -C 'I@glusterfs:server' state.sls glusterfs.server.setup -b 1
Check Gluster
# salt -C 'I@glusterfs:server' cmd.run "gluster peer status; gluster volume status" -b 1
If anything goes wrong, start over by deleting the affected volumes, re-create it manually and start it.
ctl01# gluster delete volume glance
ctl01# gluster volume create keystone-keys replica 3 172.16.10.101:/srv/glusterfs/keystone-keys 172.16.10.102:/srv/glusterfs/keystone-keys 172.16.10.103:/srv/glusterfs/keystone-keys force
ctl01# gluster volume start glance force
# salt-call state.sls glusterfs.server.setup
Controllers > RabbitMQ
Let’s now install and configure RabbitMQ on our cluster
# salt -C 'I@rabbitmq:server' state.sls rabbitmq
Check RabbitMQ
# salt -C 'I@rabbitmq:server' cmd.run "rabbitmqctl cluster_status"
Controllers > MySQL Galera
Let’s now deploy our database cluster, starting by the master
# salt -C 'I@galera:master' state.sls galera
Once the previous command terminate, deploy the remaining Galera nodes
# salt -C 'I@galera:slave' state.sls galera
Galera state also creates the database tables and users for OpenStack services.
Note: After the first failed run on the slaves, I had to apply the following patches on the Galera formula
--- a/galera/files/init_bootstrap.sh
+++ b/galera/files/init_bootstrap.sh
@@ -6,7 +6,7 @@ counter=60
while [ $counter -gt 0 ]
do
- mysql -u root -e"quit"
+ mysql -u root -e"quit" -pworkshop
if [[ $? -eq 0 ]]; then
exit 0
fi
diff --git a/galera/slave.sls b/galera/slave.sls
index 5a58186..d813370 100644
--- a/galera/slave.sls
+++ b/galera/slave.sls
@@ -91,7 +91,8 @@ galera_init_start_service:
galera_bootstrap_set_root_password:
cmd.run:
- - name: mysqladmin password "{{ slave.admin.password }}"
+ - name: echo "patched - can't set root password two times"
+# - name: mysqladmin password "{{ slave.admin.password }}"
- require:
- cmd: galera_init_start_service
Check Galera
# salt -C 'I@galera:master' mysql.status | grep -A1 wsrep_cluster_size
# salt -C 'I@galera:slave' mysql.status | grep -A1 wsrep_cluster_size
Controllers > HAProxy
# salt -C 'I@haproxy:proxy' state.sls haproxy
Check HAProxy
# salt -C 'I@haproxy:proxy' service.status haproxy
Now restart rsyslog
# salt -I 'haproxy:proxy' service.restart rsyslog
Controllers > memcached
# salt -C 'I@memcached:server' state.sls memcached
OpenStack Control Services
Controllers > Keystone | Glance
Provision Keystone
# salt -C 'I@keystone:server' state.sls keystone.server -b 1
Populate keystone services/tenants/admins
# salt -C 'I@keystone:client' state.sls keystone.client
Check Keystone
# salt -C 'I@keystone:server' cmd.run ". /root/keystonerc; keystone service-list"
Continue on with Glance State
# salt -C 'I@glance:server' state.sls glance -b 1
Run the Glusterfs client state
# salt -C 'I@glance:server' state.sls glusterfs.client
Re-run Keystone state to re-create the fernet tokens within the Gluster mounted filesystem
# salt -C 'I@keystone:server' state.sls keystone.server
Check Glance
# salt -C 'I@keystone:server' cmd.run ". /root/keystonerc; glance image-list"
Controllers > Nova
Install Nova, in this deployment we will use the LVM backend for Cinder
# salt -C 'I@nova:controller' state.sls nova -b 1
Check Nova
# salt -C 'I@keystone:server' cmd.run ". /root/keystonerc; nova service-list"
Controllers > Cinder
# salt -C 'I@cinder:controller' state.sls cinder -b 1
Check Cinder
# salt -C 'I@keystone:server' cmd.run ". /root/keystonerc; cinder list"
Controllers > Neutron
# salt -C 'I@neutron:server' state.sls neutron -b 1
Check Neutron
# salt -C 'I@keystone:server' cmd.run ". /root/keystonerc; neutron agent-list"
Getting errors here is totally normal, OpenContrail is not yet there !
Controller > Heat
# salt -C 'I@heat:server' state.sls heat -b 1
Check Heat
# salt -C 'I@keystone:server' cmd.run ". /root/keystonerc; heat resource-type-list"
Horizon | Nginx
For Horizon deployment you need to bring your prx01 proxy node to the baseline
salt "prx01*" state.sls linux,openssh,salt.minion,ntp
You can then deploy Horizon and Ninx
# salt -C 'I@horizon:server' state.sls horizon
Note: This model use a custom theme stored on prx01 at /usr/share/openstack-dashboard-mirantis-theme/static/mirantis
In this lab we aill be using our Salt Master as our reverse proxy to access our Web UI, so deploy Nginx on it.
# salt -C 'I@nginx:server' state.sls nginx
Check Horizon
Horizon should be available on http://172.16.10.121:8078, default login/password is admin/workshop, but before you can really use it you have to terminate the deployment.
OpenContrail
Install opencontrail database services
# salt -C 'I@opencontrail:database' state.sls opencontrail.database -b 1
Install opencontrail control services
# salt -C 'I@opencontrail:control' state.sls opencontrail -b 1
Contrail Post installation
Provision opencontrail control services
# salt -C 'I@opencontrail:control:id:1' cmd.run "/usr/share/contrail-utils/provision_control.py --api_server_ip 172.16.10.254 --api_server_port 8082 --host_name ctl01 --host_ip 172.16.10.101 --router_asn 64512 --admin_password workshop --admin_user admin --admin_tenant_name admin --oper add"
# salt -C 'I@opencontrail:control:id:1' cmd.run "/usr/share/contrail-utils/provision_control.py --api_server_ip 172.16.10.254 --api_server_port 8082 --host_name ctl02 --host_ip 172.16.10.102 --router_asn 64512 --admin_password workshop --admin_user admin --admin_tenant_name admin --oper add"
# salt -C 'I@opencontrail:control:id:1' cmd.run "/usr/share/contrail-utils/provision_control.py --api_server_ip 172.16.10.254 --api_server_port 8082 --host_name ctl03 --host_ip 172.16.10.103 --router_asn 64512 --admin_password workshop --admin_user admin --admin_tenant_name admin --oper add"
Check Contrail
# salt -C 'I@opencontrail:control' cmd.run "contrail-status"
# salt -C 'I@keystone:server' cmd.run ". /root/keystonerc; neutron net-list; nova net-list"
Access OpenContrail Web UI at https://172.16.10.254:8143/
using admin/workshop login/password.
You should be able to create networks, subnets, routers, consult OpenStack documentation for the corresponding workflow.
StackLight
StackLight is Mirantis solution to monitor your private cloud. Let’s bring your monitoring nodes to the baseline. This model deploy StackLight on three Ubuntu 16.04 nodes in High Availability mode.
# salt "mon*" state.sls linux,openssh,salt.minion,ntp
Now to deploy the monitoring backends, StackLight on this node, run the following states:
# salt -C 'I@elasticsearch:server' state.sls elasticsearch.server -b 1
# salt -C 'I@influxdb:server' state.sls influxdb -b 1
# salt -C 'I@kibana:server' state.sls kibana.server -b 1
# salt -C 'I@grafana:server' state.sls grafana.server -b 1
# salt -C 'I@nagios:server' state.sls nagios -b 1
# salt -C 'I@elasticsearch:client' state.sls elasticsearch.client.service
# salt -C 'I@kibana:client' state.sls kibana.client.service
# salt -C 'I@kibana:client or I@elasticsearch:client' --async service.restart salt-minion
Wait and continue with
# salt -C 'I@elasticsearch:client' state.sls elasticsearch.client
# salt -C 'I@kibana:client' state.sls kibana.client
If the command above fails when trying to bind to the StackLight VIP (172.16.10.253), check on which interface the VIP is configured in the following file:
# vi /srv/salt/reclass/classes/cluster/mk22_lab_advanced/stacklight/server.yml
And run the following state to update the VIP.
# salt -C 'I@keepalived:cluster' state.sls keepalived -b 1
If it still fails trying to connect also run the following state
# salt -C 'I@haproxy:proxy' state.sls haproxy
Now connect to the Web UI at
- Kibana / https://172.16.10.100:5601 or http://172.16.10.253:5601/
- Grafana / https://172.16.10.100:8084/login or http://172.16.10.253:3000/login l: admin p: password
- Nagios / https://172.16.10.100:8001 or http://172.16.10.253 l: nagiosadmin p:nagios
All of the above should be mostly empty, it’s normal, we aren’t yet done.
In case of issues while trying to connect to them run
# salt -C 'I@nginx:server' state.sls nginx
Compute node
To provision your compute nodes run
# salt 'cmp*' state.apply
# salt 'cmp*' state.apply
Note: I had to patch the heka formula by replacing
swap_size: {{ salt['ps.swap_memory']()['total'] }}
by
swap_size: 8192
In our workshop lab, we have a single NIC card, so Salt state cannot be used to configure the network or it will cut out the connection, so for production deployment we are saying that is is much better to have at least two network card, one for dataplane and one for management, ideally a bond of two for production traffic and a 1G interface for out of band management (PXE and Salt)
In our case you can configure compute node networking as shown below
cmp01# vi /etc/network/interfaces
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet manual
pre-up ifconfig eth0 up
post-down ifconfig eth0 down
auto vhost0
iface vhost0 inet static
pre-up /usr/lib/contrail/if-vhost0
address 172.16.10.105
network_name application
netmask 255.255.255.0
gateway 172.16.10.1
dns-nameservers <YOUR DNS SERVER>
Now Reboot your compute node
salt "cmp*" system.reboot
Check that your IP should be bound to vhost0
, which is required for Contrail . Also check contrail status
cmp01# contrail-status
<snip>
Register your vRouter run
# salt -C 'I@opencontrail:control:id:1' cmd.run "/usr/share/contrail-utils/provision_vrouter.py --host_name cmp01 --host_ip 172.16.10.105 --api_server_ip 172.16.10.254 --oper add --admin_user admin --admin_password workshop --admin_tenant_name admin"
You can process the same way to deploy your second compute node.
Check Compute node
From one of your controller, check if your compute node is present
root@ctl01:~# source /root/keystonerc
root@ctl01:~# nova hypervisor-list
+----+-------------------------------+-------+---------+
| ID | Hypervisor hostname | State | Status |
+----+-------------------------------+-------+---------+
| 3 | cmp01.mk22-lab-advanced.local | up | enabled |
+----+-------------------------------+-------+---------+
Monitoring
Lets finish our deployment by deploying the remaining monitoring components.
Start by flusing Salt Mine to make sure it is clean, Mine are data about your minions available from the master for minion to consume.
# salt "*" mine.flush
Install StackLight services, and gather the Collectd and Heka metadata
# salt "*" state.sls collectd
# salt "*" state.sls heka
Gather the Grafana metadata as grains
# salt -C 'I@grafana:collector' state.sls grafana.collector
Update Salt Mine
# salt "*" state.sls salt.minion.grains
# salt "*" saltutil.refresh_modules
# salt "*" mine.update
Update Heka
# salt -C 'I@heka:aggregator:enabled:True or I@heka:remote_collector:enabled:True' state.sls heka
Update Collectd
# salt -C 'I@collectd:remote_client:enabled:True' state.sls collectd
Update Nagios
# salt -C 'I@nagios:server' state.sls nagios
Finalize the configuration of Grafana (add the dashboards…)
# salt -C 'I@grafana:client' state.sls grafana.client.service
# salt -C 'I@grafana:client' --async service.restart salt-minion; sleep 10
# salt -C 'I@grafana:client' state.sls grafana.client
Get the StackLight VIP
# vip=$(salt-call pillar.data _param:stacklight_monitor_address --out key|grep _param: |awk '{print $2}')
# vip=${vip:=172.16.10.253}
Start manually the services that are bound to the monitoring VIP
# salt -G "ipv4:$vip" service.start remote_collectd
# salt -G "ipv4:$vip" service.start remote_collector
# salt -G "ipv4:$vip" service.start aggregator
Stop Nagios on monitoring nodes (b/c package starts it by default)
# salt -C 'I@nagios:server:automatic_starting:False' service.stop nagios3
then start Nagios where the VIP is running.
# salt -G "ipv4:$vip" service.start nagios3
Stacklight Dashboards
In this model, the Dashboard are reverse proxied by our Salt Master, so you can access them below
Kibana / https://172.16.10.100:5601 or http://172.16.10.253:5601/
Grafana / https://172.16.10.100:8084/login or http://172.16.10.253:3000/login l: admin p: password
Nagios / https://172.16.10.100:8001 or http://172.16.10.253 l: nagiosadmin p:nagios
Salt Tips & Tricks
After editing your model its a good practice to check it, just run
# reclass-salt --top
Sometimes you can get lost in all the interpolation that Reclass is doing on your classes. To check Pillar or Top data of a node, you can use
# reclass-salt --pillar prx01.int.cloudvps.com
If you want to know what are the state associated with a node
# salt 'prx01*' state.show_top
To look into a specific state
# salt 'ctl01*' state.show_sls nova
Refresh Pillar and sync_all
# salt '*' saltutil.refresh_pillar && salt '*' saltutil.sync_all
Salt troubleshooting
Salt can be run in debug mode with
cfg01# salt-call state.sls linux -l info or -l debug
You can look at currently running jobs
# salt 'ctl01*' saltutil.runner jobs.list_jobs
And kill a specific one
# salt 'ctl01*' saltutil.kill_job 20161208213122517028
Pulling formulas
While developing formulas, we aren’t packaging, it’s possible to provision them from their respective git repositories. If you’ve done so, you can later update all of them with the following trick
# cd /usr/share/salt-formulas/env
# for dir in $(find . -name ".git"); do cd ${dir%/*}; git pull ; cd -; done
Merge your model with upstream
If you want to merge your fork with upstream, start by configuring Git
# git config --global user.email "youname@yourdomain"
# git config user.name "yourname"
Add the corresponding upstream repository to your fork
# cd /srv/salt/reclass
# git remote add upstream https://github.com/Mirantis/mk-lab-salt-model.git
Fetch it
# git fetch upstream
Check out your fork’s local master
branch and merge it
# git checkout master
# git merge upstream/master
You now have to resolve potential conflicts as usual, by removing the problematic sections which conflicts and commit to your repo. Consult Github documentation for further details.
Conclusion
I hope this cheatsheet is usefull. I’ll update it regularly to keep it relevant with the current Mk22 advanced model.
Links
- Syncing a forked repository