Nixpanic's Blog

Setting up a test-environment for Apache CloudStack and Gluster

This is an example of how to configure an environment where you can test CloudStack and Gluster. It uses two machines on the same LAN, one acts as a KVM hypervisor and the other as storage and management server. Because the (virtual) networking in the hypervisor is a little more complex than the networking on the management server, the hypervisor will be setup with an OpenVPN connection so that the local LAN is not affected with 'foreign' network traffic.


Using Gluster as Primary Storage in CoudStack

CloudStack could use a Gluster environment for different kind of storage types:
  1. Primary Storage: mount over the GlusterFS native client (FUSE)
    This post shows how it is working and refers to the patches that make this possible.
  2. Volumes for virtual machines: use the libgfapi integration in QEMU
    Next upcoming task, initial untested patch in the wip-branch.
  3. Secondary Storage: mount over the GlusterFS native client (FUSE)
The current work-in-progress repository on the Gluster Community Forge already has functional support for creating Primary Storage on existing Gluster environments:
  • Infrastructure -> Primary Storage -> Add Primary Storage
    Add Primary Storage
  • Infrastructure -> Zones -> Add Zone - [wizard]
    Add Primary Storage through the Zone Wizard
Via the Infrastructure -> Primary Storage menu, the details of the newly created storage can be displayed.
Primary Storage Details

After creating a virtual machine from the standard CentOS template, it can be verified that the Primary Storage Pool on the Gluster environment is functioning. On the hypervisor that runs the VM:

[root@agent ~]# mount | grep gluster
gluster.cloudstack.example.net:/primary on /mnt/dd697445-f67c-33bc-af52-386de3ff7245 type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)

[root@agent ~]# ps -C qemu-kvm -o command | grep i-2-3-VM
/usr/libexec/qemu-kvm -name i-2-3-VM ... -drive file=/mnt/dd697445-f67c-33bc-af52-386de3ff7245/1afd48d2-c5e1-44ce-bcb3-051cc4d59716,if=none,id=drive-virtio-disk0,format=qcow2,cache=none ...

The changes to CloudStack that make this possible are located on the Gluster Community Forge and have been posted for review:
  • [#15932] Add support for Primary Storage on Gluster using the libvirt backend
  • [#15933] Add Gluster to the list of protocols in the Management Server

Initial work on Gluster integration with CloudStack

Last week there was a CloudStack Conference at the Beurs van Belage in Amsterdam. I attended the first day and joined the Hackathon. Without any prior knowledge of CloudStack, I was asked by some of the Gluster community people to have a look at adding support for Gluster in CloudStack. An interesting topic, and of course I'll happily have a go at it.
CloudStack seems quite a nice project. The conference showed an awesome part of the community, loads of workshops and a surprising number of companies that sponsor and contribute to CloudStack. Very impressive!
One of the attendants at the CloudStack Conference was Wido den Hollander. Wido has experience with integrating CEPH in CloudStack, and gave an explanation and some pointers on how storage is implemented.

Integration Notes

libvirt

It seems that the most useful way to integrate Gluster with CloudStack is to make sure libvirt know how to use a Gluster backend. Checking with some of my colleagues that are part of the group that support libvirt, quickly showed that libvirt knows about Gluster already (Add new net filesystem glusterfs).
This suggests that it should be possible to create a storage pool in libvirt that is hosted on a Gluster environment. A little trial and error shows that a command like this creates the pool:

# virsh pool-create-as --name primary_gluster --type netfs --source-host $(hostname) --source-path /primary --source-format glusterfs --target /mnt/libvirt/primary_gluster
The components that the above command uses, are:
  • primary_gluster: the name of the storage pool in libvirt
  • netfs: the type of the pool, netfs mounts the 'pool' under the given --target
  • $(hostname): one of the Gluster servers that is part of the Trusted Storage Pool that provides the Gluster volume
  • /primary: the name of the Gluster volume
  • /mnt/libvirt/primary_gluster: directory where libvirt will mount the Gluster volume
Creating a volume (a libvirt volume, which is a file on the Gluster volume) can be done through libvirt:

# virsh vol-create-as --pool primary_gluster --name virsh-created-vol.img --capacity 512M --format raw
This will create the file /mnt/libvirt/primary_gluster/virsh-created-vol.imgand that file can be used as a storage backend for a virtual machine. An example of a snippet for the disk that can be attached to a VM:

    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source protocol='gluster' name='/primary/virsh-created-vol.img'>
        <host name='HOSTNAME' port='24007'/>
      </source>
      <target dev='vda' bus='virtio'/>
    </disk>

There are some important prerequisites that need to be applied to the Gluster volume so that libvirt can start a virtual machine with the appropriate user. After setting these options on the Gluster volume and in /etc/glusterfs/glusterd.vol, a test virtual machine can get started. The log of the vm (/var/log/libvirt/qemu/just-a-vm.log) shows the QEMU command line, and this contains the path to the storage:

... /usr/libexec/qemu-kvm -name just-a-vm ... -drive file=gluster+tcp://HOSTNAME:24007/primary/virsh-created-vol.img,if=none,id=drive-virtio-disk0,format=raw,cache=none ...

Design Overview

When CloudStack utilized libvirt, it should be relatively straight forward to add support for Gluster in CloudStack. A diagram that shows the main interactions and their components looks like this:

.--------------.
| CloudStack |
'-------+------'
|
.-----+-----.
| libvirt |
'-----+-----'
|
.----------------+--------------.
| |
.---------+----------. .----------+----------.
| / storage pool / | | virtual machine |
| image management | | management |
'---------+----------' | / XML description / |
| '----------+----------'
V |
........................ V
: / vfs/fuse / : .............................
: mount -t glusterfs : : / QEMU + libgfapi / :
:......................: : qemu file=gluster://... :
:...........................:

The parts that are already functioning are these:
  • libvirt mounts a Gluster volume as a netfs/fuse-filesystem
  • create a XML definition for the disk and pass gluster:// on to QEMU

The actual development work will be in teaching CloudStack to intruct libvirt to use a Storage Pool backed by a Gluster Volume and attach disks to a virtual machine with the gluster protocol.

CloudStack Storage Subsystem modifications

Wido pointed out that most of the storage changes will be needed in the LibvirtStoragePoolDef and LibvirtStorageAdapter Java classes. Also the Storage Core would need to know about the new storage backend.
After some browsing and reading the sources, the needed modifications looked straight forward. The Gluster backend compares to the NFS backend, which can be used as an example.
Changing the code is an easy part, compared to testing it. Remember that I have no CloudStack background what so ever... Setting up a CloudStack environment to see if the modifications do anything, is far from trivial. Compared to the time I spend on changing the source code, trying to get a minimal test environment functioning took most of my time. At this moment, my patches are untested and therefore I have not posted them for review yet :-/

Setting up a CloudStack environment for testing

Some pointers to setup a development environment:
  • Building CloudStack manually (non RPMs)
  • maven 3.0.4 has been deprecated, use maven 3.0.5 instead
  • Installation Guide
  • RHEL6 requires the Optional Channel for jsvc from the jakarta-commons-daemon-jsvc package
  • install the cloudstack-agent (and -common) package
  • set guid and local.storage.uuid in /etc/cloudstack/agent/agent.properties

Running the CloudStack Management server is easy enough when the sources are checked out and build. A command like this works for me:

# mvn -pl :cloud-client-ui jetty:run
To deploy the changes for the cloudstack-agent, I prefer to build and install RPMs. Building these is made easy by the packaging/centos63/package.sh script:

# cd packaging/centos63 ; ./package.sh ; cd -
This script and the resulting packages work well on RHEL-6.5.

Upcoming work

With the test environment in place, I can now start to make changes to the Management Server. The current modifications in the JavaScript code make it possible to select Gluster as a primary storage pool. Unfortunately, I'm no web developer and changing JavaScript isn't something I'm very good at. I will be hacking on it every now and then, and hope to be able to have something suitable for review soon.
Of course, any assistance is welcome! I'm happy to share my work in progress if there is an interest. No guarantees about any working functionality though ;-)