- Who we are
- What we know
- What we do
- hastexo Academy
- Cloud Fundamentals for OpenStack (HX101), Munich, Germany
- Networking for OpenStack (HX102), Munich, Germany
- High Availability for OpenStack (HX103), Munich, Germany
- Cloud Fundamentals for OpenStack (HX101), São Paulo, SP, Brazil
- Networking for OpenStack (HX102), São Paulo, SP, Brazil
- Ceph Distributed Storage for OpenStack (HX104), São Paulo, SP, Brazil
- Cloud Fundamentals for OpenStack (HX101), Bengaluru, KA, India
- Networking for OpenStack (HX102), Bengaluru, KA, India
- Swift Distributed Storage for OpenStack (HX105), Bengaluru, KA, India
- Cloud Fundamentals for OpenStack (HX101)
- Networking for OpenStack (HX102)
- High Availability for OpenStack (HX103)
- Ceph Distributed Storage for OpenStack (HX104)
- Swift Distributed Storage for OpenStack (HX105)
- Metering and Monitoring for OpenStack (HX106)
- Orchestration and Scaling for OpenStack (HX107)
- Remote Consultancy
- On-Site Consultancy
- Custom Training
- Availability Checkup
- Ask The Expert Now!
- hastexo Academy
- What we've created
- Hints and Kinks
- Checking Corosync cluster membership
- Configuring radosgw to behave like Amazon S3
- Downgrading to DRBD 8.3
- Fencing in Libvirt/KVM virtualized cluster nodes
- Fencing in VMware virtualized Pacemaker nodes
- GFS2 in Pacemaker (Debian/Ubuntu)
- Interleaving in Pacemaker clones
- Maintenance in active Pacemaker clusters
- Managing cron jobs with Pacemaker
- Mandatory and advisory ordering in Pacemaker
- Migrating virtual machines from block-based storage to RADOS/Ceph
- Network connectivity check in Pacemaker
- OCFS2 in Pacemaker (Debian/Ubuntu)
- Solid-state drives and Ceph OSD journals
- Solve a DRBD split-brain in 4 steps
- Testing Pacemaker clusters
- Totem "Retransmit List" in Corosync
- Turning Ceph RBD Images into SAN Storage Devices
- Which OSD stores a specific RADOS object?
- Ceph Tutorial (LCA 2013)
- Ceph: The Storage Stack for OpenStack (OpenStack Israel 2013)
- Die eigene Cloud mit OpenStack Essex (German, LinuxTag 2012)
- Fencing (LCE 2011)
- GlusterFS in HA Clusters (LCEU 2012)
- GlusterFS und Ceph (German, CeBIT 2012)
- Hands-On With Ceph (LCEU 2012)
- High Availability Update (OpenStack Summit Fall 2012)
- High Availability in OpenStack (CloudOpen 2012)
- High Availability in OpenStack (OpenStack Conference Spring 2012)
- Highly Available Cloud: Pacemaker integration with OpenStack (OSCON 2012)
- Mit OpenStack zur eigenen Cloud (German, CLT 2012)
- Mit OpenStack zur eigenen Cloud (German, OSDC 2012)
- More Reliable, More Resilient, More Redundant (OpenStack Summit April 2013)
- MySQL HA Deep Dive (MySQL Conference 2012)
- MySQL High Availability Deep Dive (PLUK 2012)
- MySQL High Availability Sprint (PLUK 2011)
- OpenStack Essex im Praxistest (German, Linuxwochen Wien 2012)
- OpenStack High Availability Update (Grizzly and Havana)
- OpenStack Tour de Force (OSCON 2013)
- Roll Your Own Cloud (LCA 2011)
- Storage Replication in HPHA (LCA 2012)
- Zen of Pacemaker (LCA 2012)
- hastexo in 100 Seconds
- Technical documentation
- News releases
- hastexo announces hastexo Academy
- Inktank & hastexo announce partnership on Ceph (German)
- Inktank & hastexo announce partnership on Ceph
- SkySQL, hastexo Form Highly Available Partnership
- The OpenStack DACH Day 2013 (German)
- hastexo Becomes OpenStack Corporate Sponsor, Expands OpenStack Training Portfolio
- hastexo, Cloudscaling announce training collaboration
- hastexo, GigaSpaces announce training partnership
- Hints and Kinks
- What we charge
- What others say
About our blogs
All hastexo blog posts represent the opinion of the post's author, and do not necessarily reflect hastexo's corporate policy or point of view.
The OpenStack™ Word Mark and OpenStack Logo are either registered trademarks/service marks or trademarks/service marks of OpenStack, LLC, in the United States and other countries and are used with OpenStack LLC's permission. We are not affiliated with, endorsed or sponsored by OpenStack LLC, the OpenStack Advisory Board, or the OpenStack community.
Other names are the properties of their respective trademark owners, which hastexo is not also affiliated with. See our trademark statement for details.
High Availability in OpenStack
Submitted by florian on Wed, 2012-03-21 14:08
A few thoughts on high availability features (or the current absence thereof) in OpenStack.
I've just proposed a session for the OpenStack Folsom design summit which Jay Pipes was nice enough to invite me to (thanks!), and I thought I'd write up a few thoughts of mine ahead of time to get the discussion started.
A little while back, Tristan van Bokkem started a discussion on high availability for Nova on the OpenStack mailing list. So in Nova specifically, there are a few components where high availability is readily available; you just have to use it.
- MySQL. That's a no-brainer. MySQL HA with Pacemaker has been done so many times that I won't rehash it here. What's nice in this regard is that Galera (included in Percona XtraDB Cluster) now promises to do away with the limitations of both DRBD and traditional MySQL replication, and provide multiple-node, multiple-master synchronous replication for MySQL. As I'm sure you're aware, classic MySQL replication isn't synchronous, and DRBD can't do multi-node master-master, but the Galera based solution looks promising, if not as mature as the other two. Of course, I don't understand why the Galera folks had to reinvent not only replication (which makes sense) but also cluster membership and management (which doesn't), but that's a different discussion to be had altogether.
- RabbitMQ. Has somewhat similar HA considerations as MySQL. A Pacemaker/DRBD-based solution exists, but is considered deprecated by the RabbitMQ maintainers. Enter mirrored queues, where again the developers seemingly threw out the baby with the bath water and rather than just reimplementing replication (sensible), they came up with their own cluster manager (questionable). Their mirrored queues would probably have played very nicely with master/slave sets in Pacemaker.
As Tom Ellis pointed out in another email the previously mentioned thread, there are more HA considerations for services in Nova proper.
- nova-volume still has a lot of work to do. It has an iSCSI driver which can of course be used as an iSCSI proxy pointed at a highly available, potentially DRBD-backed, software iSCSI target. Or at an iSCSI based hardware solution that has HA built-in, such as HP LeftHand. Alternatively, we could just operate on RBD volumes (part of Ceph) which will also take care of redundancy for us, and add seamless scaleout and remirroring. That being said, there is currently no real HA provision for the nova-volume service itself, and that's something that will be required.
- Compute nodes can all run their own instance of nova-api.
- Front-end API servers can all run nova-scheduler, with a load balancer in front of them.
The Pacemaker stack has the potential of being a nice fit for most of the above. It comes with iSCSI target support (RBD doesn't need Pacemaker on the server end, as Ceph takes care of its own HA). Pacemaker also ties in directly with upstart, so any upstart job can be monitored as a Pacemaker service. And Pacemaker's clone facility makes it easy to run multiple instances of inherently stateless services with minimal configuration. What's more, Pacemaker comes with full integration for the ldirectord load-balancing service. Of course, Pacemaker adds a reliable communications layer (Corosync) and a multi-master, self-replicating configuration facility.
As for non-Nova Openstack services, Glance could use some Pacemaker integration (not hard to do; it's just that someone has to do it).
Ceph, in my opinion, has the very interesting potential of being a redundant, scalable storage one-stop shop for OpenStack. It serves the purposes of both volume/block storage (with RBD) and object storage (with RADOS/radosgw). And, as already pointed out, it comes with HA, replication, and scalability built-in.
Comments and feedback on the above are much appreciated. For OpenStack developers who visit this blog for the first time: you need to login to post comments in our effort to combat comment spam – but you can simply use your Launchpad OpenID to do so.