- What we know
- What we've created
- Hints and Kinks
- Checking Corosync cluster membership
- Configuring radosgw to behave like Amazon S3
- Downgrading to DRBD 8.3
- Fencing in Libvirt/KVM virtualized cluster nodes
- Fencing in VMware virtualized Pacemaker nodes
- GFS2 in Pacemaker (Debian/Ubuntu)
- Interleaving in Pacemaker clones
- Maintenance in active Pacemaker clusters
- Managing cron jobs with Pacemaker
- Mandatory and advisory ordering in Pacemaker
- Migrating virtual machines from block-based storage to RADOS/Ceph
- Network connectivity check in Pacemaker
- OCFS2 in Pacemaker (Debian/Ubuntu)
- Solid-state drives and Ceph OSD journals
- Solve a DRBD split-brain in 4 steps
- Testing Pacemaker clusters
- Totem "Retransmit List" in Corosync
- Turning Ceph RBD Images into SAN Storage Devices
- Which OSD stores a specific RADOS object?
- Ceph Tutorial (LCA 2013)
- Ceph: The Storage Stack for OpenStack (OpenStack Israel 2013)
- Die eigene Cloud mit OpenStack Essex (German, LinuxTag 2012)
- Fencing (LCE 2011)
- GlusterFS in HA Clusters (LCEU 2012)
- GlusterFS und Ceph (German, CeBIT 2012)
- Hands-On With Ceph (LCEU 2012)
- High Availability Update (OpenStack Summit Fall 2012)
- High Availability in OpenStack (CloudOpen 2012)
- High Availability in OpenStack (OpenStack Conference Spring 2012)
- Highly Available Cloud: Pacemaker integration with OpenStack (OSCON 2012)
- Mit OpenStack zur eigenen Cloud (German, CLT 2012)
- Mit OpenStack zur eigenen Cloud (German, OSDC 2012)
- More Reliable, More Resilient, More Redundant (OpenStack Summit April 2013)
- MySQL HA Deep Dive (MySQL Conference 2012)
- MySQL High Availability Deep Dive (PLUK 2012)
- MySQL High Availability Sprint (PLUK 2011)
- OpenStack Essex im Praxistest (German, Linuxwochen Wien 2012)
- OpenStack High Availability Update (Grizzly and Havana)
- Roll Your Own Cloud (LCA 2011)
- Storage Replication in HPHA (LCA 2012)
- Zen of Pacemaker (LCA 2012)
- hastexo in 100 Seconds
- Technical documentation
- News releases
- Hints and Kinks
- What we charge
- What others say
What's a Totem "Retransmit List" all about in Corosync?
Occasionally, you may see errors similar to this in your system logs:
corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee
Here's what causes them, and what you can do to fix the issue.
Corosync, more specifically its Totem protocol implementation, defines a maximum number of cluster messages that can be sent during one token rotation. By default, that number is 50, but you may modify this value by setting the
window_size parameter in your
corosync.conf configuration file.
When among several fast cluster nodes ("processors" in Totem speak) there are one or few slow ones, the kernel receive buffers can't cope, messages get lost, and they then need to be retransmitted. This is what causes the
Retransmit List notifications in the syslogs. This doesn't mean you're losing any messages or data. But it does mean that your cluster performance degrades when this happens, and thus you should really fix that problem.
There are a few considerations that apply to tuning Corosync's
- If you have a small cluster (say, 8 nodes or less), and they all can be expected to perform equally well because they have identical or nearly-identical hardware, then setting a large
window_sizeof up to 300 should be fine.
- If your cluster is rather heterogeneous, then you should probably stick with the default of 50. Definitely don't go higher than 256000/MTU, where MTU is that of the network interface(s) Corosync communicates over. For a standard Ethernet interface the default MTU is 1500, which would make for a maximum
- If you're running on the generally safe default of 50, and you're still getting
Retransmit Listnotifications, then one of your nodes is most likely significantly slower than the others, and you had better find the cause of that and fix it. The node could be under constant excessive load, or have a problem with its network driver, or may be plugged into an incorrectly-configured switch port.
Of course, if you need help in figuring out the root cause of the problem, we're always available to you on short notice.