Home > Mac administration, Mac OS X Server > IP Failover notification scripts

IP Failover notification scripts

The ability to have IP failover on OS X Server, if you have services that absolutely, positively must remain running, is a beautiful thing. You can also configure it to send you an email to a specified address. though the default OS X notification is pretty uniformative. A couple of weeks ago, I got a notification from my one set of servers that looked like this: 
 
From: root@localhost.localhost 
Subject: Failover Notification: IP address[es] have gone down 
Date: June 25, 2006 8:26:46 AM EDT 
To: me@work.com 
 
That was it. So I started looking around, by consulting both a coworker and AFP548.com’s writeups, to see if I could make this a little more informative. The answer was yes, you can. Here’s how. 
 
1. Edit the following script on the failover script: /usr/libexec/NotifyFailover. 
 
2. Look for the following section:  
 
if [ “${_state}” = “up” ]; then 
_state=”IP address[es] have come up” 
elif [ “${_state}” = “down” ]; then 
_state=”IP address[es] have gone down” 
fi 
 
Edit the “IP address[es] have come up” and “IP address[es] have gone down” lines to something that’s more informative. I edited mine to say “Primary.server.name has gone down. Failover.server.name has taken over for primary and will function as the mobile home server until Primary.server.name comes back on-line.” and “Primary.server.name has come up and has become available again. Failover.server.name has stopped all its failover services and is resuming its position as the failover server. 
 
The result are emails that look like this: 
 
From: root@localhost.localhost 
Subject: Failover Notification: Primary.server.name has gone down. Failover.server.name has taken over for primary and will function as the mobile home server until Primary.server.name comes back on-line. 
Date: July 21, 2006 8:40:57 AM EDT 
To: me@work.com 
 
From: root@localhost.localhost 
Subject: Failover Notification: Primary.server.name has come up and has become available again. Failover.server.name has stopped all its failover services and is resuming its position as the failover server. 
Date: July 21, 2006 8:45:28 AM EDT 
To: me@work.com 
 
 
Much more informative. 
 
There’s still nothing in the body of the email though, and maybe you’ll want to add some ability to log what’s happened. So the fix for that is to put a couple of new scripts in the /Library/IPFailover/ip.address.of.primary/ directory. (For an explanation of how this is set up , see AFP548.com’s great writeup on this topic.) There should already be a couple of scripts in here, with a correctly configured failover, named PostAcq , PreAcq and PreRel. All should have numbers after them, like PostAcq10 and PreRel20. You’ll want to put your notification scripts as the last PostAqc and PreRel scripts. 
 
My PostAqc notification script, to let the right people know that the primary’s gone down and the failover has taken over: 
 
 
#!/bin/bash 
 
# Post Acquire failover notification script 
 
subject=”Primary.server.name has failed” 
to=”me@work.com,me@mac.com,me@gmail.com,someonelse@work.com” 
body=”Primary.server.name has failed. Failover.server.name has taken over all mobile home folder hosting responsibilities from Primary and will take over Primary.server.name’s functions until the primary server comes back online.” 
 
 
# Send e-mail advising that a failover event has occured 
echo “${body}” | mail -s “${subject}” “${to}” 
 
logger “Sent alert email “${subject}” to “${to}”.” 
 
 
My PreRel notification script, to let the right people know that the primary is back up and the failover has gone back to waiting: 
 
 
#!/bin/bash 
 
# Pre-Release script – run on failover server before returning priority to main server 
 
subject=”Failover.server.name is returning priority to Primary.server.name” 
to=”me@work.com,me@mac.com,me@gmail.com,someonelse@work.com” 
body=”Primary.server.name has become available again. Failover.server.name has stopped all its failover services, unmounted the RAIDs that house the mobile home folders and is resuming its role as the failover server.” 
 
 
# Send e-mail advising that a failback event has occurred 
echo “${body}” | mail -s “${subject}” “${to}” 
 
logger “Sent failback alert email “${subject}” to “${to}”.” 
 
 
 
You’ll get emails that look like this: 
 
From: root@localhost.localhost 
Subject: Primary.server.name has failed 
Date: July 21, 2006 10:19:27 AM EDT 
To: me@work.com,me@mac.com,me@gmail.com,someonelse@work.com 
 
Primary.server.name has failed. Failover.server.name has taken over all mobile home folder hosting responsibilities from primary and will take over Primary.server.name’s functions until the primary server comes back online. 
 
 
From: root@localhost.localhost 
Subject: Failover.server.name is returning priority to Primary.server.name 
Date: July 21, 2006 10:22:14 AM EDT 
To: me@work.com,me@mac.com,me@gmail.com,someonelse@work.com 
 
Primary.server.name has become available again. Failover.server.name has stopped all its failover services, unmounted the RAIDs that house the mobile home folders and is resuming its role as the failover server. 
 
 
Do you need to do all this, if you have only one set of servers that failover? Probably not. But if there’s any possibility for confusion, setting up a specific notification can help you out a lot with figuring out what’s happened when you first get notification. 
 
 
UPDATE: It was pointed out to me that my notification email setup is wrong up above. There should be no spaces following the commas separating the email addresses. I’ve made the corrections in the entry. 

  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: