Notes on replacing a bad tape drive

February 13, 2009 Leave a comment

About two years ago, I had problems with a tape library at work. Well, in late January 2009, the tape drive broke again and this time my management decided to replace the library entirely with a new one. Based on my research, a Tandberg StorageLibrary T24 with an LTO-3 drive seemed like a good bet, as it had a fibre channel option, was supported by Retrospect (our backup software) and we could re-use our existing LTO-2 tapes. Two weeks of hell later, here’s the lessons learned that will hopefully help someone else.

1. On Mac OS X 10.4.x and Mac OS X Server 10.4.x, Retrospect has some compatibility issues with Apple’s fibre channel cards. Upgrading to Mac OS X 10.5.x and Mac OS X Server 10.5.x should help with this.

2. You really want to segregate your tape library from your attached storage if they’re both connecting over fibre channel. Tandberg has a write-up on this as it applies to Apple products. For my own setup, I wound up putting a second fibre channel card in my XServe (an Apple-branded LSI 7202XP), setting up a spare fibre channel switch, and plugging the second fibre channel card and my tape library into that. You can also zone your existing fibre channel switch as mentioned in Tandberg’s write-up, but honestly I just wanted the pain to stop and I had the spare kit available.

3. If you’re using Retrospect, and you get error -36 when it’s writing to a new catalog, switch to saving your catalogs on another drive. This is a disk i/o error, and comes through Retrospect straight from the Finder’s error reporting. I’d recommend checking out the drive for problems. In my case, it was a RAID that was used for storing Retrospect catalogs and restores. Shortly after reporting the problem, it spontaneously unmounted. I backed up the essential files off of the RAID, destroyed the RAID array and rebuilt from scratch. No problems since then.

4. Posting rants to EMC’s Retrospect forum can be theraputic but may not get you any feedback (useful or otherwise.)

5. Sometimes, you get a dud. I spent three days with the first (of two) Tandberg Storage Library T24s all but ripping my hair out when it didn’t work, convinced I was missing something. Then it started rebooting itself spontaneously and repeatedly, which made me feel a little better because it was obviously a dud and I wasn’t a gigantic drooling idiot.

6. Sometimes you’ll get the absolute last of something in the United States, and they have to send back to the manufacturer in China. No joke. When I called to get our dud library returned and replaced, our vendor’s warehouse was out, and so was Tandberg’s. To Tandberg’s credit, I reported the problems with the first tape library on Tuesday and I had a replacement unit sent to me International Warp Speed Overnight Before 10AM FedEx delivery by Friday morning.

7. The Tandberg StorageLibrary T24 is also known as the Magnum 224. Two names, same product. However, searching on Google for “Magnum 224″ gets you more useful information than “StorageLibrary T24″.

8. Double-check your fibre channel optical cables by shining a bright flashlight or a small handheld laser pointer down one end and see what comes out the other. The ones I’d been using with my old library didn’t seem to be conducting light as well as they should, so I swapped out for another set of cables. Shorter cables are generally better.

Hope this helps the next guy or girl. For myself, I’m just glad I’m able to sleep and eat normally again (stress tends to keep me up and depress my appetite.)

Airport Extreme update: Rebooting nightly wasn’t the answer.

February 6, 2009 Leave a comment

Back in November, I’d posted this entry describing how I’d started rebooting my Airport Extreme on a nightly basis to fix a problem where it was becoming unresponsive every couple of days. I was still having the problem (even with a nightly reboot) through December and January, so a few days ago, I decided to move the Airport upstairs so that at least I wouldn’t have to head down into the basement all the time. I disconnected it from the gigabit switch downstairs, unplugged everything, then brought it upstairs. Once there, I installed it in our entertainment center behind the TV and connected it to the small 10/100 switch that I use to connect my home theater Mac Mini to our home’s network. Lo and behold, it’s been five days since the change and the wireless network hasn’t gone offline once.

At this point, I’m inclined to blame the gigabit switch in the basement for the problem, but I don’t have anything to base that on other than the fact that the problem went away.

