Today Duncan Keefe, Senior Manager in Apple’s Information Systems and Technology, presented on campus about how Apple’s IT department functions.
Somewhat understandably, there was a lot of sales pitch for Mac OS X Server in there incorporated in the talk, not to mention a lot of how Apple’s IT department is as awesome as the rest of Apple (which certainly seemed to be true based on the numbers we saw today). But some of the discussion on how to effectively communicate with your userbase would have been interesting for anyone who works in support, and there were a couple of interesting technical tidbits in there as well, and one in particular that still has me excited.
For example, did you know that Apple’s IT infrastructure is 71% based on open-source solutions? While I know as well as anyone that a lot of pieces of OS X itself are open source, they’re making use of a lot of enterprise-grade systems like SAP, which I thought would offset that number more.
Or another interesting fact: after migrating large parts of their infrastructure to Mac OS X Server and Xserves instead of Solaris or AIX or other systems, the sysadmin to server ration for OS X servers is 1:276. To compare that to the organization I know, SIPB has about 30 or 40 servers in its machine room, and there are about 20 people who have access to the machine room, not counting people without physical access who maintain servers on XVM – or people who just don’t have physical access. Now granted, Apple’s number is for maintaining a network with completely homogenous hardware and operating systems, and our tiny farm of servers probably runs more services per server than theirs, but the idea of a single person being able to run the entire SIPB machine room is…stunning.
But the truly interesting thing that Duncan mentioned was in response to someone’s question about virtualization. He responded that Apple currently doesn’t use virtualization for their IT infrastructure. Instead, they developed an in-house app that allows them to dynamically shuffle services around their servers based on the resources those servers need. Apparently their average server utilization is 60%.
And that is the dream of the cloud – by providing an environment large enough to contain your entire enterprise, you can smooth out what would otherwise be debilitating spikes in individual services. And in particular, this is the perfect answer to server virtualization.
If you look at your average, non-virtualized, single-purpose server, it’s probably at about 10% resource utilization, which makes it hard to justify buying a new server for each individual application. Virtualization is often touted as the solution to this problem – you run a bunch of single-purpose virtual machines on a single physical host. You can take advantages of features like guest migration to balance load dynamically. But if a service doesn’t need to exist in a completely independent instance of the operating system, you’re probably losing on the operating system overhead, in terms of disk space and RAM and probably processor usage as well. I’m willing to guess that the cost could be as much as 5% or 10%, which matters when you have hundreds of systems.
By dynamically shuffling applications without the extra overhead of full OS virtualization, you can take take advantages of the economies of scale without that overhead. Which is just awesome. And the 60% average utilization? Also amazing. It’s just about the perfect number: high enough that your servers aren’t twiddling their thumbs, but low enough that any one server should be able to handle a sudden spike.
I’ve been kind of excited about this idea all day, although I can’t really think of a scenario that I could apply the concept to. MIT’s infrastructure is too heterogenous in terms of both hardware and operating system to benefit, and all of the servers I maintain for SIPB are too specialized, or too heavily used already to benefit, or are run services that are un-migratable – or some combination thereof.
But it’s fun to think about what a system like that would take to implement – you’d have to be sure to never assume that a service lived on a fixed IP address. How often do you re-balance services? Unfortunately, I was a little too busy dragging my jaw across the floor to actually ask any interesting questions while Duncan was there.
Anyway, that was my exciting tech story for today.