(Is there a non-ambiguous abbreviation for “appliance”? I don’t want to use “app builders”, because people would obviously get the wrong impression…)

I’m still looking for an appliance builder that has everything I want. Right now the three software packages on my list are Cobbler+koan, Thincrust, or maybe Kiwi.

I started looking into Kiwi a while back, but backed off because they seem to have DRY problems. Not to mention it’s written in Perl.

Cobbler and Thincrust look a little more promising, at least on the surface, but it’s hard to get a good sense of the kind of flexibility I can get out of them. It certainly doesn’t look like either of them have the ability to install a Debian/Ubuntu system without being handed the 20 lines of required pre-seed, but I could be wrong.

Does anybody have experience with these? Does anybody know if they fit the 4 features from last time, or could be hammered into fitting them?

I’ve said it before – there are a lot of appliance builders out there. With virtualization and the cloud being the hot ticket items of the day, everybody wants to try their hand at writing the software to provision those VMs.

Unfortunately, they all seem to suck. At least, the Debian/Ubuntu ones do. I haven’t found a VM or appliance builder application that I like, mostly because they all seem to be bad knock-offs of the actual debian-installer or ubuntu-installer.

The appliance builder I want has four key features:

  1. It should run unattended.

    This one is kind of obvious, but rules out options like just running the debian-installer by hand and answering the questions as they come up. I do a lot of repetitive installs, and it’s important that I can hand my appliance builder a pre-crafted config file and get a customized, but totally unattended install.

  2. It should run trivially in a virtual environment, and seamlessly supports multiple hypervisors.

    All of the appliance builders that anybody uses, or at least the ones I’ve attempted to use (VMBuilder and xen-create-image) run in the hypervisor. This is anywhere from an inconvenience to an actual security threat.

    I want to be able to offer users a high degree of customizability, but my users are generally untrusted, and you simply can’t allow any flexibility when the appliance installer runs as root on your hypervisor. You certainly can’t allow your users to install packages out of their own apt repositories, including PPAs – a targeted attacker can easily break out of the chroot they’re put into when their package installs, and any package can include code that runs as root. Even if you don’t allow your users to customize appliances, the principal of least privilege says you shouldn’t be running the installs as root when you can run them as not-root, and you pretty clearly can.

    Therefore, being able to run the appliance builder in a VM is an absolute must, regardless of the performance hit. We were able to adapt xen-create-image to do this for Invirt, but it wasn’t pretty, it took a lot of shoehorning, and it’s still pretty fragile.

    Not only do I want to be able to install my appliances in a guest, but I also want to be able to run that guest under various virtualization environments. Many of my deployments are still heavily dependent on Xen. I have other deployments using KVM. Ideally, I’d like my appliance builder to work fairly transparently with multiple virtualization environments, although it’s probably OK for me if the resulting appliance image only works with the particular hypervisor that created it.

  3. It should use the distributions installer mechanism instead of jerry-rigging its own.

    All of the appliance building applications I know of use their own installation code. For Debian/Ubuntu installers, this means running debootstrap and then frobbing the output. Even kiwi, the software behind the very shiny SUSE Studio effectively starts by unpacking a list of RPMs by hand.

    There’s a lot of complexity in the Debian/Ubuntu installers. When you try to duplicate it, you will get it wrong. The resulting system will not be equivalent to the same system installed using a CD. I’ve certainly seen cases before where an installer-built image was different than an appliance-builder-built image, and it’s incredibly frustrating. Maybe this is something that could be fixed by actively developing the appliance builder (Ubuntu’s VMBuilder seems to be getting help from the ubuntu-installer developers), but it inherently seems like a waste of time to have this kind of code duplication.

  4. It should have a layer of abstraction that keeps me from repeating myself.

    Simply booting the debian-installer or ubuntu-installer with a preseed file would certainly address the first three points. However, the preseed file needed simply to get an unattended Ubuntu install with no other bells and whistles is more than 20 lines long. Even if I have a template I can copy around, it’s gross from a DRY perspective.

    I want my appliance builder to be configured through a config format that abstracts that away. I only want to specify that which can’t be reasonably guessed, not everything that I might want to have a say about.

All of the virtualization projects I’m involved in right now – Invirt, Virtigo, and some smaller personal projects – could really benefit from this kind of infrastructure piece, which means I’m likely to attempt to write it if it doesn’t exist. And as far as I know, this kind of appliance building application doesn’t exist for Debian and Ubuntu, at the very least. I’ll admit that I know almost nothing about other Linux distributions. Do any of them get this more right?

Today Duncan Keefe, Senior Manager in Apple’s Information Systems and Technology, presented on campus about how Apple’s IT department functions.

Somewhat understandably, there was a lot of sales pitch for Mac OS X Server in there incorporated in the talk, not to mention a lot of how Apple’s IT department is as awesome as the rest of Apple (which certainly seemed to be true based on the numbers we saw today). But some of the discussion on how to effectively communicate with your userbase would have been interesting for anyone who works in support, and there were a couple of interesting technical tidbits in there as well, and one in particular that still has me excited.

For example, did you know that Apple’s IT infrastructure is 71% based on open-source solutions? While I know as well as anyone that a lot of pieces of OS X itself are open source, they’re making use of a lot of enterprise-grade systems like SAP, which I thought would offset that number more.

Or another interesting fact: after migrating large parts of their infrastructure to Mac OS X Server and Xserves instead of Solaris or AIX or other systems, the sysadmin to server ration for OS X servers is 1:276. To compare that to the organization I know, SIPB has about 30 or 40 servers in its machine room, and there are about 20 people who have access to the machine room, not counting people without physical access who maintain servers on XVM – or people who just don’t have physical access. Now granted, Apple’s number is for maintaining a network with completely homogenous hardware and operating systems, and our tiny farm of servers probably runs more services per server than theirs, but the idea of a single person being able to run the entire SIPB machine room is…stunning.

But the truly interesting thing that Duncan mentioned was in response to someone’s question about virtualization. He responded that Apple currently doesn’t use virtualization for their IT infrastructure. Instead, they developed an in-house app that allows them to dynamically shuffle services around their servers based on the resources those servers need. Apparently their average server utilization is 60%.

And that is the dream of the cloud – by providing an environment large enough to contain your entire enterprise, you can smooth out what would otherwise be debilitating spikes in individual services. And in particular, this is the perfect answer to server virtualization.

If you look at your average, non-virtualized, single-purpose server, it’s probably at about 10% resource utilization, which makes it hard to justify buying a new server for each individual application. Virtualization is often touted as the solution to this problem – you run a bunch of single-purpose virtual machines on a single physical host. You can take advantages of features like guest migration to balance load dynamically. But if a service doesn’t need to exist in a completely independent instance of the operating system, you’re probably losing on the operating system overhead, in terms of disk space and RAM and probably processor usage as well. I’m willing to guess that the cost could be as much as 5% or 10%, which matters when you have hundreds of systems.

By dynamically shuffling applications without the extra overhead of full OS virtualization, you can take take advantages of the economies of scale without that overhead. Which is just awesome. And the 60% average utilization? Also amazing. It’s just about the perfect number: high enough that your servers aren’t twiddling their thumbs, but low enough that any one server should be able to handle a sudden spike.

I’ve been kind of excited about this idea all day, although I can’t really think of a scenario that I could apply the concept to. MIT’s infrastructure is too heterogenous in terms of both hardware and operating system to benefit, and all of the servers I maintain for SIPB are too specialized, or too heavily used already to benefit, or are run services that are un-migratable – or some combination thereof.

But it’s fun to think about what a system like that would take to implement – you’d have to be sure to never assume that a service lived on a fixed IP address. How often do you re-balance services? Unfortunately, I was a little too busy dragging my jaw across the floor to actually ask any interesting questions while Duncan was there.

Anyway, that was my exciting tech story for today.

© 2012 No Name Blog Suffusion theme by Sayontan Sinha