After Kevin’s post on commenting, I realized that I tend to be really bad about following through with blog comment conversations.

Kevin pointed out that he’s more likely to take the discussion to zephyr, the mostly-MIT-internal chat server. In fact, Nelson started the Iron Blogger event as a way to combat the fact that we tend to have all our interesting discussions on zephyr, instead of with the rest of the world. So blogging openly but replacing “commenting” with zephyr really defeats a lot of the point.

I know that for me the biggest reason I like having discussions on zephyr is because it’s easy to have a discussion. I don’t have to go seek out replies to my commentary – they show up automatically.

On the other hand, I read blogs through an RSS reader. I don’t tend to visit sites directly. And certainly I don’t go back through a blog’s history looking for replies to my replies. This means that it’s far too easy to make a comment and never look at the comment site again.

To try and combat this, at least for my blog, I’ve installed the “Subscribe to Comments” plugin. It was really easy – the plugin automatically adds the subscription checkbox to the comments form, although I decided to move it to put it above the comment textarea.

I’d encourage the rest of you to do the same – let’s bring the discussion, as well as the blogs, out of the MIT bubble.

Paravirtualized Clocks

In theory, Xen dom0′s are supposed to forcibly sync their system clock to the domU’s. In practice, due to some incompatibility between either Ubuntu’s version of the dom0 or domU patches, that doesn’t work, even though the feature is enabled, which leads to clock drift and occasionally weird clock lockup bugs.

The easiest way to fix this is to disable the Xen clock syncing entirely, and rely on the standard Linux clock mechanism. You can do that by adding these two lines just before exit 0 in /etc/rc.local:

echo '1' > /proc/sys/xen/independent_wallclock
echo 'jiffies' > /sys/devices/system/clocksource/clocksource0/current_clocksource

You’ll want to be sure to run NTP or some other service to keep your clock in sync.

Somebody complained today that an e-mail I sent got caught in MIT’s spam filters, so I took a look at the message to see if I could figure out why it went through for me.

(For the record, using just the headers at your disposal to figure out why spam filtering is doing something strange is always a futile endeavor)

I didn’t figure out what was going on, but then I noticed this in the headers:

Received: by 10.102.218.17 with SMTP id q17cs56702mug;
        Thu, 11 Feb 2010 09:42:54 -0800 (PST)
Received: by 10.224.59.28 with SMTP id j28mr118230qah.109.1265910085825;
        Thu, 11 Feb 2010 09:41:25 -0800 (PST)
Return-Path:

Received: from dmz-mailsec-scanner-4.mit.edu (DMZ-MAILSEC-SCANNER-4.MIT.EDU [18.9.25.15])
        by mx.google.com with ESMTP id 17si5760822qyk.35.2010.02.11.09.41.25;
        Thu, 11 Feb 2010 09:41:25 -0800 (PST)
Received-SPF: softfail (google.com: domain of transitioning prvs=165867b240=uptrack@ksplice.com does not designate 18.9.25.15 as permitted sender) client-ip=18.9.25.15;
Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning prvs=165867b240=uptrack@ksplice.com does not designate 18.9.25.15 as permitted sender) smtp.mail=prvs=165867b240=uptrack@ksplice.com
Received: from mailhub-dmz-1.mit.edu (MAILHUB-DMZ-1.MIT.EDU [18.9.21.41])
	by dmz-mailsec-scanner-4.mit.edu (Symantec Brightmail Gateway) with SMTP id A4.AB.13801.441447B4; Thu, 11 Feb 2010 12:41:24 -0500 (EST)
Received: from dmz-mailsec-scanner-1.mit.edu (DMZ-MAILSEC-SCANNER-1.MIT.EDU [18.9.25.12])
	by mailhub-dmz-1.mit.edu (8.13.8/8.9.2) with ESMTP id o1BHdVsa009188
	for ; Thu, 11 Feb 2010 12:41:23 -0500
X-AuditID: 1209190f-b7bbfae0000035e9-cf-4b7441445cb2
Received: from mail-qy0-f202.google.com (mail-qy0-f202.google.com [209.85.221.202])
	by dmz-mailsec-scanner-1.mit.edu (Symantec Brightmail Gateway) with SMTP id 6B.AF.10714.241447B4; Thu, 11 Feb 2010 12:41:22 -0500 (EST)
Received: by qyk40 with SMTP id 40so1272989qyk.14
        for ; Thu, 11 Feb 2010 09:41:22 -0800 (PST)
Received: by 10.229.130.205 with SMTP id u13mr94912qcs.47.1265910082497;
        Thu, 11 Feb 2010 09:41:22 -0800 (PST)
Received: from ksplice.com ([64.27.0.149])
        by mx.google.com with ESMTPS id 20sm1621806qyk.9.2010.02.11.09.41.20
        (version=TLSv1/SSLv3 cipher=RC4-MD5);
        Thu, 11 Feb 2010 09:41:21 -0800 (PST)

A lot of spew, of course, but the interesting lines are the two SPF softfails near the top: “domain of transitioning prvs=165867b240=uptrack@ksplice.com does not designate 18.9.25.15 as permitted sender

It took me a little while to figure out what was going on – I know that Ksplice sends its e-mails through Gmail, and I know that ksplice.com’s SPF record includes the Gmail mail servers.

But that e-mail went to a list at MIT, which then expanded to my MIT e-mail address, which then forwards to my GAFYD address. That SPF validation was performed by Gmail when it received the e-mail from MIT’s mail servers, and MIT’s mail servers aren’t authorized to send mail from ksplice.com.

I’m not sure how this is avoidable for this sort of mail forwarding – MIT’s mail servers could just as easily be spammers pretending to forward mail from ksplice.com. Maybe the solution is some way for me to tell Gmail that MIT is authorized to send my mail to me. Either way, it’s just more proof that SPF doesn’t work.

Update: Anders points out that this can be solved with “Sender Rewriting Scheme“, which basically just changes the envelope on the message to something that contains an obfuscated form of the original e-mail, but whose domain is that of the forwarder.

Now for part two in my ongoing series on making Xen suck less. Last time we looked at making networking work for hardware virtualized machines. Networking for paravirtualized VMs does work out of the box, but this hint might help if you’re running into performance problems.

Paravirtualized Networking Performance

If you’re running a web server or some other server that’s sending large files (or sometimes small files), you may find that your VM seems to hang inexplicably on those transfers.

For some reason, the paravirtualized Xen networking drivers advertise that they support on-board TCP segmentation. In fact, they seem to pass the packets onto the wire un-segmented, which frequently will cause the packets to be dropped for going over the MTU.

If you’re using xen-create-image, there’s a commented out line in /etc/network/interfaces that runs ethtool -K eth0 tx off. That’s close to the right issue. You actually want to add a line to your /etc/network/interfaces so that it looks something like this:

auto eth0
iface eth0 inet static
 address 18.181.0.80
 gateway 18.181.0.1
 netmask 255.255.0.0

 post-up ethtool -K eth0 tso off

Stay tuned for more hints, including how to deal with clock issues and magic sysrqs. I’ll also be pulling walkthroughs together on converting paravirtualized to hardware virtualized VMs, and how to upgrade older Ubuntu releases to more recent ones safely.

Right now, all of my Xen dom0′s run Ubuntu Hardy with Xen 3.3 from hardy-backports. Before we even talk about making Xen work, that statement bears some looking at.

I use Xen for a variety of reasons. Some are historical – the Invirt Project was built on top of Xen, and migrating away from Xen to a solution like KVM or VMWare would require working with users that are running paravirtualized operating systems. Some are circumstantial – I still have hardware that doesn’t support hardware virtualization.

My reasons for using Ubuntu are far less logical – I know how to use it, and don’t want to learn anything else if I don’t have to.

And I use Xen 3.3 because it’s way more stable than Xen 3.2.

In any case, if you find yourself using Xen 3.3 on Ubuntu Hardy as a dom0, there are a lot of tricks I’ve picked up for making it work better. Over the next few weeks, I’ll be working my way through them. I’ll be tagging them all with xen-tips for easy retrieval later.

As a disclaimer, I have no idea if these problems have been fixed in later versions of Xen or Linux, or if they’re specific to the Xen and/or kernel shipped by Ubuntu. For me, there’s a lot of value in getting all of my software from my distribution, so these instructions are designed to help do that.

HVM Networking

I have no idea whose fault this is, but HVM networking just doesn’t seem to work out of the box. qemu-dm, which emulates the VM’s devices, hooks the VM to a tap net device, while Xen sets up networking for a vifN.0 device. As far as I can tell, the intent was to connect the tap and vif devices, but nothing does.

For Invirt, we worked around this by writing a wrapper script around qemu-dm to make sure everything was setup correctly. If you want to use this script, you can drop qemu-dm-invirt in /usr/sbin and qemu-ifup in /etc/xen. (You’ll probably want to replace vif-invirtroute in qemu-ifup with vif-bridge or vif-route or whatever networking script you’re using).

/usr/lib/xen/bin/qemu-dm is hard-coded to run /etc/xen/qemu-ifup, if it exists. Without the qemu-dm-invirt wrapper, though, qemu-ifup doesn’t have any access to the domain ID for the domain it’s setting up. qemu-ifup then sets up and triggers the normal Xen networking script, which repeats the same setup it did for the vifN.0 interface.

Then, in your Xen config file, be sure to set device_model = '/usr/sbin/qemu-dm-invirt'.

© 2011 No Name Blog Suffusion theme by Sayontan Sinha