Perl, VMWare, and Virtual Solutions

The Krang Farm is an automated build and test system created using VMWare and Perl.


December 01, 2004
URL:http://drdobbs.com/web-development/perl-vmware-and-virtual-solutions/184416172

December, 2004: Perl, VMWare, and Virtual Solutions

Sam is a Perl programmer at the PIRT Group (http://thepirtgroup.com/) and author of Writing Perl Modules for CPAN (Apress, 2002). He can be contacted at [email protected].


Why Not Build a Real Farm?


Krang is an open-source content-management system that runs on a variety of operating systems, including various Linux and BSD flavors. To make Krang easy to install, we produce binary distributions for each supported platform. For example, if you're running Redhat Linux 9 with Perl 5.8.4, you would download this file to install Krang:

krang-1.020-Redhat9-perl5.8.4-i686-linux.tar.gz

A binary distribution contains the Krang code, written in Perl, an Apache/mod_perl web server, and all the external Perl modules needed to run Krang. All you need to supply is a base operating-system installation and MySQL 4.0.13 or later. If there are other requirements on your platform, they're listed in a platform-specific README file.

As we ported Krang to more and more platforms, generating these binary distributions for each release became a real chore. Each build would have to be performed by hand on each target platform. Worse, each distribution needs to be tested to make sure Krang still works on all our platforms.

My solution is the Krang Farm (http://krang.sf.net/), an automated build and test system created using VMWare GSX Server and Perl. With the Krang Farm, I can enter a single command to build and test Krang across all our supported platforms, all on a single physical machine. In this article, I'll explain how the farm works and how you can create your own virtual farms using VMWare GSX Server and Perl.

VMWare Backgrounder

VMWare (http://vmware.com/) produces virtual-machine software. A virtual machine is basically a simulated machine that runs its own operating system. Each virtual machine has its own virtual hardware (processor, hard drive, video card, and so on), and is completely separate from the host system. The real machine that runs the VMWare software is known as the "host," while the virtual machines are "guests." Thus, the term "guest operating system" refers to the operating system running inside a virtual machine.

VMWare's software isn't an emulator like Bochs or Virtual PC. Instead, VMWare uses the built-in virtualization hardware in all x86 processors. This lets the virtual machines run nearly as fast as software that runs directly on the real hardware. However, it does have one major drawback—the virtual machines are all x86 systems. VMWare won't run PowerPC or Sparc virtual machines, for example. (See the sidebar "Why Not Build a Real Farm?")

The first VMWare product I used was VMWare Workstation, which I use to run Microsoft Windows on my Linux desktop. When I need to test a web application under IE or open an Excel spreadsheet that won't work with OpenOffice, VMWare Workstation is indispensable. A coworker uses VMWare Workstation in the reverse situation—to do development on a Linux guest running VMWare on a Windows host.

I've also used VMWare Workstation to do some software development for the Bricolage project. I set up virtual machines running a few guest operating systems supported by Bricolage: FreeBSD, Debian Linux, and Redhat Linux. I used these machines to test the Bricolage installation system while it was under development.

However, for this project, VMWare Workstation just won't work. First, it requires a local X server and won't work over a network connection. Second, it doesn't offer a scripting interface to automate operations on the virtual machines.

VMWare's entry-level server product, VMWare GSX Server, remedies both these deficiencies. It works on a headless server and provides a complete scripting system via the VMPerl and VMCOM APIs. This is the product I used to produce the Krang Farm. The rest of the article will refer to VMWare GSX Server simply as "VMWare."

VMWare can be controlled through two interfaces—the web-based VMWare Management Interface and the Virtual Machine Console rich client. Figure 1 shows the VMWare Management Interface running in the FireFox web browser. This interface lets you provision new machines, reconfigure existing ones, start machines, and stop machines. It also has links to download the Virtual Machine Console, which runs as a native GUI on Linux and Windows.

The Virtual Machine Console is the interface used to set up a new machine because it gives you access to the virtual console. Users of VMWare Workstation will be at home in the interface as it is virtually identical. The only significant difference is the need to log in to the server before accessing the guest systems. Figure 2 shows the Virtual Machine Console accessing a Redhat Linux 9 guest at the beginning of the installation process.

Scripting VMWare

VMWare offers two APIs—VMPerl for Perl scripting on Linux and Windows, and VMCOM for COM programming on Windows only. Since I'm primarily a Perl programmer, I went straight to the VMPerl API.

Example 1 shows a simple script called "enumerate_vms.pl" that lists all available virtual machines. Example 2 shows the result of running this script on my VMWare server.

As you can see, VMPerl is an object-oriented API. Example 1 works by creating a VMware::VmPerl::Server object and calling connect() on it. The connect() method takes a VMware::VmPerl::ConnectParams object, which I created using the new() method. Not passing any arguments to new() causes VMware::VmPerl::ConnectParams to use all the default values for hostname, port, username, and password. If this script were run on a different machine from the VMWare installation, then these values would have to be filled in. Finally, a call is made to the registered_vm_names() method on the server object. This returns a list of virtual machine identifiers, which are in fact the paths of configuration files.

The VMPerl API contains methods for starting and stopping machines, as well as a mechanism for exchanging data with programs running inside the guest operating system. Throughout, I found the API to be well designed and easy to pick up. The excellent documentation and plentiful examples that come with the software are a great help.

Time Travel

Software testing becomes much harder if the results of one test run can effect the results of the next run. It is very hard to ensure this won't happen because bugs have a way of defying your specifications—very hard, that is, unless you can travel back in time. Imagine setting up your system in a "known-good state." Then, every time you want to run a test, you just travel back in time and run your test on your system in that state.

VMWare lets you do just that. VMWare's virtual storage system has a "nonpersistent" mode. This means that every time the virtual machine is booted, its hard drive has the contents it had when it went into nonpersistent mode, the contents of the known-good state. During the run, changes are made on disk, but nothing lasts beyond the next boot. This gives each test run a clean slate. No matter how badly it bombs, there's no way it can affect the next run.

Running Commands On Virtual Machines

So far, I've described how new machines are created and how they can be controlled programmatically via the VMPerl API. The final piece needed to run automated tests and builds on virtual machines is a way to run commands and get output from software running on the machines.

The simplest way to solve this problem is to use a network connection to control a shell session running on the virtual machine. Telnet and RSH would have worked fine, but I used SSH because most modern operating systems have SSH running after installation.

I used the Expect module to interact with shell sessions running inside the guest operating systems (http://www.cpan.org/ authors/id/R/RG/RGIERSIG/). Expect is a Perl implementation of the venerable Expect TCL system created by Don Libes and it operates much the same way. The Perl implementation was written by Austin Schutz and Roland Giersig. You provide Expect a command to spawn and it handles setting up a pseudoterminal (pty) for the new process. Then you can run pattern matches on the output of the command and send the command input.

Example 3 shows via ssh how to run the date command on another machine (it could be virtual or real). When prompted, the script provides the user's password. The script then captures the output from date and prints it. When I run this script against my Redhat Linux 9 virtual machine (conveniently named Redhat9), I see:

$ ./get_date.pl
The date on Redhat9 is Thu Jun 17 08:41:24 EDT 2004

The VMPerl API includes an alternative mechanism for communicating with software running on the guest OS, using set_guest_info() and get_guest_info(). I did not use these methods for two reasons: First, they lack sufficient flexibility to control an interactive process easily. Second, to use them, you must install the VMWare Tools software on the guest operating system. This would add an extra step to the process of setting up a new machine, one that is otherwise unnecessary.

Putting It All Together

With the VMPerl API to start and stop machines and the Expect module to run commands via ssh, the rest is, as a former boss used to say, merely a simple matter of programming.

I started by designing a configuration file format to contain a description of all the machines in the farm. The format follows the same conventions as the Apache web server, made possible by the Config::ApacheFormat module created as part of the Krang project (http://www.cpan.org/authors/id/S/SA/SAMTREGAR/). Example 4 is a single machine's configuration from the file, farm.conf. Each machine in the farm gets a <Machine> block listing the username and password used to log in to the machine, a short description used in script output, and a list of all the Perl binaries on the machine paired with the Krang builds they generate.

One thing you might have expected to see in farm.conf that isn't there is the IP address of the machine. I decided to store that information in /etc/hosts because it is convenient to be able to ssh to the machine manually to debug problems. For example, the machine in Example 4 has a corresponding entry in /etc/hosts like this:

192.168.1.10 Redhat7_3_i686

With configuration out of the way, I created a class to encapsulate operations on the machines called KrangFarm::Machine. Example 5 shows a script that will start each machine on the farm and execute the date command. Notice how KrangFarm::Machine completely abstracts interaction with the VMPerl API and the farm configuration file.

The actual scripts in the Krang Farm system, krang_farm_build and krang_farm_test, aren't much more complicated than Example 5. The build script, krang_farm_build, transfers a source tar-ball to each machine, runs make build and make dist and fetches the resulting binary distribution. The test script transfers a binary distribution to each machine, installs it, runs make test, and parses the output to determine success or failure.

Building and testing on all configured platforms is as simple as:

  krang_farm_build --  	from-cvs && krang_farm_test -- version 1.020

Plans for the Future

The Krang Farm is working well, but there's always room for improvement. Now that building and testing are automated, I plan to add a script to perform test runs automatically every night against the latest source in CVS. If the tests fail, the script will send mail to the Krang development mailing list.

Another possible enhancement would be a system to test upgrades from one version of Krang to another. Krang includes an upgrade facility that lets users move from one version of Krang to another without losing their data. Testing upgrades from an arbitrarily old version of Krang to the latest release is a time-consuming process, and automating it could help us find upgrade bugs faster.

This work may well be done by the time you read this article; drop by the Krang web site to find out, or even to lend a hand! Like Krang, the Krang Farm software is 100 percent open source.

Problems

For all the cheerleading in this article, the project wasn't without problems. One of Krang's supported platforms, Fedora Linux, isn't officially supported by VMWare. While Fedora Core 1 worked great, Fedora Core 2 ran extremely slowly under VMWare. Compiling Krang took around an hour versus 10 minutes on the rest of the machines. I eventually solved this problem by compiling a new kernel on the Fedora Core 2 machine with a lower value for HZ (100 versus the default of 1000), a tactic I found on the VMWare community message boards.

Additionally, I am unable to start machines as a nonroot user in the Virtual Machine Console. Judging by the documentation, I'm sure this is supposed to work, but I get a fatal error whenever I try it. However, starting machines using the VMPerl API as a nonroot user works fine.

TPJ

December, 2004: Perl, VMWare, and Virtual Solutions

#!/usr/bin/perl -w
use VMware::VmPerl::Server;
use VMware::VmPerl::ConnectParams;

# connect to the server using all the default settings
$server = VMware::VmPerl::Server::new();
$server->connect(VMware::VmPerl::ConnectParams::new()) or
  die "Could not connect to server: ", ($server->get_last_error())[1];

# get a list of virtual machines
@vm_list = $server->registered_vm_names();
die "Could not get list of VMs from server: ", ($server->get_last_error())[1]
  unless @vm_list;

# print them out
print "$_\n" for @vm_list;

Example 1: enumerate_vms.pl lists all virtual machines.

December, 2004: Perl, VMWare, and Virtual Solutions

$ perl enumerate_vms.pl
/var/lib/vmware/Virtual Machines/Redhat7_3-0/Redhat7_3.vmx
/var/lib/vmware/Virtual Machines/Redhat7_3_i686/Redhat7_3_i686.vmx
/var/lib/vmware/Virtual Machines/Redhat9/Redhat9.vmx
/var/lib/vmware/Virtual Machines/Redhat9_i686/Redhat9_i686.vmx
/var/lib/vmware/Virtual Machines/Fedora1/Fedora1.vmx
/var/lib/vmware/Virtual Machines/Fedora2/Fedora2.vmx

Example 2: Output from enumerate_vms.pl.

December, 2004: Perl, VMWare, and Virtual Solutions

#!/usr/bin/perl -w
use Expect;

# connection parameters
$SERVER = 'Redhat9';
$USER   = 'krang';
$PASS   = 'krang';

# spawn the date command on $SERVER running as $USER
my $spawn = Expect->spawn(qq{ssh $USER\@$SERVER date})
  or die "Unable to spawn ssh.\n";
$spawn->log_stdout(0);

# provide the password when prompted, waiting up to 5 seconds
if ($spawn->expect(5, 'password:')) {
    $spawn->send($PASS . "\n");
} 
# wait for the date and print it out
if ($spawn->expect(5, -re => qr/^.*?\d{4}\r?\n/)) {
    print "The date on $SERVER is " . $spawn->match();
}

Example 3: get_date.pl gets the date via ssh.

December, 2004: Perl, VMWare, and Virtual Solutions

# each machine gets a Machine block
<Machine Redhat7_3_i686>

   # a reminder of what's on this machine
   Description "Redhat 7.3 Server w/ custom Perls for i686"

   # the user and password the farm system will use to login, needs sudo
   User krang
   Password krang
   # the Perl binaries and the builds they generate
   Perls /usr/bin/perl            Redhat7_3-perl5.6.1-i686-linux \
         /usr/local/bin/perl5.6.2 Redhat7_3-perl5.6.2-i686-linux \
         /usr/local/bin/perl5.8.3 Redhat7_3-perl5.8.3-i686-linux \
         /usr/local/bin/perl5.8.4 Redhat7_3-perl5.8.4-i686-linux
</Machine>

Example 4: A single machine's configuration block.

December, 2004: Perl, VMWare, and Virtual Solutions

#!/usr/bin/perl -w
use lib '/home/sam/krang-farm/lib';
use KrangFarm::Machine;

# loop through all configured machines
foreach $name (KrangFarm::Machine->list()) {
    $machine = KrangFarm::Machine->new(name => $name);
    $machine->start();

    # call the date command and extract the output
    $spawn = $machine->spawn(command => 'date');
    if ($spawn->expect(5, -re => qr/^.*?\d{4}\r?\n/)) {
        print "The date on $name is " . $spawn->match();
    }
    # stop the machine
    $machine->stop();
}

Example 5: Script that starts each machine on the farm.

December, 2004: Perl, VMWare, and Virtual Solutions

Figure 1: VMWare Management Interface running in the FireFox browser.

December, 2004: Perl, VMWare, and Virtual Solutions

Figure 2: Virtual Machine Console accessing a machine installing Redhat Linux 9.

December, 2004: Perl, VMWare, and Virtual Solutions

Why Not Build a Real Farm?

The biggest reason to build a virtual farm rather than a real farm is cost. At my company, we can host and administrate a single machine for approximately $2400 per year. The cheapest new machine that meets our hosting requirements costs at least $3000. Thus, to build a farm of six machines costs $18,000 up-front and $14,400 per year. Each new machine will cost another $3000 plus $2400 per year.

Contrast this with the virtual farm. The machine running the Krang farm costs $4275; it's a ProLiant DL360 G3 from Hewlett-Packard with dual 2.8-GHz Intel Xeon processors and 1.5 GB of RAM. The VMWare GSX Server software costs $2500. It costs $2400 per year to host and administrate, just like any other machine on our network. Thus, the total cost to set up the virtual farm is $6775 and $2400 per year. That's already much lower than setting up six machines in a real farm. But even better, adding a new machine to the virtual farm is free. It doesn't require any new hardware or administration overhead.

Another reason to build a virtual farm is the added flexibility. Adding new machines to a real farm takes time; a new machine must be ordered and set up on the network. Adding a new machine to the virtual farm can be done in just a few hours, depending only on how long the operating-system installation takes to run.

However, a real farm has advantages over a virtual farm. First, it's likely to be faster. Because all the machines in the farm can run independently, it will scale better as machines are added. Since run-time performance isn't very important for a build and test farm, this wasn't an issue for the Krang Farm project. Second, a single machine means a single point of failure. It's much more important to have a working backup system when all your eggs are in one basket!

—S.T.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.