Wednesday, January 18, 2012

Creating VirtualBox VMs command-line style

When most people run VirtualBox (vbox for short), they install the package, double click on the virtual box icon, and off they go. That is fine when you are running it off your desktop or in a server that has a gui (a Windows server does come to mind, being immediately followed by its OSX brethen). But, what if you are remotely connecting to a server through ssh or just do not want to run the gui? Or, what if you want to have vbox start/stop the vms automagically (say, during a reboot)?

Well, you can do that: like VMWare, KVM, and Xen, vbox allows you to do a lot of interesting things through the command line. Since I wrote Virtual Box in the title, I better focus on vbox. In this episode, how about if we create a little test vm from scratch?

We will use an 64bit ubuntu server as example, assuming we already downloaded the iso for it (I also like to convert install CDs/DVDs to ISOs so I can do it all remotely). We will create the VM as user root instead of your normal user.


  1. First step is to create a directory to put the VMs. Because I came from solaris, I
    tend to put stuff in /export; ubuntu likes to put it in /srv. It is your server, you do whatever you want. Anywhoo,
    sudo mkdir -p /export/vms

    NOTE: I also like to have /export in a different partition. Well, I do like to partition the hard drive both for security and to avoid the important partitions running out of space. LVM is great here too.

  2. Define the vm
    sudo VBoxManage createvm --name "testbox" --basefolder /export/vms --register

    Output should look like this:

    Virtual machine 'testbox' is created and registered.
    UUID: 260ac334-d914-47af-842d-378275130c0f
    Settings file: '/export/vms/testbox/testbox.vbox'

    Note the vm has its own UUID; this can be used to refer to it down the line, but I rarely need it. Just be ware it is there. Now, if you go to /export/vms, you will see a new directory called testbox
    was created.

  3. Let's configure testbox:
    sudo VBoxManage modifyvm "testbox" --ostype "Ubuntu_64" --memory "256"
    --hwvirtex on --nestedpaging on --cpus 1

    I am only giving it 256MB because I do not plan on running apache or
    other memory-intensive application in it. Also, note I am only giving it 1 CPU but am telling it to use the hardware virtualization provided by my CPU; do check whether yours is enabled or not.

  4. We need to define its network card. I will set it to be an intel gigabit (since it is nicely supported by all the Linux distros I have met) and set it to bridged mode, so it will be visible in the same network as the vm host:
    sudo VBoxManage modifyvm "testbox" --nic1 bridged --nictype1 82543GC
    --bridgeadapter1 eth0
  5. Now we need a hard drive. I think a 10GB hard drive will do; this
    is just a test. I will use the .vdi format so it will expand itself up
    to 10GB as needed. Note we can always add more drives later. Or move
    stuff around. Or even resize.
    sudo VBoxManage createhd --filename "/export/vms/testbox/drive1.vdi"
    --size 10000 --format VDI

    We will attach it to the SATA chain.

    sudo VBoxManage modifyvm "testbox" --sata on
    sudo VBoxManage modifyvm "testbox" --sataportcount 4
    sudo VBoxManage storageattach "testbox"  \
        --storagectl SATA --port 1 --device 0 \
        --medium "/export/vms/testbox/drive1.vdi" --type hdd
  6. Remember the installation disk we downloaded as .iso? It is time to feed it to the vm. Just to be different, let's put it in the IDE chain. I am guessing the ISO is in, say, /export/ISOs.
    sudo VBoxManage storagectl "testbox" --name IDE --add ide  --controller PIIX4
    sudo VBoxManage storageattach "testbox" \
        --storagectl IDE --port 1 --device 0 \
        --medium "/export/ISO/ubuntu-12.04-desktop-amd64.iso" \
        --type dvddrive
  7. And now to define the vrdp/vrde port (say, 5050) so we can do a
    remote install. This will not work right of the boxl you must first download and install the VirtualBox Extension Pack, which is conveniently found at https://www.virtualbox.org/wiki/Downloads. Once that is installed, you can then do
    sudo VBoxManage modifyvm "testbox" --vrdeport 5050 --vrde on \
      --vrdeauthtype external

And that should be it. Now we need to start the VM. Something like

sudo VBoxHeadless --startvm "testbox" &

should do the trick. Now, we can and should have a script to start and stop the vm and any vms we have running in the machine (say, when vm host/server is shut down/restarted), but we can talk about that later.

You probably noticed you would be happier by writing a script do to all the steps we did above; we shall talk about that in another episode.

Friday, January 13, 2012

Poor Man's LDAP Replication checker (Kerberos Involved!)

Warning: this article is a bit short, perhaps even curt. As such it might brush over too many concepts and assume you know how to setup replication in ldap. You have been warned!

If you are using ldap, you probably want to have more than one ldap server. You know, one dies and the other takes over kinda thing. Now, that involves some means to keep the ldap databases in the different ldap servers in sync. Currently the preferred method is called syncrepl, and you can find info on setting it up online (I might add some thoughts on that later). Problem is you need to make sure they are all in sync.

One way to do so is by monitoring the contextCSN. So if you have two ldap servers, master.domain.com and slave.monetra.com (let's say, as the name implies, we have a master and a slave ldap servers), after you start replication you could go to master and do:

ldapsearch -z1 -LLLQY EXTERNAL -H ldapi:/// -s base contextCSN
dn: dc=domain,dc=com
contextCSN: 20120113185836.364944Z#000000#000#000000

Then you would go to the slave and run the same command. And then compare the output, namely the funny number after contextCSN:, between the two. If they match, all should be well. If not, time to go check the log files in the two machines; depending on how you set it up, that would mean starting with auth.log and syslog.

Now we know what we need, what if we could make a little script to compare these values between all the ldap servers you have (that are replicating)? Well, we should be able to connect to those ldap servers from any machine that can do so, and query for contextCSN. And then, it is a matter of comparing them. In the following script, let's assume we have 3 ldap servers: one master and two slaves.

#!/bin/bash

# KRB5CCNAME=/tmp/host.tkt
LDAPs=( master.domain.com slave1.domain.com slave2.domain.com )
LDAP_NUMBER=${#LDAPs[@] }
ldap_reply[0]="";

function getLDAPinfo()
{
   local i
   for ((i=0; i < ${LDAP_NUMBER}; i++))
      do
         ldap_reply[$i]=`ldapsearch -z1 -LLLQ -H ldap://${LDAPs[$i]} -s base contextCSN | grep contextCSN | awk '{ print $2 }'`
         echo ${ldap_reply[$i]}
   done
}

function checkSyncStatus()
{
   local i
   local j
   for ((i=0; i < ${LDAP_NUMBER} -1; i++))
      do
         echo -n "$i x "
         for ((j=$i + 1; j < ${LDAP_NUMBER} ; j++))
            do
               echo -n " $j"
               if [[ "${ldap_reply[$i]}" != "${ldap_reply[$j]}" ]]; 
                  echo -n "(Bad)"
               else
                  echo -n "(Ok) "
               fi
         done
         echo
   done
}

echo "Number of LDAP servers: ${LDAP_NUMBER}"
getLDAPinfo
checkSyncStatus

Note the arguments for ldapsearch might change a bit (you might need a -x -Z or whatever; you know what you need to do to run ldapsearch in your environment). If the three machines are in sync, when you run the above code, the output should look something like:

Number of LDAP servers: 3
20120113185836.364944Z#000000#000#000000
20120113185836.364944Z#000000#000#000000
20120113185836.364944Z#000000#000#000000
0 x  1(Ok) 2(Ok)
1 x  2(Ok)

Now, the code is not complete, and that is for a reason. I really wanted to show what it is doing. The echo "Number of LDAPs: ${LDAP_NUMBER}" is there just to verify the number of ldap servers: we have 3 and in the LDAPs array they would be [0], [1], and [2]. It should be commented out/not be there in the production version of the script. The echo statement in getLDAPinfo() is there just to show the output of (and how to get said output) the ldapsearch command so you can see they are all matching; you can also take it off. And the same goes for the echo statements in getLDAPinfo. What they allow is to show which contextCSN values we are comparing, and whether they match ((Ok)) or not. Note we are checking not only the master with each slave but each slave against the others. It is a bit overkill but why not?

As I said, this code is incomplete; what you would need to do, after removing/commenting out the echo statements, is to decide how to use this information. What I have done is when ${ldap_reply[$i]} != ${ldap_reply[$j]}, it then writes down a message saying the contextCSN values for LDAPs[$i] and LDAPs[$j] do not match in an email that is then sent to me. Maybe you want to do something else, but you get the idea.

The only missing step now is to create a cron job to call this script every so often.

Ok, smart guy, you might say, what about the kerberos part you mentioned on the title? Well, if you go back to the script, you will notice a line containing KRB5CCNAME=/tmp/host.tkt commented out. We authenticate ldap access against kerberos. Also, since each machine has its own kerberos principal and keytab, we use it to create a kerberos cache named /tmp/host.tkt:

FQDN=$(hostname -f)
sudo kinit -k -t /etc/krb5.keytab -c /tmp/host.tkt "host/$FQDN@DOMAIN.COM"

which is owned by the root users and used by different services in each ldap/kerberos client. Well, if it is there, we might as well use it, right?

Thursday, January 12, 2012

getting getent passwd of members of other ldap groups

Usually when you deploy ldap you also want to make sure when you do getent passwd it will not only show local users but also the ldap users. This is usually done in nslcd.conf if you are using nss-pam-lapd and might look like something like this (shortened a bit for the sake of brevity):

# /etc/nslcd.conf
# nslcd configuration file. See nslcd.conf(5)
# for details.

# The user and group nslcd should run as.
uid nslcd
gid nslcd

uri ldap://ldap-thingie.domain.com
base dc=domain,dc=com
# [...]
# Customize certain database lookups
base   passwd   ou=people,dc=domain,dc=com
base   group    ou=groups,dc=domain,dc=com
# [...]
scope  passwd   one
scope  group    one
scope  netgroup one
scope  networks one

Now, let's say for whatever reason you want to create another group, say, vegetables, whose members are not people (we will discuss the metaphysical implications of that some other time):

dn: cn=vegetables,ou=groups,dc=domain,dc=com
objectClass: posixGroup
cn: vegetables
gidNumber: 2424

Since vegetables belong to ou=groups, you will see it if otu do, say, getent group vegetables. So, let's add a member to that group, say swampthing:

dn: uid=swampthing,ou=vegetables,dc=domain,dc=com
uid: swampthing
cn: Swamp Thing
givenName: Swamp 
sn: Thing
objectClass: inetOrgPerson
objectClass: posixAccount
loginShell: /usr/bin/treebark
uidNumber: 1995
gidNumber: 2424
homeDirectory: /home/swamp
gecos: Swamp Thing
mail: swampthing@domain.com

when you try to look for it using getent passwd swampthing, it will not show up. But, doing a quick ldapsearch -x "(objectClass=posixAccount)" will find our green fellow. What is going on here? Well, look back at our nslcd.conf on the top of this article. base passwd ou=people,dc=domain,dc=com really translates to "hey man! If you are looking for a user (someone under passwd) in ldap, check ou=people,dc=domain,dc=com!" Problem is Mr. Thing is not on ou=people! Then, we should tell nslcd that looking for a user in ou=vegetables is ok too:

base   passwd   ou=people,dc=domain,dc=com
base   passwd   ou=vegetables,dc=domain,dc=com
base   group    ou=groups,dc=domain,dc=com

And now, when we try getent passwd swampthing again or even id swampthing, we will get info on Mr. Swamp.

Of course, I picked a rather silly name and group for this example, but there is nothing stopping the group to be, say, ou=services and the user mysql-backup. Does that give you evil ideas?

.