Thursday, December 31, 2015

Remotely controlling ESXi with bash and without Ansible or Windows

I would expect the more knowledgeable ESXi admins amongst you will rightfully mention the proper way to run a remote command in a ESXi box is using the vSphere (Web) Client or the vSphere Command-Line Interface. I agree completely, which is why I will ignore the advice and see what else I can use.

Shattering a Dream

Just to let you know, I really really REally wanted to have used Ansible here. In fact, this was supposed to be a post on how to use Ansible to shut down an ESXi server, saving the running vm clients, and then restore those clients after the server was back up. I spent time laying things out but could not get it working. I will not go into details; but at first I thought my issues were due to ESXi not having python-simplejson, but I learned that assumption was wrong; Python in ESXi has json in it:

I will try again some other time. For now, I must put my pride away, accept I cannot be one of the cool kids running Ansible for everything, and see what I can do with what I have.

Back to the Drawing Board

Let's start by getting away from the computer (figuratively speaking unless you printed this) and think on what the goals of this mess are

  1. Connect to the ESXi-based vm host, who henceforth shall be named vmhost2.
  2. Tell vmhost2 to save all the running vm clients, preserving their state.
  3. Shut vmhost2 down.
  4. Connect back to vmhost2 after it is back up and running, or when we feel like.
  5. Tell vmhost2 to resume all those saved vm clients.
We already wrote the script to do the suspending and removing business a while ago, so two of those goals have been taken care of.

My code usually starts as a (Linux) shell script unless that is not appropriate for the task (say I need a Windows-only solution), so let me do some thinking aloud in shell before going fancy.

So, we need to connect to vmhost2 in a somewhat secure way. How about ssh? We can create a key pair using ssh-keygen; I would suggest making it at least 4096 bits long. Now we just need to copy the key to vmhost2. I will break some security principles and say we will connect to vmhost2 as root; adjust this as needed. With that in mind, we put the key in /etc/ssh/keys-root/authorized_keys, which probably behaves like your garden-variety authorized_keys file, i.e. command restrictions like we mentioned in an previous article should work. So, let's test it by connecting to vmhost2 and seeing what junk we have in the root dir:

[raub@desktop ~]$ ssh -C -tt -o IdentityFile=keys/vmhost2 -o User=root vmhost2 ls
altbootbank      lib64            sbin             var
bin              local.tgz        scratch          vmfs
bootbank         locker           store            vmimages
bootpart.gz      mbr              tardisks         vmupgrade
dev              opt              tardisks.noauto
etc              proc             tmp
lib              productLocker    usr
Connection to vmhost2 closed.
[raub@desktop ~]$
I think we can use that. Some of you might have noticed we are being a bit explicit with our options, using -o SomeOption when a shortcut would do, but we should make it easier for us to see what is going on, specially if we plan on coding that. And we shall: here is a Bourne (!) shell version of what we just did
#!/bin/sh

user=root
host="vmhost2"
KEY="keys/$host"
what="/bin/sh -c 'hostname' && /bin/sh -c 'date' && /bin/sh -c 'uptime'"

ssh -C -tt -o IdentityFile="$KEY"  -o PasswordAuthentication=no \
    -o PreferredAuthentications=publickey \
    -o ConnectTimeout=10 \
    -o User=$user $host \
    $what

Yes, we told it to run a few more commands while it was connected, but the principle is the same: we can send commands to the ESXi box through ssh. Now, those of you who are well-versed with ssh could not be remotely impressed even if you tried really hard. And I agree; I am just laying down the building blocks. If you have been in this blog before, you would know that is how I roll.

Note: I will be honest with you and say sending out command is great, but we need to get replies for this to really be useful. Later on we will revisit this, for now assume it is all rainbows and unicorns.

I think we should stop messing around and switch the Bash so we can use functions. Maybe something like this perhaps?

#!/bin/bash

send_remote_command ()
{
    # Reading function parameters in bash is not fun
    local hostname="$1"
    local username="$2"
    local keyfile="$3"
    local command="$4"

    ssh -C -tt -o IdentityFile="$keyfile"  -o PasswordAuthentication=no \
        -o PreferredAuthentications=publickey \
        -o ConnectTimeout=10 \
        -o User=$username $hostname \
        $command
}

# Let's try our little function out
user=root
host="vmhost2"
KEY="keys/$host"
what="/bin/sh -c 'hostname' && /bin/sh -c 'date' && /bin/sh -c 'uptime'"

send_remote_command $host $user $KEY $what

As others have pointed out, passing parameters to a bash function is not fun. No matter, let's try it out:

[raub@desktop esxisave]$ ./esxitest.sh
~ # whoami
Er, that was not supposed to happen. Did I just connect to the vmhost2?
~ # whoami
/bin/sh: whoami: not found
~ # pwd
/
~ # hostname
vmhost2.example.com.
~ # exit
Connection to vmhost2 closed.
[raub@desktop esxisave]$
What's going on?
[raub@desktop esxisave]$ bash -x esxisave.sh
+ user=root
+ host=vmhost2
+ KEY=keys/vmhost2
+ what='/bin/sh -c '\''hostname'\'' && /bin/sh -c '\''date'\'' && /bin/sh -c '\''uptime'\'''
+ send_remote_command vmhost2 root keys/vmhost2 /bin/sh -c ''\''hostname'\''' '&&' /bin/sh -c ''\''date'\''' '&&' /bin/sh -c ''\''uptime'\'''
+ local hostname=vmhost2
+ local username=root
+ local keyfile=keys/vmhost2
+ local command=/bin/sh
+ ssh -C -tt -o IdentityFile=keys/vmhost2 -o PasswordAuthentication=no -o PreferredAuthentications=publickey -o ConnectTimeout=10 -o User=root vmhost2 /bin/sh
~ # exit
Connection to vmhost2 closed.
[raub@desktop esxisave]$
Duh! Escaping quotes. I think we are overthinking this.
#!/bin/bash

send_remote_command()
{
    # Reading function parameters in bash is not fun
    local hostname="$1"
    local username="$2"
    local keyfile="$3"
    local command="$4"

    ssh -C -tt -o IdentityFile="$keyfile"  -o PasswordAuthentication=no \
        -o PreferredAuthentications=publickey \
        -o ConnectTimeout=10 \
        -o User=$username $hostname \
        $command
}

# Let's try our little function out
user=root
host="vmhost2"
KEY="keys/$host"
what="hostname && date && uptime"

send_remote_command $host $user $KEY "$what"
So we run it again:
[raub@desktop esxisave]$ ./esxitest.sh
vmhost2.example.com.
Sat Sept 26 05:05:00 UTC 2015
 05:05:00 up 04:50:20, load average: 0.01, 0.01, 0.01
Connection to vmhost2 closed.
[raub@desktop esxisave]$

Much better. Now we could add other functions to our script; first thing that comes to mind is making the script able to upload a file. You see, a while ago we wrote a script to suspend and resume vm clients, we might as well find a way to upload it. Since we are using ssh, it would make sense to then ask scp for help. And we shall.

Let's put a file over there

We have established we can remotely execute commands in the ESXi box vmhost2 without adding any extra packages to out desktop. On the last paragraph we promised we would show how to upload our old script tho shutdown vmhost2; we better deliver it, and we shall do it using scp if for no other reason than we hinted we were going to use it. Now, based on what we have already tested using ssh, to upload the script we might do something like

scp -i keys/vmhost2.example.com root@vmhost2:/var/tmp/save_runningvms.sh .

For now we will be using /var/tmp/ since it tends to survive a reboot but it is still a temp directory; I am not ready yet to start cluttering vmhost2's OS drive. We could modify esxitest.sh to include uploading support, say

#!/bin/bash

send_remote_command()
{
    # Reading function parameters in bash is not fun
    local hostname="$1"
    local username="$2"
    local keyfile="$3"
    local command="$4"
 
    ssh -C -tt -o IdentityFile="$keyfile"  -o PasswordAuthentication=no \
        -o PreferredAuthentications=publickey \
        -o ConnectTimeout=10 \
        -o User=$username $hostname \
        $command
}

send_file()
{
    # Reading function parameters in bash is still not fun
    local hostname="$1"
    local username="$2"
    local keyfile="$3"
    local localfile="$4"
    local remotefile="$5" # a bit of a misowner; it is more like the full path

    scp -i "$keyfile" $localfile $username@$hostname:$remotefile
}

# Let's try our little function out
user=root
host="vmhost2.in.kushana.com"
KEY="keys/$host"
what="hostname && date && uptime"

send_remote_command $host $user $KEY "$what"

# Now we send a file
sourcepath="files/save_runningvms.sh" # Yes, relative path because we can
destinationpath="/var/tmp/save_runningvms.sh" # Absolute path when it makes sense

send_file $host $user $KEY $sourcepath $destinationpath

Let's try it out: First we will check if there is nobody on /var/tmp

[root@vmhost2:~] ls /var/tmp/
sfcb_cache.txt
[root@vmhost2:~]

There is somebody there but nobody relevant to this article. So let's run the new script:

[raub@desktop esxisave]$ ./esxisave.sh 
vmhost2
Fri Jan  1 05:04:47 UTC 2016
  5:04:47 up 64 days, 22:37:34, load average: 0.01, 0.01, 0.01
Connection to vmhost2.in.kushana.com closed.
save_runningvms.sh                            100%  918     0.9KB/s   00:00    
[raub@desktop esxisave]$ 

Did you noticed the very last line? It sure looks very scpish to me. Were all of our efforts in vain or we now have a new file in /var/tmp?

[root@vmhost2:~] ls /var/tmp/
save_runningvms.sh  sfcb_cache.txt
[root@vmhost2:~] 

Of course we could have loaded the file and then remotely set its permissions and executed it, all from our little bash script. That hopefully gives you lots of ideas of where to go next.

That is fine, but why can't I run this script by passing the parameters from command line with options so it knows when I want to send a command or a file? And why doesn't it process the return messages? Do you also want me to give you some warm Ovomaltine and fluff your pillow? Seesh! I am just giving a basic example, a building block to solve whatever task we are talking about in a given article; you can embellish it later. I am not here to hold your hand but (hopefully) set you in the right direction.

Tuesday, December 22, 2015

Joining two circles/cylinders in openscad 1: Geometry

Warning!

This article has more to do with geometry than IT in general. You might possibly be scared of it, and I think that repulse to math in general and geometry specifically is a valid reaction. and I would suggest you to skip to the next posting. Or you might want to go on if just for the pretty pictures. That suits me fine also; you are the one with decider here.

You have been warned!

Recently I decided to learn how to use openscad. As the name kinda implies, it is a (open source) program to build 3D solids. Main difference with the usual suspects is that it is not interactive: you create a file using your favourite text editor (or sed for that matter) and then compile it. That might sound a bit annoying but it can be quite convenient if you are doing engineering objects as you can define the different object that make it as separate functions, which can even become libraries. However, it does not have some features you expect to find in such programs, like fileting. They are still working on providing a official solution to that, but as of right now it ain't there.

Now, sometimes you can go around these limitations by using a bit of geometry. Let me show what I mean by using a simple example: Let's say I want to build a bracket that will hold two pipes together (think chainlink fence gate) that looks like this sketch:

For now we will ignore the pipe holes and look at the external shape of the object. Since we are going to have two circles internally, it would be nice if the ends had a circly feel to it. So, we shall start with two circles around the pipe holes. To make it look pretty, and easy when we do get to build it, how about if the pipe holes are 90% in diameter of those circles we are using to make the body? You know, something like the picture (done in inkscape without making any justice to its potential) on the right.

So we have the ends of the body, but what about the sides? We need to connect the two ends somehow.

The easiest thing I can think of is to use those lines representing the radius of each circle, which happen to be both vertical. We could just connect them on each side (showing just the upper side to keep figure clean and since the object will be symmetric with respect to the horizontal, henceforth called X, axis).

I don't know about you, but even though it might work around the larger (right) circle, it would look odd on the smaller (left) circle: the surface would become parallel to the X axis just before going vertical and then it would make a sharp turn and start moving away of this axis. It just looks like it was put together in a hurry, and I think we can do better.

And how do you plan on doing that? you might ask. Or not. But this is my article so I will answer that question. Our goal is to have the side to transition gently away from the circle. I think we are looking for some kind of curve that started tangent to one circle and then, without much ado, end tangentially to the other circle. Kinda like the picture on the left, with some simple gentle curve in the middle. That does look pretty, but it requires us to determine where it will move away from each circle. In other words, we might need to do some fiddling.

I am lazy, so I would like to keep any fiddling to a minimal. Therefore, I want a solution that makes more work for the computer and less to me. I therefore need a curve that I will provide as few parameters as possible and it will still end up nice and pretty. Euclidean Geometry has the answer: the easiest way to come up with a curved surface that touches both circles tangentially mathematically with the minimal amount of input is to use a third circle.

Here is what I have in mind in a drawing (most of the green circle has been omitted so we would not focus on it): the green circle touches each of the other two circles tangentially as we want. I know the curve in the drawing does not look as nice as the one in the previous drawing, but where it touches and the shape of the curve it creates between those points is determined solely by its radius. Specifically, the bigger the green circle is the closer the line between the blue circles will be to a straight line. The other required information is determined by the two pipes we mentioned before, namely:

  1. Radius of the two blue circles
  2. Distance between the two blue circles.

So far so good, thanks to geometry. Let's see if we can continue relying on it and come up with equations that will make our life lazier. First thing we will need is some lines connecting the centers of the three circles (as I mentioned before, I am ignoring the other side due to the symmetry the object has) together. As to be expected, that defines a triangle of some arbitrary (we did not specify any relationship between the radius of the 3 different circles. We will revisit that later) shape, which in our little drawing is represented by the black triangle.

Before we continue, we must do some labeling. So, the centers of the 3 circles are called A, B, and C, which as we know are the vertexes of the triangle ABC. The sides of the triangle are labelled a, b, and c where b is opposite to B and so on. We can also name each circle after its center, so circle C is the green circle. Distance between C and side c is h.

From the picture we can also figure out the radius of each circle is being named and that a is tangent to circles B and C (and b is tangent to circles A and C). Finally, h divides c in the segments c1 and c2 and creates two right triangles. With those parameters defined, we need to start defining a few equations.

First we restate the relationship between the sides a, b, and c and known quantities:

Well, we really do not know c1 or c2, but can find them by applying Pythagoras on the left triangle:

Since we have c2, we can find c1. So, now we can locate the circle C. Next time we will put all these findings in openscad and then do some doodling.

Tuesday, November 24, 2015

Listing software versions in SCCM

And here is another windows-related posting. So I needed to look for which Windows computers had a given program and which version they had installed. Since over here they use SCCM, I thought it might be something right up its alley. And it was.. after a fashion.

So, under \Monitoring\Overview\Queries it has a query called "All Systems with Specified Product Name and Version,"


which when you run it asks for the, well, product name and version and then returns the hosts that match that. So, when you click it, it pops a window

Where you enter the product name. Once you press "OK" then it asks for the product version. After that it should run and spit out the hosts matching the criteria and some other info, which we do not know what yet since we are good kids and did not look under the skirt.

Thing is, it returned nothing.

Hmmm, maybe I did not enter the proper data. So how about if we pick a program that we know to be installed in at least some of the machines? First we need to see what it looks like under SCCM:

What we are interested on are

  • Product Name = Mozila Firefox 41.0.1 US(x86 en US)
  • Product Version = 41.0.1

Don't ask me why Mozilla decided to put the Product Version in the Product Name; I've asked them before and never had an answer. Just between you and me, it really does annoy me because it caused some issues deploying packages using puppet. That said, I think we should save that conversation for another time.

I think we have been good children for long enough. Time for some skirt pulling! And here is how the query looks like (I did some pretty formatting so we can see what is going on):

select distinct 
   sys.Name, 
   sys.SMSAssignedSites, 
   sys.OperatingSystemNameandVersion, 
   sys.ResourceDomainORWorkgroup, 
   sys.LastLogonUserName, 
   sys.IPAddresses, 
   sys.IPSubnets, 
   sys.ResourceId, 
   sys.ResourceType 
from
   SMS_G_System_SoftwareProduct as prod 
   inner join SMS_R_System as sys 
on 
   sys.ResourceId = prod.ResourceID 
where 
   prod.ProductName like ##PRM:SMS_G_System_SoftwareProduct.ProductName## 
and 
   prod.ProductVersion like ##PRM:SMS_G_System_SoftwareProduct.ProductVersion##

The ##PRM:SMS_G_System_SoftwareProduct.ProductName## means that it will stop and ask for the SMS_G_System_SoftwareProduct.ProductVersion, which means we can have someone run the query and enter the search data without needing to mess with the code. So, we can see what they were trying to accomplish even though it does not work. So, let's do some thinking with our fingers try something else:

select 
distinct 
   SMS_R_System.Name 
from  
   SMS_R_System 
   inner join SMS_G_System_INSTALLED_SOFTWARE 
on 
   SMS_G_System_INSTALLED_SOFTWARE.ResourceId = SMS_R_System.ResourceId 
where 
   SMS_G_System_INSTALLED_SOFTWARE.ARPDisplayName like "Mozila Firefox%"

And that works! It prints out a list of hostnames which have that product. I will not be able to show the screen capture since it would show the name of production machines instead of my lab. So you will have to have some faith. While we are at it, how about if we add more useful info? I don't know about you, but I would like to know which version of the software is installed and when that installation (or upgrade) took place. I will not not comment on the extra attributes we are getting back because I think you can figure out

select 
distinct 
   SMS_R_System.Name,
   SMS_G_System_INSTALLED_SOFTWARE.ProductVersion,
   SMS_G_System_INSTALLED_SOFTWARE.InstallDate
from  
   SMS_R_System 
   inner join SMS_G_System_INSTALLED_SOFTWARE 
on 
   SMS_G_System_INSTALLED_SOFTWARE.ResourceId = SMS_R_System.ResourceId 
where 
   SMS_G_System_INSTALLED_SOFTWARE.ARPDisplayName like "Mozila Firefox%"

We could make this interactive by changing

   SMS_G_System_INSTALLED_SOFTWARE.ARPDisplayName like "Mozila Firefox%"
to
   ##SMS_G_System_INSTALLED_SOFTWARE.ARPDisplayName##
Incidentally, it seems that SMS_G_System_INSTALLED_SOFTWARE.ARPDisplayName gave the same result as SMS_G_System_INSTALLED_SOFTWARE.ProductName, but that does not mean that will always be the case.

Tuesday, November 10, 2015

Memory size and Memory location in Linux

Quick easy post: I right now need to find (I am tying this as I am solving the problem, so this will be a rough post) how many memory slots the motherboard of a machine running Linux (actually Xenserver, but close enough for our needs) has, which ones are being occupied, and which kind of SIMM cards it is using. This machine is in a server room about a mil from me am I do not want to go face the rain to get there. So, how to do the deed?

The Xencenter interface does not seem to be helpful with that, so we will don our battle moustaches (ladies, it is ok to buy a nice handlebar moustache to use in this occasions; the look of your coworkers should be reason enough to do it) and go to command line.

The command in question is dmidecode, which can be found in most Linux distros to probulate the system management BIOS. Some of you have used it before, so let's rush through it a bit. This is how it starts:

# dmidecode 2.11
SMBIOS 2.7 present.
77 structures occupying 4848 bytes.
Table at 0xCF42C000.

Handle 0xDA00, DMI type 218, 11 bytes
OEM-specific Type
        Header and Data:
                DA 0B 00 DA B2 00 17 20 0E 10 01

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: Dell Inc.
        Version: 2.3.3
        Release Date: 07/10/2014
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 8192 kB
        Characteristics:
                ISA is supported
                PCI is supported
                PNP is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
                5.25"/360 kB floppy services are supported (int 13h)
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Function key-initiated network boot is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 2.3

As you can see, the machine in question is a Dell (they do make servers you know; this one is a 1U) and we can see some of its specs. Next would be stuff like CPU specs (cache, features, speed), which we don't care right now. What we care about is the memory:

Memory Device
        Array Handle: 0x1000
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 16384 MB
        Form Factor: DIMM
        Set: 1
        Locator: DIMM_A1
        Bank Locator: Not Specified
        Type: DDR3
        Type Detail: Synchronous Registered (Buffered)
        Speed: 1600 MHz
        Manufacturer: 00CE00B300CE
        Serial Number: 0296F2E0
        Asset Tag: 01150664
        Part Number: M393B2G70EB0-YK0
        Rank: 2
        Configured Clock Speed: 1333 MHz

Handle 0x1101, DMI type 17, 34 bytes
Memory Device
        Array Handle: 0x1000
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 16384 MB
        Form Factor: DIMM
        Set: 1
        Locator: DIMM_A2
        Bank Locator: Not Specified
        Type: DDR3
        Type Detail: Synchronous Registered (Buffered)
        Speed: 1600 MHz
        Manufacturer: 00CE00B300CE
        Serial Number: 0296F43E
        Asset Tag: 01150664
        Part Number: M393B2G70EB0-YK0
        Rank: 2
        Configured Clock Speed: 1333 MHz

Handle 0x1102, DMI type 17, 34 bytes
Memory Device
        Array Handle: 0x1000
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: No Module Installed
        Form Factor: DIMM
        Set: 2
        Locator: DIMM_A3
        Bank Locator: Not Specified
        Type: DDR3
        Type Detail: Synchronous
        Speed: Unknown
        Manufacturer:
        Serial Number:
        Asset Tag:
        Part Number:
        Rank: Unknown
        Configured Clock Speed: Unknown

Handle 0x1103, DMI type 17, 34 bytes
Memory Device
[...]

As you can see, we have a 16GB SIMM (fine, be grammar police and call it a DIMM) on DIMM_A1 amd DIMM_A2 slots, but nobody on DIMM_A3; just between us, this machine only has those 2 SIMMs. So, how about we find out how many SIMM slots this machine have?

[root@vmhost3 ~]# dmidecode | grep 'Locator: DIMM_'
        Locator: DIMM_A1
        Locator: DIMM_A2
        Locator: DIMM_A3
        Locator: DIMM_A4
        Locator: DIMM_A5
        Locator: DIMM_A6
[root@vmhost3 ~]#

Hmm, we can do better; let's be lazy and let the script do the count. And, just to show we are good, we will use sed instead of grep because the name is shorter

[root@vmhost3 ~]# dmidecode | sed -n '/Locator: DIMM_/p'|wc -l
6
[root@vmhost3 ~]#

Six slots. Not bad. So, how many of those are being populated? We know that an empty slot is flagged by Size: No Module Installed. Let's then look for the entries without that pattern, which is easy to do using grep:

[root@vmhost3 ~]# dmidecode | grep -A5 'Memory Device' | grep -c 'MB'
2
[root@vmhost3 ~]#

Note we dropped the wc since grep can count how many times we got successful matches. Also, the -A5 means that we are printing the first 5 lines after the matched pattern; this way the second grep has something to bite. How about if we spit out the name of which memory slots have memory and how big they are? And maybe the type and model number while we are at it. Here's how to do the deed using grep:

[root@vmhost3 ~]# dmidecode |  grep -A18  'Memory Device' | grep -B4 -A13 -e 'Size:.* MB' | grep -e 'Locator: D' -e 'Size' -e 'Part Number'
        Size: 16384 MB
        Locator: DIMM_A1
        Part Number: M393B2G70EB0-YK0
        Size: 16384 MB
        Locator: DIMM_A2
        Part Number: M393B2G70EB0-YK0
[root@vmhost3 ~]# 

I used 3 greps here to make it easier to see what is going on:

  1. Fist grep finds from the dmidecode output the entries related to memories and feed the complete entry (each is 18 lines long) to the next grep.
  2. This one then only looks for entries that have memories being reported in megabytes (MB); the fragile assumption here is that if there is a SIMM in the slot, its Size: will be reported as X MB, otherwise as we found out above it will be Size: No Module Installed. The cleverest way to do this is to test for Size: No Module Installed; if it is not there use it. But, I never claimed to be that clever.

    Now, if a matching pattern is found, we print the entire entry for this memory device, hence the -A13 (after) and -B4 (before); they print the lines before and after the one which contains the pattern.

  3. Finally we print only the lines we want to use, namely Part Number, Size, and Locator(which SIMM slot memory is sitting on).

Now, I know how many memory slots are available, which ones are being occupied, and which memory card models are installed. I can now decide if I want to buy more matching ones or lookup the specs and find the largest cards that will work in this device. Not bad for a rainy day.

Thursday, October 01, 2015

Adding (local) storage in XenServer

Lets start by getting something straight: This is not something that is normally done in an environment that has a cluster of XenServer servers. In those, you are probably using some kind of external storage line a SAN which feeds LUNs to the servers. But, there are those who did not setup their clusters like that for whatever reason -- cost, space, resources -- and as a result built their xen servers with local storage for the vm clients. Or maybe they just have one single xen server. For those this might be a bit useful.

Note: As usual most of this to be done in command line; if you have read my previous articles you would expect nothing less from me. I am just not that original; just deal with it already.

For this example, let's say you bought a server with lots of drive bays (say, 15), a nice raid controller that is supported by XenServer. And you got it running by creating, say, a RAID1 array using two small drives to install the OS. And they happen to have enough free disk space to build a few test vm clients. This server shall henceforth be called vmhost3.

Note: If you had built a Linux box and then installed Xen on the top of it, like RedHat used to recommend before they switched to KVM, this is much easier because you can use the usual LVM commands. Since we are using XenServer things are conceptually the same but the implementation slightly different.

So you built a few test vm clients and verified all is nice and peachy. I mean, even your VLAN trunks work properly. So, now it is time to go into production.

  1. You used 2 drive bays to put the OS on; you still have 13 left to play. So, you grab, say, 6 SSDs and make a nice RAID6 out of it. I will not go in details here because it is out of scope of this discussion, but I can write another article showing how to create the new RAID from command line in xenserver. Let's just assume you did it and it works.
  2. Xenserver by default creates local storage for vm clients by using LVM: it allocates a logical volume (lv) as the drive(s) for those clients, which then format them to whatever filesystem they want. So, we need to grab the drive created by the RAID6 and then configure it to do some LVMing. Every journey starts with the first step, and in this case that is to take a quick look at the drive we created when we did the raid; I will save you some time and tell you it is called /dev/sdc:
    [root@vmhost3 ~]# fdisk -l /dev/sdc
    
    Disk /dev/sdc: 1999.8 GB, 1999844147200 bytes
    255 heads, 63 sectors/track, 243133 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    
    Disk /dev/sdc doesn't contain a valid partition table
    [root@vmhost3 ~]#

    Yes, our drive is not that large, but it will be good enough for this example.

  3. Before we start passing LVs out like farts we need a lvm physical volume(pv). We could create a single partition in sdc that is as big as the disk, or just make the entire drive as the pv. I will do the latter.

    Do you remember my warning about XenServer vs Linux+Xen? This is where things diverge a bit. In the latter we would do something like

    pvcreate /dev/sdc
    vgcreate data /dev/sdc

    To create the pv and a volume group (vg) called data, but since this is XenServer, we should use their command line thingie, xe:

    [root@vmhost3 ~]# xe sr-create name-label=data shared=false \
    device-config:device=/dev/sdc type=lvm
    f70d9ff8-3567-ef23-42c2-7c7997a4abc6
    [root@vmhost3 ~]#

    which does the same thing, but tells xenserver that it has a new local storage device:

  4. From here on, it is business as usual./

Tuesday, September 08, 2015

md5 checksum in powershell

It seems I have been stuck in Windows recently. The main annoyance with that is it loves to fight me and hates to tell me what is going on. As a result, sometimes I need to write my own tools. And, yes, they do tend to emulate what I have used in Linux. Take for example md5sum, an example of a command that just happens to be related to the topic of this article.

Here is how md5sum works in Linux:

raub@desktop:~$ md5sum disallowedcert.sst
6e8bf95470e7f1ca0626012ea21e55f7 disallowedcert.sst
raub@desktop:~$

As the man page tells us, there are other options we can use. But, I do not use most of them. So, how about if we write a quick powershell script to emulate at least the basic usage of md5sum?

Warning

The md5 should not be used for, say, verify integrity of a file if you care about security; the MD5 algorithm can be hacked using collision attacks.

The code

Here is a first shot at the code

# emulate md5sum in powershell
foreach ($file in $args)
{
   $md5 = New-Object -typeName `
          System.Security.Cryptography.MD5CryptoServiceProvider
   $hash = [System.BitConverter]::ToString($md5.ComputeHash([System.IO.File]::ReadAllBytes($file)))
   echo "$hash $file"
}
PS C:\Users\User\test> powershell -executionpolicy bypass -file .\md5.ps1 .\disallowedcert.sst
6E-8B-F9-54-70-E7-F1-CA-06-26-01-2E-A2-1E-55-F7
PS C:\Users\User\test> powershell -executionpolicy bypass -file .\md5.ps1 .\disallowedcert.sst .\e098
cf355c19953274d84772a1cec96dc3356ca.crt
6E-8B-F9-54-70-E7-F1-CA-06-26-01-2E-A2-1E-55-F7
1C-BC-A2-83-77-BA-25-04-3D-6D-42-6E-E2-11-34-55
PS C:\Users\User\test>

Let's improve on it a bit then

# emulate md5sum in powershell
foreach ($file in $args)
{
   $md5 = New-Object -typeName `
          System.Security.Cryptography.MD5CryptoServiceProvider
   $hash = [System.BitConverter]::ToString($md5.ComputeHash([System.IO.File]::ReadAllBytes($file))) `
           -replace "-",""
   echo "$hash $file"
}

Let's try it out

PS C:\Users\User\test> powershell -executionpolicy bypass -file md5.ps1 md5sum.ps1 ntpTime.ps1 
F1930269F287A434993172A362269F8E md5sum.ps1
FC6ED0787947524543D34CFBA3BE641A ntpTime.ps1

PS C:\Users\User\test> 

That's much more like it.

Other versions

If you do not like my script there are a few versions out there you can download and run instead. Pick your poison!

Sunday, August 30, 2015

Sending mail in powershell (poor man's ssmtp in Windows)

A lot of times in a server you want to have a service be able to send and email. Sometimes this service is nothing but a script you wrote. As mentioned in a previous post, sometimes you just want a lightweight solution. In Linux I like to use ssmtp, but it is not available in Windows. So let's see if we can come up with the next best thing.

If you know me or my posts long enough, this is usually the cue for Powershell. And you would be right: it does provide the function send-mailmessage that allows you to craft an email which can then be sent to a proper SMTP.

Here's a quick proof of concept that I called mailtest.ps1:

$touser = "That guy <80sguy@email.example.com>"
$hostname = hostname
$SMTP = "smtp.example.com"
$fromuser = "Me "
$body = "Nothing to see. Move along"

send-mailmessage -to $touser `
                 -from $fromuser `
                 -subject "Test mail from $($hostname)" `
                 -body $body `
                 -SmtpServer $SMTP
All it does is send a test mail to that guy from the machine we are currently at. It does grab the hostname and creates a return address based on that. I did that because in a production script I like to know where my email came from. How would I used it in a production system? Let's say I have a script that monitors the default gateway and let's me know if it changed (long story). In that script I might have something like
$touser = "Me "
$hostname = hostname
$SMTP = "smtp.example.com"
$fromuser = "Gateway monitoring on $hostname "

function Send-Mail ($message) {
    send-mailmessage -to $touser `
                     -from $fromuser `
                     -subject "Default Gateway on $($hostname) Changed!!!" `
                     -body $message `
                     -SmtpServer $SMTP 
}
Now, we have a function that sends an email to Me indicating the default gateway has changed. It could be called by something like this
function Verify-Gateway($originalGateway){
    [...]
        if( $currentGateway -ne $originalGateway ){
            $Card.SetGateways($originalGateway)
            $message = @"
$hostname lost its default gateway. 
It thinks it is $currentGateway, but it should 
be $originalGateway. So, we changed it back.
"@
            # Let's write to a log
            Log-Msg ($message)
            # Then email message
            Send-Mail($message)
        }
    }
}
Makes sense? You can add more features to it, but the above works fine to me including my multiline string. Noticed in my example the smtp server does not require me to authenticate since I am in a "blessed" network. If that was not the case, fear not since send-mailmessage has a way to use authentication. I will leave that as an exercise to you.

Tuesday, August 11, 2015

Comments on a better than average Phishing Expedition

I take quite a few of you have received phishing emails. You know, some email that tries to compel the reader to click on a link to a site where for whatever reason they enter all their information so their identity, and credit card, can be stolen. And, maybe also get infected with a trojan in the process.

Some, with all respect, are rather lame. By that I mean the authors could not be bothered to

  1. Spell and grammar check the letter. Yes, I know that those phishers might not speak natively the language of their target audience, but can't you be bothered to find someone to check it out for you?
  2. Read it and make sure it make sense. I have received some that I read it 10 times and cannot figure out what is the point. There is nothing relating the text in the paragraphs and the reason to click on the link.
  3. Take a few moments to study the target. Chances are our phisher phriend wants to hit a corporation or someone who is using a corporate service like facebook or microsoft or gmail. If you are trying to impersonate them, why not try to find out what the real letter you are trying to fake looks like?

Come on! I know it is harder than run the "EZ-Phisher" program and hit a button, but please try to make me feel pleased to see the phishing email instead of insulted. The fact those badly crafted phishing attacks work tells more about the target than the phisher, and what it tells is not pretty. If it comes as a shock to you that I

So this morning I received an impressive-looking phishing email that claimed to be from google. You probably want to know how it looks like, so I did a screen capture and put it here to the left. Don't panic, it is safe: you can click on it until you get a sore finger and it will not take you anywhere.

Now, this one I think is much better than your everyday phishing email. I mean, it is in a totally different timezone. Let's examine the phishing email and see what they did right and wrong:

What went right

  1. Got the corporate colours right. The email claims to be coming from google, so they took the time to get the google icons and colours and even layout close enough to look credible.

  2. Used knowledge of the official corporate website to hide their real url. If you hover the button, it would show the following link (am allowing it to overflow):

    Note they crafted their link so it is hidden after a few proper -- from the company they claim the email is coming from, google -- links. The link is long since our phisher phriends hope the real link will fall off the screen. Also, the last link, the one the probably points to their fake site, is behind a goo.gl link: you would need to click on it to first see what the link is, and then you are attacked. Specially if you use Internet Explorer a browser that does like to be cross site scripting vulnerable.

  3. Hid the return email behind corporate-looking address. I like to see the source of the email, what gmail would call "Show Original". That usually is the 4th item from the bottom:

    When I tried to open it, the return address overwhelmed the menu bar, hiding all other options:

    Note that also worked when the email showed the return address (first picture on this article). That is good for a phishing email as it would make most users to not bother examining the email. Since we know what is the location of the option -- Show Original -- we want, we can use it regardless.

  4. Avoided whenever possible relaying emails across too many non-corporate sites. The email looks like this:
    Delivered-To: phishing-victim@gmail.com
    Received: by 10.112.204.33 with SMTP id kv1csp2118911lbc;
            Tue, 11 Aug 2015 01:35:33 -0700 (PDT)
    X-Received: by 10.180.108.35 with SMTP id hh3mr31922715wib.48.1439282133631;
            Tue, 11 Aug 2015 01:35:33 -0700 (PDT)
    Return-Path: <perry_rhodan@interplanetary.edu>
    Received: from mail-wi0-f194.google.com (mail-wi0-f194.google.com. [209.85.212.194])
            by mx.google.com with ESMTPS id bl10si2265464wib.9.2015.08.11.01.35.33
            for <phishing-victim@gmail.com>
            (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
            Tue, 11 Aug 2015 01:35:33 -0700 (PDT)
    Received-SPF: pass (google.com: domain of perry_rhodan@interplanetary.edu designates 209.85.212.194 as permitted sender) client-ip=209.85.212.194;
    Authentication-Results: mx.google.com;
           spf=pass (google.com: domain of perry_rhodan@interplanetary.edu designates 209.85.212.194 as permitted sender) smtp.mailfrom=perry_rhodan@interplanetary.edu
    Received: by mail-wi0-f194.google.com with SMTP id p15so26093395wij.0
            for <phishing-victim@gmail.com>; Tue, 11 Aug 2015 01:35:33 -0700 (PDT)
    X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
            d=1e100.net; s=20130820;
            h=x-gm-message-state:mime-version:from:date:message-id:subject:to
             :content-type;
            bh=LsemyedQTWVhfKEpSMfhlXsF9zPIYwDrlKawPrrmsog=;
            b=XAxoOiKIC3vJ4RIxmejkhdVXXBox3/I4nQYeu5Ml9F8Rq0Sjh+QKaY5M26FkPjX0fa
             Fu/v9fj+a451Aoc5AijUjqtrLEL8vH3Rhx7Kln7m6XrDo1P17HRVSDQCpoc7PlLZicSQ
             Mrnh6AFmYJ7PDCf1RuiJ8ACBZZ7+RwLLYtyEHnwxphUkCglVnJn75Vd8GDEBBv+G/BZw
             PISTui9PGA/jET733HHcLyA1FdYmYjJfWnOkM7oQBQv2/uR/xx9N06k21hbXKvYdOkkN
             oUafnLsuWnI8yBzJUaw2YpCr2HB2vw+ap9kTa4EkCzRntOhuBlpwzg60ukF6J+bdvbVf
             TpgQ==
    X-Gm-Message-State: ALoCoQmFD3lES/88ZcNCZQPJfj+ifiEDNpX5k0tRWJ0tphpzP282BDg6hmsHzb7Cf5n0u26AsTME
    X-Received: by 10.180.84.72 with SMTP id w8mr23619994wiy.71.1439282133247;
     Tue, 11 Aug 2015 01:35:33 -0700 (PDT)
    MIME-Version: 1.0
    Received: by 10.28.48.198 with HTTP; Tue, 11 Aug 2015 01:35:13 -0700 (PDT)
    From: "Google <no-reply@accounts.google.com> Google<no-reply@accounts.google.com> Google<no-reply@ac..." <perry_rhodan@interplanetary.edu>
    Date: Tue, 11 Aug 2015 09:35:13 +0100
    Message-ID: 
    Subject: Our Final Report
    To: undisclosed-recipients:;
    Content-Type: multipart/alternative; boundary=f46d044403ba48b9da051d04fc2f
    Bcc: phishing-victim@gmail.com
    
    --f46d044403ba48b9da051d04fc2f
    Content-Type: text/plain; charset=UTF-8
    Content-Transfer-Encoding: quoted-printable
    
    =E2=80=8B
    One of the classic telltales that an email is scam or phishing is found in by following the Received: headers. So, this email pretends to be from Google. A lousily put together phishing one might have the last hop on google, but then the previous hops would bounce all over the place. This one, however, came from Google. We even know it was submitted directly to a google SMTP server and when.
    Received: from mail-wi0-f194.google.com (mail-wi0-f194.google.com. [209.85.212.194])
            by mx.google.com with ESMTPS id bl10si2265464wib.9.2015.08.11.01.35.33
            for <phishing-victim@gmail.com>
            (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
            Tue, 11 Aug 2015 01:35:33 -0700 (PDT)
    And we know who supposedly submitted it:
    Received-SPF: pass (google.com: domain of perry_rhodan@interplanetary.edu designates 209.85.212.194 as permitted sender) client-ip=209.85.212.194;
    It was someone from an university. Now, why would this person use gmail from SMTP? Well, let's ask the university itself:
    bash-3.2$ dig interplanetary.edu MX
    
    ; <<>> DiG 9.8.5-P1 <<>> interplanetary.edu MX
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21325
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 6, ADDITIONAL: 7
    
    ;; QUESTION SECTION:
    ;interplanetary.edu.                     IN      MX
    
    ;; ANSWER SECTION:
    interplanetary.edu.     3077    IN      MX      5 alt2.aspmx.l.google.com.
    interplanetary.edu.     3077    IN      MX      5 alt1.aspmx.l.google.com.
    interplanetary.edu.     3077    IN      MX      1 aspmx.l.google.com.
    interplanetary.edu.     3077    IN      MX      10 alt4.aspmx.l.google.com.
    interplanetary.edu.     3077    IN      MX      10 alt3.aspmx.l.google.com.
    
    ;; AUTHORITY SECTION:
    edu.                    46983   IN      NS      a.edu-servers.net.
    edu.                    46983   IN      NS      c.edu-servers.net.
    edu.                    46983   IN      NS      f.edu-servers.net.
    edu.                    46983   IN      NS      d.edu-servers.net.
    edu.                    46983   IN      NS      g.edu-servers.net.
    edu.                    46983   IN      NS      l.edu-servers.net.
    
    ;; ADDITIONAL SECTION:
    a.edu-servers.net.      46983   IN      A       192.5.6.30
    c.edu-servers.net.      46983   IN      A       192.26.92.30
    d.edu-servers.net.      46983   IN      A       192.31.80.30
    f.edu-servers.net.      46983   IN      A       192.35.51.30
    g.edu-servers.net.      46983   IN      A       192.42.93.30
    g.edu-servers.net.      46983   IN      AAAA    2001:503:cc2c::2:36
    l.edu-servers.net.      46983   IN      A       192.41.162.30
    
    ;; Query time: 34 msec
    ;; SERVER: 10.0.0.10#53(10.0.0.10)
    ;; WHEN: Tue Aug 11 08:55:08 EDT 2015
    ;; MSG SIZE  rcvd: 380
    
    bash-3.2$
    So, this university uses gmail to send its emails. I guess that means it decided to move to the cloud by outsourcing its emails. I know a lot of people who are rabbid avocates to pushing as much of the IT infrastructure to be hosted by commercial vendors in the cloud. I wonder how many did the due diligence, but I digress. The upshot is that it makes it easier for a careful phisher to do research on targets: instead of having to go to many different sites, the eggs of many business are now literally being held in few baskets. For instance, if the target companies are, say, using the Microsoft cloud, a Microsoft-looking email to all the users in the target corportations would look legit. And that allows phishers and other hackers with criminal intents to concentrate their efforts.
  5. Email passed DKIM and SPF, which adds to its legit feel.

What went Wrong

Unfortunatley this phishing email was not perfect. The issues are few and far between but are there.
  1. Grammar. It is not horrible, but it does made me stop too long to see it was spam/phishing. The We always protect you and Our Final Report did sound a bit odd. The first one was close to what Google might have written but no cigar. Still, since it does not have the usual chicken little warning messages bad phishing uses, I honestly passed through it without noticing much. And besides, We always protect you gives a warm and fuzzy feeling, not as much as a box of kittens but you get my point.

    The first paragraph is a different story. It does not sound like what a native English speaker would say -- it is too wordy and convoluted -- and Google would have spent the money to put their message in nice clear 8th grade English. The second paragraph started with a lowercase; don't know any language that has upper and lower cases that would do so.

  2. No spaces between paragraphs. The paragraphs are short as they would in a Google announcement -- keep it simple and fast so user won't lose interest -- but because of the lack of spaces between them they look too cluttered. Still from a distance it looks clean and pleasant.

Overall I would give it a 7/10: good effort, better than average delivery, and took some time to know target. But, this phishing attempt could be better.

Keep up with the good work!

Saturday, August 08, 2015

Using a juniper SRX router as ntp proxy

I am in a Juniper router mood since last week, so we will talk about another quick and easy project: ntp proxy. I know that before I have said a given article would be fast and easy and it ended up longer than planned, but I feel lucky today!

Just to let you know, this post is really not about how to do the deed, but why.

The reason we want to use the router, uranus, as ntp proxy is that the vlans we care about for this discussion have to go through it to get somewhere else. So, if they have to see uranus every day, they might as well do something with it. Why not tell them their ntp server is the router as seen in their network, letting then the router talk to the actual ntp server?

The Steps

  1. Get the name of the ntp servers you want to use. We'll be lazy and use pool.ntp.org as primary server for now. Before you ask, yes we could have used an internal ntp server by just providing its hostname or IP.
  2. Set them
    set time-zone America/New_York
    set system ntp server pool.ntp.org version 4 prefer
    set system ntp boot-server pool.ntp.org
    

    boot-server is the default ntp server to use when the router is booted. server is one of the other ntp servers to pick if the boot-server does not like us anymore; we can have more than one here.

    We should end up with something like this

    system {
        ntp {
            boot-server 198.55.111.50;
            server 198.110.48.12 version 4;
        }
    }
    Note that even though we enter pool.ntp.org for both boot-server and server, the IPs used are different. All that means is that, as the name hints at, pool.ntp.org is a pool of machines. Every time the router resolves pool.ntp.org it might get a different IP.
  3. Check. We could see how off we are
    janitor@uranus# run show system uptime | match current
    Current time: 2015-08-06 12:33:56 EDT
    janitor@uranus#
    
    Or be a bit more blunt:
    janitor@uranus> show ntp associations
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    *time01.muskegon 204.9.54.119     2 -  706 1024  377   60.728   -4.806   2.384
    
    janitor@uranus>
    I don't know about you but that sure reminds me of the output of ntpq -p. If we need more verbose, we could say
    janitor@uranus> show ntp status
    status=06c4 leap_none, sync_ntp, 12 events, event_peer/strat_chg,
    version="ntpd 4.2.0-a Wed Apr 23 21:04:45 UTC 2014 (1)",
    processor="octeon", system="JUNOS12.1X45-D25.1", leap=00, stratum=3,
    precision=-17, rootdelay=70.394, rootdispersion=85.850, peer=10620,
    refid=198.110.48.12,
    reftime=d96e9a76.c04e615d  Thu, Aug  6 2015 22:46:14.751, poll=10,
    clock=d96ea14a.a9c6ff3a  Thu, Aug  6 2015 23:15:22.663, state=4,
    offset=-2.422, frequency=17.545, jitter=1.532, stability=0.115
    
    janitor@uranus>

Well, we got it running, but what about the ntp server choice we made? Ideally we might want to run our own ntp server, ntp.example.com, which then can be configured in the router, say

set system ntp server pool.ntp.org version 4 prefer
set system ntp server pool.ntp.org version 4 prefer
set system ntp boot-server ntp.example.com
Which would make ntp.example.com as the primary and would pick two servers off pool.ntp.org as failovers. However, what counts int he end of the day is that we are using uranus to abstract ntp to the ntp clients. As a result, they will be in sync so Kerberos and derivatives will be happy. So to show I am not lying, here is what my desktop thinks the ntp server is:
raub@desktop:~$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*uranus.example. 198.110.48.12    3 u  950 1024  377    3.519    0.043   1.605
raub@desktop:~$
As you can see, it does like uranus. Since it is so easy to change it (you can even push the changes using ansible) without having to reconfigure the ntp clients (they only have to reach uranus, why not just setup the ntp proxy thingie on uranus against an external source to get your network up and running and then worry about having a ntp.example.com later on? Pick your battles!

References

Friday, August 07, 2015

Juniper SRX router booted from backup image (or, orange LED light again!)

So, uranus is in pain once more. Power went out and it was not properly shut down before the UPS gave its last gasp (I do need to do something about that). When I rebooted it, it seemed to have come back without an issue -- it was working fine -- but it had The Light on again just like it was in last time. I ssh'd into it and this is what the motd looked like:

raub@desktop:~$ ssh janitor@uranus.example.com
janitor@uranus.example.com's password:
--- JUNOS 12.1X45-D25.1 built 2014-04-23 20:45:48 UTC

***********************************************************************
**                                                                   **
**  WARNING: THIS DEVICE HAS BOOTED FROM THE BACKUP JUNOS IMAGE      **
**                                                                   **
**  It is possible that the primary copy of JUNOS failed to boot up  **
**  properly, and so this device has booted from the backup copy.    **
**                                                                   **
**  Please re-install JUNOS to recover the primary copy in case      **
**  it has been corrupted.                                           **
**                                                                   **
***********************************************************************

janitor@uranus>
Hmmm, so the normal image (does that mean the entire OS partition or just the boot stuff?) got corrupted when power went bye-bye. Good thing it keeps a backup. Out of curiosity, let's see if that is related to the LED being on:
janitor@uranus> show chassis alarms
1 alarms currently active
Alarm time               Class  Description
2015-02-26 05:38:12 EST  Minor  Host 0 Boot from backup root

janitor@uranus>
It seems to be the case. Funny that it is labelled minor but important enough to become the motd, but I digress. Personally, we should take care of that as soon as we have a scheduled maintenance window.

Ok, maintenance window time. Before rebooting, let's prepare a few things. If we know the backup copy is good (configuration files are also backed up, and you can push them back with ansible or whatnot if you feel uncomfortable), you could be lazy like me and copy the backup into the primary partition.

janitor@uranus> request system snapshot slice alternate
Formatting alternate root (/dev/da0s1a)...
Copying '/dev/da0s2a' to '/dev/da0s1a' .. (this may take a few minutes)
The following filesystems were archived: /

janitor@uranus>
If all went well, we should see the primary snapshot having the creation date of when we ran the ,tt>request system snapshot slice alternate command.
janitor@uranus> show system snapshot media internal
Information for snapshot on       internal (/dev/da0s1a) (primary)
Creation date: Apr 19 22:34:04 2015
JUNOS version on snapshot:
  junos  : 12.1X45-D25.1-domestic
Information for snapshot on       internal (/dev/da0s2a) (backup)
Creation date: Feb 26 05:33:18 2015
JUNOS version on snapshot:
  junos  : 12.1X45-D25.1-domestic

janitor@uranus>
As you noticed, I did this a while ago but never got around to making an article about it, but there it is. However, I still had the evil LED staring at me. Time to turn it off
request system reboot media internal
So I rebooted and did not get that motd anymore not the LED:
janitor@uranus> show chassis alarms
No alarms currently active

janitor@uranus>

Friday, July 31, 2015

Detecting text file format using hexdump

another quick one: so we had a text file that had text with accented words and we had to figure out which format they were. You see, for a while the "standard" text format for computers was ASCII, more precisely 7bit ASCII (characters 0 to 127 in decimal) which was created in the 1960s and whose character set aassumed English language only. Before some of you get all excited please note that ASCII stands for American Standard Code for Information Interchange, so it stands to reason they picked English. As this standard became adopted by other countries, it became clear that some of them used characters that were not representable with only those characters, and that let to many attempts to solve that. One of the earliest was to extend the original ASCII table, where another 128 possible characters were added, which after a few adventures evolved into ISO-8859-1 a.k.a. ISO-Latin-1, and UTF-8. There are other character sets, but the principle is the same.

Thanks for the history lesson, but how about getting to the point? How to identify the text format given a file? Let's answer that by using a couple of examples.

Example 1

Let's say we got a text file that has a name, say Luis de La Peña in it somewhere. Depending how you look -- how helpful your text viewer is -- at the file, it might either show "ñ" or some garbled character; the later happens if the text viewer only knows 7bit ASCII. For instance, cat would spit out something like this in my Ubuntu laptop:

bash-3.2$ cat text_test1 
Luis de La Pe�a 
bash-3.2$

Don't know about you, but that "�" does nothing to me; it's just cat's way of telling us it cannot represent the character so it is putting a placeholder. Let's try something else; since the title of this article mentions hexdump, I propose to look at it through that program (I am telling it to print the value of each character and then the ascii representation of those characters):

bash-3.2$ hexdump -Cv text_test1 
00000000  4c 75 69 73 20 64 65 20  4c 61 20 50 65 f1 61 20  |Luis de La Pe.a |
00000010  0a                                                |.|
00000011
bash-3.2$ 

first thing we notice is that it too does not know how to show "ñ", so it is using "." as placeholder. That is a different character than what cat in my ubuntu box used; just deal with it. What we really care about is the hex side tells us that

0xF1 = ñ

That is very important: it tells us that "ñ" is represented by one 8-Bit character, so UTF8 is out. So, we need to look for an 8-bit charset. After hours of agonizing search and heavy drinking, we find that the extended ASCII and/or the ISO-8859-1 tables match all characters (don't believe me? Check the other characters in the text including space). Not bad at all, so we can read the text and convert it to a different char set.

Example 2

So we feel all good about ourselves, and we need another example. This time, I will steal a real life example from a previous article, where we had a text containing Italienisches Olivenöl which would cause DKIM email body authentication failures. Yeah, something as seemly harmless as character set can create some annoying problems.

As before, we begin by asking cat what it thinks about the text:

bash-3.2$ cat text_test2
Italienisches Olivenöl
bash-3.2$

Hold on right there. Why is cat able to print that "ö" but could not print that "ñ" earlier? Now you begin to see some of the limitations of cat compared to hexdump for these probulations: depending on how cat was compiled, it will handle some character sets but not others. hexdump knows nothing about character sets: it only knows of ASCII; anything else becomes a ".". Of course, it would suck to use hexdump all the time, so you need to know your tools and when to use each one. Since we talked about hexdump, let's see what it sees:

bash-3.2$ hexdump -Cv text_test2
00000000  49 74 61 6c 69 65 6e 69  73 63 68 65 73 20 4f 6c  |Italienisches Ol|
00000010  69 76 65 6e c3 b6 6c 0a                           |iven..l.|
00000018
bash-3.2$

Some kind of funny business happening in the second row:

  1. all the English-looking characters not only seem to be represented by one 8bit value but also the same ones we saw earlier in the ASCII example:
    0x69 = i
    0x76 = v
    0x65 = e
    0x6e = n
    0x6c = l
  2. There are two "." characters (0xc3 and 0xb6) where "ö" should be.
  3. There is a 0x0a after the "l".
What's going on here? Hold onto that question and let's check the "ö". According to the above, the two "." are there to tell us "ö" is represented as two 8bit values:
0xc3b6
If we look at any UTF8 conversion table such as this one (picked at random), we will see that is the UTF8 HEX for "ö" (Unicode code would be U+00F6).

Ok, smartypants, what about the 0x0a after the "l"? Yes that. You might have not noticed it was also on text_test1 on the first example. That is the line feed character, which in Linux means end of line.

Insert Boring Conclusion Here

I hope this was useful to you; I thought it was fun and even learned a few things while writing this. the thought process here is similar to what, say you would do when you are examining an encrypted document: try to find known patterns to work with before going after the really unknown stuff.

Sunday, July 26, 2015

Setting NTP server and time in Windows using Powershell

And here we have yet another Windows-related post! Yes, I too make fun of Windows as much as required to be in the IT business... ok sometimes more. But, as I have said again and again, being able to solve problems using command line (powershell specifically) makes it feel more like Unix. I can handle that and so can you!

Most of the Windows boxes I met that use a time server to set their time use the Microsoft one, time.windows.com, no matter if they are the sole computer in a car shop or one of the thousands desktop and servers in an university. That is nice until you have to move away from local-only user accounts and deal with Kerberos and, by extension, Active Directory. You see, Kerberos likes to have its clients to be within 5 minutes of the authentication servers (KDCs). Syncing against the Microsoft time server assumes your machine is in a network that can access the Internet. Well, I have 8 of them which are in a vlan that can't (and really shouldn't). Updates to them are pushed through SCCM (when it feels like working, but I digress) and AD.

On the top of that, I have a perfectly good ntp server in my network this vlan can reach anyway. And its address is passed by dhcp. To add insult to injury, Microsoft does not support the dhcp option to care about ntp servers. Here is a list of the DHCP options supported right from their official docs.

So, as always, I need to do something to make it stop pissing me off. And, it will be in a script of some sort. This is Windows so bash is out and Powershell is in.

The plan is to be able to find which ntp server the Windows host is using and change it if we do not like it. And, while we are there, make sure the host's time is in sync with that of the ntp server. Windows uses W32Time and stores all of that in the registry, namely in HKLM:\SYSTEM\CurrentControlSet\services\W32Time, so if you want you can unleash regedit and go at it. Taking a cue from Unix and Linux, powershell treats the registry as a file tree. So, as far as it is concerned, the above is just a path which can be accessed and modified using Get-ItemProperty and Set-ItemProperty. Let's try it out by taking a look on what we have currently defined:

PS C:\> $timeRoot = "HKLM:\SYSTEM\CurrentControlSet\services\W32Time"
PS C:\> Get-ItemProperty  -path "$timeroot\parameters"


PSPath                 : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE
                         \SYSTEM\CurrentControlSet\services\W32Time\parameters
PSParentPath           : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE
                         \SYSTEM\CurrentControlSet\services\W32Time
PSChildName            : parameters
PSDrive                : HKLM
PSProvider             : Microsoft.PowerShell.Core\Registry
ServiceDll             : C:\Windows\system32\w32time.dll
ServiceMain            : SvchostEntry_W32Time
ServiceDllUnloadOnStop : 1
Type                   : NT5DS
NtpServer              : time.windows.com,0x9



PS C:\>

The 3 blank lines below NtpServer are not a typo; don't ask me why it spits those lines because they add absolutely no value to the output besides wasting screen real state. As you can see, it wants to use time.windows.com as the NtpServer. But, what is this 0x9 on the end of the name of the ntp server? Well, here is what I know about what 0x flags mean

  • 0x01 SpecialInterval: interval in seconds between when W32Time pools for time. Requires HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NtpClient\SpecialPollInterval to be setup. By default W32Time checks the time at intervals based on the network speed, traffic, and phases of the moon. But, if you turn SpecialInterval on, it will check evet SpecialPoolInterval seconds. So, SpecialPoolInterval = 3600 means it will check time ever 3600s (or 1h).
  • 0x02 UseAsFallbackOnly
  • 0x04 SymmatricActive
  • 0x08 Client
  • 0x09 = 0x01 + 0x08. Yes, we can do math.

If we want to change it to, say, ntp.example.com, in powershell we would begin by

PS C:\> Set-ItemProperty  -path "$timeroot\parameters" -name NtpServer -Value "n
tp.example.com,0x9"
PS C:\>
And then checking again
PS C:\> Get-ItemProperty  -path "$timeroot\parameters"


PSPath                 : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE
                         \SYSTEM\CurrentControlSet\services\W32Time\parameters
PSParentPath           : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE
                         \SYSTEM\CurrentControlSet\services\W32Time
PSChildName            : parameters
PSDrive                : HKLM
PSProvider             : Microsoft.PowerShell.Core\Registry
ServiceDll             : C:\Windows\system32\w32time.dll
ServiceMain            : SvchostEntry_W32Time
ServiceDllUnloadOnStop : 1
Type                   : NT5DS
NtpServer              : ntp.example.com,0x9



PS C:\>

We changed the config, but we then need to restart the time server for that to take effect

Restart-Service -Name w32Time -Force

Let's see if we can put some of that together in a script, which I shall call ntpTime.ps1:

# SEE ALSO
# https://technet.microsoft.com/en-us/library/ee176960.aspx

$timeRoot = "HKLM:\SYSTEM\CurrentControlSet\services\W32Time"

# Name of ntp server(s) currently known by this host
function Get-NTPServer {
   $ntpserver = (Get-ItemProperty  -path "$timeroot\parameters" `
                 -name NtpServer).NtpServer -replace ",.*"
   return $ntpserver
}

# So we do not like the ntp servers this host knows and want to change them.
# Remember the 0x flags!
#
# 0x01 SpecialInterval    
# 0x02 UseAsFallbackOnly  
# 0x04 SymmatricActive
# 0x08 Client
#
function Set-NTPServer ($ntpServer) {
   Set-ItemProperty -path "$timeroot\parameters" -name NtpServer -Value $ntpServer
}

function Restart-Time {
   Restart-Service -Name w32Time -Force
}

# How far off are our time (in seconds) from the one in our ntp server?
function Get-NTPOffset ($ntpServer) {
   (w32tm /stripchart /computer:$ntpServer /samples:1)[-1].split("[")[0] `
   -replace ".*:" -replace "s.*"
}

# Adjust time by using the offset
function SetTime ($offsetSeconds) {
   set-date (Get-Date).AddSeconds($offsetSeconds)
}

## Using those silly functions ----------------------------------
$myNTP = "ntp.example.com"
$leserver = Get-NTPServer
if ( $leserver -eq $myNTP ){
   Set-NTPServer("$($myNTP),0x9")
}
SetTime(Get-NTPOffset($myNTP))
Restart-Time

I will put a more complete version in my github account, but the above is good enough to be productive. So, what it does is first see whether we are using the right ntp server ($myNTP since I needed a lame variable name). If not, it changes it. And then it adjust time as needed. Script can then be run (schtasks anyone?) at regular intervals or when the machine wakes up if it is a vm or laptop.

Tuesday, June 30, 2015

Finding the file format of Microsoft Outlook attachment.

Couple of days ago I was asked by one of my customers to help her with her email. Over there they use Exchange (they are a big Microsoft shop) so her mail client is Outlook. She received an email regarding some appointment she had, and it included an attachment and she could not open it.

I have to say I froze. Usually they are careful about not opening suspicious attachments; I hope this was legit and her computer was not infected.
When I got to her office, she showed me the email, which looks proper. And then showed that the attachment was being presented in Outlook as a file called

Appointment Info Jul2015.save
which Outlook thought was an Adobe Acrobat document(?). Maybe Outlook was setup not to show extensions, like Windows does by default. So, I decided that we should contact the sender; he did say he sent the file and even read out the filename of the attachment. Next step was to save the file and see what's what. The quickest way I had to find the extension extension of the saved file was checking its properties. It did show it to me, but it was not .pdf. It was
Appointment Info Jul2015.save.copy
Ookay. As you know I am not a Windows guru; I just am someone who understand linux and Unix in general who happens to dabble with Windows at times. So, I tried to treat it as a linux problem.

Well, we do not have the command file in Windows by default as far as I know. So, how about if we open the file in Notepad and look at the header? Here is what I saw:

PK^C^D
^@^@^@^@^@^Nc▒F^@^@^@^@^

I guess we need to find which file uses that header. A bit of search indicates it is probably a .zip file. I renamed it as

Appointment Info Jul2015.save.zip
and unzipped it. It then created a new directory with a file inside it. And that file was a pdf:
%PDF-1.4
%▒▒▒▒
1 0 obj
<%lt/Type/XObject/Subtype/Image/Width 1602/Height 1037/Mask [252 252 ]/Length 65127/ColorSpace[/Indexed/DeviceRGB 255(^@^@^@^@^@

Thoughts

  1. Notepad is not like one of those helpful programs that try to detect and process and file you leave too close to them. And that is good. All I wanted is to see the top of the file without something running without my consent. Kinda like running vi
  2. Don't let outlook open attachments using any program by default. If you do not know what it is, it should not try to open it.
  3. I really do not like how Windows by default hides file extensions. Really hate that. Great way to camouflage a bad program from your average user.
  4. Sometimes you might not have a program that you can click on and will do something. You need to be creative. Hackers do that all day and twice on Sunday.
  5. Finding out the contents of a mysterious file inside another and so on is a classic event in hacking competition. Sometimes those tasks become very Rudy Goldbergish ina completely new and frustrating level.

Tuesday, June 23, 2015

Ubuntu Docker container with multiple architectures

Here's yet another quickie: let's say you created a docker Ubuntu container to do some sort of cross compilation. You know, maybe your container is 64bit Intel and you need to spit out some static 32bit Intel stuff. Or PPC for that matter. Now your development environment needs the 32bit version of libncurses-dev, which for ubuntu 14.04 would be libncurses5-dev:i386. We do need to specify the architecture (:i386) if we are not using the default one..

So you scribble up a quick Dockerfile:

############################################################
# Dockerfile to build a ubuntu container image thingie
# Based on Ubuntu (Duh!)
############################################################

# Set the base image to Ubuntu
FROM ubuntu:14.04

################## BEGIN INSTALLATION ######################

RUN apt-get update && apt-get -y upgrade 

RUN apt-get install -y build-essential curl libc6-dev libncurses5-dev:i386

##################### INSTALLATION END #####################
And then you do some building:
ducker@boot2docker:~/docker/test$ docker build -t raubvogel/test.
Sending build context to Docker daemon 4.096 kB
Sending build context to Docker daemon
Step 0 : FROM ubuntu:14.04
 ---> 6d4946999d4f
Step 1 : RUN apt-get update && apt-get -y upgrade
 ---> Running in 2d035053a431
Ign http://archive.ubuntu.com trusty InRelease
Ign http://archive.ubuntu.com trusty-updates InRelease
Ign http://archive.ubuntu.com trusty-security InRelease
Hit http://archive.ubuntu.com trusty Release.gpg
[...]
Processing triggers for ureadahead (0.100.0-16) ...
Setting up initscripts (2.88dsf-41ubuntu6.2) ...
guest environment detected: Linking /run/shm to /dev/shm
 ---> c1434c42218e
Removing intermediate container 2d035053a431
Step 2 : RUN apt-get install -y build-essential zip bzr curl libc6-dev libncurses5-dev:i386
 ---> Running in 2ac39f9430c2
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package libncurses5-dev
INFO[0017] The command [/bin/sh -c apt-get install -y build-essential zip bzr curl libc6-dev libncurses5-dev:i386] returned a non-zero code: 100
ducker@boot2docker:~/docker/test$
And then it barks at the libncurses5-dev entry. What's going on?

The Debian Multiarch HOWTO seems to have a bit of a clue. I was going to write a long winded explanation but I am bored. So here is the short version:

dpkg --add-architecture i386
apt-get update
apt-get install libncurses5-dev:i386
See the dpkg --add-architecture i386 line? It tells the host that we are can do 32bit Intell thingies. The next line is there just to feed us with the 32bit repository data, so when we look for a 32bit package we can get it. Let apply that to our Dockerfile:
############################################################
# Dockerfile to build a ubuntu container image thingie
# Based on Ubuntu (Duh!)
############################################################

# Set the base image to Ubuntu
FROM ubuntu:14.04

################## BEGIN INSTALLATION ######################

# We need i386 crap
RUN dpkg --add-architecture i386
# Business as usual
RUN apt-get update && apt-get -y upgrade 

RUN apt-get install -y build-essential curl libc6-dev libncurses5-dev:i386

##################### INSTALLATION END #####################
And build it again.
ducker@boot2docker:~/docker/test$ docker build -t raubvogel/test.
Sending build context to Docker daemon 4.608 kB
Sending build context to Docker daemon
Step 0 : FROM ubuntu:14.04
 ---> 6d4946999d4f
Step 1 : RUN dpkg --add-architecture i386
 ---> Running in 89382c3c3469
 ---> f367d3357fc4
Removing intermediate container 89382c3c3469
Step 2 : RUN apt-get update && apt-get -y upgrade
 ---> Running in 6d0fc9519029
Ign http://archive.ubuntu.com trusty InRelease
Ign http://archive.ubuntu.com trusty-updates InRelease
[...]
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 build-essential : Depends: gcc (>= 4:4.4.3) but it is not going to be installed
                   Depends: g++ (>= 4:4.4.3) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
INFO[0002] The command [/bin/sh -c apt-get install -y build-essential  zip bzr curl libc6-dev libncurses5-dev:i386] returned a non-zero code: 100
ducker@boot2docker:~/docker/test$
Hmmm, that didn't work. Crap. Let's massage Dockerfile a bit:
############################################################
# Dockerfile to build a ubuntu container image thingie
# Based on Ubuntu (Duh!)
############################################################

# Set the base image to Ubuntu
FROM ubuntu:14.04

################## BEGIN INSTALLATION ######################

# We need i386 crap
RUN dpkg --add-architecture i386
# Business as usual
RUN apt-get update && apt-get -y upgrade

RUN apt-get install -y build-essential  &&\
    apt-get install -y zip bzr curl libc6-dev libncurses5-dev:i386

##################### INSTALLATION END #####################
And this time it works fine:
ducker@boot2docker:~/docker/test$ docker build -t raubvogel/test.
Sending build context to Docker daemon 4.608 kB
Sending build context to Docker daemon
Step 0 : FROM ubuntu:14.04
[...]
Setting up libpam-systemd:amd64 (204-5ubuntu20.12) ...
debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
invoke-rc.d: unknown initscript, /etc/init.d/systemd-logind not found.
invoke-rc.d: policy-rc.d denied execution of start.
Processing triggers for libc-bin (2.19-0ubuntu6.6) ...
Processing triggers for ca-certificates (20141019ubuntu0.14.04.1) ...
Updating certificates in /etc/ssl/certs... 173 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d....done.
Processing triggers for sgml-base (1.26+nmu4ubuntu1) ...
Processing triggers for ureadahead (0.100.0-16) ...
 ---> d86a5113c577
Removing intermediate container 32e0cc9a0b5a
Successfully built d86a5113c577
ducker@boot2docker:~/docker/test$ 

What I found out is that it does not like to have the build-essentials in the same install statement as libncurses5-dev:i386. The easiest solution is to have build-essentials on its own install statement and everyone else in the next one. And life is once again well.

Thursday, May 28, 2015

Save/Suspend and Resume a VMware ESXi vm client command line style

Here is an interesting project: let's say you have one or more UPS (one per power supply) attached to your ESXi vm host (or hosts; this is completely scalable). Yes, it goes without saying providing uninterrupted power to your servers is a good idea. But, unless you are a large company chances are this power will only last so long. You can make it last even longer by having a plan that will decide in which order your physical servers will be shut down based on load and remaining power. That does mean shutting down your vm servers; for the sake of this discussion, we will assume they are ESXi-based.

I have seen interesting articles on shutting down ESXi hosts on case of power failure, but many assume you are monitoring the UPS through the ESXi host. That might be thinking small; what if that is not the case? What if you have a UPS or two on the bottom feeding the entire cabinet? Chances are you will be monitoring it from a host, be it a vm or not, that is running some monitoring program such as Nagios, that is set to do something in case of a power failure. Of course, if you have a monitoring vm you can talk to your UPS using either ethernet or USB passthrough depending on how sophisticated that model is. And it will decide when to tell our ESXi box it is time to shut down.

I do not know about you but I would like to gracefully save/shutdown the vm clients running in that host before that.

The plan is to have the host monitoring the UPS tell the ESXi host to run a shutdown procedure, which would need to first save the vm guests. And, once the vm server is back up and running, it would resume -- by its own accord or by the order of another server -- the saved vm clients. Yes, you will have to worry about how the monitoring and the ESXi hosts will talk to each other and how the client's clock will catch up, but for this article we will focus on creating a tool that only cares about saving and resuming all of the vm guests running in this ESXi box. We can expand later.

If we want to save the running vm clients, we probably should find out which ones are running. In a previous article we wrote a script to see if a given vm client is running, off, or saved. For the script we will be creating, we want to use something else, vmdumper. Here is what the help screen for the program says.

/tmp # vmdumper -h
vmdumper: [options]  
         -f: ignore vsi version check
         -h: print friendly help message
         -l: print information about running VMs
         -g: log specified text to the vmkernel log
/tmp #
Note the -l shows only the running vms, which is what we want to do. So, let's run that and see what it spits back (I will break them a bit so they will kinda fit the screen):
~ # /sbin/vmdumper -l
wid=264397      pid=-1  cfgFile="/vmfs/volumes/52a08b50-984b4bf0-219f-d067
e51ce7b7/boot2docker/boot2docker.vmx" uuid="56 4d 11 2b 63 bc 88 fb-d9 e1 
93 fc 69 36 66 45"  displayName="boot2docker"       vmxCartelID=264396
wid=13080       pid=-1  cfgFile="/vmfs/volumes/52a08b50-984b4bf0-219f-d067
e51ce7b7/Windows 2012/Windows 2012.vmx"       uuid="56 4d e7 cb 24 11 63 
13-04 0d 9b 41 08 f9 a3 be"  displayName="Windows 2012"      vmxCartelID=13079
wid=527962      pid=-1  cfgFile="/vmfs/volumes/52a08b50-984b4bf0-219f-d067
e51ce7b7/devcentos/devcentos.vmx"     uuid="56 4d d7 e8 25 6c de 91-09 38 
60 ce ab 5d 43 ca"  displayName="devcentos" vmxCartelID=527961
~ #
As you can see, it shows the path for the config file the vm guest is using (cfgFile, its name (displayName) and something called wid. And a few other things I do not feel like caring about. So, how do we save a vm anyway? We know we can start a vm using vim-cmd vmsvc/power.on, so maybe it sounds similar. Some frustrating searching later we find that http://www.vi-toolkit.com/wiki/index.php/Vmsvc/power.hibernate might be a candidate. Thing is it needs wmid as the argument. I will save some time and state (have faith, brother!) it can be obtained by
vim-cmd vmsvc/getallvms | grep "${displayName}" | awk '{ print "vmid=" $1}'
But, does it really work? We shall try with devcentos, which happens to have wmid=3 (again, I cheat because I have spent loads of time testing this):
/tmp # vim-cmd  vmsvc/power.hibernate 3
(vim.fault.ToolsUnavailable) {
   dynamicType = ,
   faultCause = (vmodl.MethodFault) null,
   msg = "Cannot complete operation because VMware Tools is not running in this virtual machine.",
}
/tmp #
And it does not seem to want to work. It needs VMware Tools, and I do not want to worry about it. So let's see what else we can use. After some looking I found vmdumper. To save devcentos we could do
/tmp # vmdumper 527962 suspend_vm
Suspending VM...
/tmp # 
The weird number 527962 is the world id or wid for devcentos, which happens to be the first column in the output of vmdumper -l associated with that vm client.

Pet Peeve: If you remember the output of vmdumper -h, which should be the help page for that command, mentions nothing about suspend-vm. Good job, VMware! That does make me wonder what else you are not documenting...

Now my venting is done, let's see what we need.

  1. We need the wid to shut down with vmdumper
  2. We can resume (I tested already, and so can you!) the vm client using vim-cmd vmsvc/power.on. Thing is it needs wmid as the argument, which we figure out how to get above.
  3. We then need a way to save wmid so when we can restore the saved vms. Probably saving the names of the vms would also be a nice touch.
So, here is the script I wrote to save and restore the running vms. As you can see, it is rather dumb since it is an all or nothing kinda deal. It is also unforgiving: if you run it again to save vms, the old /var/tmp/save_vms file will be overwritten. For what I wrote this script for, that is but a small annoyance.
cat > save_runningvms.sh  << 'EOF'
#!/bin/sh
IFS=$'\n'
USAGE="Usage: $0 {save|resume}"
SAVE_FILE=/var/tmp/save_vms

if [ "$#" == "0" ]; then
        echo "$USAGE"
        exit 1
fi

selection=$1

case $selection in
   # If we want to save them
   save )
      rm -f ${SAVE_FILE}

      # Find which vms are currently running
      for i in $(vmdumper -l \
         | awk ' BEGIN { FS = "\t" }; { print $1 ";" $5 }')
      do
         eval $i
         # Start saving them
         vmid=$(vim-cmd vmsvc/getallvms | grep "${displayName}" \
            | awk '{ print "vmid=" $1}')
         vmdumper $wid suspend_vm

         # Write list of saved guests in $SAVE_FILE
         echo $vmid ";" $i >> ${SAVE_FILE}
      done
      ;;
   # If we want to restore them
   resume )
      # Get list of saved guests
      for i in $(cat ${SAVE_FILE})
      do
         # Wake them up
         eval $i
         vim-cmd vmsvc/power.on $vmid
      done
      ;;
esac
EOF
chmod +x save_runningvms.sh
You will note that I avoid using Bashisms because the shell in busybox is closer to Bourne than Bash.

I think you probably want to see it running. So, let's run it. First we do some saving

/tmp # ./save_runningvms.sh save
Suspending VM...
Suspending VM...
Suspending VM...
/tmp # 
Did it create the /var/tmp/save_vms file? If so, how does it look like?
/tmp # cat /var/tmp/save_vms
vmid=24 ; wid=5718058;displayName="boot2docker"
vmid=23 ; wid=5714001;displayName="Windows 2012"
vmid=3 ; wid=5715871;displayName="devcentos"
/tmp # 
Ok, I am not convinced. You must be lying. Lemme go to the other vmhost, vmhost, and ping devcentos
[raub@vmhost tmp]# ping devcentos
PING devcentos.example.com (10.0.0.112) 56(84) bytes of data.
From vmhost.example.com (10.0.0.19) icmp_seq=2 Destination Host Unreachable
From vmhost.example.com (10.0.0.19) icmp_seq=3 Destination Host Unreachable
From vmhost.example.com (10.0.0.19) icmp_seq=4 Destination Host Unreachable
^C
--- devcentos.example.com ping statistics ---
7 packets transmitted, 0 received, +3 errors, 100% packet loss, time 6125ms
pipe 3
[raub@vmhost tmp]# 
Hmmmm, okay. But maybe it was off and you were lying to me. So, let's see about waking up the sleeping vms.
/tmp # ./save_runningvms.sh resume
Powering on VM:
Powering on VM:
Powering on VM:
/tmp #
And then pinging devcentos
[raub@vmhost tmp]# ping devcentos
PING devcentos.example.com (10.0.0.112) 56(84) bytes of data.
64 bytes from devcentos.example.com (10.0.0.112): icmp_seq=1 ttl=64 time=212 ms
64 bytes from devcentos.example.com (10.0.0.112): icmp_seq=2 ttl=64 time=0.316 ms
64 bytes from devcentos.example.com (10.0.0.112): icmp_seq=3 ttl=64 time=0.313 ms
^C
--- devcentos.example.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2078ms
rtt min/avg/max/mdev = 0.313/70.992/212.349/99.954 ms
[raub@vmhost tmp]#
I guess the script does work after all. What's the world coming to?

Monday, May 25, 2015

Get status of local VMware ESXi guests using command line

I have been talking a lot about KVM, and showing examples of how to control it from the command line. In fact, just a few posts ago we were talking about USB passthrough in KVM. I guess it is high time to do some in ESXi. After all, some of you -- yours truly included -- also have to deal with ESXi, which is VMware's product.

At this point in the presentation I can see some of you stomping your chests shouting "ESXi is Enterprise level product, not to be compared to the amateur hour likes of Kay-Vee-Em!". I have bad news for you, sunshine: they are all the same. They all can do live migration and clustering and so on. RedHat builds their enterprise turnkey solutions around KVM, which they also own. The differences between them are moving targets. And they are not the only games in town. Deal with it.

Back to the topic, yes ESXi has a nice, albeit C#-dependent (i.e. Windows only), GUI client... which they have been trying to get rid of for a while. But, sometimes I (I will take full blame for this) want to do something that is not available in the GUI. Now, I am aware I could be using PowerCLI, but what if I do want to have something running automagically in the ESXi host? To see what I mean, let's use a simple example: suposed we want to have a nice list of which vm clients ar ein this ESXi host and what they are up to (running/paused/etc). If you can't figure out why we would want to do that right now, hold onto your seat until the later parts of this article.

Of course this assumes you can ssh into the ESXi host.

The command we want to use is vm-support, and the option is -listmvs:

vm-support --listvms
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/mail_1/mail_1.vmx (Registered)
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/Windows 2012/Windows 2012.vmx (Running)
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/coreos/coreos.vmx (Registered)
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/devnetbsd/devnetbsd.vmx (Registered)
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/centos64/centos64.vmx (Registered)
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/boot2docker/boot2docker.vmx (Running)
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/TheOnion/TheOnion.vmx (Registered)
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/devubuntu/devubuntu.vmx (Registered)
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/freebsd/freebsd.vmx (Registered)
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/devcentos/devcentos.vmx (Registered)
#

As you can see, it shows where each VM disk image (yes, yes, I am lazy but this is just a little blog post. Focus, focus, focus) and a status of Running or Registered. It is a good start; we feel good about ourselves until we realize that Registered canmean the vm is turned off or just saved. Bummer.

Now I know that when you save a vm, it creates a file with a .vmss extension that stores the state of the vm when it was saved. Let me show an example: I know that devcentos is saved for I did that last week. If what I said before is not a lie, we should find a .vmss file:

# ls /vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/devcentos/*.vmss
/vmfs/volumes/52a08b50-984b4bf0-219f-d067e51ce7b7/devcentos/devcentos-aaf17b2a.vmss
#
For those of you who read this blog (why would you do that?), that sounds just like the .save file KVM uses to do the very same thing. Didn't I mention in the end of the day they are the same? So, what we need to do is if we find a vm labelled as Registered, we should also see if it has a .vmss file. If so, it is a saved vm. I could bore you with the details and trials, but here is what I got:

cat > vm_status.sh  << 'EOF'
#!/bin/sh
IFS=$'\n'

for i in $(vm-support --listvms)
do 
   vm_path=$(dirname "$i")
   vm_name=$(basename $vm_path)
   vm_status=$(echo "$i" | awk '{ print $NF}' )

   # Now check status of each vm
   case $vm_status in
      '(Running)' )
         echo "$vm_name is currently running"
         ;;
      '(Registered)' )
         test -f $vm_path/*.vmss \
            && echo "$vm_name is currently saved" \
            || echo "$vm_name is currently off"
         ;;
   esac

done
EOF
chmod +x vm_status.sh
It is not perfect, but it does what I want:
# ./vm_status.sh 
mail_1 is currently off
Windows 2012 is currently running
coreos is currently off
ctfbox is currently off
devnetbsd is currently off
centos64 is currently off
boot2docker is currently running
TheOnion is currently saved
devubuntu is currently off
freebsd is currently off
devcentos is currently saved
#
One thing you need to be aware is that ESXi uses busybox, so do not try to write a full blown Bash script; you will end very disappointed.

"So, what is the point of this script as it runs in the ESXi host?" you might ask, and you have a point. By itself it makes more sense to wire it in PowerCLI and run it that way. But, what if you want to have a script send you periodic reports of which vms are alive and which ones are off? Or what if you need to see if you need to start a given vm or just restore it? This script is small and a bit of a simpleton, but it shows the potential we have for writing proper scripts to be run in the ESXi host. I will show an even more practical example in an upcoming article.