Friday, July 31, 2015

Detecting text file format using hexdump

another quick one: so we had a text file that had text with accented words and we had to figure out which format they were. You see, for a while the "standard" text format for computers was ASCII, more precisely 7bit ASCII (characters 0 to 127 in decimal) which was created in the 1960s and whose character set aassumed English language only. Before some of you get all excited please note that ASCII stands for American Standard Code for Information Interchange, so it stands to reason they picked English. As this standard became adopted by other countries, it became clear that some of them used characters that were not representable with only those characters, and that let to many attempts to solve that. One of the earliest was to extend the original ASCII table, where another 128 possible characters were added, which after a few adventures evolved into ISO-8859-1 a.k.a. ISO-Latin-1, and UTF-8. There are other character sets, but the principle is the same.

Thanks for the history lesson, but how about getting to the point? How to identify the text format given a file? Let's answer that by using a couple of examples.

Example 1

Let's say we got a text file that has a name, say Luis de La Peña in it somewhere. Depending how you look -- how helpful your text viewer is -- at the file, it might either show "ñ" or some garbled character; the later happens if the text viewer only knows 7bit ASCII. For instance, cat would spit out something like this in my Ubuntu laptop:

bash-3.2$ cat text_test1 
Luis de La Pe�a 
bash-3.2$

Don't know about you, but that "�" does nothing to me; it's just cat's way of telling us it cannot represent the character so it is putting a placeholder. Let's try something else; since the title of this article mentions hexdump, I propose to look at it through that program (I am telling it to print the value of each character and then the ascii representation of those characters):

bash-3.2$ hexdump -Cv text_test1 
00000000  4c 75 69 73 20 64 65 20  4c 61 20 50 65 f1 61 20  |Luis de La Pe.a |
00000010  0a                                                |.|
00000011
bash-3.2$ 

first thing we notice is that it too does not know how to show "ñ", so it is using "." as placeholder. That is a different character than what cat in my ubuntu box used; just deal with it. What we really care about is the hex side tells us that

0xF1 = ñ

That is very important: it tells us that "ñ" is represented by one 8-Bit character, so UTF8 is out. So, we need to look for an 8-bit charset. After hours of agonizing search and heavy drinking, we find that the extended ASCII and/or the ISO-8859-1 tables match all characters (don't believe me? Check the other characters in the text including space). Not bad at all, so we can read the text and convert it to a different char set.

Example 2

So we feel all good about ourselves, and we need another example. This time, I will steal a real life example from a previous article, where we had a text containing Italienisches Olivenöl which would cause DKIM email body authentication failures. Yeah, something as seemly harmless as character set can create some annoying problems.

As before, we begin by asking cat what it thinks about the text:

bash-3.2$ cat text_test2
Italienisches Olivenöl
bash-3.2$

Hold on right there. Why is cat able to print that "ö" but could not print that "ñ" earlier? Now you begin to see some of the limitations of cat compared to hexdump for these probulations: depending on how cat was compiled, it will handle some character sets but not others. hexdump knows nothing about character sets: it only knows of ASCII; anything else becomes a ".". Of course, it would suck to use hexdump all the time, so you need to know your tools and when to use each one. Since we talked about hexdump, let's see what it sees:

bash-3.2$ hexdump -Cv text_test2
00000000  49 74 61 6c 69 65 6e 69  73 63 68 65 73 20 4f 6c  |Italienisches Ol|
00000010  69 76 65 6e c3 b6 6c 0a                           |iven..l.|
00000018
bash-3.2$

Some kind of funny business happening in the second row:

  1. all the English-looking characters not only seem to be represented by one 8bit value but also the same ones we saw earlier in the ASCII example:
    0x69 = i
    0x76 = v
    0x65 = e
    0x6e = n
    0x6c = l
  2. There are two "." characters (0xc3 and 0xb6) where "ö" should be.
  3. There is a 0x0a after the "l".
What's going on here? Hold onto that question and let's check the "ö". According to the above, the two "." are there to tell us "ö" is represented as two 8bit values:
0xc3b6
If we look at any UTF8 conversion table such as this one (picked at random), we will see that is the UTF8 HEX for "ö" (Unicode code would be U+00F6).

Ok, smartypants, what about the 0x0a after the "l"? Yes that. You might have not noticed it was also on text_test1 on the first example. That is the line feed character, which in Linux means end of line.

Insert Boring Conclusion Here

I hope this was useful to you; I thought it was fun and even learned a few things while writing this. the thought process here is similar to what, say you would do when you are examining an encrypted document: try to find known patterns to work with before going after the really unknown stuff.

Sunday, July 26, 2015

Setting NTP server and time in Windows using Powershell

And here we have yet another Windows-related post! Yes, I too make fun of Windows as much as required to be in the IT business... ok sometimes more. But, as I have said again and again, being able to solve problems using command line (powershell specifically) makes it feel more like Unix. I can handle that and so can you!

Most of the Windows boxes I met that use a time server to set their time use the Microsoft one, time.windows.com, no matter if they are the sole computer in a car shop or one of the thousands desktop and servers in an university. That is nice until you have to move away from local-only user accounts and deal with Kerberos and, by extension, Active Directory. You see, Kerberos likes to have its clients to be within 5 minutes of the authentication servers (KDCs). Syncing against the Microsoft time server assumes your machine is in a network that can access the Internet. Well, I have 8 of them which are in a vlan that can't (and really shouldn't). Updates to them are pushed through SCCM (when it feels like working, but I digress) and AD.

On the top of that, I have a perfectly good ntp server in my network this vlan can reach anyway. And its address is passed by dhcp. To add insult to injury, Microsoft does not support the dhcp option to care about ntp servers. Here is a list of the DHCP options supported right from their official docs.

So, as always, I need to do something to make it stop pissing me off. And, it will be in a script of some sort. This is Windows so bash is out and Powershell is in.

The plan is to be able to find which ntp server the Windows host is using and change it if we do not like it. And, while we are there, make sure the host's time is in sync with that of the ntp server. Windows uses W32Time and stores all of that in the registry, namely in HKLM:\SYSTEM\CurrentControlSet\services\W32Time, so if you want you can unleash regedit and go at it. Taking a cue from Unix and Linux, powershell treats the registry as a file tree. So, as far as it is concerned, the above is just a path which can be accessed and modified using Get-ItemProperty and Set-ItemProperty. Let's try it out by taking a look on what we have currently defined:

PS C:\> $timeRoot = "HKLM:\SYSTEM\CurrentControlSet\services\W32Time"
PS C:\> Get-ItemProperty  -path "$timeroot\parameters"


PSPath                 : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE
                         \SYSTEM\CurrentControlSet\services\W32Time\parameters
PSParentPath           : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE
                         \SYSTEM\CurrentControlSet\services\W32Time
PSChildName            : parameters
PSDrive                : HKLM
PSProvider             : Microsoft.PowerShell.Core\Registry
ServiceDll             : C:\Windows\system32\w32time.dll
ServiceMain            : SvchostEntry_W32Time
ServiceDllUnloadOnStop : 1
Type                   : NT5DS
NtpServer              : time.windows.com,0x9



PS C:\>

The 3 blank lines below NtpServer are not a typo; don't ask me why it spits those lines because they add absolutely no value to the output besides wasting screen real state. As you can see, it wants to use time.windows.com as the NtpServer. But, what is this 0x9 on the end of the name of the ntp server? Well, here is what I know about what 0x flags mean

  • 0x01 SpecialInterval: interval in seconds between when W32Time pools for time. Requires HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NtpClient\SpecialPollInterval to be setup. By default W32Time checks the time at intervals based on the network speed, traffic, and phases of the moon. But, if you turn SpecialInterval on, it will check evet SpecialPoolInterval seconds. So, SpecialPoolInterval = 3600 means it will check time ever 3600s (or 1h).
  • 0x02 UseAsFallbackOnly
  • 0x04 SymmatricActive
  • 0x08 Client
  • 0x09 = 0x01 + 0x08. Yes, we can do math.

If we want to change it to, say, ntp.example.com, in powershell we would begin by

PS C:\> Set-ItemProperty  -path "$timeroot\parameters" -name NtpServer -Value "n
tp.example.com,0x9"
PS C:\>
And then checking again
PS C:\> Get-ItemProperty  -path "$timeroot\parameters"


PSPath                 : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE
                         \SYSTEM\CurrentControlSet\services\W32Time\parameters
PSParentPath           : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE
                         \SYSTEM\CurrentControlSet\services\W32Time
PSChildName            : parameters
PSDrive                : HKLM
PSProvider             : Microsoft.PowerShell.Core\Registry
ServiceDll             : C:\Windows\system32\w32time.dll
ServiceMain            : SvchostEntry_W32Time
ServiceDllUnloadOnStop : 1
Type                   : NT5DS
NtpServer              : ntp.example.com,0x9



PS C:\>

We changed the config, but we then need to restart the time server for that to take effect

Restart-Service -Name w32Time -Force

Let's see if we can put some of that together in a script, which I shall call ntpTime.ps1:

# SEE ALSO
# https://technet.microsoft.com/en-us/library/ee176960.aspx

$timeRoot = "HKLM:\SYSTEM\CurrentControlSet\services\W32Time"

# Name of ntp server(s) currently known by this host
function Get-NTPServer {
   $ntpserver = (Get-ItemProperty  -path "$timeroot\parameters" `
                 -name NtpServer).NtpServer -replace ",.*"
   return $ntpserver
}

# So we do not like the ntp servers this host knows and want to change them.
# Remember the 0x flags!
#
# 0x01 SpecialInterval    
# 0x02 UseAsFallbackOnly  
# 0x04 SymmatricActive
# 0x08 Client
#
function Set-NTPServer ($ntpServer) {
   Set-ItemProperty -path "$timeroot\parameters" -name NtpServer -Value $ntpServer
}

function Restart-Time {
   Restart-Service -Name w32Time -Force
}

# How far off are our time (in seconds) from the one in our ntp server?
function Get-NTPOffset ($ntpServer) {
   (w32tm /stripchart /computer:$ntpServer /samples:1)[-1].split("[")[0] `
   -replace ".*:" -replace "s.*"
}

# Adjust time by using the offset
function SetTime ($offsetSeconds) {
   set-date (Get-Date).AddSeconds($offsetSeconds)
}

## Using those silly functions ----------------------------------
$myNTP = "ntp.example.com"
$leserver = Get-NTPServer
if ( $leserver -eq $myNTP ){
   Set-NTPServer("$($myNTP),0x9")
}
SetTime(Get-NTPOffset($myNTP))
Restart-Time

I will put a more complete version in my github account, but the above is good enough to be productive. So, what it does is first see whether we are using the right ntp server ($myNTP since I needed a lame variable name). If not, it changes it. And then it adjust time as needed. Script can then be run (schtasks anyone?) at regular intervals or when the machine wakes up if it is a vm or laptop.