Monday, February 27, 2017

Create output filename based on input filename using powershell

Here's a situation that happened to me many times in Linux: let's say we create a script which expects the user to enter the input and output filenames. Now, what if the user forgets to enter the output filename? Should we bark or come up with a filename based on the input filename? This of course depends on the situation, but when I decided to create the output filename I would tack today's date to the input filename so they would be different.

But that was Linux and bash and python and this is Windows with powershell. And, yes, we could keep on writing in Bash using cygwin, but that owuld be cheating. Give the constraint of only running what came in Windows 7 and above (I am dating myself), let's see what we can do:

  1. Date. The date formats I like are (using Linux here, so focus on the output not the command)

    raub@desktop:~$ date +%F
    2017-01-30
    raub@desktop:~$ 
    and
    raub@desktop:~$ date +%Y%m%d
    20170130
    raub@desktop:~$ 

    Both write year using 4 digits followed by 2 digits for the month and two for the day. I know some people will cry and moan and demand the traditional US format, day/month/year, but the format I like makes sorting in a directory much easier as we are putting what changes the fastest on the end of the filename. But we are talking about powershell, not Bash or Bourne shell. That is true but now we know what we want to accomplish.

    To make it easier, I will pick one of the two formats -- YYYYMMDD -- and run with it; you can later modify the code to use the other one as exercise. The equivalent in powershell is:

    PS C:\Users\raub> get-date -format "yyyyMMdd"
    20170130
    PS C:\Users\raub> 

    Looks just like what we did above in Linux.

  2. Create filename if not given. We are reading the input filename into the script in some way or fashion. How we are doing it depends on the script and whether we should be passing options with or without flags to identify them. For now, we are going to be lazy and do the simplest thing possible: using param() on the beginning of the script.

    param($inputFile, $outputFile)

    If we have just one argument, it shall be the inputfile. If two, the second one is the outputfile. What if no arguments are passed? We can just send an error message and get out.

    function Usage
    {
       Write-Host "Usage:", $MyInvocation.MyCommand.Name, "inputfile [outputfile]"
       exit
    }
    
    if (!$inputfile)
    {
       Usage
    }

    The $MyInvocation.MyCommand.Name is a lazy way to for the script to get its own name by itself.

  3. Do something if there is no $outputFile. This is a variation of the same test we did to see if we had an $inputFile:

    function LazyOutputFilename($ifile)
    {
       $ofile = (Get-Item $ifile ).DirectoryName + '\' +  `
                (Get-Item $ifile ).BaseName + `
                '_' + (get-date -format "yyyyMMdd") + `
                (Get-Item $ifile ).Extension
       return $ofile
    }
    
    function GetOutputFilename($ifile, $ofile)
    {
       # $ofile cannot be $ifile
       # Create a $ofile if one was not given
       if (( [string]::IsNullOrEmpty($ofile) ) -or ( $ofile -eq $ifile ))
       {  
          $ofile = LazyOutputFilename $ifile
       }
    
       return $ofile
    }
    
    $outputFile = GetOutputFilename $inputFile $outputFile
    • In LazyOutputFilename() we are creating the output filename. We are putting it in the same directory as the input filename and then adding the formatted date right before the file extension.

    • The ( [string]::IsNullOrEmpty($ofile) ) checks is the output file, called $ofile inside this function, is empty. The reason we also wants to make sure the output file is not the input file is because we might be reading the input file a chunk at a time (line by line if text) so the script can handle large files without using up all the memory. If we are reading it line by line and then write right back to it, bad things might happen.

    • And, yes, we are overwriting the output filename if it gets changed in GetOutputFilename().

  4. Put everyone together.

    param($inputFile, $outputFile)
    
    function Usage
    {
       Write-Host "Usage:", $MyInvocation.MyCommand.Name, "inputfile [outputfile]"
       exit
    }
    
    <#
     Create output filename based on the full path of the input filename +
     today's date appended somewhere
     #>
    function LazyOutputFilename($ifile)
    {
       $ofile = (Get-Item $ifile ).DirectoryName + '\' +  `
                (Get-Item $ifile ).BaseName + `
                '_' + (get-date -format "yyyyMMdd") + `
                (Get-Item $ifile ).Extension
       return $ofile
    }
    
    function GetOutputFilename($ifile, $ofile)
    {
       # $ofile cannot be $ifile
       # Create a $ofile if one was not given
       if (( [string]::IsNullOrEmpty($ofile) ) -or ( $ofile -eq $ifile ))
       {  
          $ofile = LazyOutputFilename $ifile
       }
    
       return $ofile
    }
    
    if (!$inputfile)
    {
       Usage
    }
    
    $outputFile = GetOutputFilename $inputFile $outputFile