PowerShell: Resolve Partial File Paths To Full Paths

My Use-case:

I am building a continuous integration tool using PowerShell for deploying scripts/database files (like .sql, .rdl, .xmla etc) to SQL Server/Oracle with all the bells and whistles (over a hundred PowerShell functions so far 🙂 ). It needed a file path resolution functionality that is somewhat flexible. I decided to share the file path resolution functionality. Please read on.

Two important inputs are

  1. A base-folder where the files to be deployed are located
  2. A list that has the files listed in the order of deployment

The file name references could change depending on the context. E.g., when a developer lists the files, his/her references are to a source control based location. When a configuration manager builds, he/she would probably stage it to an UNC path. When DBA’s deploy it, the reference could be to a secure drop-off location.

In all this, the base path and file list should remain separated enough such that we are not taking a dependency on a hard-coded path. I will jump through hoops to keep code clean and re-usable and this is one of those tangents I took for that purpose.

Let us see an example:

For simplicity, let us say the following are the files involved.

  • C:\Temp\DB\OLTP\File1.sql
  • C:\Temp\DB\OLTP\File2.sql
  • C:\Temp\DB\DW\File1.sql
  • C:\Temp\DB\DW\File2.sql

This could translate to a base path of

  • C:\Temp

and a file listing of

  • \OLTP\File1.sql
  • \OLTP\File2.sql
  • \DW\File1.sql
  • \DW\File2.sql

Notice that we did not bother to include “DB” in the base path reference.

Now, our function (that we are about see) should be able to scan the base path and resolve all the files to their full path without collision. If there is a collision, it should report an error with specifics.

If we later staged the files to a different location, all the code can remain the same and only the base path reference would change.

The additional parameters you see are to keep it flexible enough so that the user can pass in a list as a string or as a file name which in turn contains the list of files. Also, I added in the recurse and other parameters for file list separator (usually a linefeed but that can change) and a comment character(s) based on which certain lines in the list may be ignored as comments.

The code



####################### 
<# 
.SYNOPSIS 
   Coverts a list of partial or full paths to a verified list of full paths

.DESCRIPTION 
    Checks the given base folder and looks for every file whose name may be a full path 
        or a partial name. Tries to match to 1 file within the base folder.
        If it cannot match to a single file or matches to more than 1 file, it errors out with detailed messages

.INPUTS 
    Base folder to look in and an ordered list of file names (could be relative path)

.OUTPUTS 
    Fully valid paths for all the files in the ordered list of potentially partial paths

.EXAMPLE 
     
     $partialFileNameList = (gci c:\Temp -Filter *.txt -Recurse | select -ExpandProperty FullName | Out-String).Replace('C:\Temp','')

     "This is the Input stored into a file"
     $partialFileNameList

     Set-Content -Value $partialFileNameList -LiteralPath c:\Temp\test.txt -Force

     "This is the output with fully qualified paths"
     $returnHashTable = Resolve-FilePaths `
                        -BaseFolder 'C:\Temp' `
                        -OrderedFileListFileName c:\Temp\test.txt `
                        -Recurse: $true `
                        -FileListSeparator "`n" `
                        -LineCommentChar '#' `
                        -Verbose
    
    #HashTable keys have full file path and values have addl. details
    $returnHashTable.Keys

.EXAMPLE 
     
     #Same as above example but instead of a file, uses a string as input for list of files
     $partialFileNameList = (gci c:\Temp -Filter *.txt -Recurse | select -ExpandProperty FullName | Out-String).Replace('C:\Temp','')

     "This is the Input stored into a file"
     $partialFileNameList = "C:\Windows\system32\notepad.exe`n" + $partialFileNameList

     "This is the output with fully qualified paths"
     $returnHashTable = Resolve-FilePaths `
                        -BaseFolder 'C:\Temp' `
                        -OrderedFileListString $partialFileNameList `
                        -Recurse: $true `
                        -FileListSeparator "`n" `
                        -LineCommentChar '#' `
                        -Verbose
     
     $returnHashTable.Values
     

.NOTES 
    
    Created this to assist with automatic deploy file creation

Version History 
    v1.0   - Jana Sattainathan - Mar.16.2017 

.LINK 
    N/A
#>


function Resolve-FilePaths
{ 
    [CmdletBinding()] 
    param
    ( 	
        [ValidateScript({Test-Path $_ -PathType Container})]
        [Parameter(Mandatory=$false)] 
        [string] $BaseFolder = $null,

        #The file list (could be partial path with filename or just filename) ordered in execution order
        [Parameter(Mandatory=$true, 
                    ParameterSetName='OrderedFileListString')] 
        [string] $OrderedFileListString,

        #Same as above but instead of a physical list, this is the file containing the ordered list of files
        [ValidateScript({Test-Path $_ -PathType Leaf})]
        [Parameter(Mandatory=$true, 
                    ParameterSetName='OrderedFileListFileName')] 
        [string] $OrderedFileListFileName,

        [Parameter(Mandatory=$false)] 
        [switch] $Recurse = $true,

        [Parameter(Mandatory=$false)] 
        [string] $FileListSeparator = "`n",

        #Lines in the file list starting with this string will be ignored
        [Parameter(Mandatory=$false)] 
        [string] $LineCommentChar = '#'
    )

    [string] $fn = $MyInvocation.MyCommand
    [string] $stepName = "Begin [$fn]"    
    [System.Collections.Hashtable]$returnValuesHashTable = @{}
    [Object] $returnObj = New-Object PSObject

    [string[]] $rawFileList = @()
    [System.Collections.ArrayList] $refinedFileList = @()
    
    [bool]$ignoreLine = $false
    [int]$counter = 0

    try
    {
    
        $stepName = "[$fn]: Validate parameters"
        #--------------------------------------------        
        Write-Verbose $stepName


        $stepName = "[$fn]: Get the raw list of files (could be partial names)"
        #--------------------------------------------        
        Write-Verbose $stepName  

        if ($PSCmdlet.ParameterSetName -eq 'OrderedFileListString')
        {
            $rawFileList = $OrderedFileListString.Split($FileListSeparator)
        }
        if ($PSCmdlet.ParameterSetName -eq 'OrderedFileListFileName')
        {
            $rawFileList = (Get-Content $OrderedFileListFileName).Split($FileListSeparator)
        }

    
        $stepName = "[$fn]: Resolve to literal paths in [$BaseFolder] for files"
        #--------------------------------------------        
        Write-Verbose $stepName  

        foreach($rawFile in $rawFileList)
        {
            #http://stackoverflow.com/questions/38044236/test-path-illegal-characters-in-path
            #Get rid of special characters to avoid error "Test-Path : Illegal characters in path"
            #$file = ($rawFile -replace '(-|#|\||"|,|/|:|â|€|™|\?)').Trim()
            $file = ($rawFile -replace  '(-|#|\||"|,|:|â|€|™|\?)').Trim() #Need to keep forward slash
            
            #Ignore blank lines
            if ($file.Trim().Length -eq 0)
            {
                $ignoreLine = $true
            }

            #Ignore comments
            if ($file.Trim().ToUpper().StartsWith($LineCommentChar.ToUpper()))
            {
                $ignoreLine = $true
            }


            if ($ignoreLine -eq $false)
            {
                #If it is already a fully valid file path, just use it!
                #------------------------------------------------------
                if ((Test-Path -LiteralPath $file -PathType Leaf) -eq $true)
                {
                    $refinedFileList.Add($file) | Out-Null
                }
                #File is not a fully valid file path, we need to resolve to one!
                #------------------------------------------------------
                else
                {
                    #We need the $BaseFolder to be valid to search for the file at this point!
                    if ($BaseFolder.Trim().Length -eq 0)
                    {
                        Throw "[$fn]: File [{0}] is not a fully valid leaf. Valid BaseFolder is required to search for partial file path/name!" -f $file
                    }

                    #Find all files within base folder whose full path and name matches our partial filename with or without path
                    $matches = Get-ChildItem $BaseFolder -Recurse: $Recurse | 
                                    Where-Object {$_.FullName.Replace('\','/') -match ($file.Replace('\','/'))}

                    #No matches were found!
                    if ($matches -eq $null)
                    {
                        Throw "[$fn]: Unable to locate file: [{0}] in base path [{1}]!" -f $file, $BaseFolder
                    }

                    #Too many matches were found!
                    if (@($matches).Count -gt 1)
                    {
                        Write-Warning ('Multiple matches were found for file: [{0}] in base path [{1}] {2}' -f
                                            $file, $BaseFolder, ($matches | Select-Object -ExpandProperty FullName | Out-String))
                        
                        Throw "[$fn]: Multiple matches were found for file: [{0}] in base path [{1}]!" -f $file, $BaseFolder
                    }

                    #Exactly one match was found!
                    if (@($matches).Count -eq 1)
                    {
                        $refinedFileList.Add($matches[0].FullName) | Out-Null
                    }

                }  #if ((Test-Path -LiteralPath

            }  #if ($ignoreLine

        }  #foreach($file


        $stepName = "[$fn]: Check if files qualified"
        #--------------------------------------------        
        Write-Verbose $stepName  

        #No files qualified
        if ($refinedFileList.Count -eq 0)
        {
            Throw 'No files from the list are valid/qualified in given base path [{0}]!' -f $BaseFolder
        }

        $stepName = "[$fn]: Check if input list qualified for duplicate fully qualified paths!"
        #--------------------------------------------        
        Write-Verbose $stepName  
        
        if ($refinedFileList.Count -gt (@($refinedFileList | Select-Object -Unique)).Count)
        {
            $duplicatesListString = Compare-Object -CaseSensitive:$false -ReferenceObject $refinedFileList -DifferenceObject ($refinedFileList | Select-Object -Unique) | Out-String

            Write-Warning ("[$fn]: Duplicates found: {0}" -f $duplicatesListString)

            Throw 'File list results in duplicates! Please check. {0}' -f $duplicatesListString
        }

        $stepName = "[$fn]: Loop through the files and produce extended output!"
        #--------------------------------------------        
        Write-Verbose $stepName  


        foreach($file in $refinedFileList)
        {
            $psObjectFile = New-Object PSObject
            $fileObject = Get-ChildItem -LiteralPath $file

            $psObjectFile | Add-Member -NotePropertyName 'FullFilePathAndName' -NotePropertyValue $file
            $psObjectFile | Add-Member -NotePropertyName 'FullFilePath' -NotePropertyValue $fileObject.Directory
            $psObjectFile | Add-Member -NotePropertyName 'FileNameLeaf' -NotePropertyValue $fileObject.Name
            $psObjectFile | Add-Member -NotePropertyName 'FileExtension' -NotePropertyValue $fileObject.Extension
            $psObjectFile | Add-Member -NotePropertyName 'CreationTime' -NotePropertyValue $fileObject.CreationTime
            $psObjectFile | Add-Member -NotePropertyName 'LastWriteTime' -NotePropertyValue $fileObject.LastWriteTime
            $psObjectFile | Add-Member -NotePropertyName 'LastAccessTime' -NotePropertyValue $fileObject.LastAccessTime
            $psObjectFile | Add-Member -NotePropertyName 'SizeBytes' -NotePropertyValue $fileObject.Length
            $psObjectFile | Add-Member -NotePropertyName 'InputFileName' -NotePropertyValue $rawFileList[$counter]
            $psObjectFile | Add-Member -NotePropertyName 'InputBaseFolder' -NotePropertyValue $BaseFolder
            # RelativeFilePath is relative to given BaseFolder if one is provided. 
            #   Preserve leading slash of filename by removing trailing slash of BaseFolder (if it has one)
            $psObjectFile | Add-Member -NotePropertyName 'RelativeFilePath' -NotePropertyValue ($psObjectFile.FullFilePathAndName -ireplace [regex]::Escape($BaseFolder.TrimEnd('\')), '') 
            $psObjectFile | Add-Member -NotePropertyName 'IsFileOutsideBaseFolder' -NotePropertyValue $(if (($BaseFolder.Trim().Length -gt 0) -and ($psObjectFile.FullFilePathAndName.ToLower().StartsWith($BaseFolder.ToLower()))) {$true} else {$false})            
            
            #Add as key/value pairs for easy access later.
            $returnValuesHashTable.Add($psObjectFile.FullFilePathAndName, $psObjectFile) | Out-Null
            $counter = $counter + 1               
        }
       


        #Finally....if we get to this point, it is a good list!
        $returnValuesHashTable

    }
    catch
    {
        [Exception]$ex = $_.Exception
        Throw "Unable to convert . Error in step: `"{0}]`" `n{1}" -f `
                        $stepName, $ex.Message
    }
    finally
    {
        #Return value if any

    }
}

The results:

The results is a hashtable with the same number of files that you input. The hashtable key is the full file path of each of the files and the values are custom PS objects with a lot of good information including the original input.

Please give it a try and let me know if you find it useful.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s