PowerShell: Abort after encountering [x] errors within the last minute

This short article is about how to abort out of your PowerShell loop if you have encountered [x] errors within the last minute. The technique can be reapplied to any similar situation.

Before I elaborate on how this is done, I want to give Kudos to Brian Wilhite, my friend and organizer of Charlotte PowerShell user group for showing us a neat trick using HashTable’s which I applied for this purpose. Basically, assigning a value to an item with “hence before unknown” hashtable key, creates a new item with that key. So, for example “$myHash[‘Jana’] +=1” either creates a new item in the hashtable with key Jana and initialize the value with 1 when run for the first time, or it increments the current value of the item when run during subsequent times.

In my case, I have developed a PowerShell based copy server that runs 24×7. By definition a server cannot die except under extreme conditions! If there are issues, it logs errors and limps along until it absolutely cannot run. However, it had a negative side-effect of generating a ton of log and notification emails. In reality, if the server encounters more than [x] errors within 1 minute (could be [y] mins), then we know that it is a lost cause and we can safely abort the server! In my case, I wanted to abort the server if it encountered more than 50 errors within the last minute (values come from a configuration table and are not hardcoded).

The logic goes like this within the Catch block:

  • Keep a count of errors by every minute in a HashTable
  • If the current minute had more than [x] errors, then abort
  • If there are no entries in the HashTable for current minute, then clear the HashTable (housekeeping)

This is where I am using the HashTable behavior that Brian showed us last night. Specifically,

  • If a HashTable element is assigned a value to a key that does not already exist, it creates the key and assigns the value (does not throw an error!)
  • If the element referenced by the key already exists and the value is incremented using “+=” operator, then it increments the existing value, else increments to 1 (0 is implied if item does not exist)

This is the exact logic that I have in the CATCH block of my function (feel free to modify it in any form to suit your needs):

$stepName = 'Abort engine if more than [x] errors have occured in the last minute'
#--------------------------------------------
#This was declared at the top of my function
#[HashTable] $errorCountByMin = @{}

# ...add an hashtable element (or increment if element already there) for current minute (as key)
[string]$thisMin = (Get-Date –f yyyyMMddHHmm).ToString()
$errorCountByMin[$thisMin] += 1

#Clear the error counter hashtable if this is a brand new "minute"
if (-not $errorCountByMin.ContainsKey($thisMin))
{
    #This minute does not exist and has to create a new entry in the hastable
    #..so let us use the opportunity to do housecleaning!
    #  (periodic clearing of hashtable to keep it from growing too big)
    $errorCountByMin.Clear()
}

#Abort if we crossed the max errors per minute threshold!
if ($errorCountByMin[$thisMin] -gt $rowConfig.EngineAbortAfterMaxErrorsPerMin)
{
    $stepName = "More than [$($rowConfig.EngineAbortAfterMaxErrorsPerMin)] occured within the last minute on main engine loop. See email or error log table - EngineLog. Aborting engine!"

    #DO YOUR OWN LOGGING HERE

    throw $stepName
}

Yes, it was that simple. Note that the above is my Catch block code of the main server loop.

Thanks for the spark Brian!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s