Usually, Powershell is used as a “glue” to stitch a bunch of commands and programs together. It does not need to be a performance daemon to do that (and nobody says it is). Flexibility comes with a price. But there are cases, where your doing seemingly trivial things, but your script just takes years to finish.
There is a useful cmdlet Measure-Command
that measures how long a piece of code takes to run. The usage is very simple:
$timespan = Measure-Command {
# do whatever you want to measure here
}
That’s nice if you know or suspect which part of code is slow. But I would like to have something that’s more like instrumentation. What I want is a list of called functions with their total run times and number of calls.
That’s why I created a little wrapper around Measure-Command
, called Measure-function
, that’s able to easily gather measurements of multiple functions. So now, if I have a function that I want to measure:
function Get-Something {
# i'm doing some heavy loading here
return $something
}
I just wrap the body with Measure-Function
like this:
function Get-Something {
Measure-Function "$($MyInvocation.MyCommand.Name)" {
# i'm doing some heavy loading here
return $something
}
}
Measure-Function
takes care of aggregating measurements, and makes sure not to measure recurence invocation. To get the results, do:
$global:perfcounters | format-table -AutoSize -Wrap | out-string | write-host
Now, to pinpoint bottlenecks in your code, you can follow these steps:
- Start with the entry point of your script and add
Measure-Function
to it and functions that it calls. - Run the code and see, which function takes the most time.
- Repeat step on with the slowest functions, until you find the bottleneck.
Powershell Hashtable quircks
One of the things I discovered using aforementioned method was in a place I really wasn’t expecting - enumerating through a hashtable. It should be blazingly fast even in Powershell! As it turns out, it can be awfully slow - if you’re not careful enough.
Take a look at these three simple scenarios :
# $h is a hastable of size 10000
$size = 10000
$h = @{
}
for($i = 0; $i -lt $size; $i++) {
$h += @{ "key$i" = "value$i" }
}
measure-function "enumerating $($h.count) items by enumerator" {
foreach ($e in $h.GetEnumerator()) {
$k = $e.key
$v = $e.value
}
}
measure-function "enumerating $($h.count) items by keys" {
foreach ($k in $h.keys) {
$v = $h[$k]
}
}
measure-function "enumerating $($h.count) items with property accessor" {
foreach ($k in $h.keys) {
$v = $h.$k
}
}
$global:perfcounters | format-table -AutoSize -Wrap | out-string | write-host
Each loop is enumerating over a hashtable and accessing stored values. Should be a matter of milliseconds, right? Well, let’s see…
name elapsed count
---- ------- -----
enumerating 10000 items with property accessor 00:00:30.4342957 1
enumerating 10000 items by keys 00:00:00.0479557 1
enumerating 10000 items by enumerator 00:00:00.1173057 1
As it turns out, accessing hashtable keys by property accessor takes ~800 times longer!
At a first glance, I would think that the form $h.$k
would be just a syntactic sugar for $h[$k]
. But it really isn’t (and can’t) be that simple. $k
may not only be a key inside hashtable - it may as well be a property, like Count
or a method like ContainsKey
. So underneath, powershell has to do some really time-consuming stuff, invoking reflection, dynamics, and what not - just to get you a value from hashtable.
The conclusion is simple: if you know you’re working with a potentially big hashtable, don’t go for shortcuts and use plain old $h[$k]
. But if you’re not in a tight loop - just go with what you think is more readable.
Reference:
Measure-Command
- There is also a discussion on powershell hashtable insert.