search
Categories
Sponsors
VirtualMetric Hyper-V Monitoring, Hyper-V Reporting
Archive
Blogroll

Badges
MCSE
Community

Cozumpark Bilisim Portali
PowerShell Performance Tips for Large Text Operations – Part 1: Reading Files
Posted in Windows Powershell | 1 Comment | 2,948 views | 28/02/2015 04:08

I want to give some performance tips for large text operations on PowerShell.

Test File: 424390 lines, 200 MB Microsoft IIS Log

1. First of all, we have to read file :) Lets try our alternatives:

a. Native command: Get-Content

1
2
3
4
5
6
7
8
9
$LogFilePath = "C:\large.log"
$Lines = Get-Content $LogFilePath
[int]$LineNumber = 0;
 
# Read Lines
foreach ($Line in $Lines)
{
	$LineNumber++
}

If I use this option, script takes: 13.3727013 seconds to read and loop in 424390 lines.

But how about memory usage?

getcontentmemory

Get-Content stores file into memory, so it’s normal to see high memory usage.

b. Using .Net method: [io.file]::ReadAllLines

1
2
3
4
5
6
7
8
9
$LogFilePath = "C:\large.log"
$Lines = [io.file]::ReadAllLines($LogFilePath)
[int]$LineNumber = 0;
 
# Read Lines
foreach ($Line in $Lines)
{
	$LineNumber++
}

In this option, script takes: 2.0082615 seconds to read and loop in 424390 lines which is extremely fast instead of Get-Content.

Memory usage is less than Get-Content but still too much. Also I can’t capture it but CPU is max 13%.

iofilememory

c. Using .Net method: System.IO.StreamReader

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$LogFilePath = "C:\large.log"
$FileStream = New-Object -TypeName IO.FileStream -ArgumentList ($LogFilePath), ([System.IO.FileMode]::Open), ([System.IO.FileAccess]::Read), ([System.IO.FileShare]::ReadWrite);
$ReadLogFile = New-Object -TypeName System.IO.StreamReader -ArgumentList ($FileStream, [System.Text.Encoding]::ASCII, $true);
 
[int]$LineNumber = 0;
 
# Read Lines
while (!$ReadLogFile.EndOfStream)
{
	$LogContent = $ReadLogFile.ReadLine()
	$LineNumber++
}
 
$ReadLogFile.Close()

If I use this option, script takes: 1.7062244 seconds to read and loop in 424390 lines. This seems fastest method.

Also memory usage is too low because it reads file line by line. So PowerShell doesn’t hold file in memory.

streammemory

But in this case, CPU usage is still too high. Probably it’s killing server’s one core at running time. But it’s something that I can’t help :)

Winner: System.IO.StreamReader

In next part, I’ll show you text manipulation tips. See you.


Comments (1)

Duane

January 29th, 2016
02:26:16

Dude, you are not kidding, System.IO.StreamReader was crazy fast! Thank you for this!



Leave a Reply