search
Categories
Sponsors
VirtualMetric Hyper-V Monitoring, Hyper-V Reporting
Archive
Blogroll

Badges
MCSE
Community

Cozumpark Bilisim Portali
PowerShell Performance Tips for Large Text Operations – Part 1: Reading Files
Posted in Windows Powershell | 4 Comments | 7,792 views | 28/02/2015 04:08

I want to give some performance tips for large text operations on PowerShell.

Test File: 424390 lines, 200 MB Microsoft IIS Log

1. First of all, we have to read file :) Lets try our alternatives:

a. Native command: Get-Content

1
2
3
4
5
6
7
8
9
$LogFilePath = "C:\large.log"
$Lines = Get-Content $LogFilePath
[int]$LineNumber = 0;
 
# Read Lines
foreach ($Line in $Lines)
{
	$LineNumber++
}

If I use this option, script takes: 13.3727013 seconds to read and loop in 424390 lines.

But how about memory usage?

getcontentmemory

Get-Content stores file into memory, so it’s normal to see high memory usage.

b. Using .Net method: [io.file]::ReadAllLines

1
2
3
4
5
6
7
8
9
$LogFilePath = "C:\large.log"
$Lines = [io.file]::ReadAllLines($LogFilePath)
[int]$LineNumber = 0;
 
# Read Lines
foreach ($Line in $Lines)
{
	$LineNumber++
}

In this option, script takes: 2.0082615 seconds to read and loop in 424390 lines which is extremely fast instead of Get-Content.

Memory usage is less than Get-Content but still too much. Also I can’t capture it but CPU is max 13%.

iofilememory

c. Using .Net method: System.IO.StreamReader

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$LogFilePath = "C:\large.log"
$FileStream = New-Object -TypeName IO.FileStream -ArgumentList ($LogFilePath), ([System.IO.FileMode]::Open), ([System.IO.FileAccess]::Read), ([System.IO.FileShare]::ReadWrite);
$ReadLogFile = New-Object -TypeName System.IO.StreamReader -ArgumentList ($FileStream, [System.Text.Encoding]::ASCII, $true);
 
[int]$LineNumber = 0;
 
# Read Lines
while (!$ReadLogFile.EndOfStream)
{
	$LogContent = $ReadLogFile.ReadLine()
	$LineNumber++
}
 
$ReadLogFile.Close()

If I use this option, script takes: 1.7062244 seconds to read and loop in 424390 lines. This seems fastest method.

Also memory usage is too low because it reads file line by line. So PowerShell doesn’t hold file in memory.

streammemory

But in this case, CPU usage is still too high. Probably it’s killing server’s one core at running time. But it’s something that I can’t help :)

Winner: System.IO.StreamReader

In next part, I’ll show you text manipulation tips. See you.


Comments (4)

Duane

January 29th, 2016
02:26:16

Dude, you are not kidding, System.IO.StreamReader was crazy fast! Thank you for this!


Mauricio

January 1st, 2018
01:30:23

Excellent post.
Thank you so much!!!


Faisal

January 15th, 2020
16:50:45

I’m a newbie to powershell and programming in general. Cam someone guide on how I can convert an very large XML file with UTF8 encoding using powershell?


Mark

February 27th, 2020
20:45:40

Try $count = 0; switch -File $filepath { default { ++$count } } … it is faster in my tests. :)



Leave a Reply