Saturday, March 10, 2012

Greening the IT landscape with an eight year desktop replacement plan

At some point soon after being hired in my current position I was told that we had a five year desktop replacement schedule but basically it was a verbal agreement that was made at some point in the past between parties that were no longer involved in the process. While attempting to maintain that replacement schedule I was asked to justify it in order to obtain funding.  In the ensuing conversations it became clear that what the business needed was computers that function reliably allowing employees to be productive, not computers less than five years old.  Once I realized that I was not really going to be able to maintain a five year replacement schedule I decided to determine what replacement schedule would meet the business requirements while maintaining the highest possible quality of service.  I was presented with something that a friend of mine would call an "enabling constraint".  The denial of funding actually served as a catalyst to reevaluate the existing assumptions about hardware refresh cycles in my own mind as well as within the organization and the larger IT community.  Ultimately I created a plan for implementing an eight year PC lifecycle that I can manage, that the business can fund and that also has the added benefit of being more environmentally sustainable than a shorter hardware refresh cycle. 

I used the concepts of systems administration philosophy as a basis for creating a plan for an eight year replacement schedule and I will also need to follow them to successfully execute it.  This is how they are presented on the Red Hat website:  "document everything", "know your resources", "know your users", "know your business", "plan ahead" and "expect the unexpected".

Document Everything:

One of the tenets of Systems Administration philosophy is "document everything".  This applies as much to building a business case for making a purchase as it does for configuration change management.  Creating a document or actually a handful of documents that outlines your long term hardware replacement plan is essential to being able to follow through and stick to that schedule.  In addition to being necessary for managing short term processes, documenting everything is essential to creating long-term IT strategy in an organization.  The process of documentation should not only record the current process, it should also serve as a catalyst for thinking critically about the process being documented. 

Know your resources:

I have an inventory of PC's that is kept current by a centralized monitoring and management tool, so I know what hardware I have and can keep track of when it needs to be replaced or upgraded.  The limiting constraint will be knowing what financial resources I have at my disposal during a give time period and allocating them appropriately to meet the goals that are documented in the eight year plan.  Determining what to upgrade or replace and when will be determined by knowing my users and knowing my business.

Know your users and know your business:

In our business, like many, we are using computers for a variety of functions by a variety of people.  Some functions require more computing resources than others and people use their computers in different ways.  People that are content producers, like people in the marketing department that use their computers to edit graphics and photos, will require a computer with more resources than somebody that is primarily a content consumer, reading documents and using web-mail.  Scientists that are analyzing large data sets will require more resources than somebody doing word processing.  In our business I know that we have users and applications that fall everywhere along the spectrum.  Over time I can rotate new equipment into the positions where they are used for applications that will require more resources while the older equipment can be used where fewer resources are required.

Plan ahead:

Once the applications and required resources are documented, a replacement schedule can be created.  With the replacement schedule in hand a budget can be made, resource acquisition can be planned, and management buy-in can be sought.  Planning ahead will also allow for communication with the departments or individuals whose machines will be replaced in a given time frame.  Planning and scheduling the upgrades well in advance will allow for the changes to be managed  efficiently and for requirements and expectations to be managed effectively.  Planning changes well in advance makes the transitions easier for everyone involved from management and finance to IT staff and end-users.

Expect the Unexpected:

There will be hardware failures, this is just a fact of life.  All systems tend toward entropy and computers, no matter how well designed and well built, are not exempt from the laws of physics.  Hardware failures will happen and when they do spare machines will need to be available so that the business impact of the failure will be minimized.  Spare parts will also need to be kept on hand for some of the minor failures where a full computer replacement would be wasteful.

Additional considerations:

Another thing to consider when planning to keep hardware longer is maintenance.  Rather than simply moving the PC to a new user as-is at the 4 year mark, the computer can be given a tune up before being given to the next person with no down-time to end users.  The computer can be re-imaged to a clean basic configuration and additional RAM can be added at a relatively low cost if necessary.  With the right free and open source tools in place this can be accomplished with little additional expenditure of both financial and time resources.

Greening the IT landscape:

In addition to providing the business with a desktop replacement plan that meets the financial goals of the organization by reducing annual desktop replacement cost by 27%, this plan also contributes to resource conservation in a more global way.  By increasing the working life of each piece of equipment, fewer natural resources will need to be consumed over time to perform computing functions for the business.  Many hardware vendors have "green computing" marketing campaigns that advertise the energy efficiency of their new hardware, but not a single one of them advocates buying fewer computers and keeping them for longer periods of time as a way to decrease the environmental impact of IT infrastructure.  The total number of computers consumed in a given period can be reduced by 36% by increasing the planned hardware refresh interval from 5 years to 8 years.  This change is not only good for the business' bottom line, but for the environment as well.

Saturday, March 3, 2012

PowerShell scripting for automation and documentation

In a previous post I shared a PowerShell script that automated finding folder sizes.  That script was meant to be reused as-is by having parameters passed to it when it is called from a PowerShell window.  The script that I'm posting today is different.  This one is meant to not only automate a process but document what I did at a particular time.  I cover two rules of systems administration in one script "automate everything" and "document everything".  I save a copy of these one-time scripts to use as a template when I need to perform the same operation under different conditions.  They also serve as a record of how I did something in a particular instance because I save them as part of my change management documentation.  Here's a sanitized version of the script that I have saved as my template.

# PowerShell Script
# Name:    robocopyFilesToArchive.ps1
# Purpose: use robocopy to move files from one drive to another
#          and keep a log of the move
# Author:  Matthew Sanaker,
# Date:    3/2/2012
# USAGE:   this is a one-time script written to automate and document this process
#          to reuse it save it with a new name and change variables as necessary

$date = Get-Date;
$year = $date.Year;
$month = "{0:D2}" -f $date.Month;
$day = "{0:D2}" -f $date.Day;
$timestamp = $year.ToString() + $month.ToString() + $day.ToString();

$source = "D:\Data";
$destination = "E:\Archives\Data";
$logFolder = "E:\robocopyLogs";
$logFile = $logFolder + "\" + "robocopyData_$timestamp.log";

if (!(Test-Path $logFolder))
New-Item $logFolder -type directory;

if (!(Test-Path $destination))
New-Item $destination -type directory;

robocopy $source $destination /E /ZB /COPY:DATOU /MOVE /R:2 /W:1 /LOG+:$logFile /NP /TEE;

What the script does:

The first block of code retrieves the current date from the system, takes it apart and reassembles it into a time stamp string variable that resembles a conventional BIND DNS serial number.  I like to use these serial numbers at the end of configuration files, log files and other documents to keep them simultaneously sorted by name and date when displayed in the filesystem.  First I use Get-Date to build my $date object.  I then get the year, month and day by turning those properties from my $date object into new objects.  I use .Net number formatting when creating the month and day objects so that I always have two digit months and days.  ' "{0:D2}" -f ' tells PowerShell take object 0 and display it with 2 digits of precision adding leading zeros if necessary. (follow footnote 1 for more options and information)

In the next block I create the variables that I pass to robocopy to move my data.  Those are self-explanatory.  The next two blocks test the paths for both my log file and the destination of my data and create them if they don't exist.  Since robocopy won't create folders for either the destination that you specify or the folder you want your log to be written to, you have to make sure that they exist before you run robocopy otherwise robocopy with throw an error and exit.  In PowerShell, the "!" acts to negate a statement.  Test-Path returns a boolean, so if a file path exists, it returns true otherwise it returns false.  So the statement "if (!(Test-Path $destination))" says "if the folder $destination doesn't exist".  If the if statement evaluates the condition in parenthesis and it is true, PowerShell executes the code between the curly braces.  If the condition in parenthesis is false PowerShell skips the code in the curly braces and moves on.  The statement in the curly braces in each block creates the folder in question.

Once everything is set, robocopy is executed.  Robocopy has a lot of options, I will explain the ones that I have chosen.  $source and $destination are obvious.  "/E" says include empty subfolders.  "/ZB" help recover from errors where a copy gets interrupted or access is denied.  /COPY:DATOU says copy the data, file attributes, timestamp, owner and audit information.  I chose not to copy permissions because I was moving this data to a read-only archive.  /MOVE says copy both files and folders and delete them from the source when finished.  /R and /W are retry options, I specified retry twice and wait 1 second between retries.  Since I wanted to log robocopy's success or failure, I choose /LOG+ to append output to my specified log file, /NP (no progress) so that the file is actually readable and /TEE so that I can also see the output in the shell while it's happening.

Of course robocopy could have been called from a single line right in PowerShell or a cmd window.  The log file alone could serve as good documentation and I always open them up and search for the word "error", but having both files makes the documentation more complete.


Thursday, March 1, 2012

Getting folder sizes with PowerShell

Today was a good day.  I had an excuse to do some scripting.  One of the Systems Administration rules to live by is "Automate Everything".  Code is reusable, time spent clicking buttons in a GUI to get information is just that, time spent.  Time invested in writing a script to get information for you in a way that is repeatable is time invested.  It may seem like the same amount of time the first time around, but it will pay dividends the next time you don't have to spend time at the GUI clicking, not to mention that you can easily capture the information that you are looking for for further analysis.  Another nice thing about scripting is that you can schedule the collection of information and have it delivered to you.

PowerShell is an interesting environment.  It reminds me somewhat of a Linux shell and it can be scripted like Bash and Perl.  That's what the PowerShell developers were going for, I know, but it does make it much more likeable and useable than batch files and vbScript.  The other thing that I like about PowerShell is that it can easily give me access to .Net namespaces and their properties and methods.  I'm no C# programmer, but I've played with it a bit and it seems that PowerShell is the scripting version of C#.

So my task for today was to get the sizes of a bunch of folders on one of my file servers.  I wanted to know which ones would give up the most space if they were moved to another virtual-disk on that virtual machine.  I had one hard disk that was getting full and I would rather create another disk and move folders to an "archive" location than grow the disk or add another disk under a mount point.  Luckily for me the folders I am dealing with are already sorted by year, so it's just a matter of going back far enough to get the space I want without moving newer files that would inconvenience my users.

So I put together a little script, tested it and when I was satisfied that it would do what I wanted I set it running and went to lunch.  When I got back I had the data that I wanted.  The primary function in the script was something that I came across a couple of years ago and tucked away in a code snippet file.  I honestly can't remember where I found it otherwise I'd give credit where credit is due.  Here is the script:

# PowerShell Script
# Name:    CalculateFolderSize.ps1
# Purpose: calculate the size of a folder and its subfolders
#          and return data in csv format
# Author:  Matthew Sanaker,
# Date:    3/1/2012
#    USAGE:  CalculateFolderSize.ps1 calculates the size of a folder and its subfolders
#            the root folder and output file are passed as command-line arguments
#            data is returned in two comma delimited fields: folder name, size in GB
#            the output is one row per subfolder 


param (
[parameter(Mandatory = $true)][system.IO.DirectoryInfo]$folder,
[parameter(Mandatory = $true)][string]$outFile,

[parameter(Mandatory = $false)][switch]$help

$showHelp = " `
    USAGE:  CalculateFolderSize.ps1 calculates the size of a folder and its subfolders
            the root folder and output file are passed as command-line arguments
            data is returned in two comma delimited fields:  folder name, size in GB
            the output is one row per subfolder
            example:  ./CalculateFolderSize.ps1 -folder C:\Data -outFile C:\dataSize.txt"
if ($help)

function Get-DirSize {
    param ([system.IO.DirectoryInfo] $dir)
    [decimal] $Size = 0;

    $files = $dir.GetFiles();
    foreach ($file in $files)
        $size += $file.Length;

    $dirs = $dir.GetDirectories()
    foreach ($d in $dirs)
        $size += Get-DirSize($d);
    return $Size;

    $subDirectories = $folder.GetDirectories()

    foreach ($dir in $subDirectories)
        $size = Get-DirSize $dir;
        $GB = $size / 1GB;
        $foldername = $dir.FullName;
        $foldername + "," + $GB | Out-File -FilePath $outFile -Append -NoClobber;
    "Something does not compute, please check your input"
    "PowerShell Error Message: `

I'll now to go over this a bit to explain what I did and why, so that if it doesn't suit your particular needs you will have a good idea of where to start taking it apart and changing it.

I always start my scripts out from a template with a header and I like to set parameters from the command line instead of hard-coding things.  Setting "[parameter(Mandatory = $true)]" will cause PowerShell to prompt the user for parameters if they are not given when the script is called.  The first parameter, "[system.IO.DirectoryInfo]$folder" is parent folder that you want to start searching from.  I used the .Net class as the object type because it seemed more straight-forward than getting input as a string and converting it later to the object type that I want to work with.  The second parameter is "[string]$outFile" which I will use as the name of the output file.  The next bit is "[switch]$help" which I like to use to pass a friendly help message.

The function that does all of the work is "Get-DirSize" which uses the "system.IO.DirectoryInfo" .Net class to work with the filesystem.  For some reason the "Get-ChildItem" cmdlet doesn't give you folder sizes in an intuitive way, so using the .Net class is actually more direct.  First each folder is entered and all of the files lengths are added up, then each subfolder is entered and the function is run recursively adding the file lengths to the over-all size that is kept in the variable "$size" which is finally returned as the value of the function.

First I pass our "$folder" parameter to the "system.IO.DirectoryInfo.GetDirectories()"method to get our list of subfolders that I keep in the variable "$subDirectories".  I then run a "foreach" loop over the array of sub-directory objects which processes each folder through the Get-DirSize function to return the size of each folder.  Before going on to the next sub-directory object I convert the size which is returned in bytes to something more useful, which for me today was gigabytes by saying "$GB = $size / 1GB".  Next I pull the name of the folder including the path using the "system.IO.DirectoryInfo.FullName" property of the sub-directory object.  Finally I concatenate the folder name, a comma and the folder size and pipe them out to "OutFile" passing it the "$outfile" parameter that I specified on the command line with the "-Append" and "-Noclobber" switches so that when my script is done I have a nice little csv file.  The last thing that I want to point out is the error handling.  It's just a simple "try" and "catch" block which could give you the opportunity to attempt to do something other than fail with an error.  In this case I display the error, show the help message and exit the script.  For a small script like this it's probably overkill, but I keep it in my template as a reminder for when I want to do something more complicated.  Now I can open that file in Excel to run auto-sum on all of the folder sizes, strip off meaningless decimal places or otherwise manipulate my data to get the answer that I'm looking for.