Movin’ on Up

The time is right to move on to a better blog.    I picked up a sweet domain name and have copied all my previous posts here over to it.   The format is a little bland at the moment, but over time I’ll work on it.   I just wanted somewhere free and simple to post my random thoughts.

The new URL is http://www.coffeeandpizza.net.   If you’re reading this page, I hope you’ll try out the new one.

Sharing with the Community vs. Keeping a Job

Microsoft has done a whole lot to open source their tools to the IT professional community and this is awesome.   Some of the stuff that I’m using pretty regularly includes DSC modules that are available on GitHub for basically nothing.   I can actually modify the code and put in pull requests to get my changes put into the actual main product.

As a Windows guy, this is totally new to me.   When I find something that I don’t like, I can now change it to the way I want it and, if I think others would use it, I can put in a pull request (PR) to get it merged into the real deal.   On the other hand, if I do something that’s just for my own use, I can do it without worrying about breaking the real stuff.   It’s pretty neat and very useful.  I’ve been able to take some things. like the Windows Firewall DSC module, and modify it to give me more power over the rules when I use it.   I’ve even made my own DSC module for configuring Windows Update.

The question really now is, what can I share with the community and what do I need to keep to myself?  Obviously, if there is secret stuff, like passwords or private info, in the scripts, I need to keep that to myself.  But what if absolutely nothing in the code is specific to my employer?

I work for Trek.   I don’t do this kind of work on my own time, and even if I did, I still feel like it’s for Trek.   Every part of improving myself technically or improving my status in the community affects my employment, at least at a high level.   If I was to become the biggest contributor to the DSC modules “on my own time”, would that not benefit Trek?   What if I did something “on my own” and then put it in place at Trek?   Would that suddenly make the work “Trek’s time”?   If I was to build an awesome personal Git repository full of my own work, and never put in the PRs to merge them into the main code, would that be mine?  What if I was to leave Trek?   Would I be able to take all that work with me, thereby benefiting my next employer?

These questions keep me from posting really anything into my own GitHub account or doing PRs for what I do.   I just don’t see where risking my employment or some IP infringement problems is worth it.   I know there are some employers out there who totally encourage giving to the community.   That makes perfect sense, particularly if you’re a technology company running open source stuff all the time or a consulting company making money in open source development and support.

But I work for a bike company.   We have some high-tech stuff, but we aren’t a technology company.   If I’m to ever leave Trek, and my next employer wants me to do similar work to what I’ve already done, then that employer needs to pay for the time it would take me to regenerate it.  If that company is open to having me put everything I work on out in GitHub, I’ll jump on it to do it.   From what I can tell now, though, I can bring my skills and network along with me, but not my old work.

Career Tip: Learn PowerShell

I don’t remember if I’ve posted on this before or not, but it’s worth repeating anyway.   If you are a systems administrator or engineer who works in the Windows environment, you need to learn PowerShell now.   Even if you are primarily a Linux guy, if you work in Windows AT ALL, you need to learn PowerShell.

PowerShell v5 is going to be released basically any day now.   v5 has all kinds of nice features in it, particularly in regards to Package Management.   Windows Server 2016 has a “Nano Server” installation option, which basically is a headless server that can only be managed remotely via PowerShell or other tools.  Windows Server Core, which has been an installation option since Windows 2008, has PowerShell built-in, so you can do anything you want to on those machines from within the shell.

Microsoft is NOT moving away from this direction any time soon.   Jeffrey Snover, the inventor of PowerShell, has recently been promoted to be Distinguished Engineer, architecting everything on the Windows Server and System Center products.   As such, you can expect that PowerShell will become even more ingrained into everything, and all of these products are going to be even more manageable with it.   There are PowerShell modules for Exchange, Active Directory, SQL Server, SCCM, you name it, so a little foundational learning will carry you a long way.

Learn it now.   If you don’t, in 5 years, you’re going to be playing catchup.  In 10 years, you may just be looking for another career.   The GUI on Windows Server is slowly dying, and I don’t think Snover will have it any other way.   When you’re finished learning PowerShell, don’t forget that PowerShell Desired State Configuration (DSC) is available and right around the corner, too, and you’re going to have to skill up on it.   The days of doing server management via RDP or console sessions are over.

Here’s a couple readings and videos to get you started.   These are a little older now, but the concepts still apply.

  • PowerShell v3 Jump Start Class – Taught by Snover himself.   It’s a great start down this path.  I don’t see one for v5, but I’m sure there will be one eventually.
  • The Monad Manifesto – Years ago, Snover published this as a guide to show the direction he thought things should go.   Over time, PowerShell has become what he was calling “Monad” initially.   A Must-Read.
  • The PowerShell Team Blog – Great resource on what’s coming down the pipeline.   If you’re staying on top of this, you’re staying on top of Windows Server, too.

 

The Problems with IT Support

It goes without question that the biggest headache most knowledge workers deal with is the IT department.   Getting basic stuff done, like setting permissions so you can access a file you need or getting your computer fixed, seem to take forever.   I’ve worked in both large and small environments, and in almost every case, working with the IT staff has been a pain in the butt.

Why is this?   It’s because it all starts at the help desk.

The only number most companies give for getting IT support is the help desk phone number, so everyone knows it.   The folks on the IT end of the phone are either entry-level IT pros, old timers who haven’t skilled themselves out of the department, or, worst case, a call center person reading a script to simply fill in a form with your issue.

In only the most common cases (password resets and “reboot your computer”) does the first-level support person fix the issue without further research.  That is, Googling while you are either waiting on the phone or waiting for a callback.   If it’s beyond that, you can bet that the ticket is moving up to an administrator or engineer for further examination.

In the ITIL model, which basically all service desk groups follow, the only person allowed to talk to the end user is the service desk person.    That means that the engineer who works on your ticket isn’t usually allowed to contact you back.   You have to wait for the engineer to get back to the service desk person, who then contacts you.   It’s a back and forth routine that is inefficient and doesn’t work for anyone, other than keeping the engineer from having to actually speak to the end user.  Heaven forbid!

If all of that isn’t enough, you get larger companies having a “service catalog” and a “support form”, which are actually different things.   If you call the help desk to do something like, “I need to borrow a projector,” you get yelled at for not using the service catalog.   In other words, the ITIL process itself makes it such that the end user has a prescribed “right” way to contact IT.

What’s the solution?   Have all the engineers, administrators, and everyone else rotate through help desk duty one week a month.   This puts higher-end knowledge right there and keeps from passing the tickets around, plus forces everyone to remember that the end users really are the important thing.  And when the engineers go back to their “regular” work, they know what the hell has been going on and can look for long-term solutions for the problems.   I can tell you that engineers and admins aren’t digging through support ticket reports to find stuff to work on, as they have enough to do already.  If it directly affects them, though, they’ll definitely try to fix it.

How about when end users call, instead of routing them to the “right” form, filling it out for them and getting it done?  It is not the responsibility of the end user community to contact you the way you want to be contacted.  If you get contacted, respond and do your job.   I understand that there are some forms of communication that work better than others (I don’t like outages to be posted on Yammer at 3am on a Sunday morning, either), and you cannot be responsible 24/7/365 for responding to everything, but the help desk number is already well-known.   Use it.

During business hours, how about having all IT folks available for support?   This doesn’t mean that they are all going to fix the problems, but a warm body answering the phone is definitely better than a scripted call center employee.   At least the IT folks would have a chance of knowing how to route things appropriately, too.

Finally, and this is just a pet peeve of mine, when the user asks for help and doesn’t provide you all the troubleshooting information you think you need to come up with an answer, contact them to get it.   Don’t send them an email with 75 things that they need to provide…you need to call them and walk them through it, or even connect remotely and get the information yourself.   If they knew how to obtain all the troubleshooting information, chances are good that they could read it themselves and fix their own problem.   It is our job to get the information and get things moving.

Just my 2 cents.   Or 3.   Or however much.

 

DB Indexes and the Card Catalog

To start, I’m going to say “indexes” instead of “indices”, even though that drives me crazy.    I’ve read “indexes” in too many books now and I guess it’s right.

I’ve struggled for a while to really grasp how database indexes really worked.   I’ve been searching for some guidance, and everything I find basically tells me to think about a phone book.   That never has seemed to help me, so I decided to think about it like a card catalog.   My intuition was right, at least according to the DBA I asked to confirm my thinking.   Here’s my explanation, for what it’s worth.

If you are old enough, you can remember the card catalog at the library.   It was a big box, or group of boxes, in the room that had a few index cards (nice name) for each book on the shelves.  You could find cards based upon author, title, or subject, so if you wanted a book by Stephen King, you could look under the Ks, find the cards going with that, and then go to the locations and find the books.    You could do the same with title or subject.   The subject one always seemed iffy to me, because I didn’t know if the librarians actually read every book they added and came up with subjects themselves, or if the publisher told them the subjects, or whatever.   It was still better than starting at the front of the library and looking through everything myself.

download

This is, essentially, how a database index works.   Imagine a shelf of 1000 books, each book the same size, built such that each shelf was exactly the right size for 250 books.   Also, suppose that the books are ordered alphabetically by title.   You would have 4 shelves, then, and maybe an empty one on the bottom to add books later.

In this case, if you were looking for a book that started with a G, you could scan through to the Gs and find your book.   Not too hard.   What if you bought a new book that began with an E?   To insert that into the right shelf, it would probably go on the top shelf, and you’d have to move every book to the right and below it, and you’d then have one book, probably beginning with Z, on the once-empty bottom shelf.   What if you get rid of a book beginning with R?    You could just pull it and leave a hole, but then, if you got a new book that didn’t begin with R, you’d have to keep that hole as you shift everything to make room for it.  What a pain.

To make it worse, what if you need to find a book by a specific author?   You would have to start at the top left, and go through the entire bookcase to find the books.   Then you’d have to find the one out of that set that you want to read.   Alternatively, if you only want exactly one item and you don’t care which one, you can just take the first one by that author, but you could feasibly end up scanning the whole bookcase anyway, if the one you want is at the bottom.

You may have a neatly-ordered bookcase, but adding and removing books has proven to be difficult, plus searching by any means other than by title is hard.  This is where an index can help.

First, you have to choose what to index on.   Since my example is a search by author, let’s just use that.   If you created an alphabetical list of authors in your bookshelf, you could then just tell it which row and location the books are in.   For instance, if you have one book by a guy named “James”, you would have an entry for “James” => “Shelf 3, Item 8”, or something like that.

If you got rid of that book by James, you could remove it from the shelf and then, either immediately or at a later time when you can, you would just delete the entry from your index list.   If you move the book to a new location for some reason, you can just update the index accordingly.  Easy-peasy.

A better way, though, to organize your books, would just be to assign them a number when you buy them.   Maybe you have books 1-1000 in no particular alphabetical order, sitting on the shelf in numerical order by this number.   If you want to add a new one, you can just put it on the bottom OR fill in a hole if one has been removed from circulation.   Then, you could have two indexes: one by title and one by author.   In each case, you’d just have a list and the location on the shelf.

This is the basic idea with databases and indexes.   Each table you create should have a primary key, which is really a “clustered index”, and that determines the order the records are stored.   In our example, that numbering 1-1000 is the primary key, and the shelf is ordered accordingly.

“Non-clustered Indexes” are basically just like our two catalogs.  They are sorted by some other column, like title or author, and have a pointer to tell the system where the exact record is stored.   When you add or delete a new record to the table, you simply put it in there wherever, and then immediately update the indexes to show the new entry’s location.   That’s it.    It’s basically instantaneous, so the index is always up-to-date.

Does this cause some issues?   Sure.  Updating stuff in the table means you have to update the indexes on the table, so it makes things a little slower.   Also, if you delete things from the table, you might leave holes in the indexes that need to be reorganized.   This is done normally by rebuilding the index during a maintenance window.

Now, I think it’s safe to say “I Get It”, at least at this high level.   Now maybe I can move on and not be so confused.

 

Jumping and Gifts

Here’s a video I’ve seen on Facebook a few times featuring Steve Harvey, talking about how you need to “jump” to make your life better.    This should be required viewing for everyone.   Ignore the religious parts of this if you want….that’s up to you and I don’t really know if it’s Biblical or not.   The core of this message is the most important message I can give my kids as they grow up.

Just to build on this, I hope to also impart the idea that when you find your gift, not only do you need to run with it, you need to nurture it.  There’s an awesome book called Now, Discover your Strengths, and a followup called Strengths Finder, that throw out a lot of the rules when it comes to personal development.

The main idea in these books is that it doesn’t make any sense for someone to take a listing of their weaknesses and work to make them better.   The better option is to consider your strengths and go out and nurture them.   This is the exact opposite of every corporate individual development plan thing I ever did, at least until I read and worked through these books.   Sure, there are things you have to do in your career that you suck at, and you have to become at least somewhat proficient in them, but, by and large, you should focus on what you’re good at.  Basically, most people like doing things they’re good at and hate doing things they aren’t, so not only could you make more money nurturing your gifts, but you’ll also be happier.

If you’re good logic and doing stuff like programming, you should make a career out of it and try to be the best programmer in the world.  If you enjoy working outdoors and building things, you should go become the best damn carpenter around.   If you enjoy meeting people and talking things up, you should probably be in sales.   If you’re a great public speaker, maybe teaching or the ministry is up your alley.

Too many technical people are put in a career path where they are expected to someday go into management.   Being in IT is pretty nice in that you can make a darn fine living either remaining technical or going into management, but there’s a fork in the road where you kinda need to decide.   I was always on the fence as to what to do, until I read these books.    Afterward, the Strengths Finder made clear to me that my place is as a technical resource.   I don’t like dealing with conflict, I don’t like holding people accountable for their work, and I don’t like being responsible for things that I cannot directly fix.  I DO like learning new things.   I DO enjoy solving problems.   IT is basically a perfect fit.

If I can get my kids to see that they need to jump at some point and that they need to nurture those gifts that they have, I think I’ll have done a pretty good job of parenting.   Even if they end up doing art and living in my basement….

 

Brain Changes

I’ve always heard that as you get older, you start to learn in different ways.   I know that when I was a kid, I could pretty much pick up a book or listen to a lecture and “get it.”   I didn’t need any hands-on kind of experience to learn how something worked.   They say that adults learn by doing, but that never applied.

Now, I don’t think that I need to “do” to learn, but I’ve found that it’s harder to grasp things from just reading some documentation.   For instance, nowadays, if someone gives me a high-level overview of what some software does, I can then take the documentation and be off to the races.   This is exactly what happened this week with Chef Provisioning Services.

I’d seen the software on Github for a while and had no idea what it did.   I could read the main readme file and all that, but I didn’t get it.   The examples of usage were there, but that didn’t make sense to me.  Then, in a call with Steve Murawski, he said, “Have you tried Chef Provisioning Services?   It’s like recipes for deploying VMs in Azure.”

That is all the man had to say.   I know what recipes do, as I’ve worked with Chef on them for over a year.   I know how to deploy VMs in a variety of ways in Azure, so that’s no issue.   Just hearing “it’s like recipes for deploying VMs in Azure” is all it took for me to get it.

I went out to Github immediately and looked at the example.   Well, d’uh…that makes perfect sense now!   I took the example and within an hour I had the example deployment working.   Since then, I’ve changed the template to use a custom image that I have in Azure already, and I’m a stone’s throw away from being able to spin up as many VMs as I want to from the MS gallery Windows 2012 R2 image.

Why didn’t I see this before?   Maybe there’s a section of your brain that makes the clicks when things click, and it eventually moves slower.   Hell, I don’t know.   I just hope it keeps clicking.   🙂

 

More Fun with ARM Templates

All this time, all I’ve wanted to do with an Azure Resource Manager (ARM) template was create a bunch of identical computers.   The biggest roadblock I’ve faced is figuring out how to make the ARM template not one gigantic document with similarly-named resources in it.

Each VM requires a public IP address, a NIC, and a VM resource.   When I’ve put those into a template, if I named everything to make sense to me (like “Computer1” would have “Computer1-IP” and “Computer1-NIC” attached to it), the document gets really hard to follow.    I can copy and paste the definitions for these three objects, but eventually it gets ugly and confusing, and when something doesn’t work, it’s a pain to work around.

Enter the “count” parameter!   There’s a neat way that you can tell the ARM engine to iterate through a list of resources and create however many you want it to.   The engine breaks down the JSON file and duplicates the guts of it “n” times, assigning a suffix with the number on it at the end of the item names.

Below is my ARM template that creates as many VMs as you want, each with a public IP address and a NIC.

———————

{
“$schema”: “https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#”,
“contentVersion”: “1.0.0.0”,
“parameters”: {
“userImageStorageAccountName”: {
“type”: “string”,
“metadata”: {
“description”: “This is the name of the your storage account”
}
},
“userImageStorageContainerName”: {
“type”: “string”,
“metadata”: {
“description”: “This is the name of the container in your storage account”
}
},
“userImageVhdName”: {
“type”: “string”,
“metadata”: {
“description”: “This is the name of the your customized VHD”
}
},
“adminUserName”: {
“type”: “string”,
“metadata”: {
“description”: “UserName for the Virtual Machine”
}
},
“adminPassword”: {
“type”: “securestring”,
“metadata”: {
“description”: “Password for the Virtual Machine”
}
},
“osType”: {
“type”: “string”,
“allowedValues”: [
“windows”,
“linux”
],
“metadata”: {
“description”: “This is the OS that your VM will be running”
}
},
“vmSize”: {
“type”: “string”,
“metadata”: {
“description”: “This is the size of your VM”
}
},
“vmNameBase”: {
“type”: “string”,
“metadata”: {
“description”: “This is the size of your VM”
}
},
“vmCount”:{
“type”: “int”,
“defaultValue”: 1,
“metadata”: {
“description”: “The number of VMs to build.”
}
}
},
“variables”: {
“location”: “[resourceGroup().location]”,
“virtualNetworkName”: “VNetName”,
“addressPrefix”: “10.6.0.0/16”,
“subnet1Name”: “default”,
“subnet1Prefix”: “10.6.0.0/24”,
“publicIPAddressType”: “Dynamic”,
“vnetID”: “[resourceId(‘Microsoft.Network/virtualNetworks’,variables(‘virtualNetworkName’))]”,
“subnet1Ref”: “[concat(variables(‘vnetID’),’/subnets/’,variables(‘subnet1Name’))]”,
“userImageName”: “[concat(‘http://’,parameters(‘userImageStorageAccountName’),’.blob.core.windows.net/’,parameters(‘userImageStorageContainerName’),’/’,parameters(‘userImageVhdName’))]”
},
“resources”: [
{
“apiVersion”: “2015-05-01-preview”,
“type”: “Microsoft.Network/publicIPAddresses”,
“name”: “[concat(parameters(‘vmNameBase’),copyIndex(),’-PublicIP’)]”,
“copy”: {
“name”: “publicIPAddressCopy”,
“count”: “[parameters(‘vmCount’)]”
},
“location”: “[variables(‘location’)]”,
“properties”: {
“publicIPAllocationMethod”: “[variables(‘publicIPAddressType’)]”,
“dnsSettings”: {
“domainNameLabel”: “[concat(parameters(‘vmNameBase’),copyIndex())]”
}
}
},
{
“apiVersion”: “2015-05-01-preview”,
“type”: “Microsoft.Network/networkInterfaces”,
“name”: “[concat(parameters(‘vmNameBase’),copyIndex(),’-NIC’)]”,
“location”: “[variables(‘location’)]”,
“copy”: {
“name”: “networkInterfacesCopy”,
“count”: “[parameters(‘vmCount’)]”
},
“dependsOn”: [
“[concat(‘Microsoft.Network/publicIPAddresses/’, parameters(‘vmNameBase’),copyIndex(),’-PublicIP’)]”
],
“properties”: {
“ipConfigurations”: [
{
“name”: “ipconfig1”,
“properties”: {
“privateIPAllocationMethod”: “Dynamic”,
“publicIPAddress”: {
“id”: “[resourceId(‘Microsoft.Network/publicIPAddresses/’,concat(parameters(‘vmNameBase’),copyIndex(),’-PublicIP’))]”
},
“subnet”: {
“id”: “[variables(‘subnet1Ref’)]”
}
}
}
]
}
},
{
“apiVersion”: “2015-06-15”,
“type”: “Microsoft.Compute/virtualMachines”,
“name”: “[concat(parameters(‘vmNameBase’),copyIndex())]”,
“copy”: {
“name”: “vmCopy”,
“count”: “[parameters(‘vmCount’)]”
},
“location”: “[variables(‘location’)]”,
“dependsOn”: [
“[concat(‘Microsoft.Network/networkInterfaces/’,parameters(‘vmNameBase’),copyIndex(),’-NIC’)]”,
“[concat(‘Microsoft.Network/publicIPAddresses/’, parameters(‘vmNameBase’),copyIndex(),’-PublicIP’)]”
],
“properties”: {
“hardwareProfile”: {
“vmSize”: “[parameters(‘vmSize’)]”
},
“osProfile”: {
“computername”: “[concat(parameters(‘vmNameBase’),copyIndex())]”,
“adminUsername”: “[parameters(‘adminUsername’)]”,
“adminPassword”: “[parameters(‘adminPassword’)]”
},
“storageProfile”: {
“osDisk”: {
“name”: “[concat(parameters(‘vmNameBase’),copyIndex(),’-osDisk’)]”,
“osType”: “[parameters(‘osType’)]”,
“caching”: “ReadWrite”,
“createOption”: “FromImage”,
“image”: {
“uri”: “[variables(‘userImageName’)]”
},
“vhd”: {
“uri”: “[concat(‘http://’,parameters(‘userImageStorageAccountName’),’.blob.core.windows.net/vhds/’,parameters(‘vmNameBase’), copyIndex(),’-osDisk.vhd’)]”
}
}
},
“networkProfile”: {
“name”: “[concat(parameters(‘vmNameBase’),copyIndex(),’-networkProfile’)]”,
“networkInterfaces”: [
{
“id”: “[resourceId(‘Microsoft.Network/networkInterfaces/’,concat(parameters(‘vmNameBase’),copyIndex(),’-NIC’))]”
}
]
},
“diagnosticsProfile”: {

“bootDiagnostics”: {
“enabled”: “true”,
“storageUri”: “[concat(‘http://’,parameters(‘userImageStorageAccountName’),’.blob.core.windows.net’)]”
}
}
}
}
]
}

——————————————————-

The “copyIndex()” command is what tells the ARM engine to use the value from the loop.   Using “DependsOn” with this, means that Computer1 will depend upong Computer1-NIC and Computer1-PublicIP, so nothing will be built if the other parts aren’t ready for it.

This template is setup to be used with a user image, something custom that I’ve put up there that’s already sysprepped and ready-to-go.  I haven’t yet made this add the resources for the extensions or gotten it to join the domain after being built, but at least I have the VMs running now.

Network Security Groups

Two posts in a day?   Yes….because I’m afraid if I wait until Monday, I’ll forget this stuff.

I’ve been working with Azure Network Security Groups (NSG) today.   For the uninitiated, imagine firewall rules that you can apply to subnets within your Azure network or to specific VMs.   NSG can basically replace all the ACLs on your VM endpoints, allow you to more fully control network traffic between your subnets, and can give you granular control on your VMs.   I envision replacing all of my endpoint ACLs and all of my Windows Firewall configuration with these, once I get them right.

What did I learn today?

  1. The moment you apply an NSG to a subnet, the endpoint ACLs on any VMs within the subnet are basically deemed useless.  If you have an endpoint ACL that restrict connections, as soon as you turn on the NSG, if you don’t have the restriction configured, your connection is wide open to the Internet.
  2. Each rule is made up of a Name, Direction, Action, Source Address, Source Port, Destination Address, Destination Port, and Protocol.   This is like every other firewall rule I’ve ever put in.   One tip here, though, is that on Inbound rules, if you’re wanting to open, say, port 443, you need to put that on the Destination Port.   Leave Source Port at “Any” or “*”.    I was putting the ports in both parts, but it wouldn’t let inbound connections work at all that way.
  3. The new Azure portal is pretty nice in that you can open a subnet, see all the VMs in the subnet, and then use that to browse into the endpoints.    Much easier than in the old portal, which I don’t think could show you what VMs were on what subnets at all.
  4. Reiterated to me that you always do staging first.   I left off one endpoint from one NSG and I basically crippled SQL connections to a db server there.   Do your homework first and be careful.

These worked as advertised and I think it’s going to be the right path for me.   I’m still trying to figure out how to do things on VM-level NSGs for stuff like “RPC Dynamic Ports” and ICMP, but at least I can do some network-level stuff now.    Pretty cool.

End goal will be to have the subnet NSG and some templated VM NSG in such a condition that I can just add a VM, assign it to the right subnet, give it a VM NSG, and never have to worry about Windows Firewall on the machine.  Hell, I could even disable Windows Firewall altogether if I’m lucky.   Good times.

Formatting Disks After Deployment

I don’t usually post code snippets or whatever, but I think it’s time for me to do so.   I have some sweet little ones here and there, and I might as well publish them this way so others can use them if they need them.

One of the first difficulties I had when deploying new VMs was formatting all the disks that I attached to the machine after it was deployed.   In Azure, the system disk always formats to be the C: drive, and there’s always a D: drive attached for “temporary storage”.    However, as I was automating my server builds, I didn’t really have a way to format additional data disks that I attached.

Enter the Custom Script Extension.   This extension allows you to push a PowerShell script into a VM and have it just run.   It’s pretty slick in that it runs locally, just as if you were logged into the machine, and it puts some nice logging under

“C:\WindowsAzure\Logs\Plugins\Microsoft.Compute.CustomScriptExtension”

so you can find out what happened during the execution.

In my deployment script, after the VM is built, I simply pass in the following script to format all the additional drives and assign them the next available drive letters.

----------------------------------------------
Function SetupDisks()
{
 # This will initialize and format any uninitialized disks on the system and assign the next available drive letters to them
 
 $diskstoformat = get-disk | where-object {$_.numberofpartitions -eq 0} | sort {$_.number}
Foreach($disk in $diskstoformat)
 {
 $disknum = $disk.number
 $label = 'Data'+$disknum
$disk | Initialize-Disk -PartitionStyle MBR -ErrorAction SilentlyContinue

new-partition -disknumber $disknum -usemaximumsize -assigndriveletter:$False
$part = get-partition -disknumber $disknum -number 1
if($disknum -gt 1)
{
$part | Format-Volume -FileSystem NTFS -NewFileSystemLabel $label -AllocationUnitSize ‘65536’ -Confirm:$false -Force
}
else
{
$part | Format-Volume -FileSystem NTFS -NewFileSystemLabel $label -Confirm:$false -Force
}
end
$part | Add-PartitionAccessPath -AssignDriveLetter
}
}
———————————————
Taking a look at this, here’s what’s happening:

  1. First, I pull all the disks that have no partitions.
  2. For each one, I assign it a number and label, then initialize it.
  3. Next I create a single maximum-sized partition on the disk.
  4. If I’m on the first disk, I just format it with normal NTFS ….otherwise, I change the allocation size to 65536 to make better use of the disk.   (This isn’t necessary generally, but I do it because only my SQL VMs have more than one data disk, and I use those to store my data, logs, and backups.    My “standard” VM has an attached data disk, which I don’t use the allocation on.)
  5. Finally, I assign a drive letter.

This works every time.   What took me the longest was figuring out how to get the disks to initialize in a reliable way.   Once I got the initialization down, it was pretty straightforward.