February 5, 2012

Setup Guide for Multiple NetApp ONTAP 8.1 Simulators

I have written previously on deploying the ONTAP simulator on a vSphere host, and they seem to be some of my more popular posts.  Now that the ONTAP 8.1 simulator has been released, I thought I would do an updated post that is a little more comprehensive than my previous posts.

In my most recent post on this topic, I ran through the configuration I use to optimize usable space and getting necessary license keys installed.  I wanted to build on that for the 8.1 simulator, and specifically about getting multiple simulators to work with OnCommand.  If you install multiple copies of the simulator and then try to add them into the NetApp Management Console, you will get an error similar to this:

Even though they are separate virtual machines and have different IP addresses and hostnames, their simulator system id’s are identical so the NMC thinks it’s a duplicate host you are trying to add.  Luckily it is possible to change the serial number and system id so we can get multiple simulators added in and be able to utilize things like Protection Manager and Provisioning Manager.  The easiest way to do this is to change these values before you run the setup so you don’t have to reassign the disks.

To begin, I still utilize VMware converter to bring it into my vSphere environment.  There are other methods out there but this one has worked well for me.  I won’t screenshot the entire process as it’s mostly just taking the defaults and deciding what name you want to use etc, but here is the summary screen:

Summary screen for VMware Converter

I change the disk type to be thin provisioned instead of thick, I change the NICs to be the relevant networks for my lab, and finally I didn’t set it to power on just because I wanted to make sure I boot it into maintenance mode to change the serial number as soon as it boots up.  When it boots press a key other than enter to break the boot, then run the following commands to make this simulator unique (screenshot below):

set bootarg.nvram.sysid=1111111101
set SYS_SERIAL_NUM=1111111101
boot

Interrupt boot process to change serial number

These commands were documented in this post on the NetApp Communities, and I’ve followed the same pattern – 8 1′s with a unique two character string at the end that matches the hostname (e.g. STO-FAS1 is 01 STO-FAS2 is 02 etc)

After you hit enter to boot the simulator, you need to go into maintenance mode to wipe the config and setup the simulator, you need to hit control-C to bring up the boot menu and then option 4 for wiping the configuration:

Boot menu options

Once you run through the wizard, this is the config I apply (I usually use a text snippet to insert the following into  my SSH session):

options security.passwd.rules.enable off
snap reserve -A aggr0 0
snap sched -A aggr0 0
options autosupport.enable off
aggr options aggr0 raidsize 28
disk assign all
license add DZDACHD
license add PZKEAZL
yes
license add NAZOMKC
license add ANLEAZL
license add BSLRLTG
license add NQBYFJJ
license add ELNRLTG
license add MTVVGAF
license add BQOEAZL
license add RKBAFSN
license add HNGEAZL
license add BCJEAZL
license add DFVXFJJ
license add XJQIVFK
license add DNDCBQH
license add JQAACHD
license add ZYICXLC
license add PVOIVFK
license add PDXMQMI
license add RQAYBFE
license add ZOFNMID
license add ZOPRKAM
license add RIQTKCL
ndmpd on
options nfs.export.auto-update off

I turn off the password rules since this is just my lab and I typically use a very easy password (yep, you probably already guessed it) for most of the lab stuff that can only be accessed from inside.

Next step is to add some more disks in the simulator, this info was found on this thread over on the NetApp Communities.  Not all of the commands worked for me, in part 2 step 3 I wasn’t able to successfully enter those commands.  It mentions that was a glitch in the way the program was complied so I’m just guessing that it may now be resolved and not necessary now as without it I still was able to add disks without issue.

priv set advanced
useradmin diaguser unlock
useradmin diaguser password

Enter a password to use for the diaguser (again, it’s my lab so I use a relaxed password)

systemshell

Login with the user diag and the password you just created

setenv PATH "${PATH}:/sim/bin"
cd /sim/dev
sudo makedisks.main -n 14 -t 23 -a 2
sudo makedisks.main -n 14 -t 23 -a 3
exit
useradmin diaguser lock
priv set admin
reboot

After the system comes back up, login and we can assign the new disks

disk assign all

If you run into an error where you get a bad disk label, that can be easily fixed by:

aggr status -f

This output will give you the list of the failed disks, make note of the disk ids – they should be similar to v6.32

priv set advanced
disk unfail -s v6.32

At this point you should have a total of 56 disks on the simulator, with an aggregate raid group size of 28. I added 52 of the disks into my aggregate to leave 1 as a spare so I don’t constantly get errors about low spare count (you can disable the option to warn you about low spares but unfortunately that only works on systems with 16 disks or less).

aggr add aggr0 52

Now you should have the first simulator completed, I took a VM snapshot at this point to be able to revert to after I do my testing.  Next up is configuring the second simulator, obviously the VMware Converter steps are basically identical except for the VM name so I won’t repeat that part.  The only thing to be aware of is when you first power on the VM, press a key other than enter just like you did previously so we can set a different serial number:

 

Configuring unique serial number and system id on second simulator instance

Run through the rest of the configuration steps above and now you should be able to add both simulators into the NetApp Management Console:

Both simulators showing within the NetApp Management Console. Click for larger image.

Hopefully that helps you get started with the ONTAP 8.1 simulator, I’ll have some more posts coming up that will build off of this.

Popularity: 14% [?]

NetApp Introduces New Controllers

One of the more popular controllers (at least, in my experience) that NetApp offered was the FAS250/FAS270.  From the front it looked like a standard DS14 disk shelf, but in the back it contained controller module(s).  You could deploy it as a single controller, or in a HA pair and it made a great option for SMBs – it didn’t consume a lot of space since the controller was built into the shelf and the price was entry level.  These models went away in favor of the 2000 series controllers which still offered internal drives (12 in the FAS2020/2040 and 20 in the FAS2050) but they used their own form factor and were a dedicated storage appliance.  A problem with these controllers was that when you upgraded the system, the internal drives could not be taken out and put into a disk shelf (such as the DS4243).  There were ways around this limitation, most notably just not ordering internal drives in the controllers and only using external storage to ease any concerns over future upgrades, but it still was a limitation in my opinion.

Today NetApp is announcing two new controller models that are reminiscent of the FAS200 line.  They are the FAS2240-2 and the FAS2240-4, they are 2u and 4u in size respectively.  Early performance numbers indicate between a 2-3x performance improvement over the FAS2040 depending on workload type. The FAS2040 will stick around to complete the FAS2000 lineup.  This means all current controllers will be able to run the latest ONTAP software from NetApp (the FAS2020 and FAS2050 did not support ONTAP 8.x).  As I alluded to previously, the FAS2240 is a storage shelf with the controllers inserted into the back.  The 2240-2 is a 2u system and based on the current FAS2246 SAS shelf, while the 2240-4 is a 4u system and based on the current FAS4243 shelf.  The FAS2240-2 utilizes 2.5″ SAS drives and supports either 450 or 600GB drives as of today.  The FAS2240-4 utilizes 3.5″ SATA drives and supports 1, 2 or 3TB SATA drives as of today.  Both systems can be ordered with either 12 or 24 drives.

Some quick notes on the new models:

  • Will require ONTAP 8.1+
  • Supports a mezzanine card, which can be either a 2 port FC card or 2 port 10 GbE card
    • If you put a FC card in the mezzanine slot, they can be either target or initiator ports much like onboard FC ports on other controllers today
  • Will support cluster mode, but you have to use the 10 GbE mezzanine card for cluster communication so only iSCSI/CIFS/NFS will work and must be served out of the GbE ports
  • Will come with ONTAP Essentials, which means all storage protocols are included (as well as things like Operations Manager, Protection & Provisioning Manager, DSM/MPIO)
  • Ability to convert from a controller into a disk shelf (much like the FAS200 line)
  • Will not have support for the FlashCache card or FCoE

Front view of the FAS2240-2 controller

Rear view of the FAS2240-2 controller

Front view of the FAS2240-4 controller

Rear view of the FAS2240-4 controller

Also, and in my opinion this is a big one, the maximum volume size is 54TB on the FAS2240 and the maximum volume size with dedupe and/or compression enabled is ALSO 54TB!  This is one of the best features of 8.1, 64 bit aggregates in 8.x allowed us to grow beyond a 16TB aggregate limit but we were still limited (at least in some environments) on the volume size to a maximum of 16 TB (depending on the controller model) when using compression and/or dedupe.  To reiterate, as of ONTAP 8.1 the maximum volume size for dedupe/compression is now equal to the maximum volume size for the controller – which means it could be anywhere from 30 TB on the (now) entry level FAS2040 to 100 TB on the highest end FAS6280.  To determine what the maximum volume size is for your controller, check the System Configuration Guide.

Anyone out there looking at these new controllers from NetApp?

Popularity: 12% [?]

On my way to NetApp Insight 2011

I’m off to the airport for NetApp Insight 2011, which allows me to trade the 50 degree Minnesota temperatures right now for what looks to be a week of low 80 degree temperatures.

For me one of the big focus areas this year is to dig into ONTAP 8.1 cluster mode, which NetApp has said is going to be their future.  The traditional 7-mode will still be around, but (almost?) all R&D will be put into the cluster mode features.  The cluster mode product isn’t anything new to NetApp, they acquired Spinnaker around 2004 and have been offering ONTAP GX which allowed 24 nodes in a single namespace cluster.  The issue was that the commands were not similar between ONTAP 7G and ONTAP GX as well as differing feature sets.  The paths are starting to converge and 8.1 offers some features that weren’t found with ONTAP GX such as:

  • Support for SAN (ONTAP GX only supported NAS protocols)
  • SnapMirror for replication
  • Deduplication

There are a few things that aren’t yet supported (as far as I know) in cluster mode (SnapVault is one that comes to mind) but it’s nice to see it coming closer to feature parity with the 7-mode offering.

A quick summary of my schedule to give you an idea on things I might be tweeting about:

Tuesday:

  • Virtualization Technologies Technical Keynote

Wednesday:

  • Advanced MultiStore with ONTAP 8.1 7-Mode Hands on Lab
  • Performance Sizing with System Performance Modeler
  • Cluster-Mode Scalable SAN Hands on Lab
  • Automated Failback with Configuring NetApp and VMware SRM 5

Thursday:

  • Technical Guide to Implementing V-Series
  • Monitoring NetApp Storage Through New Eyes with OnCommand
  • VMware View on NetApp: Solution Architecture, Best Practices, and Pitfalls

Friday:

  • Best Practices for Configuring and Converting to 64-bit Aggregates

I’ll be sharing content as much as I can (at least, for the things that aren’t under NDA).  Check my twitter and this blog for updates throughout the week.  Thanks for reading!

Popularity: 4% [?]

Touring the NetApp RTP Datacenter

About a year ago at one of the local Tech OnTap events NetApp did a presentation about their new datacenter and I had been wanting to tour the facility since then.   I was out in the RTP area a few weeks ago for some NetApp training, and luckily I had an extra day where I could finally see the datacenter NetApp calls the Global Dynamic Lab.

The datacenter is impressive, not only in terms of its looks and attention to detail – but also with the amount of equipment they can support at just a fraction of the cost of a typical datacenter.  NetApp was able to reduce construction costs on this datacenter by more than 2/3 and also reduce operating costs by about 60%, and in doing so was still able to deliver more power and cooling per rack than the industry average.  My first thought after hearing this from our tour guide was that it really matches NetApp’s current tag line: Do more with less.

Once we walked into the datacenter floor one of the first screens you see gives some environmental info.

At a glance you can see todays PUE, as well as the months average among other things.  It also shows the outside air info, being from Minnesota my first thought when I landed in Raleigh was how amazing the weather is – and this datacenter takes advantage of it.  Since approximately 60% of the time the weather is 70 degrees or cooler, it has the ability to automatically pull in outside air as much as possible to help in cooling the datacenter.  The average PUE for this datacenter is around 1.2 – one of the lowest in the industry, having a PUE of 1.2 gives an estimated savings of $7 million in operating expenses annually versus a PUE of 2.0.

After reviewing the screen I was ready to head in and see the rest of the place, looking down the main aisle gives you this view:

There are 36 cold rooms, and each cold room is based on a max of 720 kW, each rack can go up to 42kW as long as the total cooling load per cold room does not exceed 720 kW.  There were rows and rows of various NetApp controllers and storage (both demo and a subset of production are run from this location), all different kinds of server vendors and primarily Cisco for the switching (if there were other vendors, I didn’t seem them in the aisles I walked through).  This was definitely a more easy-going tour than the SwitchNAP datacenter tour I had gone on just a month ago at VMworld, I kept expecting security guards to be following our every step.  Joking aside, I’d have to say these are without a doubt two of the most impressive data centers I’ve seen.

There is also a video tour on Youtube as well that gives a little more detail:

All in all it was a great tour and I’d like to thank @chrisgeb and @that1guynick and other (gasp) non-Twitter users for helping to facilitate.  The only downside was I wasn’t able to take a spare FAS3210 home for my own lab, at one point I did hear “if you can get it on the plane you can take it home” though I didn’t take the challenge.

There is a really good whitepaper that NetApp as published on the GDL called: Breaking Down the Glass House: NetApp Global Dynamic Lab Delivers Higher Power Density, Greater Efficiency, and Lower Capital and Operating Costs which may also set a record for the longest name for a whitepaper.  If you are interested in more info on it I’d highly recommend checking it out.

Popularity: 4% [?]

Deploying NetApp FlexPod? Info on downgrading to ONTAP 7G

The main reason for this blog post was due to an internal/partner only Technical Report (specifically, TR-3892).  If you purchased a new NetApp system and are planning to deploy it in a FlexPod configuration, then you will likely need to downgrade the version of ONTAP that the controllers shipped with.  Most likely, they shipped with some flavor of ONTAP 8.  In order to use MultiStore we have to use 7G as it isn’t yet supported in ONTAP 8.

As you would expect this is one of the first steps under the NetApp configuration piece, and the problem is, the steps that are listed are incorrect.  The process that the guide has you follow has you netboot from the LOADER> prompt, enter into the special boot menu and run a 4a command to initialize disks and create a new root volume.

The problem with this method is that it won’t do anything with the disks that were part of the ONTAP 8 install and once the process is complete you will see output similar to the following when you run aggr status -f


filer1> aggr status -f

Broken disks

RAID DiskDevice  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------------  ------------- ---- ---- ---- ----- --------------    --------------
label version0a.10.0 0a    10  0   SA:B   0  BSAS  7200 847555/1735794176 847884/1736466816
label version0a.10.1 0a    10  1   SA:B   0  BSAS  7200 847555/1735794176 847884/1736466816

filer1>

In the past, I believe there was a way around this (disk unfail -s from priv set advanced and/or label makespare from maintenance mode).  However those methods don’t seem to work now.  The fix is that you have to upgrade back to 8.0.1 and then follow the proper procedure for downgrading the controllers to 7G which is using the revert_to command.  Once you upgrade back to 8.0.1 and reboot the system you will likely see the following:

PANIC: 2 root volumes found, 2 of which are online.

in SK process config_thread on release NetApp Release 8.0.1

The reason for this is now that you upgraded back to ONTAP 8 the “failed” disks work again which have a root aggregate and a root volume on them.  In order to fix that you need to boot into maintenance mode and run the following:

aggr offline aggr0

aggr options aggr0(1) root

Now the system should boot, and you can delete the still offline aggr0 with the aggr destroy command (also note that even though I took aggr0 offline and made aggr0(1) the root, the names “flip” once I booted up since the (1) is assigned to the duplicate aggregate that the controller isn’t booting from).

aggr destroy aggr0(1)

Now that you are back to square 1, the next part is to properly downgrade the system.  Depending on how long the system was running you might have some snapshots you need to delete before reverting to 7G.  In my case, since it’s a new install I deleted all the snapshots on vol0 as well as aggr0 with the following commands:

filer1> snap delete -a vol0

filer1> snap delete -A -a aggr0

You will also need to terminate CIFS & NFS as well as disable SnapVault and SnapMirror before you can successfully run the revert_to command.

cifs terminate

nfs off

options snapvault.enable off

options snapmirror.enable off

Another thing I noticed is that if you follow the revert procedure as listed in MyAutoSupport upgrade advisor it has a small typo on the command.  It lists the command as simply revert_to when it should be revert_to 7.3, this is pretty minor and just running revert_to will give you the possible switches that should follow the command(I show the output from just running revert_to below).

filer1> software update 7351P4_setup_q.exe -r

filer1>revert_to

usage: revert_to [-f] 7.2 (for 7.2 and 7.2.x)

revert_to [-f] 7.3 (for 7.3 and 7.3.x)

-f   Attempt to force revert.

filer1> revert_to 7.3

When the process is done it will automatically halt the system and leave it at the LOADER> prompt, boot the system back up and you should be back to ONTAP 7G with no failed disks and ready to configure the system, I have another post on the steps I follow to configure a NetApp storage array.

Popularity: 7% [?]

Removing VM snapshots left by CommVault or NetApp

Since I tend to focus on CommVault and NetApp this post is mentioning those two products in particular, but if you’ve used any VM level backup product that quiesces the VM prior to taking the backup – you have likely seen that these VMware snapshots aren’t always removed by the application.  I’ve seen a few different ways to handle this, but since I am a beginner at PowerShell I thought I’d see how I could remove them without using the vSphere Client.

This all started due to a power outage over the weekend in the middle of a VMware backup so I was left with a number of VM’s with the all too familiar ___GX_BACKUP___ snapshot name which I determined by running the following command:

Get-VM | Get-Snapshot | Select Name,VM,SizeMB

Output from PowerCLI showing the existing VM snapshots

This part didn’t tell me anything that I didn’t already know, I have a number of VM snapshots that are left behind from CommVault all with the same snapshot name.  In this particular case I had a backup job that failed, so it had already created the snapshots but hadn’t done the backup which also meant it hadn’t yet come through and removed the VM snapshots.  Next up was running the following command to delete all of those snapshots:

Get-Vm | Get-Snapshot -Name __GX_BACKUP__ | Remove-Snapshot -Confirm:$false

Snapshots being committed

This should kick off a Remove Snapshot job if you have the vSphere Client open.  Let it run and you should have all of your VM snapshots initiated by CommVault committed to disk.

If you are using NetApp SnapManager for VI to backup your virtual machines, your snapshot name will be different.  Instead of ___GX_BACKUP___ it will be something along the lines of smvi_<string of numbers>.  To remove those snapshots you can run the following:

Get-Vm | Get-Snapshot -Name smvi_* | Remove-Snapshot -Confirm:$false

Keep in mind that will delete ALL snapshots that start with smvi_, if you have valid snapshots you’d like to keep that were taken by SMVI, that command isn’t for you.  That should be it, this is pretty basic but I seem to always have to deal with leftover snapshots so I wanted to document this for easy reference.

Popularity: 8% [?]

CommVault SnapProtect with NetApp NFS Volumes

Introduction

One of the newer features of CommVault Simpana is the ability to perform SnapProtect backups for various applications such as Exchange, SQL or VMware.  This post will focus on VMware and specifically on virtual machines that are on a NFS datastore hosted on a NetApp storage array.

If you aren’t familiar with SnapProtect it allows you to take hardware based snapshots called from CommVault.  Being a NetApp reseller, I would compare it to providing SnapManager-like functionality that you can control from within the CommCell console.  If you are already a CommVault shop I think there are some pretty compelling reasons to take a look at it (and if you are currently licensed by capacity then you really should be looking at it as there is no extra charge).  If you’ve seen CommVault’s demo video of taking a backup of 500 virtual machines in 17 minutes they are taking advantage of this functionality.  The only thing I would point out is if SnapProtect is your only method for backing up your virtual machines, all you are getting is a hardware based snapshot for a backup.  Ideally you should configure a backup copy to pull that data from the snapshot off the storage array and into your magnetic library within CommVault and ideally replicate it over to a disaster recovery location as well.  All of that is possible within CommVault, it just requires a few extra things to configure in addition to basic SnapProtect functions.

Configuration

One of the things we ran into immediately when configuring this was an issue over whether NFS was even supported (yet) with SnapProtect.  We were originally implementing this shortly after v9 was released and there wasn’t yet a lot of documentation on it.  NFS is fully supported as a datastore type for SnapProtect backups and hopefully this post will help with a few of the stumbling blocks you may hit while configuring it.

Initial SnapProtect error

The first challenge I hit was network related.  I installed the Virtual Server Agent on my Media Agent, which is on the server network VLAN (for the purposes of this post, I’ll use VLAN 10).  We have another VLAN specifically for NFS/IP storage traffic used in our vSphere environment (again, for the purposes of this post I’ll use VLAN 20).  Initially the media agent wasn’t configured on the storage VLAN 20 and the screenshot to the right shows the initial error I ran into.  Simple enough solution, the network link from the media agent to the switch was configured as a trunk link with access to VLANs 10 and 20.

Array Management Control Panel

Now that I had the media agent configured with a storage VLAN my next issue was it was still trying to use the server VLAN when it connected to the NetApp array.  This ended up being caused by how I setup the NetApp array within the Array Management Control Panel in the CommCell.  After removing the entry that was configured with a hostname and instead using the IP address of the storage VLAN on the NetApp I was able to successfully connect.  Lesson learned was this isn’t just a communication channel between the media agent and the storage device, it will match the IP address listed in the Array Management Control Panel to the datastore IP address that is used within vSphere.  One thing we didn’t run into, but that you should keep in mind, is if you are using IP aliases you would then need to add in multiple entries for the same NetApp host, one entry per IP alias.  It will continue on through the list trying to find a match so the order of the arrays within the Array Management Control Panel isn’t important.

Current Limitations

Having followed NetApp’s best practice guide the page file was on a separate VMDK file that was stored on a separate datastore in the vSphere environment.  Unfortunately today SnapProtect doesn’t support this, the VMDK files must all reside within the same datastore.  You also don’t have the ability to exclude a VMDK file or datastore, something that I hope will be added in a future release.  This is one area where NetApp’s SnapManager tool has an advantage as you can choose to not snap the volume used for pagefiles and keep your space used for backups to a minimum.  I like that approach so I can keep the snapshot sizes as lean as possible, I don’t need to backup transient data like the pagefile.  If you followed the (optional) recommendation in NetApp’s best practices for vSphere (TR-3749) you would need to storage vMotion it to the same datastore as the rest of the virtual machine files before continuing with SnapProtect.

Another limitation I’ve run into with NFS volumes and NetApp is that if you follow best practices and disable the setting for automatically exporting a volume copying the snapshot into your magnetic library will not work.  When you configure the backup copy to CommVault’s magnetic library it creates a FlexClone of the volume and that is what it will use to bring the data in.  Without the options nfs.export.auto-update value set to on it won’t have the necessary NFS permissions and will fail to mount that FlexClone into vSphere.  After installing CommVault SP2 recently I no longer have this issue, although I don’t see it mentioned in the release notes for the service pack.

The SnapProtect features are relatively new and hopefully these items are on the roadmap to change as it really is impressive seeing array level snapshots being orchestrated by CommVault.

Popularity: 16% [?]

Upgrading to NetApp ONTAP 8.0.1

I sat in on a good WebEx the other day that discussed ONTAP 8.0.1 upgrade best practices and considerations.  The majority of information I already had in my checklist but I added a few things and wanted to post it as I’m sure others are about to/in the process of upgrading to ONTAP 8.0.1 and this can help cover some of the bases.

The first thing to be aware of is that of the current shipping products from NetApp, the 2020 and 2050 is not supported with ONTAP 8.x.  Other common systems that are not capable of running v8 are the 3020 and 3050.  I’d recommend checking the System Configuration Guide for the appropriate FAS model.  While you are in the System Configuration Guide, another item to note is that it tells you what the minimum root volume size should be – and there is a good chance you may need to increase the size of your current vol0 to perform the upgrade.  As an example the FAS3070 lists the minimum vol0 size as 230GB.

You will also want to make sure you don’t require a feature that isn’t currently supported yet in ONTAP 8.0.1, the main ones not yet supported are:

  • Data Motion for MultiStore vFiler
  • IPv6
  • SnapLock
  • IPsec

The Flash Cache (formerly PAM II) card is now supported with 64 bit aggregates in 8.0.1, which wasn’t supported in 8.0.

Make sure you have enough free space in your aggregate, the Tool Chest has a utility to identify the amount of free space required.  If you have any volumes with LUNs make sure they have at least 1 MB of free space in them.

Next I would run the HA Config Checker tool, I blogged previously about how to use this to spot issues that might cause problems with a controller failover/giveback.

If you are using SnapMirror you will want to upgrade your destination systems first.  Double check the Interoperability Matrix and make sure everything will be supported with the new ONTAP version.  Also be aware of switch firmware requirements in the case of a V-Series controller.

Run the upgrade advisor, this utility is excellent, it used to require premium AutoSupport but I believe is now available to all customers.  Include the options for verbose and a back out plan and save the output file.

Now that the planning/documentation phase is done, the next step is to backup the config and take a snapshot of vol0:

config dump pre-upgrade
snap create vol0 pre-upgrade
logger starting ONTAP upgrade

The config dump file is saved to /etc/configs and I would copy that to your local machine before starting the upgrade.  The other steps I follow is have logging enabled on your console session and run the following commands:

options
igroup show -v
lun show -v
rdfile /etc/exports
rdfile /etc/cifsconfig_share.cfg
rdfile /etc/rc

Comparing options between 7G and 8.0.1

Those are the main ones I do but there may be others depending on what you utilize in your environment.  Logging the output of the options command before and after the upgrade is nice because you can run a diff on the two and see what the new options are, as well as if any of the defaults have changed.  I use DeltaWalker on OSX although WinMerge (free) or Beyond Compare (not free) are good options for Windows people.  Almost all of what is logged is saved to the MyAutoSupport page as well under Raw AutoSupport Data, but if you are performing the upgrade in the wee hours of the morning it’s generally helpful to be able to reference that information as quickly as possible.

If the system is on a version older than 7.3, it is recommended to go to 7.3 first and then go to 8.0.1.  As mentioned previously you may need to resize the root volume, a safe size is 250GB which covers all but the 6xxx series controllers (again, reference the System Configuration Guide to find out the minimum root volume size for your controller model).  Also make sure you don’t have any SnapMirror/Dedupe processes running.  If you have a vif created on the controller, the upgrade won’t automatically update the naming.  The new term is ifgrp so you will have to manually edit the /etc/rc file to the new naming convention.

Now for a couple of new items I took from the WebEx:

  • Software update is now the preferred method instead of software install, the main difference being that software update has some error-checking built into it whereas software install does not
  • SnapMirror restart checkpoints are deleted during the upgrade (this doesn’t mean you have to re-baseline the SnapMirror, just that if the last incremental transfer was 20% done – you’d have to start over on that incremental transfer, you cannot resume from 20%)
  • Cannot revert to prior to 7.3 (using the revert_to command)
  • Delete all snapshots taken on the 8.x system before reverting to 7G
  • Need to change ifgrp back to vif in /etc/rc

The only thing left is testing after the upgrade is complete, test cluster failover in both directions and also ensure connectivity with all protocols.

Popularity: 21% [?]

NetApp ONTAP 8.0.1 is now GA

This morning NetApp (finally) posted the GA version of ONTAP 8.0.1, this is a release I’ve been anxiously awaiting.  If you aren’t familiar with the benefits of 8.0.1 some of the features that interest me the most are:

  • VMware VAAI support with vSphere 4.1
  • Root volume (vol0) can be on a 64 bit aggregate
  • Data Motion for Volumes to non-disruptively migrate block storage from one aggregate to another*
  • Compression for performance insensitive data

The full release notes are available here: https://now.netapp.com/NOW/knowledge/docs/ontap/rel801/html/ontap/rnote/frameset.html

The software is available for download here: https://now.netapp.com/NOW/download/software/ontap/8.0.1/

* There are some caveats with Data Motion for Volumes, see my previous blog post on it for the details

Popularity: 14% [?]

Using VAAI with VMware and the NetApp 8.0.1 Simulator

2011-12-01 Update: I have a new post on using the latest version of the NetApp ONTAP Simulator (8.1) here, follow those instructions to deploy the newest  simulator in your environment.

This one is pretty straight forward, but I had a few people ask so I am making a post on it.  As you may know with vSphere 4.1 and NetApp ONTAP 8.0.1 VAAI is enabled by default on an iSCSI or FC connection.

Click for larger image

I’m using my simulator setup I described in my last post, and now you can see below I have added into my vSphere environment and is now being displayed within the NetApp Virtual Storage Console.  You can see even though I haven’t “configured” anything for it on the NetApp side, the VAAI capable column for TST-NA1 shows enabled while the other 2 arrays (a FAS3020 which is not running ONTAP 8.0.1 – also of note is this platform is not capable of running ONTAP 8.0.1, only ONTAP 7G) are displayed as not being VAAI capable.

Click for larger image

You can also verify this from the vSphere side (either by enabling Remote Tech Support (SSH) if using ESXi/using the vMA/Service Console if using ESX classic).  In my case I am using ESXi and chose to use the vSphere Management Assistant (vMA).  I ran vicfg-scsidevs -l and was able to see that my NetApp iSCSI LUN is supported for VAAI.  NetApp also has published a good Technical Report on using VAAI which goes into detail on how to view the VAAI statistics from the array side as well using the stats show vstorage command.  It does a good job of explaining what the counters are not only from the stats show command on the array but also from the esxtop output as well.

Popularity: 23% [?]