February 5, 2012

Deploying NetApp FlexPod? Info on downgrading to ONTAP 7G

The main reason for this blog post was due to an internal/partner only Technical Report (specifically, TR-3892).  If you purchased a new NetApp system and are planning to deploy it in a FlexPod configuration, then you will likely need to downgrade the version of ONTAP that the controllers shipped with.  Most likely, they shipped with some flavor of ONTAP 8.  In order to use MultiStore we have to use 7G as it isn’t yet supported in ONTAP 8.

As you would expect this is one of the first steps under the NetApp configuration piece, and the problem is, the steps that are listed are incorrect.  The process that the guide has you follow has you netboot from the LOADER> prompt, enter into the special boot menu and run a 4a command to initialize disks and create a new root volume.

The problem with this method is that it won’t do anything with the disks that were part of the ONTAP 8 install and once the process is complete you will see output similar to the following when you run aggr status -f


filer1> aggr status -f

Broken disks

RAID DiskDevice  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------------  ------------- ---- ---- ---- ----- --------------    --------------
label version0a.10.0 0a    10  0   SA:B   0  BSAS  7200 847555/1735794176 847884/1736466816
label version0a.10.1 0a    10  1   SA:B   0  BSAS  7200 847555/1735794176 847884/1736466816

filer1>

In the past, I believe there was a way around this (disk unfail -s from priv set advanced and/or label makespare from maintenance mode).  However those methods don’t seem to work now.  The fix is that you have to upgrade back to 8.0.1 and then follow the proper procedure for downgrading the controllers to 7G which is using the revert_to command.  Once you upgrade back to 8.0.1 and reboot the system you will likely see the following:

PANIC: 2 root volumes found, 2 of which are online.

in SK process config_thread on release NetApp Release 8.0.1

The reason for this is now that you upgraded back to ONTAP 8 the “failed” disks work again which have a root aggregate and a root volume on them.  In order to fix that you need to boot into maintenance mode and run the following:

aggr offline aggr0

aggr options aggr0(1) root

Now the system should boot, and you can delete the still offline aggr0 with the aggr destroy command (also note that even though I took aggr0 offline and made aggr0(1) the root, the names “flip” once I booted up since the (1) is assigned to the duplicate aggregate that the controller isn’t booting from).

aggr destroy aggr0(1)

Now that you are back to square 1, the next part is to properly downgrade the system.  Depending on how long the system was running you might have some snapshots you need to delete before reverting to 7G.  In my case, since it’s a new install I deleted all the snapshots on vol0 as well as aggr0 with the following commands:

filer1> snap delete -a vol0

filer1> snap delete -A -a aggr0

You will also need to terminate CIFS & NFS as well as disable SnapVault and SnapMirror before you can successfully run the revert_to command.

cifs terminate

nfs off

options snapvault.enable off

options snapmirror.enable off

Another thing I noticed is that if you follow the revert procedure as listed in MyAutoSupport upgrade advisor it has a small typo on the command.  It lists the command as simply revert_to when it should be revert_to 7.3, this is pretty minor and just running revert_to will give you the possible switches that should follow the command(I show the output from just running revert_to below).

filer1> software update 7351P4_setup_q.exe -r

filer1>revert_to

usage: revert_to [-f] 7.2 (for 7.2 and 7.2.x)

revert_to [-f] 7.3 (for 7.3 and 7.3.x)

-f   Attempt to force revert.

filer1> revert_to 7.3

When the process is done it will automatically halt the system and leave it at the LOADER> prompt, boot the system back up and you should be back to ONTAP 7G with no failed disks and ready to configure the system, I have another post on the steps I follow to configure a NetApp storage array.

Popularity: 7% [?]

Upgrading to NetApp ONTAP 8.0.1

I sat in on a good WebEx the other day that discussed ONTAP 8.0.1 upgrade best practices and considerations.  The majority of information I already had in my checklist but I added a few things and wanted to post it as I’m sure others are about to/in the process of upgrading to ONTAP 8.0.1 and this can help cover some of the bases.

The first thing to be aware of is that of the current shipping products from NetApp, the 2020 and 2050 is not supported with ONTAP 8.x.  Other common systems that are not capable of running v8 are the 3020 and 3050.  I’d recommend checking the System Configuration Guide for the appropriate FAS model.  While you are in the System Configuration Guide, another item to note is that it tells you what the minimum root volume size should be – and there is a good chance you may need to increase the size of your current vol0 to perform the upgrade.  As an example the FAS3070 lists the minimum vol0 size as 230GB.

You will also want to make sure you don’t require a feature that isn’t currently supported yet in ONTAP 8.0.1, the main ones not yet supported are:

  • Data Motion for MultiStore vFiler
  • IPv6
  • SnapLock
  • IPsec

The Flash Cache (formerly PAM II) card is now supported with 64 bit aggregates in 8.0.1, which wasn’t supported in 8.0.

Make sure you have enough free space in your aggregate, the Tool Chest has a utility to identify the amount of free space required.  If you have any volumes with LUNs make sure they have at least 1 MB of free space in them.

Next I would run the HA Config Checker tool, I blogged previously about how to use this to spot issues that might cause problems with a controller failover/giveback.

If you are using SnapMirror you will want to upgrade your destination systems first.  Double check the Interoperability Matrix and make sure everything will be supported with the new ONTAP version.  Also be aware of switch firmware requirements in the case of a V-Series controller.

Run the upgrade advisor, this utility is excellent, it used to require premium AutoSupport but I believe is now available to all customers.  Include the options for verbose and a back out plan and save the output file.

Now that the planning/documentation phase is done, the next step is to backup the config and take a snapshot of vol0:

config dump pre-upgrade
snap create vol0 pre-upgrade
logger starting ONTAP upgrade

The config dump file is saved to /etc/configs and I would copy that to your local machine before starting the upgrade.  The other steps I follow is have logging enabled on your console session and run the following commands:

options
igroup show -v
lun show -v
rdfile /etc/exports
rdfile /etc/cifsconfig_share.cfg
rdfile /etc/rc

Comparing options between 7G and 8.0.1

Those are the main ones I do but there may be others depending on what you utilize in your environment.  Logging the output of the options command before and after the upgrade is nice because you can run a diff on the two and see what the new options are, as well as if any of the defaults have changed.  I use DeltaWalker on OSX although WinMerge (free) or Beyond Compare (not free) are good options for Windows people.  Almost all of what is logged is saved to the MyAutoSupport page as well under Raw AutoSupport Data, but if you are performing the upgrade in the wee hours of the morning it’s generally helpful to be able to reference that information as quickly as possible.

If the system is on a version older than 7.3, it is recommended to go to 7.3 first and then go to 8.0.1.  As mentioned previously you may need to resize the root volume, a safe size is 250GB which covers all but the 6xxx series controllers (again, reference the System Configuration Guide to find out the minimum root volume size for your controller model).  Also make sure you don’t have any SnapMirror/Dedupe processes running.  If you have a vif created on the controller, the upgrade won’t automatically update the naming.  The new term is ifgrp so you will have to manually edit the /etc/rc file to the new naming convention.

Now for a couple of new items I took from the WebEx:

  • Software update is now the preferred method instead of software install, the main difference being that software update has some error-checking built into it whereas software install does not
  • SnapMirror restart checkpoints are deleted during the upgrade (this doesn’t mean you have to re-baseline the SnapMirror, just that if the last incremental transfer was 20% done – you’d have to start over on that incremental transfer, you cannot resume from 20%)
  • Cannot revert to prior to 7.3 (using the revert_to command)
  • Delete all snapshots taken on the 8.x system before reverting to 7G
  • Need to change ifgrp back to vif in /etc/rc

The only thing left is testing after the upgrade is complete, test cluster failover in both directions and also ensure connectivity with all protocols.

Popularity: 21% [?]

NetApp ONTAP 8.0.1 is now GA

This morning NetApp (finally) posted the GA version of ONTAP 8.0.1, this is a release I’ve been anxiously awaiting.  If you aren’t familiar with the benefits of 8.0.1 some of the features that interest me the most are:

  • VMware VAAI support with vSphere 4.1
  • Root volume (vol0) can be on a 64 bit aggregate
  • Data Motion for Volumes to non-disruptively migrate block storage from one aggregate to another*
  • Compression for performance insensitive data

The full release notes are available here: https://now.netapp.com/NOW/knowledge/docs/ontap/rel801/html/ontap/rnote/frameset.html

The software is available for download here: https://now.netapp.com/NOW/download/software/ontap/8.0.1/

* There are some caveats with Data Motion for Volumes, see my previous blog post on it for the details

Popularity: 14% [?]

Using VAAI with VMware and the NetApp 8.0.1 Simulator

2011-12-01 Update: I have a new post on using the latest version of the NetApp ONTAP Simulator (8.1) here, follow those instructions to deploy the newest  simulator in your environment.

This one is pretty straight forward, but I had a few people ask so I am making a post on it.  As you may know with vSphere 4.1 and NetApp ONTAP 8.0.1 VAAI is enabled by default on an iSCSI or FC connection.

Click for larger image

I’m using my simulator setup I described in my last post, and now you can see below I have added into my vSphere environment and is now being displayed within the NetApp Virtual Storage Console.  You can see even though I haven’t “configured” anything for it on the NetApp side, the VAAI capable column for TST-NA1 shows enabled while the other 2 arrays (a FAS3020 which is not running ONTAP 8.0.1 – also of note is this platform is not capable of running ONTAP 8.0.1, only ONTAP 7G) are displayed as not being VAAI capable.

Click for larger image

You can also verify this from the vSphere side (either by enabling Remote Tech Support (SSH) if using ESXi/using the vMA/Service Console if using ESX classic).  In my case I am using ESXi and chose to use the vSphere Management Assistant (vMA).  I ran vicfg-scsidevs -l and was able to see that my NetApp iSCSI LUN is supported for VAAI.  NetApp also has published a good Technical Report on using VAAI which goes into detail on how to view the VAAI statistics from the array side as well using the stats show vstorage command.  It does a good job of explaining what the counters are not only from the stats show command on the array but also from the esxtop output as well.

Popularity: 23% [?]

Getting Started with the NetApp ONTAP 8.0.1 Simulator

2011-12-01 Update: I have a new post on using the latest version of the NetApp ONTAP Simulator (8.1) here, follow those instructions to deploy the newest  simulator in your environment.

Now that the ONTAP 8.0.1 simulator is (finally) available, I thought I’d mention a few basic tweaks to get the most out of running the sim.  There are some settings that you probably don’t need enabled when running a simulator (such as AutoSupport) and also since it comes with 28Gb worth of “disk” you would likely want to maximize the usable space.

If you want to use the 8.0.1 simulator on a VMware ESX or ESXi server, see my other post on this – the same steps still work.

The first thing I normally do on the simulator is disable AutoSupport, theres really no reason to have AutoSupports enabled on a system I’m using for testing purposes.

options autosupport.enable off

Next up is assigning the rest of the disks, by default it only assigns the minimum disks required to boot up the system (3 for RAID-DP) and there are 25 other disks to assign to the system.

disk assign all

Now we need to add the license keys, the following license keys work for the simulator (note, these will not work on a physical system):

a_sis                   MTVVGAF
cifs	                DZDACHD
disk_sanitization	    PZKEAZL
http	                NAZOMKC
flex_clone	            ANLEAZL
iscsi	                BSLRLTG
multistore	            NQBYFJJ
nearstore_option        ELNRLTG
nfs	                    BQOEAZL
smdomino	            RKBAFSN
smsql	                HNGEAZL
snapmanagerexchange	    BCJEAZL
snapmirror	            DFVXFJJ
snapmirror_sync	        XJQIVFK
snaprestore	            DNDCBQH
snapvalidator	        JQAACHD
sv_linux_pri	        ZYICXLC
sv_ontap_pri	        PVOIVFK
sv_ontap_sec	        PDXMQMI
sv_unix_pri	            RQAYBFE
sv_windows_ofm_pri    	ZOFNMID
sv_windows_pri	        ZOPRKAM
syncmirror_local   	    RIQTKCL

Note: These licenses are also listed on this NetApp Communities page, my list does not include the licenses that do not work on the 8.0.1 simulator (such as FCP or Snaplock).  Also, you will want to run cifs setup and iscsi start after you enter the license keys in.

At this point we need to decide how to carve up the remaining disks, the root volume is on a 3 disk aggregate which is 32 bit.  We could assign the remaining disks into the existing aggregate or we could also create a new 64 bit aggregate if we wanted (although since we won’t come anywhere near maxing out a 32 bit aggregate I’m going to continue down that route).  Keep in mind if you decide to use a 64 bit aggregate, that now in 8.0.1 you can have the root volume on a 64 bit aggregate although there are some challenges getting the data from a 32 bit aggregate to a 64 bit aggregate – I’ve just used the ndmpcopy command to do it.

The default RAID group size is 16, since I don’t want to lose 2 more of my “disks” to parity drives, I’m going to set the RAID group size to 28.

aggr options aggr0 raidsize 28

Now we can add the remaining spare disks into the aggregate, I’m going to add all of them in leaving no spares.  Normally there is an option to disable the complaining about no spare disks – options raid.min_spare_count but unfortunately that doesn’t work with more than 16 drives on the system.  If you don’t want to see that error just add 24 instead of 25 for the following step:

aggr add aggr0 25

Next up is to reclaim the aggregate snap reserve space of 5%

snap reserve -A aggr0 0
snap sched -A aggr0 0

That should be enough to get you going, the nice part about the 8.0.1 simulator is it allows you to play around with VAAI with vSphere 4.1 (using iSCSI).  I’ll cover that in my next post.

Popularity: 48% [?]

NetApp Setup Guide

I’ve had this posted for awhile on my NetApp Reference Page at the top but I wanted to move it into a post for the permalink to this, hence the post.

Setup Guide for NetApp Installs

Pre-Install: Infrastructure

  • Ensure physical cabling done (Verified the appropriate amount of ethernet/FC switch ports are available and the cables have been run to the location the NetApp will reside)
    • Ethernet
      • Management port cable run (1 per head)
      • IP address to use for management port(s)
      • Ethernet port(s) cable(s) run (varies per configuration)
      • Document all IP addresses that will be assigned to interfaces
      • Ethernet switches configured properly and relevant information sent to implementation engineer
        • Jumbo frames if required
        • EtherChannel configuration if required
        • Necessary vlans configured
    • Fibre Channel
      • Cables run (varies per configuration)
      • Appropriate zoning done on switches
    • Power available

Pre-Install: Administrative

  • Document contact info for all parties involved (technical resources, project managers, etc)
  • Document address for where the NetApp controller(s) will be located
  • Document licenses for all protocols
  • Document hostname(s) to use for NetApp controller(s)
  • Document the following information and verify compatibility with support matrix
    • For all hosts connecting to the NetApp via a FC/iSCSI connection
      • OS Version (eg Windows Server 2003 R2 Standard x86)
      • HBA Models
        • Current driver version
        • Current firmware version
      • MPIO version
      • Host Utilities Kit version
      • SnapDrive version
    • FC Switches
      • Switch model(s)
      • Firmware version
    • ONTAP version

Install: Administrative

  • Register customer info on NOW site
  • Obtain, document and install all software licenses

Install: Upgrade to Latest Versions

  • Upgrade ONTAP to latest
  • Upgrade RLM/BMC to latest
  • Upgrade system firmware to latest
  • Upgrade disk firmware to latest
  • Upgrade shelf firmware to latest

Install: Initial Setup Best Practices

  • Configure RLM/BMC
  • Configure SSH
options ssh.access "host=host1,host2 AND if=e0"
  • Disable telnet
  • Configure SSL
  • Disable HTTP
  • Configure AutoSupport
  • Configure NTP/timed
options timed.log on
options timed.proto ntp
options timed.servers 0.us.pool.ntp.org,1.us.pool.ntp.org,2.us.pool.ntp.org
options timed.sched 1h
options timed.window 5m
options timed.max_skew 3h
options timed.log off
  • Adjust aggregate snap reserve (if necessary)
snap reserve -A aggr0 0
  • Set RAID group size appropriately
  • Assign odd disks to 1st head and even disks to 2nd head
    • Ensure appropriate spare disks are kept
  • Resize vol0
    • Set to 20g for smaller filers and 50g for larger filers
  • Set ucode options on vol0 (new volumes will inherit this option)
vol options vol0 create_ucode on
vol options vol0 convert_ucode on
  • Set appropriate security style for vol0 (unix or ntfs)

options wafl.default_security_style unix (or ntfs)

Install: Protocol Configuration

Configure NFS:

  • Disable auto export of NFS volumes
options nfs.export.auto-update off

Configure iSCSI:

  • Disable iSCSI on unused interfaces/vifs
iscsi interface disable vif1
  • Disable protocols on ethernet ports if required (7.3 and later)
options interface.blocked.cifs e5b
options interface.blocked.nfs e1a,e1b
options interface.blocked.iscsi e5b
options interface.blocked.cifs "" (undoes the previous restriction)
  • Disable WINS on unnecessary interfaces (-wins on ifconfig)

Configure CIFS:

  • Make sure snapshots are visible from CIFS clients
options cifs.show_snapshot on
  • Make sure the previous versions tab is integrated
options cifs.ms_snapshot_mode xp
  • Disable default home share

Install: Add on Software Configuration

Installing SnapDrive

  • Install necessary patches
  • Install .NET
  • Install Host Utilities Kit

Install: Verify Configuration

Post Install: Test Plan

  • Verify data is still accessible after the following hardware failures
    • Shelf
    • Path (fcadmin offline 0a)
    • Switch
    • ESH
    • Controller
  • Verify MPHA
storage show disk -p
  • Perform a cf takeover and giveback and ensure no issues
  • Test AutoSupport and ensure configured users are receiving email
  • Ensure all vif interfaces have a partner interface (cf-config-check.cgi will detect this)
  • ifconfig -a and ensure all necessary interfaces are up, speed/duplex settings are correct

Popularity: 43% [?]

ONTAP 8.0.1 RC1 Available

The NOW site shows ONTAP 8.0.1 RC1 available for download as of today: https://now.netapp.com/NOW/download/software/ontap/8.0.1RC1/

At the time of this post all the links for release notes show a page not found, hopefully more information will be available shortly.

Popularity: 13% [?]

NetApp Flash Cache and 64 bit Aggregates

I was talking with a local NetApp SE the other day and the subject came up about the Flash Cache (formerly Performance Acceleration Module, or PAM) card and 64 bit aggregates in ONTAP 8.0  The issue, he told me, is that currently the Flash Cache cards are not compatible with 64 bit aggregates in ONTAP 8 – however they do work with 32 bit aggregates in ONTAP 8 and of course in ONTAP 7G.

I wasn’t able to find anything on the NOW site to confirm or deny this, he mentioned this only affected the newer PCIe based Flash Cache cards (256GB and 512GB)  the older, original PAM card (16GB) does work with 64bit aggregates.

The only thing I can find on the NOW site that talks about 64bit aggregates with PAM is from TR-3786:

10.6 PERFORMANCE ACCELERATION MODULE

A Performance Acceleration Module (PAM) can optimize the performance of your random read intensive workloads such as file services and messaging. PAM works with 64-bit aggregates as well as 32-bit aggregates and caches data that is coming from volumes located in both types of aggregates. PAM caches data based on data access regardless of the aggregate type.

The data cached in PAM while the system is in operation depends on the workload, and it can be a combination of data from volumes contained in different aggregates. There is no way to let PAM cache data only from a particular aggregate type. As noted in other sections of this document, 64-bit aggregates have a bigger address space and also take more memory for their metadata than 32-bit aggregates. This might reduce the total amount of effective data that can be cached in PAM when used with 64-bit aggregates present in the system.

However since they are referring to it as PAM and not Flash Cache I’m not sure if they are talking about the 16GB card that the NetApp SE said would still work or if this document is saying the newer card will work too and just hasn’t been updated to reflect the new product name.  Anyone out there able to shed some light on this?  I would love to know the details.

Popularity: 9% [?]

Use ONTAP 8.0 7-Mode Simulator on ESX

2011-12-01 Update: I have a new post on using the latest version of the NetApp ONTAP Simulator (8.1) here, follow those instructions to deploy the newest  simulator in your environment.

If you want to use the ONTAP 8.0 7-mode simulator in ESX the process is actually pretty simple, personally I wish NetApp would just offer an OVF format for the simulator but anyways…

Create a new VM in the vSphere Client and select Custom, assign a name to the VM, a resource group and select the datastore to place the VM files on.  Then you will be at the Virtual Machine Version screen.

Select Virtual Machine Version 7 and click Next

Select Other and choose FreeBSD 64bit from the drop down and click Next

Select 2 Virtual CPUs and click Next

Assign 2 GB RAM and click Next

Chose 4 NICs, in my case I placed the 1st and 3rd NIC on my VM network and the 2nd and 4th NIC on the Storage Network.  Click Next

Leave the default on SCSI controller and click Next

Select Do Not Create Disk and click Next

Click Finish

Now you need to copy the .vmdk files(including the cf card folder) from the ZIP into the directory you created the FreeBSD VM.  After you do that go into the VM properties and add hard disk choosing Use an Existing Virtual Disk. Browse to the VM location.

The first disk you want to add is the larger of the two with ‘cf’ in the name, after you add this disk add the other disk as well (I used default options for IDE controller etc).

That’s all there is to it, from this point you can follow the manual to setup the simulator.

After you power it on press ctrl-c and then select option 4.

Popularity: 100% [?]

Verify NetApp clusters with cf-config-check

A handy tool to use to verify the configuration of both NetApp controllers in a cluster is HA Configuration Checker.  This tool will check licenses, network configuration and options on each system to make sure there isn’t a problem during a cluster takeover.  Running the script is pretty simple, there is a .cgi and a .exe version.

By default it uses RSH to communicate to the NetApp controllers, personally I like to use SSH which just requires adding the -s into the command.  Another thing to be aware of, if you don’t specify the user when using SSH it will use the username you are currently signed in with on the computer it is being run from.  In my case that user doesn’t exist on the NetApp controllers so I used the root user.

Pretty helpful, and from this I can see I need to add a FlexClone license and fix my e0 interface or things depending on it may not work in a takeover situation.

NetApp System Manager can also do some basic alerting on issues like these as well, when I removed the FlexClone license from one of my filers you can see the on screen alert.

Bottom line: Make sure you don’t have mismatched configurations and when a cluster takeover happens you will be much happier.

Popularity: 22% [?]