Monthly Archives: October 2009

Capacity vs Performance: Thin Provisioning-Reclaiming Free Space

Posted on by

A comment about HDS’s Zero Page Reclaim on one of my previous posts got me thinking about the effectiveness of thin provisioning in general.  In that previous post, I talked about the trade-offs between increased storage utilization through the use of thin-provisioning and the potential performance problems associated with it.

There are intrinsic benefits that come with the use of thin provisioning.  First, new storage can be provisioned for applications without nearly as much planning.  Next, application owners get what they want, while storage admins can show they are utilizing the storage systems effectively.  Also, rather than managing the growth of data in individual applications, storage admins are able to manage the growth of data across the enterprise as a whole.

Thin provisioning can also provide performance benefits…  For example, consider a set of virtual Windows servers running across several LUNs contained in the same RAID group.  Each Windows VM stores its OS files in the first few GB of their respective VMDK files.  Each VMDK file is stored in order in each LUN, with some free space at the end.  In essence, we have a whole bunch of OS sections separated by gaps of no data.  If all VMs were booting at approximately the same time, the disk heads would have to move continuously across the entire disk, increasing disk latency.

Now take the same disks, configured as a thin pool, and create the same LUNs (as thin LUNs) and the same VMs.  Because thin-provisioning in general only writes data to the physical disks as it’s being written by the application, starting from the beginning of the disk, all of those Windows VMs’ OS files will be placed at the beginning of the disks.  This increased data locality will reduce IO latency across all of the VMs.  The effect is probably minor, but reduced disk latency translates to possibly higher IOPS from the same set of physical disks.  And the only change is the use of thin-provisioning.

So back to HDS Zero Page Reclaim.  The biggest problem with thin provisioning is that it doesn’t stay thin for long.  Windows NTFS, for example, is particularly NOT thin-friendly since it favors previously untouched disk space for new writes rather than overwriting deleted files.  This activity eventually causes a thin-LUN to grow to it’s maximum size over time, even though the actual amount of data stored in the LUN may not change.  And Windows isn’t the only one with the problem.  This means that thin provisioning may make provisioning easier, or possibly improve IO latency, but it might not actually save you any money on disk.  This is where HDS’s Zero Page Reclaim can help.  Hitachi’s Dynamic Provisioning (with ZPR) can scan a LUN for sections where all the bytes are zero and reclaim that space for other thin LUNs.  This is particularly useful for converting thick LUNs into thin LUNs.  But, it can only see blocks of zeros, and so it won’t necessarily see space freed up by deleting files.  Hitachi’s own documentation points out that many file systems are not-thin friendly, and ZPR won’t help with long-term growth of thin LUNs caused by actively writing and then deleting data.

Although there are ways to script the writing of zeros to free space on a server so that ZPI can reclaim that space, you would need to run that script on all of your servers, requiring a unique tool for each operating system in your environment.  The script would also have to run periodically, since the file system will grow again afterward.

NetApp’s SnapDrive tool for Windows can scan an NTFS file system, detect deleted files, then report the associated blocks back to the Filer to be added back to the aggregate for use by other volumes/LUNs.  The Space Reclamation scan can be run as needed, and I believe it can be scheduled; but, it appears to be Windows only.  Again, this will have to be done periodically.

But what if you could solve the problem across most or all of your systems, regardless of operating system, regardless of application, with real-time reclamation?  And what if you could simultaneously solve other problems?  Enter Symantec’s Storage Foundation with Thin-Reclamation API.  Storage Foundation consists of VxFS, VxVM, DMP, and some other tools that together provide dynamic grow/shrink, snapshots, replication, thin-friendly volume usage, and dynamic SAN multipathing across multiple operating systems.  Storage Foundation’s Thin-Reclamation API is to thin-provisioning what OST is to Backup Deduplication.  Storage vendors can now add near-real-time zero page reclaim for customers that are willing to deploy VxFS/VxVM on their servers.  For EMC customers, DMP can replace PowerPath, thereby offsetting the cost.

As far as I know, 3PAR is the first and only storage vendor to write to Symantec’s thin-API, which means they now have the most dynamic, non-disruptive, zero-page-reclaim feature set on the market.  As a storage engineer myself, I have often wondered if VxVM/VxFS could make management of application data storage in our diverse environment easier and more dynamic.  Adding Thin-Reclamation to the mix makes it even more attractive.  I’d like to see more storage vendors follow 3PAR’s lead and write to Symantec’s API.  I’d also like to see Symantec open up both OST and the Thin-Reclamation API for others to use, but I doubt that will happen.

NetApp and EMC: Startup and First Impressions

Posted on by

In the last post, I talked about a project I am involved in right now to deploy NetApp storage alongside EMC for SAN and NAS.  Today, I’m going to talk about my first impressions of the NetApp during deployment and initial configuration.

First Impressions

I’m going to be pretty blunt — I have been working with EMC hardware and software for a while now, and I’m generally happy with the usability of their GUIs.  Over that time, I’ve used several major revisions of Navisphere Manager and Celerra Manager, and even more minor revisions, and I’ve never actually found a UI bug.  To be clear, EMC, IBM, NetApp, HDS, and every other vendor have bugs in their software, and they all do what they can to find and fix them quickly, but I just haven’t personally seen one in the EMC UIs despite using every feature offered by those systems. (I have come across bugs in the firmware)

Contrast that with the first day using the new NetApp, running the latest 7.3.1.1L1 code, where we discovered a UI problem in the first 10 minutes.  When attempting to add disks to an aggregate in FilerView, we could not select FC disk to add.  We could, however, add SATA disk to the FC aggregate.  The only way to get around the issue was to use the CLI via SSH.  As I mentioned in my previous post, our NetApp is actually an IBM nSeries, and IBM claims they perform additional QC before their customers get new NetApp code.

Shortly after that, we found a second UI issue in FilerView.  When creating a new Initiator group, FilerView populates the initiator list with the WWNs that have logged in to it.  Auto-populating is nice but the problem is that FilerView was incorrectly parsing the WWN of the server HBAs and populating the list with NodeWWNs rather than PortWWNs.  We spent several hours trying to figure out why the ESX servers didn’t see any LUNs before we realized that the WWNs in the Initiator group were incorrect.  Editing the 2nd digit on each one fixed the problem.

I find it interesting that these issues, which seemed easy to discover, made it through the QC process of two organizations.  ONTap 7.3.2RC1 is available now, but I don’t know if these issues were addressed.

Manageability

As far as FilerView goes, it is generally easy to use once you know how NetApp systems are provisioned.  The biggest drawback in an HA-Filer setup is the fact you have to open FilerView separately for each Filer and configure each one as a separate storage system.  Two HA-Filer pairs? Four FilerView windows.  If you include the initial launch page that comes up before you get to the actual FilerView window, you double the number of browser windows open to manage your systems.  NetApp likes to mention that they have unified management for NAS and SAN where EMC has two separate platforms, each with their own management tools. EMC treats the two storage processors (SPs) in a Clariion in a much more unified manner, and provisioning is done against the entire Clariion, not per SP.  Further, Navisphere can manage many Clariions in the same UI.  Celerra Manager acts similarly for EMC NAS.  Six of one, half a dozen of the other some say, except that I find that I generally provision NAS storage and SAN storage at different times, and I’d rather have all of the controllers/filers in the same window than NAS and SAN in the same window.  Just my preference.

I should mention, NetApp recently released System Manager 1.0 as a free download.  This new admin tool does present all of the controllers in one view and may end up being a much better tool than FilerView.  For now, it’s missing too many features to be used 100% of the time and it’s Windows only since it’s based on MMC.  Which brings me to my other problem with managing the NetApp.  Neither FilerView nor System Manager can actually do everything you might need to do, and that means you end up in the CLI, FREQUENTLY.  I’m comfortable with CLIs and they are extremely powerful for troubleshooting problems, and especially for scripting batch changes, but I don’t like to be forced into the CLI for general administration.  GUI based management helps prevent possibly crippling typos and can make visualizing your environment easier.  During deployment, we kept going back and forth between FilerView and CLI to configure different things.  Further, since we were using MultiStore (vFilers) for CIFS shares and disaster recovery, we were stuck in the CLI almost entirely because System Manager can’t even see vFilers, and FilerView can only create them and attach volumes.

Had I not been managing Celerra and Clariion for so long, I probably wouldn’t have noticed the above problems.  After several years of configuring CIFS, NFS, iSCSI, Virtual DataMovers, IP Interfaces, Snapshots, Replication, and DR Failover, etc. on Celerra, as well as literally thousands of LUNs for hundreds of servers on Clariion, I don’t recall EVER being forced to use the CLI.  CelerraCLI and NaviCLI are very powerful, and I have written many scripts leveraging them, and I’ll use CLI when troubleshooting an issue.  But for every single feature I’ve ever used on the Celerra or Clarrion, I was able to completely configure from start to finish using the GUI.  Installing a Celerra from scratch even uses a GUI based installation wizard.  Comparing Clariion Storage Groups with NetApp Initiator groups and LUN maps isn’t even fair.  For MS Exchange, I mapped about 50 LUNs to the ESX cluster, which took about 30 minutes in FilerView.  On the Clariion, the same operation is done by just editing the Storage Group and checking each LUN, taking only a couple minutes for the entire process.

Now, all of the above commentary has to do with the management tools, UIs, and to some degree personal preferences, and does not have any bearing on the equipment or underlying functionality.  There are, of course, optional management tools like Operations Manager, Provisioning Manager, and Protection Manager available from NetApp, just as there is Control Center from EMC (which incidentally can monitor the NetApp) or Command Central from Symantec.  Depending on your overall needs, you may want to look at optional management tools; or, FilerView may be perfectly fine.

In the next post,  I’ll get into more specifics about how the Exchange 2007 CCR cluster turned out in this new environment, along with some notes on making CCR truly redundant.  I’ve also been working on the NAS side of the project, so I’ll also post about that some time soon.

NetApp and EMC: Real world comparisons

Posted on by

I’ve been tasked recently on a project to increase availability of applications through the use of multiple/disparate storage systems.  This environment has heavily invested in EMC Clariion and Celerra storage systems over the past few years and needed a non-EMC platform from which to build the second half of a redundant storage environment.  For various reasons I won’t go into here, we chose IBM nSeries as that second platform. (Since the IBM system is rebranded NetApp FAS, I will refer to this as a NetApp filer.)  I’ve been working on implementing the new equipment as well as integrating it into the Business Continuity strategy.

The overall strategy is to continue to use the EMC Clariion/Celerra systems for production and disaster recovery replication and split applications between and across the two storage platforms for local redundancy.  The NetApp will also perform disaster recovery replication for some of the applications.  Here’s a really simple diagram that might help if the description is confusing:

EMC and NetApp Redundancy

EMC and NetApp Redundancy

Now this may sound easy, but it is, in fact, NOT straightforward.  This strategy requires close coordination with application owners and careful planning.  As we move forward on this project, I’ll talk about various idiosyncrasies, caveats, and problems we’ve faced, how we got around them, and I’ll also talk a lot about the differences between the Clariion/Celerra and NetApp platforms’ features and functionality, application support, and manageability.  These comparisons will include using both systems with FiberChannel connections as well as CIFS/NFS NAS, all in conjunction with DR replication and failover.

To start off, I figure we should compare some of the terminology between EMC and NetApp systems.  Some terms don’t directly translate, but I matched them up as close as I could and noted where there is no equivalent.   Below are two tables: one for Block Storage, and the other for NAS Storage.  Click on them to see full size versions.

EMC-NetApp Block Storage Terminology table

EMC-NetApp Block Storage Terminology

EMC-NetApp NAS Storage Terminology

EMC-NetApp NAS Storage Terminology

In the next update, I’ll start talking about the deployment itself.  The point of these articles is to discuss the differences, advantages, and disadvantages of each platform so that you can understand how each one might work in your environment.  I do not intend to disparage either platform or vendor.  I will try to be vendor agnostic as much as possible, and I do feel like I have a somewhat unique position of comparing new and recent hardware and firmware from both vendors, in the same production capacities, simultaneously, in the same environment.  I am NOT comparing old ONTap code to new FLARE/DART code or vise-versa, nor am I comparing old Clariion CX hardware to new NetApp/IBM hardware, etc.

Stay tuned!

Expanding a Thin LUN on Clariion CX4

Posted on by

Do you have an EMC Clariion CX4 with Virtual Provisioning (thin provisioning)?  Have you tried to expand the host visible (ie: maximum) size of a thin-LUN and can’t figure out how?  Well you aren’t alone…  Despite extremely little information available from EMC to say one way or the other, I finally figured out that you actually can’t expand a thin lun yet.  This was a surprise to me, since I had just assumed that would be there.  Thin-LUNs are pretty much virtual LUNs, and as such, they don’t have any direct block mapping to a RAID group that has to be maintained.  And traditional LUNs can be expanded using MetaLUN striping or concatenation.  Due to this restriction the host visible size of a thin-LUN cannot be edited after the LUN has been created.

But there IS a workaround.  It’s not perfect but its all we have right now.  Using CLARiiON’s built-in LUN Migration technology you can expand a thin-LUN in two steps.

Step 1: Migrate the thin-LUN to a thick-LUN of the same maximum size.

Step 2: Migrate the thick-LUN to a new thin-LUN that was created with the new larger size you want.  After migration, the thin-LUN will consume disk space equivalent to the old thin-LUN’s maximum size, but will have a new, higher host maximum visible size.

Two Steps to expand a thin LUN

Two Steps to expand a thin LUN

This requires that you have a RAID group outside of your thin pools that has enough usable free space to fit the temporary thick-LUN, so it’s not a perfect solution.  You’d think that migrating the old thin-LUN directly into a new, larger thin-LUN would work, but to use the additional disk space after a LUN migration, you have to edit the LUN size which, again, can’t be done on thin-LUNs.  I haven’t actually tested this, but this is based on all of the documentation I could find from EMC on the topic.

I’m looking into other methods that might be better, but so far it seems that certain restrictions on SnapView Clones and SANCopy might preclude those from being used.  The ability to expand thin-LUNs will come in a later FLARE release for those that are willing to wait.