Blog Archives

EMC CLARiiON and Celerra Updates – Defining Unified Storage

Posted on by

This past week, during EMC World 2010 in Boston, EMC made several announcements of updates to the Celerra and CLARiiON midrange platforms.  Some of the most impressive were new capabilities coming to CLARiiON FLARE in just a couple short months.  Major updates to Celerra DART will coincide with the FLARE updates and if you are already running CLARiiON CX4 hardware, or are evaluating CX4 (or Celerra), you will want to check these new features out.  They will be available to existing CX4(120,240,480,960)/NS(120,480,960) systems as part of a software update.

Here’s a list of key changes in FLARE 30:

  • Unified management for midrange storage platforms including CLARiiON and Celerra today, plus RecoverPoint, Replication Manager and more in the future.  This is a true single pane of glass for monitoring AND managing SAN, NAS, and data protection and it’s built in to the platform.  “EMC Unisphere” replaces Navisphere Manager and Celerra Manager and supports multiple storage systems simultaneously in a single window. (Video Demo)
  • Extremely large cache (ie: FASTCache) – Up to 2TB of additional read/write cache in CLARiiON using SSDs (Video Demo)
  • Block level Fully Automated Storage Tiering (ie: sub-LUN FAST) – Fully automated assignment of data across multiple disk types
  • Block Level Compression – Compress LUNs in the CLARiiON to reduce disk space requirements
  • VAAI Support – Integrate with vSphere ESX for improved performance

These features are in addition to existing features like:

  • Seamless and non-disruptive mobility of LUNs within a storage array – (via Virtual LUNs)
  • Non-Disruptive Data Migration – (via PowerPath Migration Enabler)
  • VMWare Aware Storage Management – (Navisphere, Unisphere, and vSphere Plugins giving complete visibility  and self-service provisioning for VMWare admins (Video Demo) AND Storage Admins
  • CIFS and NFS Compression – Compress production data on Celerra to reduce disk space requirements including VMs
  • Dynamic SAN path load balancing – (via PowerPath)
  • At-Rest-Encryption – (via PowerPath w/RSA)
  • SSD, FC, and SATA drives in the same system – Balance performance and capacity as needed for your application
  • Local and Remote replication with array level consistency – (SnapView, MirrorView, etc)
  • Hot-swap, Hot-Add, Hot-Upgrade IO Modules – Upgrade connectivity for FC, FCoE, and iSCSI with no downtime
  • Scale to 1.8PB of storage in a single system
  • Simultaneously provide FC, iSCSI, MPFS, NFS, and CIFS access

All together, this is an impressive list of features for a single platform. In fact, while many of EMC’s competitors have similar features, none of them have all of them in the same platform, or leverage them all simultaneously to gain efficiency.  When CLARiiON CX4 and Celerra NS are integrated and managed as a single Unified storage system with EMC Unisphere there is tremendous value as I’ll point out below…

Improve Performance easily…

  • Install a couple SSD drives into a CLARiiON and enable FASTCache to increase the array’s read/write cache from the industry competive 4GB-32GB up to 2TB of array based non-volatile Read AND Write cache available to ALL applications including NAS data hosted by the array.
  • Install PowerPath on Windows, Linux, Solaris, AND VMWare ESX hosts to automatically balance IO across all available paths to storage.  PowerPath detects latency and queuing occuring on each path and adjusts automatically, improving performance at the storage array AND for your hosts.  This is a huge benefit in VMWare environments especially.
  • When VMWare releases the updated version of vSphere ESX that supports VAAI, ESX will be able to leverage VAAI support in the CLARiiON to reduce the amount of IO required to do many tasks, improving performance across the environment again.
  • Upgrade from 1gbe iSCSI to 10gbe iSCSI, or from 4gbe FiberChannel to 8gbe FiberChannel, without a screwdriver or downtime.
  • Provide NAS shared file access with block-level performance for any application using EMC’s MPFS protocol.

Improve Efficiency and cost easily…

  • Create a single pool of storage containing some SSD, some FC, and some SATA drives, that automatically monitors and moves portions of data to the appropriate disk type to both improve performance AND decrease cost simultaneously.
  • Non-disruptively compress volumes and/or files with a single click to save 50% of your disk space in many cases.
  • Convert traditional LUNs to more efficient Thin-LUNs non-disruptively using PowerPath Migration Enabler, saving more disk space.

Increase and Manage Capacity easily…

  • Add additional storage non-disruptively with SSD, FC, and SATA drives in any mix up to 1.8PB of raw storage in a single CLARiiON CX4.
  • Using FASTCache, iSCSI, FC, and FCoE connectivity simultaneously does not reduce total capacity of the system.
  • Expanding LUNs, RAID Groups, and Storage Pools is non-disruptive.
  • Migrating LUNs between RAID groups and/or Storage Pools is non-disruptive using built-in CLARiiON LUN Migration, as is migrating data to a different storage array (using PowerPath Migration Enabler)!
  • Balancing workload between storage processors is non-disruptive and at individual LUN granularity.

Protect your data easily…

  • Snapshot, Clone, and Replicate any of the data to anywhere with built in array tools that can maintain complete data consistency across a single, or multiple applications without installing software.
  • Maintain application consistency for Exchange, SQL, Oracle, SAP, and much more, even within VMWare VMs, while replicating to anywhere with a single pane-of-glass.
  • Encrypt sensitive data seamlessly using PowerPath Encryption w/RSA.

Maintain Flexibility…

  • While you can do all of these things quickly and simply, you still have the flexibility to create traditional RAID sets using RAID 0, 1, 5, 6, and 10 where you need highly predicable performance, or tune read and write cache at the array and LUN level for specific workloads.  Do you want read/write snapshots? How about full copy clones on completely separate disks for workload isolation and failure protection? What about the ability to rollback data to different points in time using snapshots without deleting any other snapshots?  EMC Storage arrays have been able to do this for a long time and that hasn’t changed.

There are few manufacturers aside from EMC that can provide all of these capabilities, let alone provide them within a single platform.  That’s the definition of simple, efficient, Unified Storage in my opinion.

EMC VPLEX enables the private cloud.. But what is a “private cloud”?

Posted on by

Buzzword Much?

If you have seen any of EMC’s marketing for EMC World, or you are attending EMC World in Boston this week, you no doubt noticed a ton of talk about the “Private Cloud”.  There has been a lot more talk from vendors as of late about the “cloud” and “cloud computing” and you may be reminded about how every few years the word “cloud” is shouted out by vendors of all kinds and how inevitably the talk quiets and nothing is really different.  So is it different this time?  I think so.

What is a Cloud?

In the context of IT, there are examples of clouds already.  The Internet and public telephone system are two examples of clouds.  Facebook, Flickr, and Salesforce are examples of clouds as well.  The common theme is that each of these examples provides some sort of service to the end user without requiring the end user to purchase or build any infrastructure to support it.  You can plug a phone into a wall and immediately call nearly anyone in the world.  Cloud is a fancy word (or buzzword) for providing something “as-a-service”.  Salesforce.com is software-as-a-service (SaaS).

So what is the Private Cloud?  

In the context of enterprise datacenters, the focus of EMC’s vision, the Private Cloud is Infrastructure-as-a-service (IaaS) and it enables corporate IT to transition from a necessary expense, to a profit center within the business, providing IT-as-a-Service to the rest of the business.  It decouples infrastructure from applications providing unprecedented levels of scalability, availability, and flexibility at lower cost.

What if…
a.) your corporate applications could run from anywhere, and users had access from anywhere?
b.) you could relocate your applications from anywhere to anywhere else, at any time, without disruption to your users.
c.) you could replace any piece of physical hardware in your infrastructure without impacting your applications.

Sounds too good to be true right? Maybe not…

This week, EMC announced a completely new product called VPLEX.  VPLEX has the ability to take your existing storage arrays and pool them into a cooperative pool of storage for hosts and applications.  It then allows you to move application data within and across those arrays as needed without disrupting the application or users.  If you are familiar with EMC’s Invista, IBM’s SVC, or Hitachi’s USP-V products you may be thinking that VPLEX is just another storage virtualization product.  But I assure you it’s different.  VPLEX virtualizes storage within the datacenter similar to how the above products can, but VPLEX can ALSO combine storage across multiple datacenters and allow an application to run from any of them or all of them, simultaneously, through the power of Federation.

Active/Active Datacenters

With VPLEX Federation, you can move a virtual machine and all of its data from datacenter A to datacenter B in a matter of minutes without user disruption; or hundreds of VMs, or thousands of VMs.  You can run the same application in both locations, sharing a single dataset.  Armed with EMC VPLEX and VMWare vSphere, you can upgrade, replace, and reconfigure any part of your infrastructure (storage, servers, network, power distribution, etc) without ever having to take your applications offline.  How’s that for availability?

The ability to create a virtual infrastructure from the storage layer through to the server layer and host any application on that infrastructure is the key to creating providing Infrastructure-as-a-Service, building the Private Cloud, and provisioning IT-as-a-Service within your organization.  Imagine running the IT department as a business within the business and actually showing financial value to the business.

There is a lot more to this concept but I wanted to at least bring some context around “cloud” as well as EMC’s new VPLEX product.  There will be more to come on this topic.

Chuck Hollis wrote about VPLEX as a new Storage Platform today, and VirtualGeek called it a Virtual Machine teleporter in his quite detailed write up of this new technology.  The key is to step back with an open mind and think about how application design and disaster recovery planning could be approached in entirely new ways when the data is no longer confined to a particular physical location.

Much ado about the future.. (of IT)

Posted on by

The 8:50am Alaska Air flight from Seattle to Boston today may as well have been an EMC chartered flight. Full of my current EMC peers, previous coworkers from my past 12 years in IT, as well as other EMC customers; all of us making the pilgrimage to Boston for EMC World 2010. The five and a half hour flight was both a networking opportunity and a reunion at the same time.

Despite the time away from home while my wife and I prepared for some big life changes, I’m excited to attend my 4th EMC World in 5 years, my first as an employee of EMC. This year promises to be extremely exciting as we make a number of huge announcements during the week, some of which have the potential to change the landscape of information storage and management. As an IT professional for over a decade, I’m a techie at heart and this is really exciting stuff. As a new EMC employee, the position and direction of the company validates many of the reasons I chose to take on this new career path, and with this company.

I plan to provide some commentary on the announcements we make during the week, particularly around virtual storage and the concept of “cloud computing”. I’m not a fan of industry buzzwords and “cloud” is one of the worst offenders but I think it’s important for IT professionals to understand what the vendors really mean when they talk about cloud, and how it affects every day life in IT.

If you are attending EMC World this year, I hope you feel the excitement, and I hope you start to see the bright future we are all headed for.

Do you have a recovery plan? You should!

Posted on by

In my new role at EMC, I am one of the first people to learn of major problems that my customers experience.  In general, customers seem to call their sales team before technical support when a big problem happens.  In the past week, I’ve been involved in recovery efforts with two different customers, both resulting from complete power outages in their production datacenters.

Both of these customers process millions of dollars through their global customer facing websites.  The smaller customer of the two does not have a disaster recovery site of any kind, while the other (larger) customer does have a recovery site, but it is not designed for 100% operation and is hundreds of miles away.

What became clear through both of these incidents is that having a very clear, very well known recovery plan is critical to the business.  Interestingly, these experiences drove home the point that even if you don’t have a recovery site, aren’t using replication, and otherwise don’t have any way to recover the data offsite, you still need a plan that encompasses what you CAN do.  More often than not, major outages are short lived and you will be recovering in your primary datacenter anyway, so you need to have a pre-determined plan to prevent major issues and shorten the time to recover.

Here are some things to think about when creating a recovery plan:

  • Get the application owners together and build a list of all the applications running in your environment.  Document the purpose of each application and map dependencies that each application has on other applications.
  • Next, involve the server/systems admins and document the server names, database names, IP addresses, and DNS names for each application on the list.
  • Finally, involve the infrastructure teams (storage, network, datacenter) and document the network dependencies (subnets, routers, VPN connections, load balancers, etc).  Document any SAN storage used by the servers/applications.  Also document how each infrastructure component affects others (ie: the SAN switches are required to be operational before servers can connect to storage arrays.)
  • Work with business leaders to prioritize the applications.  The idea is to understand how much impact each application has to the business both from a productivity perspective as well as direct financial impact.  There may be legal requirements or service level agreements with customers to consider as well.
  • If possible, identify the maximum amount of time each application can be down in the event of a catastrophic event (RTO – Recovery Time Objective) and how much data can be lost without significant impact to the business (RPO – Recovery Point Objective).  These metrics are usually measured in minutes, hours, and days.
  • Document the backup method for each server and application.  How often are backups run?  What is the retention period?  How long does it take to complete backups?   What is the expected time to restore the data?  How long does it take to recall tapes from offsite storage?
  • At this point you have a prioritized list of applications, now build a step by step recovery plan that lists the exact order in which you must recover systems.  The list should include server names as well as validation points to ensure certain systems are working before moving to the next step.  For example:
    • Step 1: bring up the network switches and routers
    • Step 2: bring up the DNS/DHCP servers
    • Step 3: bring up Active Directory servers
    • Step 4: bring up SAN fabric switches
    • Step 5: bring up SAN storage arrays, verify health of arrays with help from vendor
    • Step 6: …

I recommend that one of the first steps before starting recovery is to contact your key vendors (storage array vendors at least) to notify them of your outage so they can get support resources ready to troubleshoot any hardware issues you may run into during the recovery.

  • Identify key players needed in a recovery, at least primary and secondary contacts for every application and vendor contacts for hardware/software, facilities, UPS/Generator support teams, etc.
  • Establish a standard communication plan to include at least the following…
    • A method to notify employees of an outage and give instructions
    • A method to notify key players for recovery
    • A mechanism for key players to communicate with each other during the recovery
    • Personal (not corporate/business) contact information for all of the key players

The key thing to remember here is that you cannot rely on any communication tools that are part of your infrastructure.   You must assume your PBX/VOIP system will be down, Email will be down, corporate instant messenger will be down, Sharepoint will be unavailable, etc.

  • If you have a remote recovery site, with or without replication technology, and intend to use the remote site to recover production applications in the event of a large failure, be sure to document the triggers for moving to the recovery site.  As an example, you may want to attempt recovery in the primary site, and then move to the recovery site if recovery at the primary site will take too long — be sure to document that time and get executive buyoff.  You should not hear “how long do we wait until we move to the DR site?” during an active recovery operation.  That decision needs to be made during the planning exercise.
  • Document the entire plan and store the digital copies in a readily accessible place (file shares, Sharepoint site, etc).  Keep additional copies on USB sticks or CDs stored in a safe place.  Keep even MORE copies in another location outside the primary datacenter facility (ie: safe deposit box, remote office safe, etc).  Print copies as well and store the printed copy in similar safe places.  Assume that a building may not be accessible due to fire or flood.  I know one customer who issues fingerprint secured USB sticks to every manager.  Each manager must sync their USB stick to a server at least monthly or upper management is notified.
  • Make sure that everyone is aware of the recovery plan, who has access to the plan, where the copies are stored, and what role each of the key players is expected to play during a recovery.

There is far more to think about but hopefully you can get a good start with what I’ve listed above.  If you have a recovery plan already, you should review it regularly and think about anything that needs to be added or modified in the plan.

If you are trying to get approval for a remote recovery site and replication technology and having trouble getting executive approval, going through this exercise and defining application priority with RPO/RTO for each could give you the ammo you need.  Traditional backup architectures aren’t designed for RPO’s under 24 hours while storage array based replication can get RPOs down into the minutes and restoring from tape takes way longer than restoring from replicated data.

Last but not least, keep the plan updated as your environment changes, add new application and server details to the plan as part of the implementation process for new applications, or as part of change control procedures for significant changes to the infrastructure.

Changes…

Posted on by

I’ve been absent from posting lately because there have been a lot of changes in my life.  Most notably I made a change in my career and have joined EMC Corporation as a Sr. Technical Consultant.

In this new role, I’ll be helping customers overcome challenges related to storing and managing information.

I’m not sure what my future topics will be other than to say they will still be storage related.  I expect that topics will come from the challenges my customers are facing and how they can be solved with today’s technology.

I promise to be as objective as possible despite my new employer being EMC, the corporate blogging policy is quite reasonable.

As before, the opinions expressed here are my own and not those of my employer or any other person or company.  All company and product names mentioned in this blog are copyrighted by their respective companies.

NetApp and EMC: Exchange 2007 Replication

Posted on by

Exchange Replication

Building on the redundant storage project, we also wanted to replicate Exchange to a remote datacenter for disaster recovery purposes.  We’ve been using EMC CLARiiON MirrorView/A and Replication Manager for various applications up to now and decided we’d use NetApp/SnapMirror for Exchange to leverage the additional hardware as well as a way to evaluate NetApp’s replication functionality vs EMC’s.

On EMC Clariion storage, there are a couple choices for replicating applications like Exchange.
1.) Use MirrorView/Async with Consistency Groups to replicate Exchange databases in a crash-consistent state.
2.) Use EMC Replication Manager with Snapview snapshots and SANCopy/Incremental to update the remote site copy.

Similar to EMC’s Replication Manager, NetApp has SnapManager for various applications, which coordinates snapshots, and replica updates on a NetApp filer.

Whether using EMC RM or NetApp SM, software must be installed on all nodes in the Exchange cluster to quiesce the databases and initiate updates.  The advantage of Consistency groups with MirrorView is that no software needs to be installed in the host; all work is performed within the storage array.  The advantage of RM and SM/E is that database consistency is verified on each update and the software can coordinate restoring data to the same or alternate servers, which must be done manually if using MirrorView.

NetApp doesn’t support consistent snapshots across multiple volumes so the only option on a Filer is to use SnapManager for Exchange to coordinate snapshots and SnapMirror updates.

Our first attempt configuring SnapManager for Exchange actually failed when we ran into a compatibility issue with SnapDrive.  SnapManager depends on SnapDrive for mapping LUNs between the host and filer, and to communicate with the filer to create snapshots, etc.  We’d discussed our environment with NetApp and IBM ahead of time, specifically that we have Exchange CCR running on VMWare, with FiberChannel LUNs and everyone agreed that SnapDrive supports VMWare, Exchange, Microsoft Clustering, and VMWare Raw Devices.  It turns out that SnapDrive 6 DOES support all of this, but not all at the same time.  Specifically, MSCS clustering is not supported with FC Raw Devices on VMWare.  In comparison, EMC’s Replication Manager has supported this configuration for quite a while.  After further discussion NetApp confirmed that our environment was not supported in the current version of SnapDrive (6.0.2) and that SnapDrive 6.2, which was still in Beta, would resolve the issue.

Fast forward a couple months, SnapDrive 6.2 has been released and it does indeed support our environment so we’ve finally installed and configured SnapDrive and SnapManager.  We’ve dedicated the EMC side of the Exchange environment for the active nodes and the IBM for the passive nodes.  SnapManager snapshots the passive node databases, mounts them to run database verification, then updates the remote mirror using SnapMirror.

While SnapManager does do exactly what we need it to do, my experience with it hasn’t been great so far…  First, SnapManager relies on Windows Task Scheduler to run scheduled jobs, which has been causing issues.  The job will run on its schedule for a day, then stop after which the task must be edited to make it run again.  This happens in the lab and on both of our production Exchange clusters.  I also found a blog post about this same issue from someone else.

The other issue right now is that database verification takes a long time, due to the slow speed of ESEUTIL itself.  A single update on one node takes about 4 hours (for about 1TB of Exchange data) so we haven’t been able to achieve our goal of a 2-hour replication RPO.  IBM will be onsite next week to review our status and discuss any options.  An update on this will follow once we find a solution to both issues.  In the meantime I will post a comparison of replication tools between EMC and NetApp soon.

NetApp and EMC: ESX and Exchange 2007 CCR

Posted on by

The first application we tackled after deploying the NetApp system was Exchange 2007.  We had deployed Exchange 2007 recently, running in CCR clusters on VMWare ESX.  Since each node of a CCR cluster has it’s own copy of the database we wanted to put one node from each cluster onto the NetApp, leaving the other nodes on the Clariion.  This environment is entirely FiberChannel, no iSCSI deployed and as such the Exchange servers are using VMWare Raw Devices for the database and log disks.  This poses a problem that we didn’t discover until later which I will discuss in a future post about replicating Exchange with NetApp.

Re-Architecting the environment to fit the storage

The first thing we discovered was that neither IBM/NetApp nor EMC would support the same host HBAs zoned to multiple brands of storage.  So we had to split the ESX cluster into two clusters, one on each storage platform.  Luckily the Exchange environment was isolated on it’s own six node cluster so it was easy to split everything in half.

Next we learned that due to NetApp’s updated active/active mode with proxy paths in ONTap 7.3, VMWare ESX 3.x randomly selects paths when rescanning HBAs and will pick non-optimized paths to the LUNs.  This still works but is not ideal as it increases IO latency, causing the Filer to send autosupport emails periodically warning of the problem.  Installing the NetApp Host Utilities for ESX onto the ESX hosts themselves allows you to run a script that assigns persistent paths evenly across the HBAs.  The script works as advertised but as far as I can tell you have to run the script each time you add a new LUN to the ESX server.  It would be much better if it were more automated.

Actually, if you are running ESX4.0 the scenario changes since NetApp ONTap 7.3+, Clariion FLARE 26+, and ESX4 all support ALUA making this problem all but disappear and improving fabric resiliency. Unfortunately for us, ESX4 is still a bit new and hasn’t been rolled out into production yet.  NetApp also released tools for vCenter 4.0 that allow you to do the path assignment and other tasks from within vCenter rather than at the command line.  EMC also now has PowerPath available for ESX4.0 which will not only manage paths but load balance across all paths for increased performance and lower latency.

VirtualStorageGuy has blogged already about the NetApp/EMC/vSphere plug-ins and there is even a Powerpoint available.

Finally, during the sales process NetApp pushed their de-duplication features (A-SIS) quite a bit and stressed how much disk space we could save in a VMWare environment.  During deployment we were informed that if your VMs (VMDKs and VMFS) were not properly partition aligned de-duplication wouldn’t work well or at all.  Since this environment has several hundred VMs built over several years by many people, and aligning the system (C:) drive of a Windows VM is difficult, the benefit would be minimal for us.  Luckily NetApp has provided tools that can scan and align VMDKs without having to repartition the disks.  We have not tested this yet.  Partition Alignment is a best practice for ANY SAN storage system so we can’t fault NetApp for this problem; it’s just a fact of life.

But is it REALLY Redundant?

Even with two storage systems, with independent VMWare clusters, each hosting half of the Exchange cluster environment, a problem with either array could still take down and entire Exchange cluster.  This is due to the File Share Witness (FSW) component used in a Majority Node Set (MNS) cluster like Exchange CCR.  The idea behind the FSW in an MNS cluster is to prevent a condition known as Split Brain.  Since a MNS cluster does not have a quorum disk, it relies entirely on network communication between the nodes to determine cluster status and make decisions about which nodes should become active.  In the event that the two nodes lose communication with each other, each node will check for the FSW and if it is still available, it assumes that the other cluster node is down and proceeds to bring cluster resources online (if they weren’t already).  Without the FSW, both nodes would potentially go active and there could be issues with inconsistent data, etc.  This is the split-brain condition.

Typically, each cluster has a single FSW on a separate server (the CAS servers in our case).  With the redundancy storage model we moved to, the FSW became a single point of failure.  If we put the FSW on EMC storage with NodeA, and NodeB on the IBM/NetApp storage, a problem with the EMC array could take down both the cluster node AND the FSW at the same time.  The surviving cluster node on the IBM/NetApp array would go down or stay down to prevent split-brain since the FSW was not available.  Moving the FSW to the IBM/NetApp array presents the same problem on opposite side of the cluster.  Incidentally, we proved this problem in lab testing to be sure.  The solution is to move the FSW off of BOTH arrays, to either a dedicated physical server with internal disk, or a third storage array if you have one.  There was a second EMC array in production so we moved the FSW there.  In the new configuration, a complete outage of any single storage array would not take down the Exchange environment.

Crude diagram of the storage redundant Exchange CCR cluster

So far this new 3-way split environment is working fine, performance on the EMC and NetApp arrays is fine for Exchange.  Using the same number of disks on the NetApp array yields about twice as much usable space as the EMC due to RAID-DP vs RAID-10 but overall performance is similar.  Theoretically that means we could allow for more growth of the Exchange databases but in reality that is not always the case.  My next update will be about Exchange replication using SnapManager and SnapMirror and how that has effectively negated the remaining free space in the NetApp aggregate.

THE ENGINE

Posted on by 0 comment

So Far…

Category: MGB Project

THE CAR

Posted on by 0 comment

A Few Years Ago…

Category: MGB Project

Capacity vs Performance: Thin Provisioning-Reclaiming Free Space

Posted on by

A comment about HDS’s Zero Page Reclaim on one of my previous posts got me thinking about the effectiveness of thin provisioning in general.  In that previous post, I talked about the trade-offs between increased storage utilization through the use of thin-provisioning and the potential performance problems associated with it.

There are intrinsic benefits that come with the use of thin provisioning.  First, new storage can be provisioned for applications without nearly as much planning.  Next, application owners get what they want, while storage admins can show they are utilizing the storage systems effectively.  Also, rather than managing the growth of data in individual applications, storage admins are able to manage the growth of data across the enterprise as a whole.

Thin provisioning can also provide performance benefits…  For example, consider a set of virtual Windows servers running across several LUNs contained in the same RAID group.  Each Windows VM stores its OS files in the first few GB of their respective VMDK files.  Each VMDK file is stored in order in each LUN, with some free space at the end.  In essence, we have a whole bunch of OS sections separated by gaps of no data.  If all VMs were booting at approximately the same time, the disk heads would have to move continuously across the entire disk, increasing disk latency.

Now take the same disks, configured as a thin pool, and create the same LUNs (as thin LUNs) and the same VMs.  Because thin-provisioning in general only writes data to the physical disks as it’s being written by the application, starting from the beginning of the disk, all of those Windows VMs’ OS files will be placed at the beginning of the disks.  This increased data locality will reduce IO latency across all of the VMs.  The effect is probably minor, but reduced disk latency translates to possibly higher IOPS from the same set of physical disks.  And the only change is the use of thin-provisioning.

So back to HDS Zero Page Reclaim.  The biggest problem with thin provisioning is that it doesn’t stay thin for long.  Windows NTFS, for example, is particularly NOT thin-friendly since it favors previously untouched disk space for new writes rather than overwriting deleted files.  This activity eventually causes a thin-LUN to grow to it’s maximum size over time, even though the actual amount of data stored in the LUN may not change.  And Windows isn’t the only one with the problem.  This means that thin provisioning may make provisioning easier, or possibly improve IO latency, but it might not actually save you any money on disk.  This is where HDS’s Zero Page Reclaim can help.  Hitachi’s Dynamic Provisioning (with ZPR) can scan a LUN for sections where all the bytes are zero and reclaim that space for other thin LUNs.  This is particularly useful for converting thick LUNs into thin LUNs.  But, it can only see blocks of zeros, and so it won’t necessarily see space freed up by deleting files.  Hitachi’s own documentation points out that many file systems are not-thin friendly, and ZPR won’t help with long-term growth of thin LUNs caused by actively writing and then deleting data.

Although there are ways to script the writing of zeros to free space on a server so that ZPI can reclaim that space, you would need to run that script on all of your servers, requiring a unique tool for each operating system in your environment.  The script would also have to run periodically, since the file system will grow again afterward.

NetApp’s SnapDrive tool for Windows can scan an NTFS file system, detect deleted files, then report the associated blocks back to the Filer to be added back to the aggregate for use by other volumes/LUNs.  The Space Reclamation scan can be run as needed, and I believe it can be scheduled; but, it appears to be Windows only.  Again, this will have to be done periodically.

But what if you could solve the problem across most or all of your systems, regardless of operating system, regardless of application, with real-time reclamation?  And what if you could simultaneously solve other problems?  Enter Symantec’s Storage Foundation with Thin-Reclamation API.  Storage Foundation consists of VxFS, VxVM, DMP, and some other tools that together provide dynamic grow/shrink, snapshots, replication, thin-friendly volume usage, and dynamic SAN multipathing across multiple operating systems.  Storage Foundation’s Thin-Reclamation API is to thin-provisioning what OST is to Backup Deduplication.  Storage vendors can now add near-real-time zero page reclaim for customers that are willing to deploy VxFS/VxVM on their servers.  For EMC customers, DMP can replace PowerPath, thereby offsetting the cost.

As far as I know, 3PAR is the first and only storage vendor to write to Symantec’s thin-API, which means they now have the most dynamic, non-disruptive, zero-page-reclaim feature set on the market.  As a storage engineer myself, I have often wondered if VxVM/VxFS could make management of application data storage in our diverse environment easier and more dynamic.  Adding Thin-Reclamation to the mix makes it even more attractive.  I’d like to see more storage vendors follow 3PAR’s lead and write to Symantec’s API.  I’d also like to see Symantec open up both OST and the Thin-Reclamation API for others to use, but I doubt that will happen.