Monthly Archives: January 2011

Saving $90,000 per minute with VMAX Federated Live Migration

Posted on by

One of the customers I work with regularly has a set of key MS SQL databases totaling over 100TB in size which process transactions for their customer facing systems. If these databases are down, no transactions can be processed and customers looking to spend money will go elsewhere. The booking rate related to these databases is ~$90,000 per minute, all day, every day.

Today these databases live on Symmetrix DMX storage which has been very reliable and performant. As happens in an IT world, some of these Symmetrix systems are getting long in the tooth and we are working on refreshing several older DMX systems into fewer, newer VMAX systems. Aside from the efficiency and performance benefits of sub-LUN tiering (EMC FASTVP), and the higher scalability of VMAX vs DMX as well as pretty much any other storage array on the market, EMC has added another new feature to VMAX that is particularly advantageous to this particular customer — Federated Live Migration

Tech Refresh and Data Migrations:

Tech Refresh is a fact of life and data migrations, as a result are common place. The challenge with a data migration is finding a workable process and then shoehorning that process into existing SLAs and maintenance windows. In general, data migrations are disruptive in some way or another, and the level of disruption depends on the technology used and the type of migration. EMC has a long list of migration tools that cover our midrange storage systems, NAS systems, as well as high-end Symmetrix arrays. The downside with these and other vendors’ migration tools is that there is some level of downtime required.

There are 3 basic approaches:

  1. Highest Amount of Downtime:
    • Take system down, copy data, reconfigure system, bring system online
  2. Less Downtime:
    • Copy Data, Take system down, copy recent changes, reconfigure system, bring system online (EMC Open Replicator, EMCopy, RoboCopy, Rsync, EMC SANCopy, etc)
  3. Even Less Downtime:
    • Set up new storage to proxy old storage, reconfigure system, serve data while copying in the background. (EMC Celerra CDMS, EMC Symmetrix Open Replicator)
    • (Traditional virtualization systems like Hitachi USP/VSP, IBM SVC, etc are similar to this as well since you must take some level of downtime to get hosts configured through the virtualization layer, after which you can non-disruptively move data).

Common theme in all 3? Downtime!

Non-Disruptive Migrations are becoming more realistic:

A while ago now, EMC added a feature to PowerPath called Migration Enabler which is a way to non-disruptively migrate data between LUNs and/or Storage arrays from the host perspective. PowerPath Migration Enabler could also leverage storage based copy mechanisms and help with cut over. This is very helpful and I have multiple customers who have successfully migrated data with PPME. The challenge has been getting customers to upgrade to newer versions of PowerPath that include the PPME features they need for their specific environment. Software upgrades usually require some level of downtime, at least on a per-host basis, and we run into maintenance windows and SLAs again.

EMC Symmetrix Federated Live Migration (FLM)

For Symmetrix VMAX customers, EMC has added a new capability that could make data migrations easy as pie for many customers. With FLM, a VMAX can migrate data into itself from another storage array, and perform the host cutover, automatically, and non-disruptively. The VMAX does not have to be inserted in front of the other storage array, and there is no downtime required to reconfigure the host before or after the migration.

Federation is the key to non-disruptive migrations:

The way FLM works is really pretty simple. First, the host is connected to the new VMAX storage without removing the connectivity to the old storage. The VMAX is also connected to the old storage array directly (through the fabric in reality). The VMAX system then sets up a replication session for the devices owned by the host and begins copying data. This is all pretty straightforward. The smart stuff happens during the cut over process.

As you probably know, the VMAX is an Active/Active storage system, so all paths from a host to a LUN are active. Clariion, similar to most other midrange systems is an Active/Passive storage system. On Clariion, paths to the storage processor that owns the LUN are active, while paths to the non-owner are passive until needed. PowerPath and many other path management tools support both active/active and active/passive storage systems.

Symmetrix FLM leverages this active/passive support for the cut over. Essentially the old storage system is the “owning SP” before and during the migration. At cut over time, the VMAX essentially becomes the owning SP and it’s paths go active while the old paths go passive. PowerPath follows the “trespass” and the host keeps on chugging. To make this work, VMAX actually spoofs the LUN WWN and host ID, etc from the old array, so the host thinks it’s still talking to the old array. Filesystems and LVMs that signature and track LUNs based on WWN and/or Host ID are unaffected as a result. The cut over process is nothing more than a path change at the HBA level. The data copy and cut over are managed directly from the VMAX by an administrator.

Of course there are limitations… FLM initially supports only EMC Symmetrix DMX arrays as the source and requires PowerPath. From what I gather, this is not a technical limitation however, it’s just what EMC has tested and supports. Other EMC arrays will be supported later and I have no doubt it will support non-EMC arrays as a source as well.

Solving the $5 million problem:

Symmetrix Federated Live Migration became a signature feature in our VMAX discussions with the aforementioned customer because they know that for every minute their database is down, they lose $90,000 in revenue. Even with the fastest of the 3 traditional methods of migration, they would be down for up to an hour while 12 cluster nodes are rezoned to new storage, LUNs masked, drive letters assigned, etc. A reboot alone takes up to 30 minutes. 1 hour of downtime equates to $5,400,000 of lost revenue. By the way, that’s quite a bit more than the cost of the VMAX, and FLM is included at no additional license cost, so FLM just paid for the VMAX, and then some.

EMC’s New VNX Unified Storage Systems

Posted on by

Today, EMC announced the new VNX and VNXe Unified Storage platforms that merge the functionality of, and replaces, EMC’s popular Clariion and Celerra products.   VNX is faster, more scalable, more efficient, more flexible, and easier to manage than the platforms it replaces.

Key differences between CX4/NS and VNX:

  • VNX replaces the 4gb FC-Arbitrated Loop backend busses with 6gb SAS point-to-point switched backend.
    • Fast and Reliable
  • VNX supports both 3.5” and 2.5” SAS drives in EFD (SSD), SAS, and NearLine-SAS varieties.
    • Flexible and Efficient
  • VNX has more cache, more front-end ports, and faster CPUs
    • Fast and Flexible
  • VNX systems can manage larger FASTCache configurations.
    • Fast and Efficient
  • VNX builds on the management simplicity enhancements started in EMC Unisphere on CX4/NS by adding application aware provisioning.
    • Simple and Efficient
  • VNX allows you to start with Block-only or NAS-only and upgrade to Unified later if desired, or start with Unified at deployment.
    • Cost Effective and Flexible
  • VNX will support advanced data services like deduplication in addition to FASTVP, FASTCache, Block QoS, Compression, and other features already available in Clariion and Celerra.
    • Flexible and Efficient

Just as with every manufacturer, newer products take advantage of the latest technologies (faster Intel processors and SAS connectivity in this case,) but that’s only part of the story with VNX.

Earlier, I mentioned Application Aware Provisioning has been added to Unisphere:

Prior to Application Aware Unisphere, if tasked with provisioning storage for Microsoft Exchange (for example), a storage admin would take the mailbox count and size requirements, use best practices and formulas from Microsoft for calculating required IOPS, and then map that data to the storage vendors’ best practices to determine the best disk layout (RAID Type, Size, Speed, quantity, etc).  After all that was done, then the actual provisioning of RAID Groups and/or LUNs would be done.

Now with Application Aware Unisphere, the storage admin simply enters the mailbox count and size requirements into Unisphere and the rest is done automatically.  EMC has embedded the best practices from Microsoft, VMWare, and EMC into Unisphere and created simple wizards for provisioning Hyper-V, VMWare, NAS, and Microsoft Exchange storage using those best practices.

Combine Unisphere’s Application Aware Provisioning with the already included vCenter integration, and support for VMWare VAAI and you have a broad set of integration from the application layer down through to the storage system for optimum performance, simple and efficient provisioning, and unparalleled visibility.  This is especially useful for small to medium sized businesses with small IT departments.

EMC has also simplified licensing of advanced features on VNX.  Rather than licensing individual software products based on the exact features you want, VNX has 5 simple Feature Packs plus a few bundle packs.  The packs are created based on the overall purpose rather than the feature.  ie: Local Protection vs. Snapshots or Clones

  • FAST Suite includes FASTVP, FASTCache, Block QoS, and Unisphere Analyzer
  • Security and Compliance Pack includes File Level Retention for File and Block Encryption
  • Local Protection Pack includes Snapshots for block and file, full copy clones, and RecoverPoint/CDP
  • Remote Protection Pack includes Synchronous and Asynchronous replication for block and file as well as RecoverPoint/CRR for near-CDP remote replication of block and.or file data.
  • Application Protection Pack extends the application integration by adding Replication Manager for application integrated replication and Data Protection Advisor for SLA based replication monitoring and reporting.

You can also get the Total Protection Pack which includes Local Protection, Remote Protection, and Application Protection packs at a discounted cost or the Total Efficiency Pack which includes all five.  That’s it, there are no other software options for VNX/VNXe.  Compression and Deduplication are included in the base unit as well as SANCopy.  You will also find that the cost of these packs is extremely compelling once you talk with your EMC rep or favorite VAR.

So there you have it — powerful, simple and efficient storage, unified management, extensive data protection features, simplified licensing, and class leading functionality (FASTVP, FASTCache, Integrated CDP, Quality of Service for Block, etc) in a single platform.  That’s Unified, That’s EMC VNX.

I didn’t have time to touch on VNXe here but there is even more cool stuff going on there.  You can read more about these products here..

StorageSavvy Blog 2010 in review

Posted on by

WordPress sent me an email with overall stats for 2010 and I thought I’d share a few things I noticed.

First, thank you to all of my readers as well as those who have linked to and otherwise shared my posts with others.  I know that many of my peer bloggers have much higher numbers than I, but I still think 22,000 views is pretty respectable.

For 2010, my most popular post was Resiliency vs Redundancy: Using VPLEX for SQL HA.  The top 5 posts are listed here..

1

Resiliency vs Redundancy: Using VPLEX for SQL HA June 2010

2

EMC CLARiiON and Celerra Updates – Defining Unified Storage May 2010
1 comment

3

NetApp and EMC: Real world comparisons October 2009
10 comments

4

While EMC users benefit from Replication Manager, NetApp users NEED SnapManager June 2010
25 comments

5

NetApp and EMC: Replication Management Tools Comparison June 2010
2 comments

You may notice a theme here.  First, Midrange Storage is HOT, and any comparisons between EMC and it’s competitors seem to get more attention compared to most other topics. Note #3 was written in 2009 and it’s the 3rd most viewed post on my blog in 2010. A secondary theme in these top 5 posts might be disaster recovery as well since most of these posts have DR concepts in the content as well.

Looking at search engine results the it looks like emc flare 30, clariion, and mirrorview network qos requirements were the hottest terms.  The MirrorView one is pretty specific so I may do some blogging on that topic in the future.

With these stats in mind, I’ll keep working to hone my blogging skills through 2011 and sharing as much real-world information as I can, especially as I work with my customers to implement solutions.  One thing I’ll do is try and provide the comparisons people seem to be interested in, but focusing on the advantages of products, while steering clear of negativity as much as possible.

Welcome to 2011!  It’s going to be fun!