Cohesity: Much More than Hyperconverged Secondary Storage

Recently I had the opportunity to attend Cloud Field Day 4 in Silicon Valley. While in attendance, one of the briefings I attended was provided by Cohesity. For those who are not already familiar with Cohesity, It was founded in 2013 by Mohit Aron. Aron cofounded Nutanix and was previously lead on Google filesystem. So it’s safe to say that he created Cohesity with a solid foundation in storage. The platform was created as a scale out secondary storage platform. But as I discovered during my time there, the use cases for Cohesity’s platform have grown very broadly beyond a secondary place to store data.

Cohesity spent very little time getting the delegates up to speed on their platform and the SpanFS Distributed fileystem that powers it. That information has been covered in past Field Day events and can be found in archived videos. We spent the majority if our time with Cohesity covering higher level features and functionality which I will review in this blog post.

Cloud Adoption Trends

The first “session” of the briefing was delivered by Sai Mukundan and past Field Day delegate Jon Hildebrand.

Sai covered some of the trends that Cohesity sees in customer adopting cloud and the Cohesity Data Platform specific use case. The first use case being long-term retention and VM-migration.

Because Cohesity supports AWS, Azure, GCS, and any S3 compatibly object storage, a customer can choose the cloud storage provider that suits them best as a target for long-term storage of their data. Indexes of customer data can enable search across local and cloud instances of data stored by Cohesity. This is especially valuable in helping customers to avoid unnecessary egress charges when a file already exists on-premises.

My favorite part of most briefings is the demos of course, and Cohesity did not disappoint. During his time in the limelight, Jon showed off how he could create a policy that would archive data to multiple public clouds at once. In this case, he created a single policy that would archive data to AWS, Azure, and GCP all at the same time. I actually managed to get a question in during this demo to and in case you are curious, not only can you set a bandwidth limit for each cloud target but also a global limit to ensure that the aggregate of your cloud archive jobs will not consume an unwanted amount of bandwidth. Jon and Sai also showed that once the data exists in multiple locations all will be shown when a restore is initiated.

Migration of VMs Is handled in Cohesity by a feature named “CloudSpin”

CloudSpin

This feature was also showcased in demo form. I won’t describe the demo in detail because you can just watch it at your leisure. I will however mention one thing that struck me during Cohesity’s briefing. The UI is not only slick and responsive, but also well thought out. While watching demos I was impressed by how intuitive everything seemed and how easy I felt navigating the platform would be for someone who was unfamiliar with the interface.

Application Test/Dev

Within the context of VM Migration that was previously mentioned, another potential use case of the Cohesity platform is application mobility for the purposes of testing and development. Again, this functionality was demonstrated rather than just explained.

Again, I won’t spend a lot of time rehashing what took place during the demo. But as the demonstration of the power available to developers unfolded, the panel of mostly infrastructure professionals started discussing the implications of these capabilities brought up concerns about access and cost control. The Cohesity team did a very good job of addressing roles with built-in RBAC capabilities, but it is clear that there is no built-in cost control capability at this point in time. It was pointed out that the extensibility of the platform through use of APIs means that customers could implement cost control using a third party management plane of their choice. This is an indirect answer the question though, and I would like to see Cohesity implement these features natively. A customer can make the decision to implement them in the Cohesity platform, leverage a third party management plane, or simply let the developer run wild (bad idea.)

Cloud Native Backup

Within the Cohesity model, cloud-native backups are a three step process. The image below depicts the scenario specific to AWS, but the process for Azure or GCP workloads is largely the same. First, a snapshot of an EBS volume is taken and placed in an S3 bucket. Second, the snapshot is transformed into an EBS volume. To complete the process, the volume is copied to Cohesity Cloud Edition.

CNBackup

Multi-Cloud Mobility

A common first use-case for many customers when they initially put data into the cloud is for long term retention. With this in mind, Cohesity seeks to enable customers to store an move data to the cloud provider of their choice. The three big clouds (AWS, Azure, and GCP) are all supported, but a customer could choose to leverage an entirely different service provider as long as they offer NFS or S3 compatible storage.

I expected Cohesity to show of some kind of data movement feature during a demo of this use case, but I was wrong. What I got instead was data consistency with Cohesity, even when data that had been archived to one cloud vendor was migrated to another by a third party tool. This ensures that the cluster will maintain access to the data and be able to continue performing such tasks as incremental backup. This is accomplished by changing the metadata within a Cohesity cluster. There are multiple ways to execute this task, be it GUI, API, or in the case of the demo, a CLI tool called icebox_tool.

icebox_tool

Summary

While Cohesity may have started life as a “Hyper-Converged Secondary Storage” platform, the use cases have increased greatly as the platform has matured. While this makes for a very powerful platform that can fit a multitude of customer types, it has led to confusing messaging.

UseCases

Is Cohesity a data archival platform, a backup platform, or a data mobility platform? The answer is “all of the above” which is fine, but doesn’t really help deliver a clear message that can be brought to market and keep the product at the front of mind for customers who are seeking a product to address their needs.

I’m not a marketing genius so I have no idea what this message would look like. However, Cohesity has been bringing in a lot of top talent lately and I think they should have no problem clearing confusion that exists around their platform. Because it is clearly a powerful and capable platform and once customers know off all the use cases and how the product is relevant to them, it is sure to gain popularity.

Disclaimer: Although this is not a sponsored blog post, much of the information used for it was gathered at Cloud Field Day 4. All of my expenses were paid by GestaltIT who coordinate all Tech Field Day events.

Move Your Data into the Cloud NOW with SoftNAS

Recently, I had the opportunity to attend Cloud Field Day 4 in Silicon Valley. While there, one of the companies that presented to the delegates was SoftNAS. For those unfamiliar with SoftNAS, their solution allows organizations to present cloud storage from platforms such as AWS and Azure using familiar protocols such as iSCSI and NFS.

This approach to cloud storage has both benefits and drawbacks. On one hand, SoftNAS allows companies to overcome data inertia easily without refactoring their applications. On the flipside, an application can only be considered cloud native when it is designed to take advantage of the elasticity of services and resources made available when using public cloud platforms like AWS S3 and Azure Blob Storage. How SoftNAS helps customers accomplish this is with a number of features.

SmartTiers

SmartTiers is meant to leverage multiple cloud storage services and aggregate them in a single target for applications that are not able to utilize cloud storage services natively. With SmartTiers, data can be automatically aged from the quickest and most expensive tier of storage made available to the application to lower cost, longer term storage. Think of it as putting hot data in flash storage, cool data in traditional block storage, and cold data in object storage such as S3.

SmartTiers

UltraFast

UltraFast is the SoftNAS answer to the problem of network links that tend to have unpredictably high latency and packet loss. Using bandwidth consumption scheduling SoftNAS claims UltraFast will achieve the best possible performance gains on networks with the aforementioned problems with performance and reliability. Performance of UltraFast is monitored through a dashboard and can also be measured on demand with integrated speed tests.

Lift and Shift

Lift and Shift is a common term used to describe the process of moving data off a legacy platform and into the cloud without refactoring. It is often seen as an intermediate step to eventually adopting a cloud native architecture. SoftNAS helps customers achieve this by moving data to an on-premises appliance that will continually sync with another appliance in the cloud service of their choice. Synchronization of data can be accelerated by Ultrafast. When the customer is ready to complete the migration of their application, only a final delta sync will be needed and the most recent version will be present in the cloud platform of their choice.

LiftNShift

FlexFiles

FlexFiles is a SoftNAS feature that is based on Apache NiFi. It solves the problem of collecting and analyzing data in remote locations. IoT devices can generate extremely high amounts of data that cannot possibly be transferred back to a data center or public cloud over the types of WAN/Internet links available at most remote locations. By placing a SoftNAS appliance in the location where data is to be collected, FlexFiles will allow customers to filter data. Once data is captured and filtered locally, only that data deemed necessary is transferred securely to the public or private cloud where it can be acted upon (transformed in SoftNAS terms) and processed by an application.

Summary

The first reaction that some may have to SoftNAS is that the product is not cloud native and therefore not worth their time. I would caution against this line of thinking and encourage taking time to consider the use cases of SoftNAS’s solutions. Much of what SoftNAS does enables customers to move large amounts of data without refactoring their applications for the public cloud. This can be extremely valuable for organizations that do not have the time or skills in house to completely rearchitect the applications that their business relies on for critical operations.

Yes, if you are moving your data to the cloud you would be better off adopting a cloud native architecture long term. But if you have an immediate need to move your data off site, a solution like SoftNAS will remove the barrier for entry that can exist in many organizations and serve as an on-ramp to the cloud.

One More Thing

While they were presenting at Cloud Field Day 4, SoftNAS mentioned that they also have a partnership with Veeam that will allow use of the platform as a target for synthetic full backups. There was not enough time to get a deep dive on this functionality at the time. I have reached out to SoftNAS and hope to get more information soon and follow up with some specifics on the offering.

Disclaimer: Although this is not a sponsored blog post, much of the information used for it was gathered at Cloud Field Day 4. All of my expenses were paid by GestaltIT who coordinate all Tech Field Day events.

From Automation Noob to…Automation Noob with a Plan

Note: This post is being published at the same time as a lightning talk that is being delivered at Nutanix .NEXT 2018 by the same title. It contains link to the various resources mentioned during the talk.

I’ve been working in IT for 15 years and I think my story is very similar to that of many of my peers. I had some programming courses in college, but decided that I was more interested in infrastructure and chose the route of a systems administrator over that of an application developer.

Early in my career most of my tasks were GUI or CLI driven. Although I would occasionally script repetitive tasks, that would usually consist of googling until I found someone else’s script that I could easily change for my purposes. Most of the coding knowledge I had gained in college I either forgot or was quickly outdated.

Fast forward to the current IT infrastructure landscape and automation and infrastructure as code are taking over the data center. Not wanting to let my skills languish, I have embarked upon improving my skills embrace the role of an infrastructure developer.

The purpose of this blog post is not to teach the reader how to do anything specific, but to share what methods and tools I’ve found to be useful as I attempt to grow my skill set. My hope is that someone will be undergoing the same career change as myself and find this to be useful. I’ve heard many tips and tricks some of which I have taken to heart as I work towards my goal. I will devote a bit of time to each of these:

  • “Learn an language.”
  • “Pick a configuration management tool.”
  • “Learn to consume and leverage APIs.”
  • “Have a problem to solve.”

I’m going to take these one by one and break down my approach so far.

Learn a Language

When I first heard this I was unsure which language I should learn and where I should start. I actually cheated a bit on this one. I chose two languages: Python and PowerShell. I chose these based on the fact that they are both powerful, widespread, and well suited to most tasks I would want to automate.

PowerShell

To get away from googling other people’s PowerShell scripts and actually create something myself I wanted to actually understand the language and how it handles objects.

I’d heard many mentions of the the YouTube series “Learn PowerShell in a Month of Lunches” by Don Jones. I made my way through this series and have found it very valuable in understanding PowerShell. There is a companion book by Don and Jeff Hicks available as well. I have not purchased the book myself, but have heard good things.

Another great way to learn PowerShell or any other number of technologies is Pluralsight. Jeff Hicks, one of the authors of the previously mentioned book, also authored a course named “Automation with PowerShell Scripts.” I am still making my way through this course but like pretty much everything on Pluralsight it is high quality. If you have access to Pluralsight as a resource I highly recommend taking advantage of it.

Python

I was even less familiar with Python than I was with PowerShell before making the decision to enhance my automation skills. Although I understand general program principals, I needed to learn the syntax from the beginning. Codecademy is a great, free resource for learning the basics of a language. I made my way through their Learn Python course just to get basics under my belt.

Codecademy was a great way to get started understanding Python syntax, but left me with a lot of questions about actual use cases and how to start with scripting. Enter another “freeish” resource, Packt. I say freeish because Packt gives away free ebook every day. I check this site every day and have noticed that some books have popped up multiple times. Many of these are Python books and one that I have been spending my time on in particular is Learning Python by Fabrizio Romano. My favorite method of learning is to have a Python interpreter and the book opened in side by side windows on my laptop. Not pictured is the code editor will keep opened in the background or on another monitor.

LearningPython

Another resource worth mentioning is Google’s free Python course. I’ve only looked at this briefly and have not spent much time on it yet. It appears to be high quality and easy to follow, and at $0 the price is right!

Pick a Configuration Management Tool

There are many of these out there and choosing the right one can be difficult. What makes one better than the other? You’ve got Puppet, Chef, and Ansible for starters. Fortunately for me the decision was easy as my employer at the time was already using Puppet, so I just decided to dive in.

The first thing I did was download the Puppet Learning VM. This a a free VM from Puppet that will walk you through learning the functionality of puppet, not just by reading the included lessons (pictured below), but by accessing the VM itself by SSH and actually interacting with Puppet.

LearnPuppet

You can run this VM in a homelab or even on your old PC or laptop given you install something like VMware Workstation or VirtualBox. Learning Puppet takes place in the form of quests. You are given an objective at the end of a module that you must achieve within the Puppet VM and you will see progress indicator on the screen informing you when you have completed various tasks. This is one of the coolest “getting started” guides that I have ever come across and I cannot recommend it highly enough.

As great as the Puppet Learning VM was, I wanted to strike out on my own and really get to know the product. To do this, I used old hardware at work to setup a lab and mirror our production environment as best I could. When things didn’t work my troubleshooting efforts usually led me to the Puppet User Group on Google.

This is a great resource for anyone who has questions while trying to learn Puppet. I’ll be honest and admit that I’m not very active in this group. I mostly just subscribe to the daily digest, but reading other people’s questions and seeing how they resolved their issues has been very helpful for me.

Where things got really interesting with Puppet was when I broke my vCloud Director lab entirely. By recycling many of the Puppet modules and manifests from my production environment I managed to overwrite the response file of my vCD lab. These response files are specific to each installation of vCD and are not portable. Although this was a lab and I could just start over, I was determined to fix it in order to get a better understanding of vCD. This taught me an important lesson about paying attention to the entirety of the configuration you will be enforcing, regardless of the tool you are using.

Learn to consume and leverage APIs

This one is the newest too me and I am only getting started. Recently vBrownBag hosted a series called “API Zero to Hero” and I watched them all. I also have downloaded Postman and use it to play with an API every time I want to learn a little more.

When it comes to actually doing something with an API, I am interested in leveraging the API in Netbox, the excellent DCIM tool by Jeremy Stretch of Digital Ocean, to provision and assign things like IP addresses, VLANs, etc. and save myself some manual work.

Have a problem to solve

All the learning in the world won’t do you much good if you don’t apply it. Shortly after acquiring a little bit of PowerShell knowledge I felt comfortable enough to write some basic scripts that helped me save some time in the Veeam Console. I delivered a vBrownBag tech talk on this at VMworld 2017 and you can view it in the video below if you are interested.

Long story short. I wanted to quickly change a single backup job on dozens of Veeam Backup jobs. Rather than click through all of them I dug through the Veeam PowerShell documentation and tested things out until I came up with 7 lines of code that would allow me to toggle storage integration on or off for all backup jobs.

Bonus Tips

Now that I’ve walked through my approach to the various pieces of advice I received over the years, I’m going to leave you with a couple more.

Pick a code editor and use it

We are long past the days of quickly editing .conf files in vi or parsing data in Notepad++. There are numerous code editors available and I’ve tried a few like Sublime Text and Atom Text. My favorite that I’ve settled on is Microsoft Visual Studio Code. It’s robust, lightweight, and extensible. What more could you want?

Use version control

Project.ps1, ProjectV2.ps1, and ProjectV2final.ps1 are not acceptable change tracking in an era of Infrastructure as Code. Its important to learn and use version control as you are creating and updating your automation projects. The most popular and widely used resource for this is git and more specifically GitHub if you want to host your code publicly, or GitLab to host the repositories yourself.

To get started learning Git, once again I would recommend starting with CodeCademy if you are completely new. They have a Learning Git class that is a suitable introduction. If you already have the basics handled, then you may want to learn some of the more advanced concepts outline in the vBrownBag #Commitmas series.

Making sense of it all

At this point you can tell that I have my work cut out for me. I have listed several different skills that I am trying to learn or plan on learning to stitch together into a useful skill set. If you find yourself in the same situation, the best encouragement I can give you is to keep at it.

You may not be able to sit and learn everything you need without interruption. I certainly haven’t. But whenever I have some spare time available I go back to the latest course or YouTube series I’m working on and try to make progress. If you are reading this post and find yourself in a similar position, I encourage you to do the same and keep at it!

Slides from the lightning talk related to this post are available here. Links to all the resources from this blog post are embedded in the slides as well.

News from Nutanix .NEXT 2018

Nutanix is holding their annual .NEXT conference in New Orleans this year and have made several new announcements and enhancements to their platform. I will be highlighting some of my favorites

Flow

Flow is the name of the Nutanix Software Defined Networking (SDN) offering. As of version 5.6 Nutanix had the capability to perform MicroSegmentation of workloads. With release of Flow, Nutanix will be extending their SDN capabilities to include network visibility, network automation and service chaining.

FLOW

Flow is available for customers using the Nutanix Acropolis Hypervisor (AHV). There will be no additional software to install. Many of the SDN enhancements coming to AHV are a result of the recent Nutanix acquisition of Netsil.

FLOWNetsil

If you are in New Orleans for .NEXT this week there are a few sessions/labs that will be a great opportunity to learn more about SDN in Acropolis.

FLOWSessions.png

ERA

Era is Nutanix’ first foray into PaaS and is a Database management solution. Using ERA, Nutanix customers will have the ability to manage their databases natively on their Nutanix clusters.

ERAcdm.png

ERA’s Time Machine feature will allow customers to stream their Database to their Nutanix cluster and clone or restore to any point in time up to the most recent transaction.

ERATimeMachine.png

The first release of ERA will support Oracle and PostgreSQL databases.

ERA1.0

Long term, the ERA of vision aligns well with the overall Nutanix philosophy of giving customers choice. Having the ability to manage a DB of a customer’s choice and in the cloud of their choice is the reality that Nutanix is aiming for with ERA.

ERAvision.png

Beam

Beam is the result of the Nutanix acquisition of Botmetric. Beam is targeted at enabling customers to have visibility and control of cost across multiple clouds. In the first release Beam will support Nutanix Private clouds as well as AWS and Azure, with GCP support coming in a future release.

Nutanix Beam Botmetric

Nutanix Beam NTC

Nutanix Beam Cost Optimization

Its clear with the announcements by Nutanix at .NEXT 2018 that hybrid or multi cloud strategies are the goal of Nutanix and their platform. Giving customers the freedom and choice and the ability to provide the right services in place to enable their business to succeed in the 21st century.

How to Create a DellEMC XC CORE Solution With the Dell Solutions Configurator

Nutanix and DellEMC announced a new way to purchase a Nutanix solution on Dell hardware this week in the form of Dell XC CORE. Customers can purchase hardware delivered to give a turnkey Nutanix experience and still purchase software separately preventing a license from being tied to a specific appliance.

For Nutanix partner resellers there has been a bit of confusion regarding how we can configure and quote the XC CORE nodes for customers seeking these options. After a little bit of searching I have completed a couple of these configurations for my customers.

It looks like some partners have different versions of the Dell configurator with some allowing them to select some options at the time of solution creation. The version I have access to does not so I had to dig a bit to find where I could configure XC CORE nodes.

New Solution

After creating a new solution, I navigated down to the available Dell XC nodes that were previously available to me.

DeviceSelection

By selecting a Dell XC node that was listed in the XC Core datasheet and expanding the first item, “XC640ENT” in my case, I found a new option. Dell EMC XC640ENT XC CORE.

ConfiguratorXC640ENT

The rest of the configuration was as familiar as any Nutanix configuration and I was able to complete the solution according to the sizing data I had already configured.

Proud to be Selected as a Veeam Vanguard 2018

I woke up on the morning of March 6, 2018 to a very pleasant surprise. I had received the email pictured below from Rick Vanover, Director of Product Strategy at Veeam Software inviting me to be a member of the Veeam Vanguard.

Screenshot_20180307-103741~2.png

Needless to say I immediately thanked Rick for selecting me and accepted his invitation. I had the pleasure of meeting several members of the Veeam Vanguard when attending VeeamON last year. A few of them encouraged me to apply and I want to specifically thank Matt Crape, Rhys Hammond, and Jim Jones for their support.

 

I gained my VMCE certification while in New Orleans and have since led the majority of Veeam PreSales discussions on behalf of my employer. I also had opportunity to deliver vBrownBag tech talk on the VMTN stage at VMworld last year about my experience Automating Repetitive task in Veeam Backup and Replication using the Veeam PowerShell Snap-In. I look forward to continued involvement in the Veeam community and look forward to being an active participant in the Veeam Vanguard Program.

 

Building an On-Demand Cloud Lab

This time last week, I was in Palo Alto, California for the VMUG leader summit, which was great.  Prior to the summit I submitted an entry into a new VMUG program called SPARK.

To participate in SPARK a leader was asked to submit two power point slides outlining an idea for a way to provide value back to the VMUG community. Additionally we were asked to submit a video that would be played back at the summit which would be played back before all leaders present voted for the favorite submission. The Indianapolis VMUG submission was an On-Demand Cloud lab built on VMware Cloud on AWS.

It was a tight race, but I’m proud to say that Indianapolis VMUG won the first VMUG SPARK contest and will receive $5000 funding to help make this idea a reality.

Here’s the point of the post though. I can’t do this alone, and in fact I don’t want to. I want to involve the community to make it as great as it can be. I want to hear other’s ideas as to how we can create an infrastructure that can be spun up quickly and easily and help other VMUG members learn new skills.

We will be holding our initial planning calls after the new year at whatever date/time works for the best for the most participants. If you would like participate please reach out to me via Twitter. My DMs are wide open and all are welcome. We can make a great platform for everyone to learn on if we work together!