Hands On With Coho Data’s DataStream Storage Architecture

Ever since attending Storage Field Day 4 and seeing Coho Data’s presentation, I had the desire to see their product in action. I mean, c’mon … it’s NFS attached storage (one of my favorite protocols) for vSphere that uses OpenFlow to redirect RPCs to back end storage nodes called MicroArrays. That’s pretty Star Trek sounding, right? Others are able to influence the data path by use of round robin DNS and the like, but this is one of the few platforms I’m aware of that bakes in software defined networking to solve the storage scale-out problem.

Architecture and Lexicon

The folks over at Coho Data were nice enough to ship a pair of MicroArrays fronted by a single DataStream Switch, which is an Arista 7050T-52, to my lab at work. The idea is that all of the MicroArrays plug into the switch, which acts as the front-end IP namespace for all of the back-end storage nodes. You then plumb the switch into your existing Ethernet network and let traditional networking handle the rest.

Fancy Visio Diagram
Fancy Visio Diagram

Because the switch has 52 ports, of which half can be used by MicroArrays, I can theoretically dual-connect 13 MicroArrays into a single switch, minus management interfaces – so, more like 8 MicroArrays. This provides 20 GbE of bandwidth to each MicroArray.

The DataStream Switch
The DataStream Switch

Recently, I’ve acquired a second DataStream Switch and have introduced redundancy into the stack. Each MicroArray is now cabled to both switches, which provides for switch redundancy, while also having more ports in which to scale out.

Here’s what’s inside the chassis, taken from the web interface:

Chassis 1 - Contains 2 MicroArrays
Chassis 1 – Contains 2 MicroArrays

And a screenshot of the DataStream Switch – the other one looks pretty much the same, just different ports are plugged in.

Primary DataStream Switch
Primary DataStream Switch

Hardware and Software Installation

The hardware is pretty standard stuff to get into the data center. A pair of 1U DataStream Switches followed by any number of 2U chassis that hold a pair of MicroArrays. Because it’s a lab, all of the gear is in one rack. Both switches are sandwiched together with the 2U chassis sitting below them.

All of the cables come in the box and are color coded – which rocks! There’s a handy manual that comes with that tells you exactly which colors go where. It reminds me of the old Gateway computer kits that had color coded ports on the back to make it easy for anyone to install. 😉

Coho Data Cabling
Coho Data Cabling

Here’s a breakdown of the cables and colors:

  • Orange – Stacking cables between the two DataStream Switches. They become one logical switch.
  • Blue – MicroArray management cable. Each MicroArray plugs into an alternate DataStream Switch. The one in the picture is going to the top switch.
  • Red – MicroArray copper 10 GbE Data Cable to bottom switch.
  • Green – MicroArray copper 10 GbE Data Cable to top switch.
  • Yellow – Management cable for web access to one of my Nexus 2248TP FEX switches used for management.
  • Fiber – Uplinks into my network. I’m using 1 per switch, connected to a Nexus 7010 VDC.
  • Black – MicroArray to MicroArray cable for physical awareness of each other.

Once everything is plugged in you turn on the DataStream Switches and let the Coho software, which lives on those Arista switches, fire up. At that point you can connect into the web front end and go through their setup wizard. The wizard is, and should be, quite boring. It’s just a few questions about IP addresses, NTP servers, the vCenter connection, and that sort of stuff. There are no questions about the storage configuration because this is 2014 and modern storage arrays do that sort of stuff for you.

Network Configuration

I ended up putting the management interface on a management VLAN as an access port.

interface Ethernet121/1/3
  description COHO MGMT
  switchport access vlan 10
  spanning-tree port type edge
  no shutdown

The data interfaces sits on a VLAN set aside for NFS traffic. I’m using a static LAG because that’s what the DataStream supports. There is no need to configure a LAG on the DataStream side; it does it automatically.

interface port-channel26
  description Arista (Coho) LAG
  switchport mode trunk
  switchport trunk allowed vlan 26
  spanning-tree bpduguard disable

interface Ethernet2/17
  description Arista (Coho) Primary
  switchport mode trunk
  switchport trunk allowed vlan 26
  spanning-tree bpduguard disable
  channel-group 26
  no shutdown

interface Ethernet2/18
  description Arista (Coho) Secondary
  switchport mode trunk
  switchport trunk allowed vlan 26
  spanning-tree bpduguard disable
  channel-group 26
  no shutdown

That’s about it for networking. The data VLAN is available on my ESXi hosts so that they can mount the NFS volume without having to route.

DataStream Upgrades

I’ve been involved with two code updates thus far. The first required plugging in a USB drive into the DataStream Switch and rebooting it, at which point it picked up the new code and loaded it. Those days are dead – I can now upload a package file directly onto the DataStream via the web interface.

Updating the DataStream Version
Updating the DataStream Version

The process was a little wonky on Chrome, but I’m assuming it just wasn’t tested against Chrome. It apparently works fine on IE. The actual upgrade itself went fine, but the splash screen didn’t update – so I refreshed the page. Problem solved. Upgrading both DataStream Switches took under 15 minutes and services was not interrupted. It was a rather underwhelming upgrade process – which is the goal, right?

Activity Stream

The DataStream uses a social media-esque activity stream to report what’s happening. You can filter the events using time, prority, or wildcard search strings. I have thought about seeing if I could make a Twitter account for the DataStream and have it DM me when there are issues, because I hate email. Time to find an API 🙂

DataStream Activity Stream
DataStream Activity Stream

Mounting Storage

Assuming that your ESXi hosts have a vmkernel interface on the data network, mounting the storage is a snap. I ended up creating a DNS A record for the data IP address and using that as the server address. The mount point is simply “/” because the solution is just one big distributed storage volume.

DataStream Datastore
DataStream Datastore

With only a pair of MicroArrays I get 17.75 TB of space, but supposedly this is actually TiB, making the real value closer to 19.5 TB. There’s also an NFS VAAI plugin on the VMware HCL that can be deployed into the environment to makes file cloning something that the array does on your behalf.

Coho Data's VAAI Plugin
Coho Data’s VAAI Plugin

Thoughts

The DataStream 1000, which is the official name, has been in the lab for about a year now. It’s low maintenance and has progressively been easier to work with as the team implements more features and functionality into the product. I’ve yet to really bench the software on the latest code, version 1.6, but plan to do so in the near future or see if my colleague Brian Suhr at Datacenter Zombie will crush it with desktop tests. 🙂

On my wish list are some obvious items: replication would be nice, although hypervisor based replication via folks like Zerto or Veeam B&R certainly make that less of a pain point these days, and PowerShell cmdlets.