Home Blog Page 190

Nvidia thoughts on composability – tail latency limits CXL

The world of composable systems is divided between PCIe/CXL-supporting suppliers, such as Liqid, and networking suppliers such as Fungible. And then there is Nvidia, which has depended upon its networking for composability – either Ethernet or InfiniBand – but is preparing to support CXL. An interview with Kevin Deierling, its VP for Networking, cleared things up.

His basic concern with PCIe-based systems is tail latency – the occasional (98th to 99th percentile) high-latency data accesses that could occur and delay app completion time – and link robustness, as we shall see.

Deierling said Nvidia starts from a three-layer stack point of view, with workloads at the top. They interface with a system hardware layer through platform elements such as Nvidia Omniverse, AI and HPC. These talk system software – Nvidia RTX, CUDA-X and PHYSX – through  underlying software constructs such as DOCA, Base Command and Fleet Command to hardware elements such as GPUs, CPUs, DPUs (Bluefield), NICs (ConnectX), switches and SoCs (System on Chip).

Frank Deierling

Deierling said: “This is relevant because each of these different workloads on the top, some of them are incredibly compute-intensive, and need a tonne of GPUs. Others are incredibly data-intensive and need, you know, NICs and CPUs to access storage. And so that notion of composability is critical to everything we do. And it’s really fundamental to everything we do so that we can take a set of resources and offer this as being built in the cloud. Therefore, you have a set of resources that needs to run every one of these applications. … Composability is really being driven by this. It’s just all sorts of different workloads at the top that all need really different elements.”

He said: “Our customers will leverage our technology to build a composable datacenter. So we’re not trying to compete with somebody that’s doing something over PCI Express – that’s just not how we view the world.”

Deierling talked about Nvidia’s accelerated Ethernet platform and three embodiments: the BlueField 3 DPU (Data Processing Unit), ConnectX-7 Smart NIC and Spectrum-4 switch.

He said: “Underlying it is this vision of the datacenter as the new unit of computing. And then we can scale the three elements – GPUs, CPUs, and DPUs – independently. And then, of course, storage is another really a fourth element here so that, for storage-intensive applications, we can assign different storage elements to these boxes.”

A datacenter can be equipped with a number of GPUs of different kinds, CPUs and DPUs – but not all workloads need all of these, and what they do need varies. By virtualizing the hardware and being able to compose the hardware elements and form them into dynamic configurations for workloads you can increase the system’s utilization and reduce the amount of stranded and temporarily useless resource capacity in systems such as Nvidia’s OVX SuperPod.

BlueField, he said, “can actually go take what was local storage and make it remote, but have it still appear as if it was local. This is a classic composability workload, and others are doing that by literally extending the PCIe interface.”

The trouble with PCIe is that it is unsophisticated and was never designed for networking. “We do it in a much more sophisticated way. The problem with PCIe is that if you disconnect a PCIe switch cable from a storage box, then your system says: ‘Oh, something broke, I’m going to reboot.’ Everyone’s trying to work around that fundamental nature – that PCIe was designed to work in a box, and it doesn’t fail. And if it fails, your whole system has failed.”

“We use networking, and we make the composability interface at the software layer. So when you do … NVMe, you actually say, ‘Hey, I’m a local NVMe drive.’ The reality is we’re not an NVMe drive – the storage is somewhere else. It’s in a box from one of our partners – a Pure or a DDN or NetApp or EMC or just a bunch of flash. We present it as if it was local. We do the same thing with networking. We can have IO interfaces that are really composed and then we can go off and run firewalls and load balancers and all of those different applications.”

In his view, “We’re really taking all of … hardware appliances, and we moved them to this software-defined world and now we’re virtualizing everything which makes it composable in a software-defined hardware system.”

Blocks & Files: I can see that you’re dynamically composing storage attached to Bluefield and the GPUs and CPUs, but I can’t see you composing DRAM or storage-class memory. So  are you at a disadvantage there?

Deierling: “Our NVLink technology is an even lower-level technology than networking. That starts to address some of these issues … We’re using our NVLink technology to connect GPUs to GPUs. But Jensen [Huang, Nvidia president and CEO] showed the roadmap where that really starts to extend into other things. And those other things are certainly capable of supporting memory devices. Today, we’re actually doing GPU to GPU and the memory between those devices. We can connect across an NVLink interface and share that.”

Blocks & Files: And CXL?

Deierling: “One of the things we’re looking at is CXL. NVLink is very much like CXL. And I think we won’t be disadvantaged. You know, we’ll support CXL when it’s available. We look at CXL as just the next generation of PCIe. Certainly today with PCIe, it’s one of the areas where we connect. And, you know, frankly, it slows us down.”

“When we make networking connections between devices, we’re typically limited by PCIe to go to the host memory. This is precisely, for example, why we have converged cards that we can connect directly from our GPUs: because we have a very, very fast local interconnect, faster than we’re able to get through the host system bus to the CPU.”

He said that CXL will speed things up and “as it starts to provide new semantics, we will absolutely take advantage of those.”

Tail latencies

A PCIe problem is tail latencies, as he explained. “RDMA, the remote direct memory access capability, needs really low latency hardware offload congestion management. [That’s] because one of the challenges here is that if you have memory, you simply can’t live with long tail latencies, where every once in a while, your access takes many, many hundreds of microseconds versus on average, microseconds. 

“This is one of the things that we address with all of our adaptive routing and congestion control capabilities that we built into our networking technologies, the same sort of technologies will need to be built into anything that has the potential to compose memory.”

You may get 99 percent hit rates on local caches (server socket-accessed DRAM or GPU high-bandwidth memory) and only one percent of the time have to go fetch data across the composable interface. “So you figure that you’re in great shape. Well, if your access times is 1000 times longer – if instead of being 50 nanoseconds, it’s 50 microseconds – [then] that one percent of the time … that you have to go fetch from remote memory completely dominates your performance.”

PCIe can’t solve the tail latency problem. “Today we’re doing a tonne of things with RDMA over networks. We partner with others that use PCI Express, but frankly, we’ve never seen that technology be able to deliver the kinds of performance with composability that we’re able to deliver over a network.”

“CXL has some headwinds that it needs to overcome to become something that really supports the composable network. We’ll adopt and embrace that as needed. And when things aren’t there, and they’re not available, then we build our own technologies.” Like NVLink.

Deierling said: “There was a great paper that that actually Google published nine years ago called The Tail at Scale. Even if something happens only one out of 1000 times, in our datacenters with our workloads fragmented into thousands of microservices, you’ll see it all the time. So that’s what we really think about not average latencies, but tail latency.”

Blocks & Files: What do think of CXL big memory pools?

Deierling: “As CXL evolves, and we see a potential that gives us performance. It has to be deterministic performance. It’s that long tail latency that kills you. … If that becomes something that is reliable, cost effective, can scale out and addresses that long tail latency, then absolutely we will take advantage of those connections. You know, it’s not there yet.”

Comment

Nvidia is a member of the CXL Consortium and has, we could say, a watching brief. It does not have a need for large CXL-accessed memory pools because its GPUs have local high-bandwidth memory and are not memory-limited in the same way as an x86 server, with its limited DRAM socket numbers. As CXL memory pooling is not here yet we can’t tell if long tail latency will be a real problem or whether software such as that from MemVerge could sort it out. We are suffering from, you could say, new technology’s long arrival latency.

NetApp adds Porsche race team to client list

Porsche Formula E
Porsche Formula E

NetApp and the TAG Heuer Porsche Formula E race car team are forming a “multi-year partnership” in which “NetApp will provide the sportscar maker with innovative hybrid cloud solutions that will help them continue to write car racing history.”

“NetApp’s hybrid cloud solutions enable TAG Heuer Porsche Formula E Team to access their data trackside to support driver and team performance. This helps them make data-driven decisions in real time.” Our first thought was that this was just typical IT-supplier-getting-sports-team marketing guff. But then we realized it was actually about the mobile edge, ROBO (remote office: branch office), and public cloud data sharing. 

The race car is literally a mobile edge device. It generates data from onboard system sensors and sends them wirelessly to the race team’s remote office: its trackside base.

Real-time decisions have to be made at the trackside to help the car win the race and deal with situations such as overtakes. The driver cannot analyze the car’s overall situation in detail, but the trackside people can if they have access to reference information in a database.

Blocks & Files diagram.

In the NetApp-Porsche setup this information is stored in a cloud NAS: Cloud Volumes ONTAP. The trackside base sees part of this in its local edge device: a Windows server running NetApp’s Global File Cache (GFS). This caches “active” data in distributed offices – the trackside base in this example – so it can be used in collaboration situations needing fast local access. It features a global namespace with real-time central file locking.

The software creates a Virtual File Share at the ROBO location which looks and feels like a traditional file share. This file share presents centrally provisioned file shares in real time, while data is centrally stored on one or more file shares in Cloud Volumes.

NetApp says that typically clients provision <1TB of storage in a smaller office (<50 users), while others may reserve up to 5TB for cache in the largest offices of heavy users with large file formats. Active files remain persistently cached in the GFS edge instance, which saves on data movement across the WAN to Cloud Volumes ONTAP. We don’t envisage any data gravity problems with this setup.

Car-generated and other trackside data is encrypted and uploaded to Cloud Volumes ONTAP by GFS across a WAN link with compression, streaming and delta differencing (sending only changed data). We say Cloud Volumes ONTAP, but the central cloud silo could be Cloud Volumes Service or Azure NetApp Files, meaning AWS, Azure, and GCP are all supported. In Porsche’s case the Azure public cloud is used.

Data can be analyzed in the cloud and used to provide real-time (race time) decisions – such as when to use the Formula E Attack Mode, which unlocks an additional 30 kilowatts of engine power. Attack Mode engagement rules are set by the FIA admin body an hour before every ePrix race. Drivers can use the extra speed this gives them for a few laps when they need an edge.

Data protection (backups and archive) is done in the cloud, relieving the trackside base from that burden. A Cloud Manager product delivers centralized orchestration across this hybrid cloud storage infrastructure and data management services so admin staff can manage, monitor, and automate their data software and hardware consumption.

Staff in Porsche Motorsport HQ at Weissach also access the central cloud NAS data for post-race analysis and their general electric vehicle research and development initiatives.

Setting aside the Porsche and e-car racing glamor, what we have here are transitory ROBO sites set up (at race circuits) with data generated from far edge systems (race cars) and commands sent to them. These ROBO sites link to a cloud datacenter in a global file system which is accessed by a head office site as well. In other words, one ROBO site access period every month or so and head office access all the time. No big deal really. Cloud NAS analytics are used to inform the trackside ROBO strategy but not to operate the actual mobile edge devices – the race cars. Again, not that demanding.

Check out a Global File Cache FAQ here.

SingleStore raises $116m, preps for potential IPO

Database supplier SingleStore has raised $116 million from its third funding round in three years as it looks to continue an aggressive hiring rate and gets ready for, we think, an IPO.

The round was led by Goldman Sachs Asset Management, along with new investor Sanabil Investments and contributions from existing investors Dell Technologies Capital, GV, Hewlett Packard Enterprise (HPE), IBM Ventures, Insight Partners, and others.

SingleStore CEO Raj Verma said: “With this funding, we have raised $278 million over the last 20 months, and our valuation is upwards of $1.3 billion. We are truly at unicorn status.”

The plan is to “invest time and resources to innovate more and even faster” and keep up the pace of staff recruitment.

The SingleStoreDB unifies transactional and analytic workloads and can run either in memory or in a hybrid memory + storage drive (disks, SSDs) mode. The company claims its combined transaction and analytical features obviate time-consuming extract, transform and load (ETL) procedures, enabling faster analytical results. Verma claimed in a blog: “Results show that SingleStoreDB delivers a 50 percent lower TCO against the combination of MySQL and Snowflake and a 60 percent lower TCO compared to the combination of PostgreSQL and Redshift.”

Raj Verma, SingleStore
Raj Verma

“Our purpose is to unify and simplify modern data. We believe the future is real time, and the future demands a fast, unified and high-reliability database – all aspects in which we are strongly differentiated,” he added.

Investment money is pouring in to analytics database companies, either on-premises or in the cloud. So far this year we have seen NoSQL database DataStax raise $115 million, Dremio raise $160 million for its data lakehouse, data warehouse startup Firebolt raise $100 million, and analytics startup Imply raise $100 million. That’s a total of almost $600 million in six months. 

Heavy hints

Where do we get our IPO ideas from? Heavy hints dropped by SingleStore.

Firstly, there is a new general counsel at SingleStore, Meaghan Nelson, who has had, the company says, “prior roles in private practice taking companies such as MaxPoint Interactive, Etsy, Natera and Veeva through their IPOs.“

Nelson said: “I feel that my deep experience working closely with companies through the IPO process along with my experience in scaling G&A (General and Administrative) orgs will be of great value to SingleStore as we continue to achieve new heights.”

Brad Kinnish, SingleStore
Brad Kinnish

Secondly, SingleStore has hired a new CFO, Brad Kinnish, and this appointment, along with that of Nelson, will “infuse a great depth of experience to the C-suite, making it even more equipped to explore future paths for company growth.”

The previous CFO, Manoj Jain, now a business advisor to SingleStore says he’s not leaving the company. Verma told us: “Manoj wanted to step back into an advisory role, allowing for a CFO with extensive IPO experience to be added to the team, and allowing him to take a less time-intensive role so he could increase time with family.”

Verma said about the new exec hires: “I cannot emphasize enough how strategic these new leaders are for SingleStore, and how excited I am to see what we can accomplish together.”

What about timing? Kinnish said: “It’s such an exciting time in the database industry. Major forces such as the rise in cloud and the blending of operational and transactional workloads are causing a third wave of disruption in the way data is managed. SingleStore by design is a leader in the market, and I am confident we will achieve a lot in the coming year.”

ExaGrid’s regular as clockwork growth machine

Exagrid has recorded another record-breaking revenue and bookings quarter, taking its customer count past 3,400.

This builds on its first 2022 quarter revenue record with ExaGrid exceeding that and reporting well over 20 percent year-on-year revenue growth. The company says its growth is accelerating, and it’s hiring over 50 additional inside and field sales staff worldwide.

Bill Andrews, ExaGrid president and CEO, said: “ExaGrid offers the best story to fix backup storage, as well as the most unique ransomware recovery solution in the industry, with the best cost up front and over time, the fastest backup and restore performance, the ability to recover from a natural or man-made site disaster, and the ability to recover data after a ransomware attack, which remains top of mind in today’s world.”

The company supplies appliances with a non-deduplicated network-facing disk-cache Landing Zone where the most recent backups are written for fast backups and for fast restores. Its Adaptive Deduplication technology deduplicates the data into a non-network-facing repository where data is stored for longer-term retention – weeks, months and years. The combination of a non-network-facing tier (virtual air gap) plus delayed deletes with ExaGrid’s Retention Time-Lock feature, and immutable data objects, guards against the backup data being deleted or encrypted. 

ExaGrid has a scale-out architecture, which maintains a fixed-length backup window and, it says,  eliminates disruptive and costly forklift upgrades and product obsolescence.

In the second quarter ExaGrid added 168 new customers, with 43 six- and seven-figure deals, and has more than 3,400 active upper mid-market to large enterprise customers. It was also cash-positive for the seventh quarter in row. We’ve tabulated the data for the past seven quarters: 

The numbers of new customers and six- and seven-figure deals is holding up well and ExaGrid is convinced there are more orders out there for the taking. That’s why it’s hiring 50 extra sales people.

We have to ask ourselves why other deduplicating backup appliance vendors with disk-based products aren’t taking note of this almost rampaging progress and doing something about it – like acquiring ExaGrid or emulating its technology. Stuff must be going on behind the scenes and something will surely be revealed in the future.

Non-legacy architecture Rubrik hoping for big win exit

I rocked up to an interview with Rubrik co-founder, chairman and CEO Bipul Sinha, armed with what I thought were challenging questions about Rubrik’s growth, an IPO, his succession and more. I was met with whip-smart rebuttals and got a lesson in VC outcomes. He swatted away my negative suggestions as easily as a Wimbledon tennis champ dismissing a no-hope opponent.

Update: Index Engines CEO comment added as a bootnote at the end of this article. 13 July 2022. Competitive takeout example added as bootnote 2; 18 July 2022.

Rubrik is a data protection, security and management company that started up in 2014 and has amassed more than $550 million in funding. It has more than 2,000 customers and made more than $2.5 billion in cumulative software sales. My first question focussed on an IPO.

Blocks and Files: Is an IPO still the preferred exit?

Bipul Sinha.

Bipul Sinha: “We want to build an enduring institution that lasts longer than my professional career. And, in my mind, the long-lasting company is always a public company, because public companies attract better talent which propels the company forward. So we want to be a public company. We are watching the market. We are developing ourselves better and better, and we will be ready when the market is ready.”

When do you think the market will be ready?

Bipul Sinha: “Nobody knows that; it’s a billion dollar question.”

Sinha has started posting short LinkedIn messages that suggest, to me, he’s viewing things from a distance – he’s looking at lessons learned, as if he’s stepping back a little. 

Rubrik is eight years old, with Sinha in his post all that time, and an IPO in the next year or two looks unlikely given the world and US economic situation.

Snowflake had its IPO after eight years and with a new CEO, Frank Slootman. SpringPath was acquired by Cisco five years after it was founded. Tegile was bought by Western Digital eight years after it was started up. Nvidia bought SwiftStack nine years after its founding and Google bought Velostrata seven years after that firm was set in motion. 

The economy is suffering from the COVID pandemic, Chinese problems with consequent supply chain issues, the Ukraine war, and then there’s inflation. That’s a pretty poor hand at the moment – a pretty poor environment in which to go public.

With this background and the reflective LinkedIn posts I wondered if Bipul was thinking about stepping back from the CEO role.

Assuming it takes two years for the current issues to get themselves worked out, and the environment becomes a little bit more positive for running an IPO, that’ll be ten years for you to be in charge for Rubrik. Have you thought about succession planning? Would it be conceivable that you might not be running Rubrik when it goes public?

Bipul Sinha: “The thing is that my focus is to continue to build the company. And as long as VC and the board believe that the company has the right leader, I’ll continue to go on.”

It strikes me though that ten years is a long time to be in charge.

Bipul Sinha: “That’s not true. I mean, if you look at founder-led companies, from Fortinet to Oracle, to even other iconic founder-led companies, these companies run for long time with founders. Bill Gates was the CEO of Microsoft for 20 years.

“The thing is that, again, we want to build a Rubrik into a long-lasting, enduring institution. That’s the goal.”

When talking to companies such as Veeam the backup market seems pretty immune to competitive takeouts. Backup software is very sticky and growth comes two ways: increasing data volumes from existing customers and protecting new, greenfield workloads. Is this how Rubrik sees the market?”

Bipul Sinha: “I will give you one data point in the last six and a half years. We have sold over growth or close to $2.5 billion worth of product cumulative – and our growth rate as you know is extremely high. So, so, if the data protection market is growing at a low single digit or something around that, we will not be growing this fast unless they are replacing. And we are replacing legacy backup and recovery with Rubrik data security solution.”

Would you see Rubrik acquiring other data protection companies?

Bipul Sinha: “Why would we acquire a legacy solution when we have the best in class solution?”

There’s a company called Index Engines. And it has a technology which Dell sells as Cyber Sense. It can do full content scanning of backup data sets, and the resulting index used for detecting possible ransomware attack patterns. So this is new technology. I think Index engines is a relatively new company. It’s not a legacy company.

Bipul Sinha: “I’m not going to comment on Index Engines per se, but I’ll give you a very interesting perspective. … Index Engines works on a storage interface to scan so it works on a legacy architecture. So it is fundamentally incompatible with our view of the world. Any company that says that, ‘hey, I can take new technology to scan my data’ is definitely selling storage. We are not. … They are all legacy technology. What we have done is built our single software that combines metadata and data together. We have built our own AI engine inside it.

“All the companies that is claiming ‘here is a marketplace; you can take random third-party software to scan your data’ means that they’re selling storage – they’re selling a legacy full-cost solution. Thinking about it: having third-party software running on your backup data will create a supply chain risk. … Our architecture is a zero trust architecture, which is completely different from the legacy.”

What you say to customers is ‘Hello, your technology or your existing data protection data security backup set is not fit for purpose, you should use ours.’ That’s a difficult sell.

Bipul Sinha: “Here’s the thing. If we look at Rubrik’s success, we are able to make that sale and use it.  … We are replacing legacy with a modern data security solution.”

Rubrik is winning in competitive takeouts?

Bipul Sinha: “Yes, we have many case studies showing this. (See bootnote 2.)

“We have to change our thinking around this market. This market was built and conceived 20–30 years ago and legacy backup software writes to a storage target. It’s a legacy architecture, which was built for once-in-a-while disaster, human error, things like that. And now now it is cyber disasters. There is a ransomware attack every 11 seconds. This legacy architecture is woefully inadequate, and they are trying to apply band-aid after band-aid – like by bringing software to scan your data for cybersecurity. But the issue is that the fundamental architecture is broken. Because as soon as you have backup software separate from the storage, you are at risk.

Blocks & Files diagram.

“Rubrik has a single software that does both. And that’s why it’s a zero trust architecture.”

Rubrik does write to external storage though.

Bipul Sinha: “What we say is that you keep your data for 30 days, 60, 90 days in Rubrik, because that’s the ransomware interval, and you recover faster on premises, or whatever, wherever. And then, once you’ve gone past the 60–90 days interval, then you write out the data into cheap and deeper storage, because that’s not a ransomware risk or vulnerable data anymore.

“When we write to external storage, we write only for archiving purposes – long-term archival. Which is a different business than the nearline backup. So the main difference is the nearline backup, in the legacy architecture it sits here (in external storage). In the Rubrik architecture, the nearline backup sits here, in direct-attached storage. 

“And then it is a scale out, right? It’s not limited to one server. You can go to hundreds of servers.”

VC outcomes

We finished up by talking about VC investment outcomes. Sinha drew a six-box diagram with two rows: one for the right investment and one for a wrong investment.

Blocks & Files diagram with our example suppliers, not Rubrik’s.

There are two types of investment. One is into a singular company with unique technology and no direct competitors with that technology. The other is into a company with a variation on technology shared by other suppliers.  

A big outcome follows an investment in a company with singular technology which grows to dominate its market. Snowflake is a good example. Where a VC invests in a company with no real technology advantage the outcome will be small, because it is effectively shared between several companies. 

Our B&F diagram suggests MayaData, bought by DataCore, was in this category because it was one of many startups providing Kubernetes infrastructure services for cloud-native developers – for example Kasten, Robin.IO, Ondat and Portworx.

Sinha would put Rubrik in the right investment, singular technology box, because no other data protection and security supplier has a unified backup, storage and security architecture. Everyone else uses a legacy architecture with separate storage services accessed across a network, which introduces risk.

That’s his pitch, in a nutshell. He’s sticking to it, and hoping for a “win big outcome” – a great pay day. 

Bootnote 1

Tim Williams, founder, investor and CEO of Index Engines, told us: “Index Engines indexes data in backup formats. There is no technical reason why we couldn’t index Rubrik backup data. We aren’t limited to legacy architectures and neither is Dell. … customers have very high-performance expectations for our analysis, which means we need to qualify every backup target we support. We haven’t qualified Rubrik as a backup target to date because we are 100 percent channel, and our channel hasn’t asked us to. That’s the only reason why we don’t support Rubrik.”

Bootnote 2

Colchester Institute’s IT services manager, Ben Lewis said, “Being a long term Veeam customer, we needed a full review to make an informed decision. The review concluded with Rubrik being the top choice due to its overall protection and recovery from cyber breaches. Our choice to move away from Veeam and purchase Rubrik was the best decision we could have made.” In April 2021, Colchester fell victim to a ransomware attack. With Rubrik’s data protection and rapid recovery capability Colchester were able to recover 100% of their data without having to pay the ransom.

HCI player Scale Computing bags $55m funding

Hyperconverged infrastructure appliance vendor Scale Computing has raised $55 million to grow its Internet-edge business.

Scale Computing is virtually the last specialist HCI start-up standing in a heavily consolidated market that is dominated by mainstream players Dell EMC, HPE and Cisco.

The funding round was led by a group of funds managed by Morgan Stanley Expansion Capital and takes Scale’s total funding to $159 million. Scale’s prior injection of capital was a $34.8 million G-round in 2018.

Scale has previously launched a series of products for the edge, such as the cigarette box-sized HCI edge appliance, the HE 150, and has built a growing customer base around its Scale Computing Platform. It is now getting funding to, er, scale that business faster.

Jeff Ready.

CEO and co-founder Jeff Ready, said: “Data is moving to the edge twice as fast as it moved to the cloud. Management of the edge is an inverted problem from management of the data center. A typical data center deployment represents hundreds or thousands of servers at one or two locations. On the other hand, a typical edge deployment is a handful of servers each at hundreds or thousands of locations. This requires a completely different approach to deployment and management.”

Scale says the SC//Platform uses autonomous, self-healing technology that enables remote edge management of applications and systems at scale, keeps applications running as errors happen, and uses machine intelligence rather than human administrators. The SC//HyperCore software delivers high-availability remote, on-premises edge computing with disaster recovery.

Pete D. Chung, Managing Director and Head of Morgan Stanley Expansion Capital, said: “The technological advantage of Scale Computing’s edge computing platform solves endemic customer problems through enhanced resiliency, manageability and efficacy of their IT infrastructures. We are thrilled to help the company build upon their success with this funding.” 

Scale will spend the cash on investments in people and R&D, and restructure its debt facilities. It says it wants to expand the capabilities of its edge computing, virtualization, and hyperconverged products.

Catalogic upgrades ransomware, hypervisor protection

Data protection supplier Catalogic has updated its software to provide ransomware pattern matching that protects Microsoft 365 and more hypervisor workloads.

Update: Catalogic ProLion situation explained and GuardMode scanning information added. 13 July 2022.

Catalogic was a copy data management supplier (ECX product) and endpoint and server data protector (DPX product), but it sold its ECX products to IBM in May 2021 to concentrate fully on data protection. It has CloudCasa software to protect containers and a DPX product to protect more traditional workloads. DPX v4.8 introduces Guard Mode for earlier ransomware detection and vPlus to look after Microsoft 365 and various hypervisor environments.

Sathya Sankaran, Catalogic
Sathya Sankaran

Sathya Sankaran, Catalogic COO, said: “DPX GuardMode changes a backup teams’ cyber reliance posture from reactive to proactive with early detection. [It] notifies backup and storage teams of suspicious activity and pinpoints the extent of damage caused by cyber incidents.”

How does it do this? We’re told GuardMode maintains and updates over 4,000 known ransomware threat patterns, and assesses affected files. It also monitors file shares and file system behavior, locally and over the network, as well as relying on a specific binary fingerprints (ransomware patterns). Affected files can be recovered by rolling back to clean versions in DPX’s backup store.

This follows on from Dell announcing its CyberSense ransomware detection capability a few days ago.

Catalogic had a relationship with ProLion, dating from 2019, for its CryptoSpike offering to protect NetApp ONTAP environments. Through this it gained access to a Block List that includes thousands of ransomware file endings or names. There were daily list updates on a CryptoSpike server. However, this is no longer operational, as a Catalogic spokesbody said: “Catalogic started the ProLion relationship around 2019 but it ended as of May 2022 (2 year relationship), although [it is] still selling support renewals. Catalogic ended the relationship in part due to not focusing on NetApp as a key partner anymore (it does not sell ECX anymore to storage vendors), and of course, now with the focus on DPX with GuardMode, and vPlus (Storware 5.0 product) and Catalogic’s CloudCasa.”

Catalogic product set slide
Catalogic product set slide from 2019 or so, showing CryptoSpike and two Storware products

Catalogic told us: “For DPX GuardMode which is Windows only in its initial release, we use the File Server Resource Manager (FSRM)  to do the file level scanning to find compromised files.  FSRM uses a 3rd party community sourced lists of filters or signatures that is updated daily.”

The vPlus addition provides data protection for Microsoft 365, and other virtualization platforms such as RHV/oVirt, Acropolis, XenServer, Oracle VM, and KVM. Catalogic CEO Ken Barth said: “We are excited to extend our relationship with Storware and announce DPX vPlus, that adds Microsoft 365 cloud data protection and expands our coverage of hypervisor workloads. DPX vPlus is fully integrated into the DPX vStor backup repository, and it delivers greater workload coverage for an organization’s edge and cloud data.”

vPlus is not Catalogic’s own software. In fact, Catalogic partnered with Polish data protection software company Storware in 2018 after previously reselling its products. It took an equity stake in Storware, entered into an exclusive distribution agreement to offer Storware data protection products in the North American market, and gained an exclusive right to promote Storware products for potential OEM signings with North American companies.

The agreement covered two Storware products: vProtect and KODO. vProtect is an enterprise backup solution for virtual environments that secures virtual machines running on Citrix XenServer, Xen, Nutanix Acropolis, RHEV, oVirt, KVM, KVM for IBMz, Proxmox, and Oracle VM. KODO is data protection software for Windows and macOS systems (desktops and laptops), mobile devices (iOS, Android), and SaaS platforms (Office 365, Box).

Clearly, DPX vPlus is a combination of Storware vProtect and KODO. 

Catalogic is announcing its ransomware detection and M365/multi-hypervisor table stakes in the data protection market casino. That they are table stakes means they should be welcomed by Catalogic’s customers.

TRIE

TRIE – four letters taken from the word reTRIEval. A TRIE is a digital tree structure also known as a Prefix or Radix tree. This is an extremely fast and compact data structure where a node’s key is encoded in the node’s position in the tree. TRIEs were invented by Edward Fredkin at MIT in 1960, and generated renewed interest in the 2000s when employed by Google for its autocomplete feature. This is the mechanism Google uses when a user enters a word into a search box, and Google automatically starts to complete the search string. It requires searching a web-scale index as fast as the user can type. Infinidat adapted the structure for storage virtualization, specifically for providing extremely efficient and high-speed mappings between virtualization address layers. (Definition from Mainline Information Systems.)

Tape is low cost and a low carbon storage winner

measuring tape
measuring tape

Consultant Brad Johns thinks moving archive data from disk to a 60:40 tape:disk situation could cut its ten-year CO2 emissions by almost 60 percent and its total cost of ownership by almost half.

Update: Critical point of view about tape infrastructure exclusion from tape numbers added. 12 July 2022.

Johns presented his findings at the Fujifilm 12th Annual Global IT Executive Summit in San Diego, June 22–25. He worked out the CO2 emissions and TCO numbers using three scenarios for storing 100PB of data for ten years: all disk storage, 60 percent tape and 40 percent disk, and all tape. The numbers were based on amounts provided by Seagate of its 18TB Exos nearline disk drive and Fujifilm for its LT0-9 tape cartridge with 18TB of raw capacity. 

His worked out CO2 numbers only includes the estimated CO2 impact of the storage media. They exclude controllers, networking, tape libraries, tape drives (except energy), cooling and electrical systems. A chart shows the  CO2 emission findings: 

The 60:40 tape:disk setup puts out 1,1134 tons of CO2 whereas an all-disk equivalent emits 2,663 tons – 42 percent more. An all-tape system outputs 79 tons – 97 percent less than the all-disk configuration. 

Supercomputing expert Glenn Lockwood tweeted about the exclusion of controllers, networking, tape libraries, etc: “Disingenuous; tape has a much higher barrier to entry (robotics, drives, etc) than HDD which wasn’t included. Nor was cooling; can’t run tape without climate control (dew point), but you can run disk.”

The Fujifilm and Johns thinking is that CO2 emission data will play an increasingly important part in customers’ IT purchase decisions as ESG (Environmental, Social and Governance) reporting becomes more widespread and organisations will want to demonstrate progress against ESG goals.

The cost findings are quite persuasive as well:

  • All disk – $17,707,468
  • 60:40 tape:disk – $9,476,339 (46 percent less than all-disk)
  • All tape – $3,832,956 (78 percent less than all-disk)

Johns calculates that moving 60 percent of the world’s HDD-resident data to tape would save over 72 million tons of CO2e. Much of the data currently on hard disk drives is, he says, cold data and could be more cost-effectively stored on tape where it would emit less CO2 as well.

We could ask if storing all this data is worthwhile in the first place, as it costs so much money and contributes to global warning. Suppliers of storage media and their contracted consultants don’t ask such questions, focusing instead on the merits of their respective media. In this case it’s tape good and disk bad. 

A Fujifilm and IBM-sponsored white paper written by Johns states: “Given the focus on sustainability and the large volumes of storage devices required to store the growing quantities of data in the coming years, organizations have an opportunity to reduce their carbon footprint, improve sustainability and reduce expenses by migrating less frequently accessed (cold data) from shared disk drive (HDD) based storage to modern tape storage.“”

Note the snarky use of the term “modern” here by Johns in describing tape storage. Is he trying to say that disk drive storage is not modern? That would be a ridiculous stance to take – an 18TB disk drive is every bit as modern as an 18TB tape cartridge.

Zettabyte era brings archiving front and center

Veteran storage analyst Fred Moore  told a Fujifilm event audience that we’re entering a zettabyte era of data storage and archival storage will be the dominant technology by capacity.

Fred Moore.

The four-day Fujifilm 12th Annual Global IT Executive Summit took place in San Diego, June 22–25 and Moore, the president of Horison Information Strategies,  presented there. He said that, by 2025, a projected 175ZB of data will be created, with ~11.7ZB actually stored. As an illustration, 1ZB – 1×1021 bytes, or a sextillion – would fill 55.36 million LTO-9 (18TB) cartridges or 50 million 20TB HDDs.

Moore said 80 percent of the 11.7ZB would be archived – roughly 9.3ZB – with active archiving becoming a de facto standard. Data moves into archival status when it is 90 to 120 days old and its probability of access sinks below 0.5 percent. Archival data has a general compound annual growth rate between 25 percent and 35 percent, and it is being amassed faster than it can be analyzed.

Moore believes this archival data pressure will prompt the arrival of a new storage tiering model:

His graphic shows a four-tier model with two archive tiers: a front-end active archive using disk and tape, and a back-end using tape, which will have the slowest data access speed. Data access speed, storage cost and total cost of ownership (TCO) all go down as we move down the tiers. 

But a subsequent chart shows an extra, third, archive tier – a deep archive:

His chart shows the 80 percent of archive data divided between the three tiers with the individual percentages adding up to 80 percent. We calculate that the actual data split between the three archive tiers is about 19 percent in the active archive front end, around 75 percent in the standard archive and six percent in the deep archive, where the coldest data of all is stored, for up to 100 years or more.

The active archive uses SSDs for instant retrieval and nearline disks for slightly slower access. The standard archive tier will be a tape library. The deep and very long-lasting archive will use tape again or new technology – such as DNA storage, glass/photonics, 3D holography or some other emerging tech.

The standard access format for archive data will be, Moore thinks, S3 object storage with file format used for some data types. 

Tape, Moore says, uses much less energy and has a much lower carbon footprint than HDDs – around 85 percent lower. Unlike with disk drives, you can add tape capacity (more cartridges) without increasing energy consumption, and tape has a clearer roadmap to higher capacities than disk.

In other words, for standard archives, for capacity, cost and environmental reasons, tape is king. It rules and will do so for a long time.

WANdisco contract wins drive growth sky high

Replicator WANdisco is reporting massive growth – albeit from a relatively low base – in the first half of 2022, following a flurry of contracts wins.

The company’s LiveData Migrator (LDM) software replicates so-called live data, typically from an on-premises source to the public cloud. It’s live data because the source data set is being updated while replication is taking place. WANdisco’s technology manages this so there is no downtime and updates are replicated as well. LiveData Migrator for Azure (LDMA) is a native Azure service that enables users to migrate petabyte-scale Hadoop data and Hive metadata to the Azure cloud with the promise of no loss of data consistency.

Dave Richard, WANdisco
Dave Richards

WANdisco chairman and CEO Dave Richards said: “Following the general availability of our LDM and LDMA products, the first half of this year has been a transformational period.

“Looking ahead, the key task for us is to remain focused on converting our robust pipeline of opportunities. Many of these opportunities are being driven by the explosion in IoT use cases, and we remain confident in our ability to deliver continued strong trading through 2022 and beyond.”

This is being helped by partnerships with Snowflake and Databricks.

The company reported revenues of $27.3 million for the first half of 2022 ended June 30, compared to the year-ago quarter’s $2.1 million. Ending RPO (Remaining Performance Obligations) at the end of the first half is expected to be approximately $31 million as of June 30, up significantly from last year’s $3.5 million.

WANdisco had cash of approximately $32.5 million and $13.1 millions in trade receivables at the end of the first half.

We have recorded multiple contract wins by WANdisco so far this year:

  • April – Signed a contract worth $213,000 with a PC vendor for the LiveData Migrator (LDM) to migrate a subset of data from the existing Hadoop environment to cloud-native systems that can be run in the public cloud 
  • April – Agreed to a pre-pay license deal with Oracle worth $150,000. This was an initial rollout across a couple of Oracle customers
  • April – Signed a Commit-to-Consume contract worth $720,000 with an existing customer, a top 10 global retailer

Up until now WANdisco has not generally met the promise of its technology, recording revenues that haven’t sparkled. For example, 2021 revenues of $7.3 million were down from 2020’s $10.5 million. At the time Richards said: “With our recent contract wins, unique set of solutions and unique set of solutions and high visibility of near-term pipeline, we remain confident in our ability to significantly improve results in FY22.” How right he was.

Data protection: a mature market with fabulous growth prospects for Veeam

Veeam CTO Danny Allan

Data protection is a mature market with a tremendously sticky product and wonderful growth prospects.

This was my conclusion after talking with Veeam CTO Danny Allan at a UK VeeamON Tour event. The reasoning starts from the point of view that, unlike many areas of IT, there has been no consolidation of data protection to a few large suppliers – such as has happened in SSDs, DRAM, server CPUs, disk drives, servers, operating systems, hyperconverged systems and so forth.

Danny Allan.

Backup software is a kind of digital superglue. That’s mostly because a backup data set has a proprietary format and it’s impractical to convert  from one supplier’s backup format to another – from Veeam to Veritas NetBackup say, or IBM TSM to Commvault. If customers do need to convert a backup dataset, Allan said that, in his experience: “They typically pay a third party to do it.”

The process is generally primitive and time-consuming, in that there is no direct conversion from one supplier’s format to another. Instead, the source backup data has to be rehydrated – restored – and then backed up again using the new supplier’s software. Generally, Allan said, a customer who has changed a backup supplier for a workload will retain the old backup software and its data set and let it age out. At that point it can be deleted and the old backup software discarded.

With this in mind it’s clear that a backup supplier has to foul up pretty badly for a customer to replace it with another supplier to protect the same workload.

The second realization is that there are virtually no greenfield data protection prospects any more. Any business or organization with an IT department has data protection arrangements in place already. The only new business that comes into being is when an existing customer develops new workloads which have no protection arrangements. By default the existing data protection suppler (or its channel partner) gets first crack at protecting the new workload. If the existing backup supplier can’t protect it – either not at all or not well enough – then a competing vendor can get a look in.

With this background of precious few new customers and virtually no competitive take-out opportunities it’s clear that data protection has mature market characteristics. But it also has high-growth characteristics too – because data volumes are growing, and new workloads need protecting.

There’s no need to justify the fact that data growth exists; the signs are everywhere. Businesses are collecting and analyzing more data, both for existing workloads and for new ones like machine learning and genomics, video production and surveillance and so forth.

All backup suppliers will generally charge by the amount of data protected – so if it goes up, so too does the vendor’s revenue. Another massive source of new data to be protected is the adoption of new workloads.

Allan said: “We’re on a decade cadence.” Each decade has its solidifying technology and its emerging technology. Ten years or so ago the solidifying technology was server virtualization and the emerging technology was cloud. Before that virtualization was the emerging technology and physical server workloads represented the solidifying technology.

Now we are in a decade with cloud becoming a solidifying technology and Kubernetes representing the emerging technology. That’s why Veeam bought Kasten two years ago – to acquire and then develop its ability to protect Kubernetes workloads. It will protect them wherever they exist: on-premises and in the public clouds.

Allan said Veeam is growing “more than 20 percent a year” and cited an IDC report putting Veeam in joint first place in the data protection market with Dell Technologies. IDC said Veeam’s first half 2021 revenues were $647.17 million, up 14.8 percent on an annual basis, and giving it a $1.29 billion annual run rate. Dell revenues for the same period were $665.46 million, down 10.9 percent year-on-year, giving it a $1.33 billion run rate. Extrapolate the growth/decline rates to the first half for this year and you arrive at Veeam in first place with Dell second. 

Allan seemed confident that Veeam would achieve the top spot. If that happens then, we think, Veeam might well look favorably on running an IPO. That would be a coup for fresh CEO Anand Eswaran and likely what he was brought in to achieve.