Home Blog Page 105

Hammerspace using Vcinity tech as remote data delivery and access pump

Large scale data orchestrator Hammerspace is using Vcinity’s high-speed remote data movement technology to overcome data gravity.

VCinity’s software moves chunks of data, using remote direct memory access (RDMA) inside IP packets, and enables real-time compute on data thousands of miles away. It moves data over WAN connections at more than 90 percent of the link’s sustained bandwidth. Hammerspace pitches its global parallel filesystem as enabling data to be located and accessed inside a global namespace as if it were local.

However accessing data in large files across continental and intercontinental WAN distances still incurs latency and network transmission time penalties.

Tony Asaro

Talking about testing the combined technologies, Tony Asaro, SVP of strategy and business development at Hammerspace told B&F:

“Hammerspace performed its data-in-place assimilation, which is normally done locally, over a wide area network (simulated) by leveraging Vcinity. This enables customers and service providers to easily orchestrate third-party file data from remote locations to centralized resources such as GPU farms.”

Vcinity chairman and CEO Harry Carr said in a statement: “Hammerspace immediately understood that the joint solution with Vcinity is a huge win for customers. In an AI/ML world, Hammerspace’s ability to unify, orchestrate, and automate global unstructured data is complemented by Vcinity’s unparalleled performance for moving or accessing data. A global namespace and immediate access to data truly enables the next data cycle.”

Harry Carr

Two years ago we wrote that Vcinity provides two gateway devices, appliances or virtual machines, at either end of a wide area link. There can be from three to eight network links between the two Vcinity devices, with data striped across them. The devices run on Linux and implement IBM’s Spectrum Scale (Storage Scale now) parallel file system. A client system mounts its Vcinity device as an NFS or SMB file system and reads or writes data to/from the remote file share via its inline Vcinity device.

VCinity says its combination with Hammerspace’s technology enables customers to make their distributed data accessible to compute, regardless of geographic distance. It establishes a continuous data pipeline into the Hammerspace environment. Customers can choose to extend the capabilities of their Hammerspace environment to wherever their data is located, with the option to either move data hyper fast from point to point (such as into your Hammerspace environment, or across storage locations within it) or instantly access data in place from a remote location (edge, core, or cloud). 

Hammerspace’s ability to unify, orchestrate, and automate global unstructured data (with HA and scalability) is complemented by Vcinity’s performance for moving or accessing data

We understand that Hammerspace users don’t interface directly with Vcinity, as the Vcinity tech runs “under the hood.”

The Hammerspace/Vicinity tech combo is suited for AI/ML workloads, distributed data center setups, supporting remote workforces and edge compute. Asaro said: “Together, we revolutionize high-performance AI/ML workloads by enabling businesses to orchestrate their distributed data to their centralized GPU resources.”

Pure Storage gets new Cisco FlashStack for AI certification added to Nvidia BasePOD validation

Pure Storage says it has new CiscoFlashStack configs complementing its Nvidia’s BasePOD reference architecture certification, burnishing its AI storage street cred.

Cisco’s FlashStack for AI is a set of Cisco validated designs (CVDs) for AI systems using Cisco UCS server and networking components along with Nvidia GPUs in the servers and Pure FlashBlade storage arrays.

Nvidia’s DGX BasePOD is a reference architecture used to build A systems with Nvidia A100 and H100 GPUs, ConnectX network interface cards,  QM and SN switches, and third-party storage. BasePOD is smaller than Nvidia’s SuperPOD, and Pure’s AIRI AI system is built on the BasePOD reference architecture and uses the latest FlashBlade//S array.

A statement from Tony Paikeday, Senior Director, DGX platform at Nvidia, said:“Businesses across industries are leveraging generative AI to enrich customer experiences and transform operational efficiency. With NVIDIA DGX BasePOD certification, Pure Storage can help customers simplify and scale their initiatives, speeding their return on investment with AI-powered insights.”

Integration layers for DGX BasePOD

DGX BasePOD system Architecture with A100s

There is a rush by businesses to integrate AI across their organizations to drive real-time decision making, operational efficiency, and enhanced scalability. They will struggle, Pure says, unless they have the right storage infrastructure in place. Its storage infrastructure can support, it says, both enterprise development of large-scale AI training environments and the deployment of large language models for AI inference.

Pure says it supports over 100 customers across many AI use cases, including self-driving cars, financial services, genomics, gaming, manufacturing, and more. It wants us to know that it long anticipated the rising demand for AI, delivering an efficient, high performance, container-ready data storage platform to fuel the most advanced enterprise AI initiatives. 

We should not, Blocks & Files imagines, forget this when reading about VAST Data and CoreWeave and Lambda Labs, IBM’s AI-focused Storage Scale 6000, or DDN and its AI experiences.

Pure commissioned a report from Wakefield Research, “Drivers of Change: Meeting the Energy and Data Challenges of AI Adoption,” that says AI-using organizations are using more electrical energy for AI than anticipated and this could set back their ESG goals. Pure, of course, can help. Read the report here.

Storage news ticker – 13 November

Data protector Acronis launched its MSP Academy, an educational initiative promising to help managed service providers (MSPs) with business and technological knowledge, skills, and tools.  It covers various topics, including starting an MSP, running a successful MSP business, marketing an MSP business, and optimizing the efficiency and productivity of MSP technicians. More information here.

Data connectivity supplier CData has introduced its GPT-powered AI Generator to the CData Connect Cloud. It has a text-to-SQL capability converting everyday language into dynamic SQL queries spanning multiple data sources. CData says it eliminates the need for intricate technical knowledge, simplifying and streamlining data connectivity across an organization.

The Fibre Channel Industry Association (FCIA) will run a live webcast on November 30, 2023, “NVMe over FC Deep Dive in Protocol, Architecture and Use Cases” in which. Experts will cover:  

  • The architecture of NVMe/FC at protocol level
  • Building blocks of NVMe, like NVMe subsystem
  • NVMe controllers, Namespaces etc.
  • Overview of FC-NVMe T11 standards
  • Key advantages of NVMe/FC, including application use cases

Register here.

Brian Householder

CyberSense data integrity scanner supplier Index Engines says Brian Householder has joined its advisory board. Householder led the growth and transformation of Hitachi Data Systems into Hitachi Vantara, holding the position of CEO, President and COO over his final seven years with the organization. Householder was involved with all facets of this multi-billion dollar global business, transitioning the company from a hardware-centric to a data software and solutions company. CyberSense uses AI-driven machine learning models and over 200 full-content analytics to detect signs of ransomware corruption, claiming it does so with 99.5 percent accuracy.

Gaming product business NOVOMATIC Italia has upgraded its legacy storage infrastructure to Infinidat’s InfiniBox. It was installed in mid-2022 and said it reduced latency by 68 percent for NOVOMATIC, dropping to only 0.32ms from 1ms on its previous Hitachi Vantara storage system. In addition, it claimed, Infinidat increased the cache hit ratio to 98.8 percent – up from only 50 percent on its previous storage system. Read a case study here.

Intel getting out of DAOS? The Linux Foundation has launched the DAOS Foundation, to advance  the governance and development of the Distributed Asynchronous Object Storage (DAOS) project as an open-source project. Founding members are: Argonne National Laboratory (ANL), Enakta Labs, Google Cloud, HPE, and Intel. Intel will donate the DAOS source code to the DAOS Foundation and will remain one of the main contributors to the future development of the project. Visit the DAOS Foundation website at https://foundation.daos.io/.

NAS supplier iXsystems announced the TrueNAS F-Series, an all-NVMe storage systems that runs on TrueNAS Enterprise 23.10 Hyperconverged Storage Software. With 30TB NVMe drives, a single 2U system supports 720TB of highly available storage. Compared to prior models from iX, F-Series offers significant reductions in power, space and Total Cost of Ownership (TCO). It supports data intensive use cases including AI/ML, containerization, content creation, database servers, gaming, and virtualization. Linux-based TrueNAS Enterprise 23.10 offers native container support, Kubernetes integration for containerized applications, and the ability to scale up to 1,200 drives and 25PB+ in a single system. It has a scale-up or scale-out architecture, built on the OpenZFS 2.2 filesystem, with nZFS Block Cloning (Deduplication) for SMB and NFS file copies and ZFS dRAID Pool Layouts. The TrueNAS M-Series remains the versatile and high capacity system for hybrid  flash and HDD requirements in the portfolio.

Komprise is in the  Deloitte Technology Fast 500 ranking of the fastest-growing technology, media, telecommunications, life sciences, fintech, and energy tech companies in North America – for the second consecutive year. According to the rankings, Komprise grew 212 percent during the three-year period from 2019 to 2022. It grew 306 percent in the 2022 edition.

The highest-ranked storage companies on the latest list are:

  • 110 – Wasabi – 1,314 percent growth (number 42 last year, 4,109 percent)
  • 126 – Snowflake – 1,161 percent
  • 148 – Fivetran  – 979 percent (99 last year)
  • 228 – Own Company – 625 percent (198 last year)
  • 264 – Netlist – 519 percent
  • 358 – Datadog – 362 percent
  • 383 – ScyllaDB – 333 percent (185 last year)
  • 420 – Confluent – 291 percent
  • 437 – Twist Bioscience – 274 percent(332 last year)
  • 460 – CData Software – 262 percent
  • 462 – Zscaler – 260 percent
  • 487 – Cloudflare – 240 percent (500 last year)
  • 501 – MongoDB – 227 percent
  • 524 – Komprise – 212 percent (420 last year)

VAST Data was the 5th fastest growing US tech supplier in the 2022 list, with 14,985 percent growth. It is not present in the 2023 list, which is odd.

Micron announced 32Gb monolithic die-based 128GB DDR5 RDIMM memory, using 1β (1-beta) technology, featuring performance of up to 8000 MTps to support data center workloads. Micron’s 128GB RDIMMs will be shipping in platforms capable of 4800 MTps, 5600 MTps, and 6400 MTps in 2024 and designed into future platforms capable of up to 8,000MTps.

… 

Object storage supplier MinIO has joined the STAC Benchmark Council – STAC being the standard in financial services market benchmarking.

Storage array supplier Nexsan has improved its Nexsan Worldwide Partner Program with deal registration, preferential pricing, market development funds, Sales Person Incentive Funds (SPIF) and a storage refresh rebate program. Learn more by emailing sales@nexsan.com.

Korea’s Samsung’s third calendar 2023 results saw revenues of ₩67.4 trillion ($52.6 billion), down 12.3 percent y/y. There was a profit of ₩5.84 trillion ($4.56 billion), down 37.8 percent. The memory part of its Device Solutions business (DRAM + NAND) earned ₩10.53 trillion ($8.2 billion) in revenues, down 31 percent. It’s going to concentrate on producing high value DDR5, LPDDR5x, HBM and HBM3E memory products, amid expectations of a recovery in demand.

Object storage supplier Scality has promoted Eric LeBlanc to VP w-wChannel Sales and GM of ARTESCA.

SIOS Technology, which supplies application high availability (HA) and disaster recovery (DR), announced a verified integrated solution with Milestone Systems, a provider of open platform IP video management software (VMS). The integration between Milestone XProtect platform and SIOS LifeKeeper for Windows guarantees continuous access to the surveillance system’s control and configuration capabilities, preventing disruptions and enhancing operational efficiency.

Computer array systems builder SoftIron has put out a blog containing its predictions for 2024.

StorMagic has an HPE partnership whereby HPE will include SvSAN as a new way to store backup data using HPE GreenLake for Backup and Recovery and HPE StoreOnce. StorMagic SvSAN is HCI software that transforms any two x86 servers into a highly available shared storage cluster.

Research house TrendForce issued its channel-market SSD report and supplier rankings for 2022:

It said global channel market SSD shipments witnessed a decline, with only 114 million units shipped in 2022—a 10.7 percent decrease from the prior year. The top three SSD shipment leaders of 2022 were Kingston, ADATA, and Lexar, with Kingston and ADATA maintaining solid advantages and experiencing growth in market share over 2021. Lexar’s growth was attributed to an aggressive push for revenue in anticipation of going public. Kimtigo, in 2022, made significant strides in expanding into industrial control and OEM markets, which in turn boosted its shipment volume and market share. Netac maintained its competitive edge in the SSD market alongside securing several government orders in the enterprise SSD sector, keeping its market share and ranking consistent with the previous year.

.…

Veeam’s Kasten container backup business unit made three announcements:

  1. Kasten by Veeam celebrated the first anniversary of KubeCampus, an online career development resource for the Kubernetes developer community, with a new partnership with WeAreDevelopers, a community for developers invested in accelerating tech talent.
  2. Kasten by Veeam announced the release of its New Kasten K10 V6.5 platform for Kubernetes during KubeCon + CloudNativeCon North America. The new release introduces trusted container environments, enhanced ransomware protection and data protection support for large-scale Kubernetes environments.
  3. Kasten by Veeam announced that Kanister, an open-source framework that provides application-level data backup and recovery, has been accepted by the Cloud Native Computing Foundation (CNCF) as a sandbox project, indicating that the project adds value to the CNCF mission and encourages public visibility within the community.

Wasabi Technologies announced partnerships with security providers in Australia and New Zealand, Channel Ten and Visium Networks, to deliver cloud storage to support the video surveillance needs of organizations in every industry. Wasabi recently introduced Wasabi Surveillance Cloud to enables organizations to offload video surveillance footage from their local storage environment directly to the cloud.

Panasas accelerates bulk data access in latest software release

High-performance computing parallel file storage supplier Panasas now has support for dual-actuator disk drives and has added the S3 protocol to speed dataset access and provide public cloud tiering.

Panasas announced the previous PanFS v9 major release in August 2021. Just over two years later we have PanFS 10 which adds support for Seagate Exos 2×18 dual-actuator disk drives in the ArchiveStor filers, Amazon’s S3 object storage protocol, 24TB disk drives and InfiniBand HDR (200Gbps) connectivity.

Ken Claffey

CEO Ken Claffey said of the release: “We understand that our customers’ data environments demand greater performance and cost-efficiency coupled with storage solutions that are easy to install, configure, and manage at scale.”

The AI and HPC worlds are increasingly overlapping, says the company, adding that PanFS v10 has been developed with that in mind.

Claffey said: “We’re setting a new performance/cost benchmark with the new PanFS 10 capabilities. With support for Seagate’s MACH2 dual-actuator drives, organizations can now benefit from twice the performance capacity without having to evaluate a cost-multiplier of using all-flash storage solutions, which prove costly as storage requirements for emerging workloads expand.”

PanFS 10 stores large files on 3.5-inch disk drives, smaller ones on faster access SSDs and very small files (<1.5MB) and metadata in NVMe SSDs.

Seagate’s MACH2 dual-actuator drives offer around twice the bandwidth and IOPs per TB of capacity as single-actuator HDDs. However, the additional actuators use up the space occupied by a disk platter in a 10-platter chassis, meaning that they are 9-platter drives and so have a reduced capacity; Exos 2X18 18TB drives instead of 20TB or larger.

An Exos 2X18 drive, functioning logically as two 9TB drives, delivers up to twice the sequential read performance of a single actuator 18TB drive.

PanFS v10 software now supports the S3 protocol as well Panasas’ proprietary DirectFlow protocol, NFS and SMB/CIFS v3.1. Panasas’s Director Nodes translate NFS, SMB/CIFS and S3 into the DirectFlow protocol. The S3 support enables cloud-native applications, that use S3 as their primary data access protocol, to store and retrieve data in PanFS systems. 

Panasas documentation says: “Objects stored in PanFS via the S3 protocol are visible as POSIX files in PanFS via the other data access protocols (DirectFlow, NFS, and SMB/CIFS), and POSIX files stored in PanFS via any other data access protocol are accessible as Objects via the S3 protocol.”

A PanMove facility can migrate any number of files between on-prem HPC deployments and AWS, Azure, or Google plus any other S3-supporting cloud service provider.

A 20-page PanFS v10 architectural white paper can be read here. The new release of ActiveStor Ultra and PanFS 10 will be available in Q1 2024.

NetApp doubles down on Microsoft Azure partnership

After announcing a stronger collaboration with Google Cloud in August, NetApp says it has renewed its partnership with Microsoft in a bid to grow public cloud revenues.

NetApp is less successful in the cloud than it might be. Though public cloud revenues of $154 million in its latest quarter are a fraction of its hybrid cloud revenues of $1.28 billion, they are growing, while hybrid cloud has been declining for three quarters in a row.

NetApp spent around $149 million buying various CloudOps startups when Anthony Lye ran the public cloud business up to July 2022. Now it’s not earning much more than that in total public cloud business revenues and its in-house ONTAP products will be responsible for much of that. Actual CloudOps revenues have not been revealed.

The company has introduced lower cost all-flash arrays and a new reseller deal with Fujitsu in a bid for growth. It’s also improving public cloud products and operational services to help that part of its business to grow, and this Microsoft pact is part of that.

There are four software products involved: Azure NetAppFiles (ANF), Cloud Volumes ONTAP, the Blue XP data management facility, and cloud operations cost monitor/manager Spot.

ANF now provides:

  • Datastores for Azure VMware Solution  
  • Application volume group for SAP HANA 
  • Smaller 2 TiB capacity pool size 
  • Large volumes (Up to 500 TiB) 
  • Customer-managed keys  
  • Availability zone volume placement and cross-zone replication  
  • New regions: South Africa North, Sweden Central, Qatar (Central), Korea South 

NetApp says ANF is fast because of the integration of ONTAP all-flash arrays into the Azure Data Center networking infrastructure. Applications running within the Azure Virtual Network (VNET) can get sub-millisecond latency access to ONTAP data.

SVP and GM for NetApp cloud storage Ronen Schwartz said: “Through our collaboration with Microsoft, we can deliver market-leading cloud storage and cloud infrastructure operations to our customers and partners that nobody else in the industry offers.” Qumulo might have a view on that point.

The Spot offering for CloudOps on Azure now includes integrated support for Azure Kubernetes Service (AKS) environments, and a fully managed, open source service that integrates with Azure NetApp Files, BlueXP and Cloud Volumes ONTAP. Spot for Azure uses AI/ML-driven automation to deliver continuous optimization of all Azure compute with, NetApp says, increased efficiency and vastly reduced operational complexity, helping make Azure compute less onerous to manage.

As an indication of this complexity, Spot for Azure has recently gained:

  • Spot Eco support for Azure Savings Plans and all Azure account types including Pay-as-You-Go, multi-currency account (MCA), enterprise agreement (EA), and cloud solution provider (CSP) 
  • Spot Elastigroup Stateful Node for Azure 
  • Spot Security support for Azure 
  • Spot Ocean for AKS enterprise-grade serverless engine for Azure  
  • Spot Ocean for Apache Spark Azure support 
  • Instaclustr Managed PostgreSQL on Azure NetApp Files 

NetApp’s Haiyan Song, EVP and GM of CloudOps at NetApp, said in a statement: “Our expanded Spot by NetApp portfolio on Microsoft Azure and partnership with Microsoft provides customers with what they need to continuously automate and optimize application infrastructure in the cloud and drive the cloud operation and cost efficiency their businesses demand while delivering the digital experience customers expect.” 

We think Spot may well become a NetApp copilot for Azure CloudOps. Blue XP looks like potential copilot material as well.

Spot by NetApp CloudOps products are available from the Azure Marketplace, Microsoft Azure channel partners, and NetApp, and Azure NetApp Files is available through the Microsoft Azure portal as a native Azure service.

Nutanix veep talks HCI, SAN, and AI storage solutions

Interview: We had the opportunity to ask Nutanix about server SAN progress, hyperconverged infrastructure vendor consolidation, VMware and Dell HCI positioning, and its AI intentions. Lee Caswell, SVP of Product and Solutions Marketing, provided the answers.

Blocks & Files: How is it that HCI (server SAN) has not replaced external SAN and file and object storage?

Lee Caswell, Nutanix
Lee Caswell

Lee Caswell: All-flash HCI is actively replacing SAN for all workloads at a pace that is consistent with the conservative nature of storage buyers. New HCI technologies are generally additive, not a wholesale SAN replacement, which allows customers to depreciate prior investments while taking advantage of the more agile and lower cost HCI solution.

NAS and object storage replacement is starting to grow quickly now that very high-capacity storage media and capacity-dense nodes allow scale-out systems to be cost-competitive with traditional scale-up proprietary hardware designs. We expect to see customer adoption accelerate as users consolidate storage architectures to control costs, address talent gaps, and leverage common data services across files, blocks, and objects.

Blocks & Files: How would Nutanix position the strengths and weaknesses of external storage vs HCI?

Lee Caswell: External storage will continue to exist just as UNIX systems persisted when x86 servers took over the compute market. For customers with dedicated storage administrators and the resources to separately maintain storage, server, hypervisor, and network configurations, external storage is a proven, albeit complex, entity. HCI, however, will command higher market growth rates powered by a simplified operating model and the scale economics of commodity servers. 

Cloud spending has shown us that most customers simply want to stop managing storage and get on with their primary objective of speeding the development, deployment, and managing of applications and their associated data. HCI offers a flexible scaling model, a software-defined management model suitable for generalists, and a common operating model across legacy and modern applications with full cloud extensibility that will support the next wave of unstructured data growth.  

Blocks & Files: With Cisco giving up on Springpath, does Nutanix think there will be more consolidation in HCI suppliers’ ranks and why or why not?

Lee Caswell: We’re watching the HCI market evolve very quickly from a narrow on-premises modernization play to an expansive hybrid multicloud platform market. Once customers move to a server-based architecture they start to see their control plane aperture open dramatically. There are very few infrastructure vendors that have the appetite for the engineering investment and the dedicated sales attention required to support a common cloud operating model. 

It also requires an elegant scale-out architecture to deliver a hybrid multicloud platform with a consistent operating model from the edge to the core to the public cloud, across legacy and containerized apps, with full data services including database as a service. Nutanix delivers this seamless experience with one architecture while others are either rearchitecting or, in the Cisco case, partnering with Nutanix. 

Blocks & Files: How would Nutanix position its offerings vs VxRail and, separately, PowerFlex?

Lee Caswell: Nutanix Cloud Platform (NCP) uses a single web-scale architecture with integrated snapshots, replication, and DR that are common underpinnings to all our offerings. Portable licensing allows customers flexibility in moving data and apps at will across servers, including servers in AWS and Azure.  The user experience evident in our NPS of 92 and high renewal rates shows the simplicity of our offering and the excellence of our support organization. The offering includes a distributed file system and object store now recognized by Gartner as a Visionary player

VxRail, by contrast, has two incompatible vSAN architectures with node-locked licenses and different data service offerings. Object stores must be licensed separately from partners along with KMS solutions – both of which are included in NCP. Cloud DR systems are powered by a third architecture (from Datrium) which introduces yet another management layer. 

PowerFlex (formerly ScaleIO) is increasingly being positioned by Dell as a vSAN/VxRail alternative. However, PowerFlex is more accurately a software-defined storage (SDS) offering requiring storage specialists since it is managed by LUN and licensed by TB. Finally, since VMware VCF always requires vSAN for the management domain, introducing PowerFlex adds separate patching, maintenance, and licensing headaches. 

Blocks & Files: How would Nutanix position its offerings as being capable of providing primary storage for AI Training?

Lee Caswell: With our recently released Nutanix GPT-in-a-Box, we provide reliable, secure, and simple AI-ready infrastructure to support LLM and GenAI workloads. LLM and use case specific fine-tuning data is stored on Nutanix Files Storage and Object Storage while block storage is used to run the models for training. Nutanix recently submitted verified files and objects MLPerf Storage Benchmark results (link). Of note was our ability to deliver 25Gbps for NFS to power 65 ML accelerators (GPUs).

Blocks & Files: Ditto AI Inferencing?

Lee Caswell: A powerful feature of Nutanix AI solutions is the ability to optimize resources from core to edge. Training and fine-tuning a model is often more resource-intensive than inferencing and would happen in the core. Inference can happen in the core or at the edge, depending on the application’s needs. This flexibility opens the potential for optimization of resources. Many edge scenarios are sensitive to latency and require high-performance storage systems with the ability to function without an active cloud connection.

Backblaze hits $100M ARR, but computer backup stalls

Storage pod
Storage pod

Cloud storage provider Backblaze grew revenues 15 percent year-on-year in its third 2023 quarter but backup storage revenues declined quarter-on-quarter.

Revenues were $25.3 million in the quarter ending September 30, with a loss of $16.1 million, as Backblaze prioritized growth over profitability. This compares to a year-ago loss of $12.8 million on revenues of $22.1 million. Cash, short-term investments, and restricted cash amounted to $35.8 million at quarter end.

Backblaze revenues

Annual recurring revenue (ARR) rose 15 percent year-on-year to $100.9 million. CEO Gleb Budman said: “In Q3, we passed the $100 million in ARR milestone and are on track to achieve adjusted EBITDA profitability in Q4 through continued strong growth and efficient execution.

“Looking ahead, our recent price increase supports continued investments in the platform and positions us for profitable growth while continuing to offer customers a compelling and cost effective storage solution.”

Its computer backup business brought in $13.7 million, up 4 percent year-over-year, while the B2 cloud storage business generated $11.6 million, but B2 cloud storage ARR rose 31 percent year-over-year to $46.8 million. There have been price increases for both B2 cloud storage and computer backup storage, with the larger increase for computer backup. That may have stymied growth. Customers can pay monthly or have one or two-year contracts.

Backblaze segment revenues
B2 cloud storage revenues are on track to overtake computer backup revenues by the end of 2024

Budman said: “We’ve actually seen some customers wanting to switch from monthly to one and two year to lock in the price point ahead of the pricing.” He doesn’t see the computer backup business growth rate getting to double figures. “We’re going to have a growth rate in the upper single digits for computer backup.”

Backblaze started off selling backup storage in its cloud then branched out into general B2 cloud storage, now the fastest-growing part of its business, representing 46 percent of revenues. Budman said: “Long-term B2 is overtaking as the dominant product quickly.” Backup suppliers like HYCU utilizing B2 cloud storage will contribute to this growth.

The company is offering free egress with its B2 cloud storage, and Budman said: “We saw new customers engage with us directly because high egress fees charged by other providers are a major pain point for them.”

Budman said in the earnings call: “We believe that we’re at an inflection point with higher revenue growth expected in Q4 and 2024 … We’ve made great progress on our financial performance, particularly adjusted EBITDA and cash.”

CFO Frank Patchel added: “Next quarter, we expect to accelerate revenue growth, reduce cash usage by about half sequentially, and reach positive adjusted EBITDA … We expect Q4 adjusted EBITDA margin between a positive 1 percent and a positive 3 percent. Q4 is expected to be our first quarter of positive adjusted EBITDA as a public company.”

He also said: “Cash break-even would be in the first half of 2025.” Profitability should follow after that, it will be hoping.

Backblaze said it would use CoreSight colocation facilities in Reston, Virginia, for its expanded service in the eastern US data region.

The outlook for Backblaze’s final 2023 quarter is for revenues of $29.1 million plus/minus $400,000, a 21 percent rise from a year ago at the mid-point. Full 2023 year revenues would then be $101.6 million, a 19 percent annual increase.

GigaOm highlights key players in cloud storage landscape

GigaOm analysts have surveyed the cloud file storage area and found it necessary to split it into two distinct areas for its Radar reports: distributed cloud file storage, with a focus on collaboration, and high-performance cloud file storage. Three suppliers are present in both reports: Hammerspace, Nasuni and Panzura. CTERA leads the distributed cloud file storage area while NetApp is ahead in the high-performance cloud file storage report.

A Radar report is GigaOm’s view of the main suppliers in a market area, summarizing their attributes within three concentric rings – entrant (outer), challenger, and leader (inner) – with those set closer to the center judged to be of higher overall value. Vendors are located with reference to their scores on two axes – Maturity to Innovation and Feature Play to Platform Play. An arrow projects each vendor’s offering over the coming 12 to 18 months. The arrows represent three types: a forward mover, a fast mover, and an outperformer.

GigaOm Distributed Cloud File Storage Radar

There are just six suppliers present in the distributed file storage group: two separated outsiders in the challenger’s area, Peer Software and LucidLink; and a group of four close together in the Leaders’ ring with a balance between being innovators and generalized platform players.

The front-ranked supplier is CTERA, followed by Nasuni, both classed as outperformers, and then Hammerspace and Panzura. Hammerspace is a data orchestration supplier whereas CTERA, Nasuni, and Panzura are cloud file services suppliers with a collaboration focus.

Peer Software is described by GigaOm’s analysts as providing Global File Services (GFS) software on which customers can abstract distributed file services on top of existing storage infrastructure, and supporting scalability in the cloud. The company has an ambitious roadmap and recently introduced a new analytics engine, but it will take time to execute.

LucidLink’s Filespaces software presents on-premises applications with instant access to large S3 and Azure Blob-stored datasets over long distances, with local storage caching of the current data needed. It also works with on-prem object stores.

Analysts Max Mortillaro and Arjan Timmerman say: “Areas where we found the most significant variances were: data management, analytics, advanced security, and edge deployments. All of these areas are important, but we can’t stress enough the urgency of advanced security measures as a mitigation strategy against elevated persistent ransomware threats. Well-designed and varied edge deployment options are also critical now that most organizations must accommodate a massively distributed workforce.”

The vendors may need to provide data classification, data compliance, and data sovereignty features in the future.

GigaOm Radar for High-Performance Cloud File Storage

There are 14 suppliers who are placed in three groups:

  • IBM (Storage Scale), Zadara, and DDN (EXAScaler) all classed as more mature suppliers
  • Cloud file storage mainstreamers
  • Public cloud group

NetApp is the clear front runner in the leaders’ area, followed by Hammerspace then Qumulo. Weka, Nasuni, and Panzura are challengers poised to enter the leaders’ ring. These six suppliers are more platform than feature-focused, and represent, in our view, a mainstream group.

It is somewhat surprising that Weka, very much focused on high-performance file data access, has a relatively low ranking, being beaten by NetApp, Hammerspace, Qumulo, and Nasuni. The report authors suggest Weka’s data management needs improving: “It is expected that, in the future, Weka will improve some of its capabilities in this area, notably around metadata tagging, querying, and so on.”

IBM, Zadara, and DDN are in a top right quadrant and classed as challengers.

ObjectiveFS, Microsoft, and Amazon are a separate group of challengers, with entrants Google and Oracle set to join them in a public cloud-oriented group that is more focused on adding features than being a platform play.

Timmerman and Mortillaro say: “The high-performance cloud file storage market is interesting. It might seem obvious to many that, thanks to their dominance and massive market share, public cloud providers would provide the most comprehensive solutions. Nothing could be further from the truth.

“The primary concern with these is that, with a few notable exceptions (such as Tier 1 partnerships with vendors such as NetApp), these public cloud solutions typically need additional adjustments and improvements to meet enterprise requirements.”

Hammerspace, Nasuni ,and Panzura are all present in GigaOm’s Distributed Cloud File Storage Radar report as well as this high-performance cloud file storage report. No other suppliers have a dual report presence.

You can read the report here courtesy of Hammerspace. It provides excellent and concise descriptions of each vendor’s offerings.

GigaOm Radar reports are accompanied by Key Criteria that describe in more detail the capabilities and metrics used to evaluate vendors in this market. GigaOm subscribers have access to these Key Criteria documents.

Qumulo file system goes native on Azure

Qumulo has added a native Azure implementation of its scale-out file system, complementing its existing edge, on-premises, and public cloud presence.

Its Scale Anywhere concept matches NetApp’s data fabric vision in scope and arguably provides a more scalable and parallel filesystem than NetApp’s ONTAP in the public clouds and Dell’s APEX File Storage for AWS. 

Qumulo CEO Bill Richer said: “Unstructured data is everywhere in the enterprise and growing at an unprecedented rate, a reality that traditional storage solutions have constantly grappled with. Qumulo finally fixes this with a software platform built for hybrid IT that scales to exabytes anywhere unstructured data is generated, stored, consumed, or managed.”

The Scale Anywhere vision in a nutshell is up to exabyte-level file storage on-premises, edge, datacenters, and the public cloud. It’s a vision shared and pioneered by NetApp, Dell, and HPE (GreenLake).

Qumulo Scale Anywhere graphic
Qumulo graphic

Kiran Bhageshpur, Qumulo CTO, said in a briefing: “I look at this as the third era of data. First, there was the NetApp era, which was scale-up domain controllers connected across SAS expanders. They owned the 1990s and the early 2000s. Then there was a scale-out era, which was very much Isilon’s kind of game, and I was there. We really ran the table at that point in time, and it’s the 8,000-pound gorilla for large scale unstructured data today. I believe we are in the third era, which is scale anywhere.

Kiran Bhageshpur

“What’s changed from 15 years ago to now is the fact that data is everywhere. It’s not just datacenters, it is the edge. And the edge is very dispersed. We’ve got customers who are talking about hundreds of terabytes and mining sites, you’ve got national defense workflows with forward operating bases and geosynchronous or other satellite imagery and longtime drone imagery, which needs to be shared all across the edge. And then of course you’ve got the cloud with an absolute plethora of services.”

A file system and data services supplier has to cover all these bases. Bhageshpur told us “file in the cloud has been a joke” because it doesn’t use a cloud-native approach. That was true of the initial April Azure Qumulo implementation. “The version we put out in April was still what I would call a version 1.0 from a cloud architecture point of view as it was a lift and shift … What we are now announcing is a truly cloud-native Enterprise File System in the cloud, starting with Azure.”

Qumulo says the Scale Anywhere Platform has four components:

  • Ability to run on commodity hardware or in the (main) public clouds.
  • A Nexus unified control plane for customers to manage all their Qumulo instances anywhere.
  • Q-GNS global namespace providing a unified edge-to-datacenter-to-public cloud data plane with its data services – multi-protocol access, enterprise security integrations, snapshots, quotas, replications etc. – delivered within this geographically distributed namespace. Qumulo says customers can access remote data as if it were local for all their workflows, from the most performance-sensitive applications to active archives.
  • Azure Native Qumulo (ANQ), described by Qumulo as “the first cloud-native enterprise file system with unlimited exabyte scale, unmatched economics, and performance elasticity.”

Note that the current Qumulo implementations on AWS (Qumulo File Data Platform) and GCP are not included in this component list. They are not cloud-native. We expect that cloud-native versions will arrive in due course. Bhageshpur said: “Our view is scale anywhere. So every single cloud, every single hardware platform.”

ANQ was developed with Microsoft and data is stored in Azure Blob object storage buckets with caching to speed up access. Bhageshpur said: “The caching uses Azure ephemeral instances’ attached NVMe as a read cache, we also use a high performance managed disk as a protected cache.” Qumulo says 99 percent of ANQ I/O operations are served from the cache, improving performance and shielding file workloads from cloud object storage latencies.

  • ANQ separates performance from long-term storage retention, allowing each layer to scale independently, elastically, and seamlessly. It lets customers dynamically raise and lower their cluster’s throughput rates on demand, as needed. 
  • ANQ leverages Azure’s 12 nines of data durability and its clustered architecture so files are dependably available.
  • ANQ is almost 80 percent less expensive than the closest alternative (understood to be Azure NetApp Files), and is comparable to the fully burdened on-premises cost of file storage.
  • ANQ data services include quotas, snapshots, multi-protocol access, enterprise security integrations, and real-time data analytics.
  • ANQ can be configured and deployed directly from the Azure service portal in minutes.

Bhagesphur said of cost: “For the first time file in the cloud, rich enterprise file in the cloud is going to be comparable to the cost of object in the cloud.”

ANQ has a cloud-familiar pay-as-you-go model in its pricing structure, where economics improve as data volume grows. 

Monthly pay-as-you-go pricing is $30 per TB per month (for 1 PB and beyond). Pricing starts at $37 per TB per month (baseline commitment of 100TB), with an incremental charge per TB as data grows.

Microsoft’s Maneesh Sah, CVP Azure Storage, said: “Combining Microsoft Azure Blob Storage with Azure Native Qumulo, Microsoft and Qumulo deliver familiar on-premises file management capabilities at costs similar to object storage. Azure is the only cloud that offers the balance of performance, cost, and scalability.”

Qumulo is also announcing a consumption-based Qumulo One pricing model.

Bhageshpur said: “With our latest update to Azure Native Qumulo, customers can migrate their file-based AI/ML workloads to the cloud, and run them natively in Azure at scale, without compromising performance, features, or cost.”  He also said that Qumulo Scale Anywhere customers won’t need Hammerspace to orchestrate files within the Qumulo namespace as Qumulo includes that kind of functionality.

He thinks Qumulo operates at a level of scale beyond the capabilities of edge-caching cloud-based file data services suppliers such as CTERA, Nasuni, and Panzura, characterized as home directory-style use cases: “My view is that a lot of what Panzura, especially Nasuni, are doing … is going to get subsumed by some variation of OneDrive, Google Drive, or Dropbox … Whereas what we are looking at is hundreds of terabytes to hundreds of petabytes, really large scale data.”

The Q-GNS global namespace hits private preview from November 9 then public preview on February 1. ANQ is generally available.

Generative AI: It’s not just for the big guys

COMMISSIONED: Being stuck in the middle is no fun.

Just ask the titular character Malcom, of the TV series “Malcolm in the Middle,” (2000-2006) who struggles to stand out among his four brothers. In the earlier sitcom, “The Brady Bunch,” (1969-1974) Jan Brady feels overshadowed by big sister Marcia and younger sis Cindy.

Polite (or impolite) society has a name for this phenomenon, which describes the challenges children sandwiched between younger and elder siblings feel within their families: Middle child syndrome.

Reasonable minds differ on the legitimacy of middle child syndrome. Is it real or perceived and does it matter? Even so, it can be hard to compete with siblings – especially brothers or sisters who enjoy the lion’s share of success.

The middle children of the global economy

As it happens, the global economy has its own middle children in the form of small- to medium-sized businesses, which find themselves competing with larger enterprises for talent, capital and other vital resources.

Yet like their larger siblings, SMBs must innovate while fending off hungry rivals. This dynamic can prove particularly challenging as SMBs look to new technologies such as generative AI, which can be resource intensive and expensive to operate.

No organization can afford to overlook the potential value of GenAI for their businesses. Seventy-six percent of IT leaders said GenAI will be significant to transformative for their organizations and most expect meaningful results from it for within the next 12 months, according to recent Dell research.

Fortunately, SMBs wishing to capitalize on the natural language processing prowess of GenAI can do so – with the right approach: Using a small language model (SLM) and a technique called retrieval augmented generation (RAG) to refine results.
You may have noticed I called out an SLM rather than a large language model (LLM), which you are probably more accustomed to reading about. As the qualifiers imply, the difference between the model types is scale.

LLMs predict the next word in a sequence based on the words that have come before it to generate human-like text. Popular LLMs that power GenAI services such as Google Bard and ChatGPT feature hundreds of billions to trillions of parameters. The cost and computational resources to train these models is significant, likely putting building bespoke LLMs out of reach for SMBs.

SMBs have another option in building small language models (SLMs), which may range from a hundred million to tens of billions parameters and cost less to train and operate than their larger siblings.

SLMs can also be more easily customized and tailored for certain business use cases than LLMs. Whereas LLMs produces long form content, including whole software scripts and even books, SLMs can be used to build applications such as chatbots for customer service, personalized marketing content such as email newsletters and social media posts and lead generation and sales scripts.

Even so, whether you’re using an LLM or an SLM, GenAI models require enough computational resources to process the data, as well as data scientists to work with the data, both of which may be hard for SMBs to afford. And sure, organizations may use a pre-trained model but it will be limited by the information it knows, which means its accuracy and applicability will suffer.

RAG fine-tunes models with domain knowledge

Enter RAG, which can add helpful context without having to make big investments, thus democratizing access for SMBs. RAG retrieves relevant information from a knowledge repository, such as a database or documents in real time, augments the user prompt with this content and feeds the prompt into the model to generate new content. This helps the model generate more accurate and relevant responses for the information you wish your model to specialize in.

For example, at Dell we show organizations how to deploy RAG and Meta’s LLama2 LLM to retrieve domain-specific content from custom PDF datasets. The output was used to show how an organization might theoretically use RAG and an LLM to train a help-desk chatbot.

SMBs can use an SLM with RAG to build a more targeted and less resource-intensive approach to GenAI. Effectively, the combination affords them a very accurate tool that delivers more personalized information on their company’s data – without spending the time and money building and fine-tuning a custom model.

Getting started with RAG may seem daunting to SMBs but organizations can repurpose a server, a workstation or even a laptop and get started. They can pick an open-source LLM (such as LLama2) to begin the process. Dell calls this the GenAI easy button.

That way organizations can bring the AI to their data, keeping control of their sensitive corporate IP while freeing up IT resources as they innovate.

SMBs play an important role in the economy by contributing to innovation. Yet too often they’re relegated to Malcom or Jan status – the oft-underestimated and neglected middle children of the global economy.

By combining the right approach and technology tools, SMBs can leverage GenAI to accelerate innovation, enabling them to better compete and woo new customers – rather than feeling lost in the middle of the corporate pack.

To learn more, visit dell.com/ai.

Brought to you by Dell Technologies.

Commvault launches multi-vendor cyber resilience partnerships

Handshake
Handshake

Commvault, which just announced its all-in-one Commvault Cloud software infrastructure, has set up multiple security supplier agreements to provide comprehensive cyber resilience, security, and data intelligence capabilities.

Commvault believes collaboration is key, and is setting up partnerships with cyber security, artificial intelligence, and cloud suppliers to provide joint customers with more ways to detect, protect, and respond to potential threats and attacks, while also improving data visibility and governance. It is working with partners across the security tool chain, including: security information and event management (SIEM); security orchestration, automation, and response (SOAR); network detection and response; vulnerability, threat detection, and assessment; incident management; and data governance and privacy. 

Chief product officer Rajiv Kottomtharayil said: “By integrating with a broad ecosystem of new security and AI partners via our Commvault Cloud platform, we … can collectively and jointly bring faster, smarter, and more connected security insights to organizations around the world.”

Depiction of Commvault anti-malware security partnerships
Blocks and Files depiction of Commvault anti-malware security partnerships

The partner list includes:

  • Avira (part of Gen): AI/ML-driven threat intelligence, prediction, analysis and anti-malware technologies. 
  • CyberArk: Identity security platform.
  • Darktrace: Machine learning-based anomaly detection with integration with HEAL and Commvault. 
  • Databricks: Data lake platform for data and AI. 
  • Entrust: Post-quantum cryptography and data encryption.
  • Microsoft Sentinel: SIEM. 
  • Netskope: Zero trust-based Secure Access Service Edge (SASE) web content filtering. 
  • Palo Alto Networks: Threat intelligence repository leveraging Cortex XSOAR to shorten incident response times. 
  • Trellix: Threat detection and response with Intelligent Virtual Execution (IVX) sandbox to analyze and inspect malware in an isolated environment.

This set of integrations is impressive, but suggests the skill set needed to manage and operate these facilities will be substantial – and more so if more partners’ technologies are used. Admin staff will need to be trained and proficient with each of the suppliers’ technologies.

This suggests, in turn, given Commvault’s Arlie copilot initiative, that an even more capable copilot could be needed to help users understand the overall anti-malware capabilities of their IT environment, and the vulnerability gaps. Such an assistant would need to be trained on all the partners’ systems, be able to answer users’ questions, and generate reports and code for processes to help detect, diagnose, and fix problems in what is a highly complex environment.

If – like Commvault (Airlie), Databricks (Dolly) and Microsoft (OpenAI) – each supplier has their own copilot, they will need to interact with each other to help customers with multi-supplier security systems. Copilot-to-copilot conversations will be a whole new generative AI ball game.

Rubrik runs Ruby copilot on GenAI rails

Rubrik has jumped aboard the generative AI copilot train with Ruby to speed cyber detection, recovery, and resilience.  

Rubrik joins Druva (Dru) and Commvault (Arlie) who also have data protection copilots. A copilot is a generative AI large language model trained on the supplier’s products and services and able to interpret natural language questions and requests. It can respond in natural language and also generally produce coded statements to activate reports and procedures.

Anneka Gupta, Rubrik chief product officer, said: “Think of Ruby as the personification of a security analyst in AI, who is there to hold the customer’s hand to resolve a security incident much faster than they could do before.” 

Rubrik says it uses AI to help customers in three areas:

  • Security expertise with a guided response process that helps users navigate challenging workflows and speeds recovery from cyber incidents via Ruby.
  • Detecting anomalous activity in data across enterprise, cloud, and SaaS applications, to help customers identify malicious activity and determine the scope of a cyberattack or incident via Rubrik’s Data Threat Engine.
  • A support team, which can be more proactive and targeted in alerting potential problems before they impact an organizations’ systems via Rubrik’s Sentry AI platform. 

Ruby, like Commvault’s Arlie, uses Microsoft Azure’s OpenAI large language model. It also uses Rubrik’s best practices and the expertise of its field teams and insights from its ransomware recovery efforts. It will generate data risk alerts, and users can ask follow-up questions including whether any sensitive data was impacted, or what needs to be done to restore the environment. Ruby will provide guidance on additional questions to pose, and help customers resolve incidents more quickly.

Rubrik Ruby interaction with user
Ruby interaction with a user

It will help less skilled users deal with cyber threats. Gupta said: “Securing business data must be a company-wide imperative; every employee should be empowered with the tools to quickly respond to incidents. Our goal with Ruby is to bridge any skills gaps and eliminate the roadblocks to cyber response so that all organizations realize business continuity and preserve the integrity of their corporate data.”

Rubrik says Ruby is planned to be available in the coming months to Enterprise Edition subscribers who opt in. Over time, Ruby will expand to help customers recover even faster and more effectively from cyber attacks.

Rubrik Ruby malicious file alert
Ruby malicious file alert

Comment

Rubrik is positioning Ruby as a copilot, a helpful assistant. It is not saying it is an autonomous cybersecurity agent which can be given the responsibility of looking after an organization. That responsibility is the user’s, with Ruby as a tool.

It will be interesting to see how such a copilot performs during an actual cyber incident and helps recognize it, disarm the malware, and recover data to a clean state. A detailed case study of an incident like this will be fascinating to study

We think that having an AI-driven copilot is going to become table stakes for data protection and security vendors.

Bootnote

Rubrik’s Ruby has nothing to do with the server side, web application framework Ruby on Rails, nor with the Ruby open source programming language.