An unnamed AI-enabled data analytics startup has surfaced with a Rubrik co-founder as its CEO along with two other Rubrik alumni.
Soham Mazumdar co-founded data protection-focused Rubrik in 2014 along with CEO Bipul Sinha, VP Engineering Arvind Jain, and CTO Arvind Nithrakashyap. Rubrik has since grown enormously, notching up 2,000-plus customers, $550 million in funding, and half a billion in annual recurring revenue.
In a LinkedIn post, Mazumdar says he is: “Excited to share that I have embarked on a new startup journey with incredible partners in crime, Sharvanath Pathak and Kapil Chhabra. We are taking on some critical problems at the intersection of AI and Data Analytics. Currently heads down building the product and hope to share updates before long.”
Mazumdar’s post adds: ”We are looking for founding engineers to join us on this journey! We are hiring across the stack, frontend, backend & NLP. Please hit us up if you are interested in learning more.”
From left: Soham Mazumdar, Sharvanath Pathak and Kapil Chhabra
Rubrik’s former engineering boss Arvind Jain left in 2019 to start up Glean as its CEO. Glean’s business is to provide AI-powered workplace search. Its software is using generative AI to build a virtual knowledge base that is aware of and works with an organization’s data governance environment.
It is a small world. Pathak is also an ex-founding engineer at Jain’s Glean and one-time founding engineer at Rubrik, from August 2014 to December 2016. Chhabra was ex-Senior Director of Product Management at Rubrik. He left in May this year.
The intersection of AI and data analytics is a booming area of development with large language model (LLM) machine learning being applied as an intelligent interface to dataset analysis routines. Databricks and its Dolly LLM chatbot is an obvious example of the ongoing supercharged developments.
The Dolly developers say that any organization using Dolly “can create, own, and customize powerful LLMs that can talk to people, without paying for API access or sharing data with third parties.”
We suspect Mazumdar’s startup will be active in the generative AI dataset analytics space and will endeavor to do more than Glean-type search, including something different from providing a framework for chatbots to organize analytic runs on and across datasets.
CloudCasa says it’s one of the available Kubernetes applications in the new Microsoft Transactable Container (aka Kubernetes Apps) in the Azure Marketplace. CloudCasa already provides native integration with Azure Kubernetes Service (AKS), Amazon Elastic Kubernetes Service (EKS) and Google Kubernetes Engine (GKE). Customers can now pay for CloudCasa through the Azure marketplace. They also get enhanced security via Microsoft’s automated malware and vulnerability scanning, we’re told.
…
Cloudian’s HyperStore software is available on the HPE GreenLake Marketplace, giving customers the option to get HyperStore delivered as an on-premises service. Cloudian and HPE GreenLake together say they deliver data sovereignty and control via a pay-as-you-go financial model. With Cloudian HyperStore, customers can transform any compute platform – including servers, virtual machines, or containers – into a pool of S3 API-compatible object storage.
…
Cobalt Iron announced Compass NAS Protector, a new set of features in its Compass enterprise SaaS backup platform. Compass NAS Protector speeds up backups, simplifies management of NAS data, and improves and consolidates backup operations, we’re told. It has technology for identifying new, changed, and deleted NAS files compared with the Compass inventory. Proprietary scanning and identification processes optimize parallelism in NAS share/export scanning. Enhanced data-movement processing enables high-speed, parallel backup and archiving. Cobalt Iroh says it ensures NAS device independence to consistently protect physical, virtual, and cloud NAS share/export resources. Compass NAS Protector is automatically available to existing and new Compass users at no additional cost.
Ranga Rajagopalan.
…
SVP of Products Ranga Rajagopalan is leaving Commvault and heading to Druva.
…
Cloud database supplier Couchbase has enhanced its in-memory perfoirmance Capella Database-as-a-Service app. It will be accessible by popular developer platform Netlify and features a new Visual Studio Code extension, with the aim of making it easier for developers and development teams to build modern applications on Capella, streamline their workflows and increase productivity. Couchbase is also extending its enterprise deployability – adding over 10 new supported regions across three major CSPs – and introducing new features, like support for Time Series data, allowing customers to move more applications to Capella with a lower total cost of ownership (TCO).
…
Crucial, a Micron brand, is sampling the world’s fastest retail SSD; the T700 M.2 format drive. It uses Micron 232-layer TLC NAND, has 1, 2 and 4TB capacities, a PCIe 5.0 interface, a Phison PS5026-E26 controller, and has a 12,400 MBps sequential read and 11,800 MBps sequential write bandwidth. It can deliver up to 1.5 million/2 million random read/write IOPS. The T700 is available with and without a passive heatsink and supports Microsoft DirectStorage API, with a direct SDD-to-GPU data path, bypassing the host CPU. More here.
…
Cassandra NoSQL database supplier DataStax has a partnership with AI startup ThirdAI to will try to make large language models (LLMs) and other artificial intelligence (AI) technologies accessible to any organisation, regardless of where their data resides — on-premises or in the cloud — running on massively scalable databases such as DataStax’s Astra DB. This partnership marks the launch of the DataStax AI Partner Program, which has been designed to connect enterprises with AI startups and speed the development and deployment of AI applications for customers.
…
Search company Elastic announced a 3-year global agreement (SCA) with Amazon Web Services (AWS) and achievement of the AWS Security Competency designation. The SCA includes:
Accelerating integrated go-to-market activities across sales and marketing, including marketing campaigns, guides and workshops, events and sponsorships, advertising placements
Technology integrations and commercial incentives to streamline the migration of on-premises workloads to Elastic Cloud on AWS
Global expansion of best practices from the Americas to accelerate worldwide growth
…
Tom Coughlan writes in Forbes that David Flynn, founder and CEO of Hammerspace, spoke at the 2023 IEEE MSST conference about building an NFS file system into PCIe-based SSDs. He said that with flash performance on parity with Ethernet and with special purpose processors becoming more commonplace, the time is right for making smarter SSDs.
…
Hitachi Vantara has introduced Hitachi Data Reliability Engineering (DRE), a suite of consulting services assisting organizations in improving the quality and consistency of business-critical data, available through Hitachi Application Reliability Centers. With a secure, self-service approach, DRE allows organizations to embed quality data into applications, we’re told.
…
Huawei says its OceanStor Pacific Scale-Out Storage is suited to the Numerical Weather Prediction (NWP) application at petabyte scale. NWP prediction flow involves collecting meteorological observation data, pre-processing data, performing model computing, and post-processing data. Huawei says OceanStor Pacific SOS provides improved Weather Research Forcasting model storage access performance: data private clients (DPC) combining with MPI-I/O can distribute workloads to every I/O node and improve forecast accuracy performance by almost 10 times within an area of 1 km.
There is significantly improved task efficiency in large-scale cluster environments: Computing time decreases with an increase in computing nodes. Overall time decreases by 1/3 compared to before in an environment with 64 nodes and 40 cores for each node, due to I/O performance improvement. This system uses metadata search functions to speed data search and management efficiency.
Craig Bumpus.
…
Azure data protector Keepit has appointed Craig Bumpus as Chief Revenue Officer (CRO). Bumpus will be responsible for building Keepit’s go-to-market and associate strategies and helping to set direction with CEO Frederik Schouboe. Prior to joining Keepit, he served as CRO at Qumulo and at UiPath, where he played a part in growing the company from $25 million to over $400 million.
…
Kioxia is sampling a second generation UFS 4.0 flash drive for smartphones, digital cameras and the like. UFS 4.0 incorporates MIPI M-PHY 5.0 and UniPro 2.0, supports theoretical interface speeds of up to 23.2 gigabits per second (Gbps) per lane or 46.4 Gbps per device, and is backward compatible with UFS 3.1. Kioxia’s little UFS 4.0 card has a performance improvement over its previous generation of +18 percent sequential write, +30 percent random write and +13 percent random read.
…
Kioxia has a new Exceria Plus G3 Series consumer SSD in development. It comes in M.2 single-sided 2280 format with a PCIe gen 4 interface and up to 2TB capacity using TLC NAND. Provisional sequential read bandwidth is 5,000 MBps. The new drive has approximately 70 percent more power efficiency at max sequential read speed over the previous generation EXCERIA PLUS G2 Series.
…
Kioxia is opening two new R&D facilities — the Flagship Building at the Yokohama Technology Campus and the Shin-Koyasu Technology Front — to bolster the company’s research and development of flash memory and solid-state drives (SSDs). Going forward, other R&D functions in Kanagawa Prefecture will be relocated to these new R&D hubs to improve research efficiency.
…
Research house TrendForce says Micron will be fabricating its leading edge 1-gamma DRAM (~10nm) at the company’s Hiroshima, Japan fab, but that doesn’t mean it’s abandoning Taiwan. Micron will be developing the 1-gamma process in Taiwan and production will start in CY ’25 in both Taiwan and the Hiroshima facility.
…
Startup Ocient announced v22 of its Ocient Hyperscale Data Warehouse (OHDW) product adding real-time analytics features, query performance enhancements, and support for additional business intelligence (BI) tools. It has improved performance for loading, streaming, and extract, load, transform (ELT) workloads, eliminating the need for standalone tools like Spark and Informatica, Ocient says. V22 marks GA of Hyperloglog (HLL) sketches for Ocient’s suite of real-time analytics capabilities so customers can create rollups of data using approximations on aggregated metrics and accelerate query processing without sacrificing performance. More information here.
…
Kevin Delane.
Enterprise SaaS app data protector OwnBackup has hired Kevin Delane as its CRO. Delane was formerly CRO at Cohesity, leaving in March this year. Cohesity and OwnBackup are partnering so that Cohesity can supply OwnBackup’s SaaS protection of Salesforce, Microsoft Dynamics and ServiceNow to its customers. The two are working on a joint offer to consolidate the protection of SaaS applications on a single system, streamlining IT operations, reducing attack vectors for cybercriminals to exploit, and providing business continuity to counter data disruption, we’re told.
…
SK hynix says it has completed the development of the industry’s most advanced 1bnm DDR5 DRAM, the fifth-generation of the 10nm process technology. The company and Intel are running a joint evaluation of 1bnm and validation via the Intel Data Center Certified memory program for DDR5 products targeted at Intel Xeon Scalable platforms. The DDR5 products provided to Intel run at the world’s fastest speed of 6.4Gbps (Gigabits per second), we’re told. This represents a 33 percent improvement in data processing speed compared with test-run products in early days of DDR5 development. The development means new DRAM products that offer both high-performance and performance per watt.
…
WANdisco’s crunch general meeting at which it will seek new shareholder funds needed for its survival is scheduled for June 6. Non-exec director William Grant Dollens is resigning from WANdisco’s board. He was appointed in 2016 and a company statement says: “having served a six-year term as originally envisaged at the time of his appointment, now is an appropriate time to step down as the Company embarks upon the next phase of development.”
The statement adds: “Recently Mr Dollens has been instrumental in securing the appointment of interim Chair Ken Lever, interim CEO Stephen Kelly, and interim CFO Ijoma Maluza.” Chairman Ken Lever said: “I look forward to continuing a constructive dialogue with him as the Board continues to focus on advancing the fundraising and other workstreams that are designed to lift the current suspension of WANdisco’s shares and position the Company for long term success.”
Just five more years and spinning will be over. That was the opinion of Pure Storage several weeks ago when it said it believed SSDs would kill off the hard disk drive business by the end of 2028. But Jerome Lecat, CEO at object storage software supplier Scality has scoffed at this idea when we asked him about it. Paul Speciale, Scality’s Chief Marketing Officer, has now answered B&F’s follow-up questions.
Blocks & Files: Some all-flash-array suppliers suggest SSDs will be lower cost overall, in TCO terms, than HDDs by 2028 because of NAND price declines, electricity price levels and electricity availability. Do you think that QLC SSD acquisition cost (and maybe PLC later), with data reduction built in or not, will equal and then drop below HDD acquisition cost?
Paul Speciale
Paul Speciale: Since we are partnered with nearly all industry-leading server platform providers, we get a unique overview of the true costs of storage media across the HDD and SSD spectrum. When comparing today’s high-density Large Form Factor (LFF) HDDs to current QLC based flash SSDs, vendor data shows that HDD capacity pricing is between 5 to 7 times lower on a $ per GB basis today. In addition, these vendors project that HDDs will retain 4-5X cost-advantages into 2028.
While data reduction techniques can overcome these cost differences for certain types of application data, they cannot do so for all data (compressed video is the usual example). Most enterprise data protection (backup) applications already perform data compression & deduplication on backup data. We also have customers in Life Sciences where genomics data is pre-compressed, so there is no advantage to storing it on an all-flash system beyond random file IO performance for a relatively small amount (10 percent) of hot data (the remaining 90 percent is tiered to RING scale-out object storage with HDDs for data storage capacity).
These workload and data variances make it challenging for flash-based systems to truly overcome a 5X cost disadvantage in all cases. The reality today is therefore:
The promise of cost parity for flash SSDs with equivalent HDDs has not yet arrived.
High-density HDD will retain a ~5x cost advantage over high-density SSDs for many more years.
Flash vendors will not be able to eliminate this media price disadvantage through data reduction techniques for backup and other unstructured data workloads.
Blocks & Files: It’s taken as a truism that SSDs use less electricity than HDDs. Why would you think this is or is not the situation in a storage system?
Paul Speciale: For all types of drives, average power consumption will depend on workload characteristics.
From flash vendor spec sheets we see that high-density QLC flash drives consume 5W (idle), 15W (read) to 20W (write) per drive (see Micron’s data sheet and Kioxia’s data sheet).
For high-density HDDs, power consumption ranges between 7W (idle) and 9.4W (max) per drive (see Seagate’s data sheet).
In summary, we don’t believe that there is a sufficient difference in terms of power consumption for this to be the key buying criteria for customers.
Blocks & Files: If SSDs and HDDs use the same or approximately the same electricity in watt/TB terms then what cost factors other than acquisition costs can be used by enterprises to differentiate between SSD and HDD storage?
Paul Speciale: The buying criteria will come down to use-cases: QLC flash is warranted for more latency-sensitive, read-intensive workloads, whereas HDDs are now proven over the long term and optimal for most other petabyte scale unstructured data workloads.
How would you compare the working life of SSDs and HDDs? Is one longer than the other? Do they have the same enterprise refresh period?
All vendors across these drive types offer five-year warranties on their drives. In addition, flash vendors place a recommended maximum Drive Write Per Day (DWPD) on their drives, which impact the workloads for which they are suitable so customers need to be careful not to exceed these ratings.
Blocks & Files: How would you compare the rack density of shelves of SSDs versus shelves of disk drives? Do you think that shelves of HDDs in a rack need more or less cooling than shelves filled with the same capacity of SSDs?
Paul Speciale: We are doing additional analysis on density and cooling characteristics and will follow up when this concludes.
Blocks & Files: How do you think enterprises should calculate the cost of HDD storage vs the cost of SSD storage
Paul Speciale: As stated previously, the key considerations are related to the use cases/workload characteristics. This will determine whether or not the higher cost of flash media will deliver meaningfully better performance for the application (and again, flash delivers benefits mainly for latency-sensitive, read-intensive workloads).
Scality as a software-defined storage provider supports all types of drives including HDD and high-density flash, and our value proposition doesn’t depend on any specific drives. We therefore really don’t have a horse in the race between flash and HDD. Ultimately what matters is customers and the problems they are trying to solve. Today, we have seen that the longer term proven nature of HDD in terms of density, durability and price/performance make them the optimal solution for most multi-petabyte object storage use cases. This is especially true for backup applications which are write-heavy workloads that do not benefit from the lower latency of flash media.
It was a tough Q1 of fiscal 2024 for Dell as revenues fell by a fifth on the back of shrinking PC and infrastructure sales.
Turnover for the quarter ended May 5 was $20.92 billion, beating Dell’s guidance, and it recorded a net profit of $578 million, down 46 percent on a year ago.
The Client Solutions Group – PCs – pulled in $12 billion, down 23 percent, representing the fifth straight quarter of declines. The Infrastructure Solutions Group – servers, networking and storage – raked in $7.6 billion, down 18 percent, to end five successive growth quarters. Servers and networking accounted for $3.8 billion of that, down 24 percent. Storage revenues were 11 percent down, also at $3.8 billion.
Chuck Whitten
Co-COO Chuck Whitten said: “We executed well against a challenging economic backdrop. We maintained pricing discipline, reduced operating expenses, and our supply chain continued to perform well after normalizing ahead of competitors.”
Financial summary
Gross margin: 24 percent of net revenue
Operating expenses: $3.949 billion, down 7 percent
Operating cash flow: $1.8 billion
Cash and investments: $9.2 billion
This is Dell’s seasonally low quarter for storage and it saw soft storage demand, although the PowerStore and PowerFlex products did well. We note that the storage 11 percent revenue fall was less than half the 24 percent server and networking revenue drop, and the 23 percent CSG segment revenue fall. Storage did proportionately much better inside Dell.
In contrast, HPE storage revenues declined three percent in its latest quarter, NetApp’s shrank six percent and Pure five percent. With a wider storage portfolio and many more customers, we can take the view that Dell is more exposed to the overall macroeconomic situation than these competitors.
We should also remind ourselves that Dell says its external storage revenue market share is larger than these other top players combined.
The overall infrastructure environment is challenging for Dell, with Whitten saying: “Customers aren’t outright cancelling digital projects, but they are prioritizing spend and they continue to constrain investments in infrastructure hardware after the burst of pandemic spending the last couple of years.” Servers are challenged in large bids with enterprise customers, while storage demand downturns typically lag server demand: “And after seeing slowdowns in small and medium business in Q4, we saw a slowdown in larger customers in Q1 … we’re clearly seeing a lull in the storage market.”
Dell expects any storage demand improvement occurring after server demand rises.
Generative AI hype
In the earnings call Whitten said: “We are seeing a lot of demand for AI-optimized infrastructure, that’s obviously a very good thing for our business.” Customers are exploring AI in their datacenters and at the edge. Demand for Dell’s XE9680 8-way GPU AI server “has been very good … but we’re also seeing demand across our portfolio.”
Jeff Clarke
Dell sees the pricing environment becoming more aggressive and claims it won’t take deals that lower its profit margins. It is focusing, co-COO Jeff Clarke said, on long-term market share gains in “the advantaged opportunities for us, commercial PCs, workstations, high-end consumer PCs and gaming, and the opportunities that exist in our broad storage portfolio and server business … AI is going to drive demand for our business.”
He cautioned that it was early in the demand cycle, saying “AI-optimized servers are still a very small part of our overall server mix,” and it will take time to influence the overall server business.
Clarke added: “the buzz is on generative AI and large language models. And it’s an incredible opportunity … What customers are trying to do is to figure out how to use their data with their business context to get better business outcomes and greater insight to their customers and to their business … We think the more specific opportunity is around domain-specific and process-specific generative AI, where customers can use their own data.”
Hence the Project Helix announcement: “An opportunity to take enterprise AI at scale, make it easy to deploy, easy to design, easy to put and install, and use pre-trained models, tuned models and be able to drive inference out in the data center and at the edge.”
Whitten said the AI business represents incremental growth and margin dollars. “The ASPs of our AI-optimized servers are a multiple of our normal server average, and we see lots of opportunities to provide services around this infrastructure.”
Outlook
CFO Tom Sweet said: “We expect Q2 revenue to be in the range of $20.2 billion and $21.2 billion, or between down 3 percent and up 1 percent sequentially, with a midpoint of $20.7 billion.” That would be a 21.7 percent year-on-year fall at the midpoint. Within that: “We expect CSG Revenue to be roughly flat sequentially and ISG down in the low single digits sequentially.”
The full fiscal 2024 outlook is for revenues to be down between 12 percent and 18 percent, down 15 percent at the midpoint, which implies a return to sequential growth in the second half of the year. Dell said that while there is near-term uncertainty, it has a strong conviction in the growth of its TAM over the long term. Short-term pain should lead, it hopes, to long-term gain.
Qumulo has introduced non-disruptive upgrades, adjustable protection policies enabling capacity reclamation from large clusters, and HPE Alletra 4110 support.
The scale-out, parallel file system supplier says its Transparent Platform Refresh (TPR) feature enables old to new appliance upgrades with no end-user disruption and no need for a migration exercise. The Adaptive Data Protection (ADP) feature can enable customers to get back hundreds of terabytes of capacity as their cluster grows, and so save cash.
Ryan Farris, Qumulo’s VP of product, provided the announcement statement: “With more platform options and capabilities to boost efficiency and simplify refreshes at scale, our customers can ensure their end-users and workloads are fully supported even in a turbulent macroeconomic environment.”
Aaron Oshita.
Qumulo senior product marketing manager Aaron Oshita blogged about TRP, writing: “Platform refreshes with Qumulo are now as simple as: set a plan, add new nodes, remove old nodes, and you’re done! Your data effortlessly ‘glides’ over to your new appliances.”
ADP is based around better use of Qumulo’s existing erasure coding through users being able to adjust the settings. Larger clusters can stripe data across more nodes and, as a consequence, need proportionally less overhead space set aside.
A six-node cluster could lose two nodes to erasure coding parity overhead, whereas a 16-node cluster might still only need two. Oshita writes: “The cluster would still be protected against multiple concurrent disk or node failures, and the customer would have effectively expanded their cluster’s usable capacity by 21 percent.”
Qumulo diagram.
HPE’s Alletra 4000 is the rebranded Apollo 4000 server line. The two latest 4000s – the 4110 and 4120 – both have Gen-4 Xeon Sapphire Rapids processors and PCIe 5 connectivity. Qumulo says its supported Alletra 4110 is highly performant and good for AI, machine learning and other workloads needing a powerful server.
It is the first Qumulo-certified product with Enterprise and Data Center Standard Form Factor (EDSFF) NVMe flash drives. These, it claims, have significant space and thermal efficiency advantages over M.2 (gumstick) and U.2 (2.5-inch) format SSDs.
Kasten’s founders have moved from backing up containers to protecting Microsoft 365, with their latest startup Alcion and its AI-enhanced backup-as-a-service based on open source Corso software.
The pitch is that SaaS apps need backing up and protecting against malware. The largest and most popular such app is the Microsoft 365 suite. API access can be used to integrate external systems, and AI can be used to enhance ransomware attack detection and response. An open source tool base can engender a community around Alcion, and so grow users faster than otherwise.
Niraj Tolia
CEO and co-founder Niraj Tolia blogs: “Our mission is to protect all the world’s data against both malicious threats and accidents, and we are building a company to accomplish this, with community and intelligence as the foundational building blocks.”
Alcion is an early stage startup, and its M365 coverage limited. It protects Exchange, SharePoint files within Document Libraries, with Pages and Lists support coming soon, and also OneDrive. Teams support is also coming soon.
Backups are stored in an S3-style vault of the user’s choice – meaning AWS, Backblaze and GCP, with Azure Blob support coming soon.
The biz says it has an efficient and simple-to-use GUI, and time from onboarding to first backup is in the single-digit minutes area. Backups can be scheduled manually or invoked automatically – for instance Microsoft Defender threat signals can invoke proactive backups – with multiple daily backups that adjust to user activity. This feature involves predictive analytics and AI.
Alcion says it has a malware elimination feature. Discovered malware is skipped to create clean backups, which can be used to recover from ransomware attacks. Ransomware attacks can be detected as they happen – though no details are supplied as to how this is done. The software also has delayed and cancelable deletion of backups to thwart cyber attacks – this prevents backups from being easily targeted by hackers or malicious insiders.
The functional features include a resilient, high-throughput, fault-tolerant data mover optimized for Microsoft 365 API limitations. Microsoft’s Graph API enables external programs to access Microsoft’s cloud service resources and was initially released as the Office 365 Unified API.
Alcion says its backup engine is purpose-built to work with the Graph API’s unpredictable latencies and error rates to deliver fast and reliable backups. It will recover from backup failures in case of transient outages or upstream throttling of the Microsoft APIs.
Backups are based on incremental changes for speed, but are always stored as full backups for fast restores. Alcion provides end-to-end encryption, tenant isolation, per-tenant encryption keys, and incident reporting.
Other companies protecting Microsoft 365 include Assigra, Cohesity, Commvault’s Metallic, Druva and HYCU – but not OwnBackup, which protects Microsoft Dynamics 365.
Background
Alcion was started in 2022 by CEO Niraj Tolia and VP Engineering Vaibhav Kamra. They previously founded containerized app protection startup Kasten, which was acquired by Veeam in October 2020 for $150 million in stock and cash. Alcion is a distributed company with its headquarters in Santa Clara, CA. It has taken in $8 million in seed funding.
NetApp’s latest quarterly revenues have fallen, in line with HPE and Pure Storage.
Revenues in NetApp’s Q4 of its fiscal 2023 ended April 28, 2023, were $1.58 billion, down 6 percent year-on-year, and profit declined 5.4 percent to $245 million. Full fy2023 revenues came in at $6.36 billion, a mere 0.6 percent more than fy2022 but with profits up 35.5 percent to $1.27 billion.
George Kurian, NetApp’s CEO, said: “Our sharpened focus and disciplined execution yielded solid Q4 results in a dynamic environment. … We are entering FY24 with substantial new innovations and a more focused operating model to better address the areas of priority spending.”
Hybrid cloud revenues in Q4 were 8 percent lower at $1.43 billion but public cloud revenues grew 25.8 percent to $151 million; too small an amount to make much difference to the revenues as a whole. Public cloud ARR (annual run rate) rise 23 percent to $620 million. The all-flash array run rate went down 4 percent to $3.1 billion.
Financial Summary:
Free cash flow: $196 million vs $343 million last year
Operating cash flow: $235 million vs $411 million a year ago
Cash, equivalents & investments: $3.1 billion
EPS: $1.13 vs $1.14 a year ago
Kurian’s prepared remarks mentioned “ongoing macroeconomic challenges,” a “slow demand environment,” and “headwinds from large enterprises weighed on our product and AFA revenue.” He said NetApp beat its guidance for the quarter and is confident about future growth opportunities, just not for the rest of the year though, judging by its outlook.
Things get worse. The next quarter’s outlook is for c$1.4 billion, a 12 percent drop on a year ago. The full fy2024 guidance is for a low-to-mid single digit revenue decline, say 2 to 5 percent lower.
NetApp has introduced new products recently: the all-flash ASA A-Series SAN array and the QLC SSD-based C-Series, lower cost than its other all-flash ONTAP arrays. The outlook doesn’t suggest these will lift next quarter’s revenue though.
The company is stuck in a short-term no-growth situation, with its public cloud operations not contributing significantly to its fortunes, and sales of its on-premises arrays and data fabric-based hybrid private-public cloud bridges just not expanding . It is managing its business tightly, keeping costs under control and delivering profits.
It’s not seeing, for example, any significant boost from the current generative AI hype which has driven Nvidia’s stock price and market prospects sky high. This is despite Kurian claiming: “In Q4, we demonstrated industry-leading performance in the GPU Direct benchmark, proof of our ability to enable customers to use the full power of GPU technology for AI.”
Kurian sees the public cloud as the key market, saying: “We believe that our first party storage services, branded and sold by our cloud partners represent our biggest opportunity.”
The sales force has been aligned to this, he added: “We have not wavered in our conviction that Public Cloud services has the potential to be a multibillion-dollar ARR business for us. While the shift to cloud is experiencing an industry-wide slowdown, the long-term trend in favor of cloud is unchanged.”
But the outlook is the outlook, and the outlook is for no growth. Contrast this with Pure’s 5 percent growth guidance for its next quarter. HPE’s next quarter storage revenue guidance is flat – no growth and no decline.
Pure Storage revenues declined in its first quarter of fiscal 2024 as enterprise and cloud customers slowed buying activity due to the uncertain economic situation.
The $598 million brought in for the quarter ended May 7 was down 5 percent on a year ago, with a loss of $67.4 million, compared to the $11.5 million loss the prior year. It was Pure’s first decline after eight consecutive growth quarters and in keeping with quarterly results from HPE storage, down 3 percent, and NetApp, down 6 percent. The previous year’s Q1 revenues were enlarged with a $60 million Meta deal and there would have been 5 percent revenue growth this year ignoring that. Pure gained 276 new customers in the quarter, the lowest number for 10 quarters.
Charles Giancarlo
CEO and chairman Charlie Giancarlo said: “We are the clear leader in data storage, now delivering a portfolio that can address the vast majority of storage needs for all enterprises. The superior economics, performance, and operational and environmental efficiencies of Pure’s product portfolio over both hard disk and SSD-based, all-flash competitive offerings are now undeniable.”
This ”clear leader” statement is one that other suppliers, with larger storage sales, would find difficult to reconcile with their reality. HPE, for example, just reported $1 billion in storage sales and NetApp recorded $1.6 billion. In March Dell reported $5 billion in storage revenues, 8.5x higher than Pure’s $598 million. Pure looks to be fifth in the storage revenue rankings, behind Dell, HPE, Huawei and NetApp – an odd sort of leadership in data storage.
Financial summary
Gross margin: 70.1 percent
Operating cash flow: $173.2 million
Free cash flow: $121.8 million
Total cash, equivalents & securities: $1.2 billion
Pure said it had a record pipeline for its latest FlashBlade//E product, it’s QLC flash-using wannabe nearline disk array killer. Giancarlo re-emphasized this aspect of the product, saying: “As I have stated in the past, the days of hard disks are coming to an end – we predict that there will be no new hard disks sold in 5 years.”
He said: “FlashBlade//E is the second in a series of products that can compete for the secondary tier, and soon lower tiers, of the storage market entirely dominated today by hard disks.” That “lower tiers” remark is a nod, we understand, to even higher capacity flash drives that Pure will announce in the next few months.
In his view: “With the introduction of our //E product line, Pure can now compete for customers’ entire storage estate, enabling Pure to become their complete storage partner for the first time.” We’d gently remind him that some potential customers, such as the three top cloud service providers and other large enterprises, use tape storage for archival reasons, and Pure has no offering in the tape market. Charlie Giancarlo is not one to undersell the company he both leads and chairs.
The Evergreen One Pure-as-a-Service subscription business doubled revenue year-on-year. That drove up subscription services revenue by 28 percent to $280.3 million. Subscription ARR rose 29 percent to $1.2 billion. Pure closed a near $10 million deal for its Cloud Block Store in the quarter.
Pure claimed it is the chosen vendor for AI environments across a broad range of industries, notably media, entertainment, pharma, healthcare, aerospace, transportation, and financial services. GIancarlo said: “We expect our leading role in AI to continue to expand, but we are equally excited that the requirements for big data will drive even more use of high performance flash for traditional bulk data.” He is confident Pure will increase its storage market share, outgrowing its competitors.
Revenue guidance for the next quarter is $680 million, which would be 5 percent higher than the year-ago Q2. Full 2024 guidance is for 5-9 percent growth over 2023’s $2.8 billion.
How do storage suppliers find a way to be relevant in the ChatGPT world? Generative AI using large language models (LLM) has received an enormous boost with ChatGPT and its successors. IT industry execs including Nvidia’s Jensen Huang, Alphabet’s Sundar Pichai and Microsoft’s Satya Nadella are all saying this is an epoch-defining defining moment in AI development and tech.
Several storage suppliers are directly involved in delivering storage capacity to the GPU servers used in generative AI processing, including Dell, DDN, HPE, IBM, NetApp, Panasas, Pure Storage, VAST Data and Weka. Others are fielding data sets for use by the models such as Databricks, Snowflake and their ilk. Building vector databases? Think Pinecone.
Every storage supplier faces the same problem here. How do you define and promote your product/service relevance in this generative AI tidal wave?
They may all see their markets being affected in some way by AI and could have to add AI interfaces for their customers or find a way to help their customers use AI resources in some way. Data lifecycle and governance manager Komprise is already walking along this path as we found out through an interview with president and COO Krishna Subramanian.
Blocks & Files: Is there a role for data management/governance in this new world of generative AI?
Krishna Subramanian
Krishna Subramanian: Yes there is, and we are calling this SPLOG or Security Privacy Lineage Ownership Governance.
If you have sensitive or PII (personally identifying information) data and are using AI on it, how do you protect that proprietary and protected data? You don’t know if you are generating results from another company’s PII. Your own company’s PII/IP data could get leaked. There are few controls here. How to keep your data in your domain, prevent leakage so it doesn’t get used by a general LLM.
Data lineage is another way to describe data transparency: if an LLM gets data from general sources how do you know if its verified, with consent, does is contain bias or inaccuracies? There is no requirement to share source data now and very difficult to track data sources.
There is a data ownership angle. Who does the IP belong to if you use a derivate work and who’s liable if something goes wrong? This requires legal coordination and likely regulation.
From a data governance viewpoint you need to ask how do you know who did what with which data, so you can ensure compliance and investigate any issues that may happen with your data in an LLM? You need a framework for this internally.
Blocks & Files: How can Komprise help?
Krishna Subramanian: Our software can get you a solid understanding of your own data wherever it resides. Where’s your sensitive PII and customer data? Can you monitor usage to make sure these data types aren’t inadvertently fed into an AI tool and to prevent a security or privacy breach?
You can tag data from derivative works by the owner, or the individual or department that commissioned the project to help with compliance and tracking.
It can recognize when unintentional data leakage occurs and alert when sensitive or corporate data is shared with LLMs.
Blocks & Files: Could you use a generative AI to monitor another generative AI’s use of data?
Krishna Subramanian: As you know, generative AI is basically good at things like natural language processing that require deep pattern recognition and the ability to “generate” new content based on prior learnings. So, could a generative AI be used to look for patterns in another AI’s output to try and police things? The answer is yes, and we already see examples of this in tools that are now being used to spot if students are cheating using ChatGPT.
However, this assumes that you can see recognizable patterns in what generative AI creates – this is certainly true right now, but as generative AI advances, I think this will become harder and harder to do. In summary, I would say yes, AI can be used to spot other AI’s use of data as long you can characterize what you are trying to police in some recognizable patterns (e.g. use of PII or detect certain proprietary IP patterns).
The power and the danger with generative AI lies in the fact that it is not deterministic – just as you cannot always control a toddler’s behavior, you cannot deterministically predict the output of generative AI, making it hard to police.
Blocks & Files: Could Komprise add a generative AI interface to its capabilities?
Krishna Subramanian: Komprise can add a conversational element to our interface powered by Generative AI that responds to prompts, and uses as underlying logic an analytics-driven data management framework leveraging both generative as well as more deterministic machine learning and analytics-driven predictive automation techniques to address data governance, security and data management.
So, generative AI adds a natural-language interface to the analytics-driven data management that Komprise already offers, but in such a way that the logic is still deterministic and verifiable.
Blocks & Files: Could Komprise, for example, ask or request a Komprise generative AI: “What PII data do I have in my data estate?” Or ” Is any not protected?” And “Apply protection policy X to all unprotected PII data.“
Krishna Subramanian: Komprise already answers these questions, except that you would phrase these currently as queries in Komprise as opposed to natural language prompts.
Komprise analyzes all the data it is pointed at, creates a Global File Index and classifies the data so you can create such queries as well as take actions using Komprise Deep Analytics Actions by setting policies. Adding a generative AI chatbot will create the ability to interface with natural language to the functionality we already provide.
In addition, a key role we can provide through data management is to help organizations ensure the security and privacy of their data as their employees interface with AI applications – by answering questions like, “Am I exposing PII data to an AI application”, or flagging when employees may be inadvertently sharing corporate data and creating data leakage with an AI application. This goes beyond Komprise using AI in its own interface to Komprise helping organizations leverage generative AI safely while protecting their data security and privacy.
Blocks & Files: Could a Komprise generative AI be made proactive? For example, ask a Komprise generative AI “What unprotected PII data do I have in my data estate?” The response is: “New data set Y contains a list of all unprotected PII data. Shall I protect it?” What do you think?
Krishna Subramanian: Absolutely – “adaptive automation” is the broader category you are referring to, and this has always been our goal. Generative AI is simply one tool in achieving this, but more broadly, how can we continually add to our analytics to proactively learn more from data, can we anticipate what could happen and protect against it, can we be smarter in how we automate – these are all the areas where we see data management evolving.
Storj is a decentralized – but not Web3 – storage company with the sales pitch being that it can provide fast and reliable enterprise-class public cloud storage services and undercut Amazon S3 pricing.
Web3 or dStorage, like Protocol Labs, Djuno and Zus, is characterized by its use of providers’ spare capacity, blockchain, cryptocurrency, and a wish to tear down the walls of Web 2.0 storage, exemplified by AWS and Azure. It may even want to replace everyday, state-backed fiat currencies like the dollar with blockchain-backed cryptocurrencies.
Storj supplies its DCS cloud storage from multiple small datacenters and operators, utilizing their spare or stranded capacity. COO John Glesson told us: “We have this network of 24,000 endpoints all over the world.” In more than 100 countries, in fact, and the endpoint total has grown from 15,000 at the end of last year.
Incoming files are encrypted, divided (sharded) into 64MB segments, erasure-coded, and the fragments written across multiple distributed providers in a continuous single namespace. Blockchain technology is not used.
When a file is accessed (read), it needs reassembling and this is done using the nearest erasure-coded fragments to the requesting system. The full file or object is not read, meaning that not all the endpoints storing the file need to supply their portion of it. An edge-hosted Storj intermediary agent gets enough erasure-coded fragments, a minimum 29 of 80 pieces of each segment, to completely reassemble and decrypt the file/object and present it to the requesting system.
Compute is faster than network transfer, and this saves on network transmission time, particularly when Storj capacity providers can be distributed globally. Storj claims it can pull content delivery network-class performance from its distributed S3-compatible store.
The company states: “Compared to S3 single-region storage, Storj has more consistent throughput because it pulls from many nodes simultaneously, drastically increasing the probability of uncongested paths. S3 performance is affected by intermittent internet slowdowns. This is especially true when the S3 origin is far away from the download location.”
You can view a Storj webinar covering its performance advantage over AWS S3 here and an Edinburgh University study of Storj performance video here.
Naturally, the erasure coding also protects against disk, SSD, and provider failures as well. Storj provides 99.97 percent availability, 11 nines of durability, and not one file has been lost in more than three years, we’re told. Its services are backed by SLAs, and it can also provide geofencing to restrict a stored file’s location to a particular geographic region where data sovereignty concerns exist. The company can also regionalize its network to provide business continuity.
Because Storj does not have to build or lease its own datacenters and infrastructure, its costs are dramatically lower than a tier 1 or 2 cloud service provider; up to 80 percent lower than AWS S3 is the claim. This also makes the Storj public cloud less environmentally damaging then other clouds as it uses already-built capacity, not new capacity.
Storj says it provides affordable, reliable and performant decentralized storage without the blockchain, external cryptocurrency and change-the-world evangelical disadvantages of Web 3.0 storage. It’s an offering for business and public sector users with SLAs that use distributed spare capacity – in roughly the same way that Airbnb uses spare housing capacity and Uber spare vehicle capacity.
HPE grew revenues 4 percent in Q2 of its fiscal 2022 – albeit missing previous guidance – with stronger sales of edge networking and HPC helping to offset tumbling server shipments that dragged down total storage turnover.
Group revenues were $6.973 billion in the quarter ending April 30 with a profit of $418 million, up 67 percent annually, helped by previous efforts during the pandemic to trim operating expenses.
Antonio Neri
CEO and President Antonio Neri trilled: “Building on a great start to the fiscal year, HPE grew revenue, increased the contribution of recurring revenue through the HPE GreenLake edge-to-cloud platform, and delivered exceptional profitability to generate a strong second quarter performance.”
EVP and CFO Tarek Robbiati said: “These results demonstrate that our strategy to pivot our portfolio to higher-growth, higher-margin areas is working – and that we are operating with discipline.”
The company saw longer sales cycles due to the general economic situation but said AI-related sales were a big bright spot.
HPE segment revenue history. Intelligent Edge has overtaken Corp. Invstments & Other, then Financial Services, and now Storage
The largest business segment for HPE is servers, which brought in $2.761 billion, down eight percent year-on-year and the second consecutive quarter of declines. Wells Fargo analyst Aaron Rakers pointed out that profitability increased for the Compute division as average server prices grew in single digit percentages on the back of richer configurations sold.
The Intelligent Edge (Aruba) division reported revenues of $1.3 billion, up 50 percent year-on-year. This eclipsed storage divisional revenues of $1.043 billion which declined 3 percent annually. Storage was, however, a tale of two halves as external storage (Alletra) sales grew by triple digits – for the fourth consecutive quarter – but not quite high enough to counter the fall in storage based on HPEs servers, which declined at a greater rate than Alletra’s increase. All-flash array sales grew 20 percent annually.
Financial services recorded $858 million in revenues, up 4 percent, with HPC and AI reporting revenues of $840 million, 18 percent higher than a year ago. Tail runner Corporate Investments and Other had $296 million in revenues, down 9 percent.
GreenLake, HPE’s cloud-like subscription business, has now captured more than $10 billion in total contract value to date and the Annual Revenue Run-rate grew 35 percent year-on-year in HPE’s Q2 to $1.1 billion.
Financial Summary
Gross margin: record 36 percent, up 3.6 percent Y/Y
Operating cash flow: $889 million
Free cash flow: £288 million
Diluted EPS: $0.32, up 68 percent Y/Y
Share repurchases and dividends in quarter: $261 million
Neri said during the earnings call that HPE saw “saw some decline in the health of microeconomic conditions, causing unevenness in customer demand, particularly in general purpose computing.”
He added: “European, Asian and mid-sized company deals are holding up better than expected while large enterprise businesses and customers in certain sectors, such as financial services and manufacturing in North America have been more conservative.”
Sale cycles are being stretched as some customers are becoming more hesitant to commit to new projects. Neri mentioned “some sluggishness from large accounts.”
The AI opportunity
The CEO said: “Our shift to a higher-margin portfolio mix led by the Intelligent Edge segment, and the strong demand for our AI offering, further strengthen the investment opportunity for our shareholders.”
Robbiati amplified this: “The emergence of large language models and generative AI has prompted many inquiries from our customer base during the past quarter, which are turning into pipeline and orders. In the past few months, we have multiple large enterprise AI wins totaling more than $800 million and counting. This includes large AI as a service deals under the Green Lake model.”
HP intends to invest organically and inorganically in AI.
Neri said what HPE had experienced in AI was “simply amazing, breathtaking. In some cases, I consider AI a massive inflection point no different than web 1.0 or mobile in different decades. But obviously the potential to disrupt every industry, to advance many of the challenges we all face every day through data insights, is just astonishing.“
He reckons HPE is well-positioned with a hybrid AI market opportunity because it can support AI inference at the edge as well as in the data center, in supercomputing and in the cloud.
Neri highlighted HPE’s “second exascale system called Aurora for Argonne National Laboratory, Aurora will develop a series of generative AI models to run on the HP Cray EX supercomputer”.
We can expect to hear more at the HPE Discover event in Las Vegas next month.
HPE is uprating its profitability and EPS guidance for the full year on the back of the latest financial numbers.
China
HPE expects to see more money coming in from China and its HC3 joint venture with China’s Unisplendour International Technology (UNIS). HPE owns 49 percent of H3C Technologies which sells HPE kit in China. It has entered an agreement, a Put Share Agreement, to sell its HPC shares for $3.5 billion in cash to UNIS. This transaction could take place in the next 6 to 12 months depending on regulatory approval, but may take longer.
HPE has negotiated a new Strategic Sales Agreement with H3C covering the sales of HPE gear in China, and said it “is firmly committed to serving customers and continuing to do business in China through both direct sales and our partner H3C.“
Outlook
The revenue forecast for Q3 is flat: $6.7 billion to $7.2 billion; $6.95 billion at the mid-point and the same as a year ago. This will be the first full quarter of GreenLake file services, the HPE-VAST Data deal, and we’ll see if that delivers a boost to HPE’s storage sales. If it does then the odds of HPE providing a VAST-based GreenLake object services offering should increase.
CloudCasa this month claimed that open source Velero backup was far more prevalent in the Kubernetes data protection market than Portworx, Trilio or Kasten by Veeam. We note that many of these vendors do not make their figures publicly available but what do they say of CloudCasa’s claims?
CloudCasa supports Velero and COO Sathya Sankaran claimed Velero had a “two order of magnitude advantage over any commercial product that is out there today.” He based this on Velero’s 100 million-plus Docker pull stats and certain assumptions. Each user can make multiple pulls, but Sankaran said he assumes each make a mean of 100, which he claims “implies at least a million clusters are downloading.”
He presented an estimated user chart based on one percent of Velero downloads equating to 500,000 clusters deployed. We can’t vouch for its veracity as none of the other vendors were happy to make their numbers public.
Pure Storage and Portworx: The numbers are wrong
We asked a Pure Storage spokesperson about Portworx’s view of the Kubernetes app backup market and whether Pure agreed with the chart’s projection of its user base.
He said: “The Kubernetes app data protection market is growing at a fast CAGR year on year and Portworx, with its offerings of Portworx Backup and Portworx DR, has the most comprehensive enterprise offerings in this space. We cover a wide swathe of use cases from multi-site, multi-cloud backup and recovery to zero RPO metro DR solutions. While we don’t publicly share customer base details, the projections of customer base shared in the chart via mail is off by several orders of magnitude for Portworx.
“We are seeing large enterprises adopt Portworx to protect mission-critical applications in production, at scale. Our customers are regularly deploying hundreds of Kubernetes clusters and 1000s of namespaces and protecting all of them via a single pane of glass with Portworx Backup. An example customer story is here.
“Portworx is squarely focused on enterprise customers adopting Kubernetes for their tier-1 to tier-N workloads. We also cater to the SMB market via our SaaS offerings.
“Velero’s Docker pull numbers, on the other hand, are not a reflection of their traction with actual enterprise customers or even the market. These pulls are mostly from individual developers doing limited testing of a few Kubernetes applications and second tier backup vendors pulling it down as part of their application builds.
“A consistent trend we see with enterprises is that once their developer teams have tested with Velero, they then look for enterprise class Kubernetes data protection offerings that can give them multi-cluster management, fine granular RBAC, enterprise scale, automated application protection, best in class performance, multi-cloud support and inbuilt application hooks. Portworx Backup excels at providing these capabilities to customers and is therefore seeing accelerated enterprise customer adoption.”
Mere wrapping and insignificance
“Offerings that are merely wrapped around Velero, like CloudCasa, have suffered from the same limitations of Velero, do not offer enterprise class data protection, business continuity and disaster recovery and therefore have negligible enterprise customer adoption.”
Will Pure’s Portworx adopt Velero? The spokesperson said: “While the engineering effort to support Velero in our offerings is insignificant, the customer value through such an integration is even more insignificant.
“Velero is really a consumer-class tool, best utilized in limited non-production class use cases. A large majority of our customer base have had experience with Velero, as part of their early Kubernetes journey and understand very well that it is a limited tool best used in some non-production testing.
“Almost all of our install base graduated from their Velero usage to Portworx when considering moving tier-1 applications onto Kubernetes. Some of our large customers thoroughly tested Velero for their enterprise production requirements and gave overwhelming feedback on its limitations, which we captured in this blog. All of these large customers are now expanding their footprint on Portworx as we enable them to move more workloads to Kubernetes.”
Will Pure will offer a SaaS version of the Portworx K8s app data protection product?
It already does: “Portworx Backup is available as both a self managed offering and a SaaS offering. The Portworx Backup SaaS offering is a fully managed service that has been available for customers since June 2022. We also announced the industry’s most compelling Free Forever tier on Portworx Backup SaaS in October 2022 at Kubecon Detroit. Apart from the thousands of users who have signed up for Portworx Free Trial offers over the years, we have seen tremendous monthly traction in terms of user signups. More details on Portworx Backup SaaS are in this blog and a second one.”
Pure has announced SaaS-based offerings for all of the Portworx Platform, and launched the first Data Services on Kubernetes SaaS offering via its Portworx Data Services launch last year.
Trilio
David Safaii, Trilio’s executive chairman, told us: “Per the chart provided on ‘Estimated Users’, speaking for Trilio (only), the presented figures are incorrect. Though we will not share exact numbers we can comfortably say that Trilio’s customer base is well north of the number presented in your chart. As relates to users, Trilio’s offering spans departments in organizations creating 000’s of users. For example, there is an engagement at this moment that empowers IT, Network and Engineering Operational teams across an entire Telco with Trilio’s Intelligent Recovery capabilities.
“This has typically been the rule for us and not the exception as organizations look to our enterprise-class platform for scale, broad capabilities/features, support, and vision – which the upper end of the market is happy to pay for and rely on. Coupled with strong ties and deep collaboration with industry leading Distributions such as Red Hat, Trilio eliminates risk and instills confidence in Day 2 Operations.
“So, while there are 20,000 Velero users depicted in the chart, IMHO, it represents a number of individuals that have downloaded the tool. It does not present the story of organizations leveraging it to protect entire production environments. Yes, Docker is an excellent example of this.
“Please keep in mind the number of downloaded free tools by individuals in the wild that were never used or orphaned after they simply did not do what was required – maybe kicking the tires. While there is no way of tracking Velero usage in production states, the good news (about the download figure) is that it paints a story of a rapidly growing Kubernetes market. Developers and other individuals are all learning and building. Naturally, in the short term, they will want to leverage free when possible.
“Where do we go tomorrow with Velero? We will not comment on our product plans but Trilio will co-exist. There is no denying that Trilio’s technology will continue to lead from the front.”
Kasten by Veeam
A Veeam spokesperson told us regarding Kasten by Veeam and Velero: “Veeam has … nothing to add to the story.”