Home Blog Page 101

Storage News Ticker – November 3

Open-source player Airbyte, which provides data source to destination connector tech, has a “Powered by Airbyte” version that enables software developers to embed over 100 integrations into their applications. Airbyte’s library of connectors enables data movement and synchronization between various sources and destinations. Pricing for Powered by Airbyte is based primarily on the number of customers syncing data through Airbyte.

Kubernetes app protector CloudCasa by Catalogic says its CloudCasa backup-as-a-service platform is now available as a self-hosted option for enterprises and service providers. This new deployment option has all the features of the SaaS version and offers full control and enhanced sovereignty for enterprises with air-gapped clusters. It allows organizations to provide self-service backup and recovery to their developers with single sign-on authentication and granular role-based access control. The product will make its public debut at KubeCon + CloudNativeCon North America 2023, November 6-9 in Chicago. Also being introduced are single sign-on integration for enterprise SaaS customers and enhanced Velero onboarding and support services.

AWS Bedrock is a fully managed, serverless service that connects to a choice of foundation models from AI companies. With its general availability, Cloudera is releasing the latest applied ML prototype (AMP) built in Cloudera Machine Learning: CML Text Summarization AMP built using Amazon Bedrock. Via this AMP, customers can use foundation models available in Amazon Bedrock for text summarization of data managed both in Cloudera Public Cloud on AWS and Cloudera Private Cloud on-premise. Cloudera customers can fine-tune the Amazon Bedrock model using their own labelled data to create an accurate customized model for a specific problem.

DataStax, which supplies a scale-out, cloud-native Cassandra NoSQL database and an Astra DB Database-as-a-Service (DBaaS), this week announced the launch of RAGStack, designed to simplify implementation of retrieval augmented generation (RAG) applications built with LangChain. Its says that with RAGStack companies benefit from a pre-selected set of the best open-source software for implementing generative AI applications, providing developers with a ready-made solution for RAG that uses the LangChain ecosystem including LangServe, LangChain Templates and LangSmith, along with Apache Cassandra and the DataStax Astra DB vector database. This removes the hassle of having to assemble a bespoke solution and provides developers with a simplified, comprehensive generative AI stack.

Dell Technologies and Meta are enabling Meta’s Llama 2 AI large language model to run on Dell Generative AI systems on premises. Dell intends to be the preferred on-premises infrastructure provider for customers deploying Llama 2 with their own IT. Llama 2models can run in Dell-equipped data centers or edge sites. The Dell Validated Design for Generative AI with Meta’s Llama 2 provides pre-tested and proven Dell infrastructure, software and services, with fully documented deployment and configuration guidance. A blog can tell you more.

Europe’s largest fork lift and warehouse truck supplier KION Group is using Dremio’s lakehouse software, running on Azure, to get analytic insights from previously silo’d data sets. KION Group employees can now interrogate more than 200 million data records to receive responses to queries in just three seconds, compared to the 30 minutes the task would have taken previously. Dremio’s self-service analytics means that non-technical users, such as product owners or service leads, can run complex data queries without the assistance of IT professionals.

HPE has picked up a £225 million contract to build Isambard-AI, the UK’s fastest supercomputer at the University of Bristol. Isambard-AI will use next generation liquid-cooled HPE Cray EX supercomputers and 5,448 Nvidia GH200 Grace Hopper superchips, and is predicted to reach up to 200 quadrillion calculations per second, >200 petaFLOP/s and over 21 ExaFLOP/s of AI performance. This will enable Isambard-AI to perform over 21 quintillion AI-optimised floating point operations per second.

Phase one of the system, available in March 2024, will utilise Isambard 3 – a TOP500-class supercomputer service for AI and high-performance computing (HPC). This is due to be installed at the NCC at the start of 2024. Isambard-AI will use the latest HPE Slingshot 11 interconnect, and nearly 25 PB of Clusterstor E1000 Lustre storage optimised for AI workflows. It will be hosted in a self-cooled, self-contained data centre, using the HPE Performance Optimized Data Center (POD), and will be situated at the National Composites Centre (NCC), based at the Bristol and Bath Science Park. It will connect with a new supercomputer cluster at the University of Cambridge, called Dawn (see below), which is being developed to offer additional capacity as part of the new national AIRR (AI Research Resource). 

Cloud object storage and data protection supplier IDrive is launching a free cloud data migration tool to enable users to transfer data from other object storage providers directly to IDrive e2 with no minimum data limits and no additional fees. Utilizing encryption protocols, IDrive e2 guarantees end-to-end protection for all transferred data. It says the cloud storage provides ransomware protection, object lock, bucket versioning, and centralized access. Data is stored in data centers with physical and biometric security, and all transfers are secured with TLS protocol. Pricing starts from $20/TB/Year with no fees for egress, and there is a $4/TB/month pay-as-you-go monthly option. 

Intel and Dell will contribute to the Dawn supercomputer at the Cambridge Open Zettascale Lab for AI modelling work. Dawn will support some of the UK’s largest-ever workloads across both academic research and industrial domains. Importantly, it is the UK’s first step on the road to developing future Exascale system. IT will be rolled out in 2 phases, with phase operating now, and phase 2 completing in 2024. The liquid-cooled server HW consists of Dell PowerEdge XE9640s 2RU servers with dual gen 4 Xeon processors and four Intel Data Center GPU Max accelerators, with OpenStack SW. There will be more than 1,000 Intel GPUs, indicating at least 256 XE9640s. We have no details on the Dawn storage system.

Mahindra Racing, active in the Formula E champioship, selected Kinetica as its real-time database for advanced analytics for its speed and processing of location-enriched time-series data from sensors encompassing engine, aerodynamics, and tire performance. The faster they can process this data, run complex models, and identify critical issues, the quicker Mahindra Racing’s team can fine-tune their cars’ setups to try to win more races.

Reuters reports today that China’s Commerce Minister, Wang Wentao, has told Micron CEO and President Sanjay Mehrotra that it would welcom the company extending its footprint in China, indicating a thaw in the previous frigid relations. Wang said: “We welcome Micron Technology to continue to take root in the Chinese market and achieve better development under the premise of complying with Chinese laws and regulations.” China said it will optimize the environment for foreign investment and provide service guarantees for foreign enterprises. On October 9, Wang Wentao told a bi-partisan US delegation led by Senate Majority Leader Chuck Schumer, that China is ready to work together with the U.S. to adhere to the principle of mutual respect, peaceful co-existence and win-win cooperation, foster a business environment for Chinese and U.S. business communities and promote bilateral trade and investment. Micron would appear to be a benficiary of that. Extending its fotprint could mean Micron building or extending a memory fab in China.

Xinnor soups up Lustre for HPC and AI with xiRAID

Software RAID house Xinnor has developed xiSTORE, an HPC and AI market storage software stack using its xiRAID and the Lustre parallel filesystem with off-the-shelf hardware.

Xinnor’s xiRAID software is among the fastest software RAID products available. It supports NVMe, SAS, and SATA drives, and works with block devices, local or remote, using any transport – PCIe, NVMe-oF or SPDK target, by Fibre Channel or InfiniBand. The software provides a local block device to the system and has a declustered RAID feature for HDDs. This places many spare zones over all drives in the array and restores the data of a failed drive to these zones, making drive rebuilds faster. Lustre (Linux + cluster) is an open source parallel filesystem used in HPC and supercomputing. It is the filesystem software used in DDN’s ExaScaler and HPE Cray ClusterStor arrays.

Xinnor says xiSTORE performance is optimized for HPC and AI workloads, and supports both HDD and NVMe SSD configurations. It features integration with Lustre clustered filesystems, virtual machine management, and has no hardware lock-in. A declustered RAID approach is used for HDD, which delivers drive rebuild times up to 2.6 times faster than ZFS and up to ten times faster than conventional RAID systems, we’re told.

The xiSTORE software supports multiple RAID configurations such as RAID 5, 6, 7.3, N+M/nested/declustered, and is composed of dual-controller building blocks deployed in an HA cluster to eliminate single points of failure. It supports a silent data corruption protection mechanism that scans and fixes errors in the background, with insignificant performance loss.

Users can add more JBODs to one node (scale-up) or enhance both capacity and performance by incorporating additional building blocks (Scale-out). Here is an architecture diagram:

Xinnor architecture diagram

It can deliver over 800 GBps sequential read and write from 384 disk drives, a claimed 83 percent efficiency compared to raw device performance, and more than 60 GBps write and >80 GBps read from 16 NVMe SSDs, a claimed 94 percent efficiency. A node block diagram shows the controller components:

Xinnor node block diagram

The RAID engine specifically supports RAID 0, 1, 10, 5, 6, 7.3, RAID 50, 60, and 70. The declustered RAID feature supports dRAID (dRAID1), dRAID5 (4D+1P, 8D +1P), dRAID6 (8D + 2P), dRAID6 (16D + 2P), and dRAID 7.3 (16D + 3P).

The minimal node config is two servers with either AMD EPYC or Intel Nehalem (or newer) CPUs, each with:

  • 1x SAS HBA SAS 9500-16e per Numa node
  • 1x Infiniband 200 Gbps adapter per Numa node
  • 1x SAS JBOD with 84 SAS HDD + 4 SAS SSD

The use of off-the-shelf hardware means that the system should be more affordable than commercially available Lustre systems, and not be limited by hardware lock-in.

We referred to an xiSTORE microsite for the information in this article. Xinnor is going to present xiSTORE at booth 389 at the upcoming Supercomputing Conference (SC23) to be held in Denver, CO, November 12-17. Xinnor told us about a coming November 29 webinar at which it will showcase 2 recommended reference architectures: one based on HDD to address the workload of traditional HPC applications and another one based on NVMe SSD to address the new challenge created by AI workloads. Both can be combined in large installations that require both high capacity and performance.”

Backblaze blitzes cloud storage speeds with ‘shard stash’ cache

Backblaze cloud and backup storage is speeding up small file uploads by using a fast SSD ingest cache called a shard stash.

The business stores ingested files on disk drives, which write data more slowly than SSDs. The company is now writing incoming files simultaneously to disk drives and SSDs, with the SSD-held data stored only until the HDDs have received all the data, at which point the SSD copies are deleted. The result is small file upload speeds as much as 30 percent faster than AWS S3, Backblaze claims.

Gleb Budman, Backblaze CEO, claimed: “Backblaze’s pioneering approach delivers cloud storage at 1/5th the price versus legacy vendors, and our latest innovation maintains those savings while delivering 10-30 percent faster performance versus AWS depending on the file size.”

Details are explained in a Backblaze blog, which says: “Prior to this work, when a customer uploaded a file to Backblaze B2, the data was written to multiple hard disk drives (HDDs). Those operations had to be completed before returning a response to the client.

“Now, we write the incoming data to the same HDDs and also, simultaneously, to a pool of solid state drives (SSDs) we call a ‘shard stash,’ waiting only for the HDD writes to make it to the filesystems’ in-memory caches and the SSD writes to complete before returning a response. Once the writes to HDD are complete, we free up the space from the SSDs so it can be reused.”

SSD vs HDD speeds can be shown by comparing stats from a Seagate Exos 16 TB disk drive and a Micron 7450 Max 3.2 TB SSD:

Backblaze SSD vs HDD speed comparison
Since HDD platters rotate at a constant rate, 7,200 RPM in this case, they can transfer more blocks per revolution at the outer edge of the disk than close to the middle – hence the two figures for the X16’s transfer rate

The SSD is more than 20 times faster at sustained data transfer, more than 2,200 times faster at reading data, and nearly 900 times faster for writes than the HDD, we’re told.

The blog says: “Let’s consider a real-world operation, say, writing 64 KB of data. Assuming the HDD can write that data to sequential disk sectors, it will spin for an average of 4.2 ms, then spend 0.25 ms writing the data to the disk, for a total of 4.5 ms. The SSD, in contrast, can write the data to any location instantaneously, taking just 27µs (0.027 ms) to do so. This (somewhat theoretical) 167x speed advantage is the basis for the performance improvement.”

Previously, when a client application uploaded a file to the Backblaze B2 Storage Cloud, a “coordinator pod” split the file into 16 data shards, creating four additional parity shards, and then wrote the resulting 20 shards to 20 different HDDs, each in a different pod.

Now, “upon receiving a file of 1 MB or less, the coordinator splits it into shards as before, then simultaneously sends the shards to a set of 20 Pods and a separate pool of servers, each populated with 10 of the Micron SSDs described above – a ‘shard stash.’ The shard stash servers easily win the ‘flush the data to disk’ race and return their status to the coordinator in just a few milliseconds. Meanwhile, each HDD Pod writes its shard to the filesystem, queues up a task to flush the shard data to the disk, and returns an acknowledgement to the coordinator.”

“Once the coordinator has received replies establishing that at least 19 of the 20 Pods have written their shards to the filesystem, and at least 19 of the 20 shards have been flushed to the SSDs, it returns its response to the client … If power was to fail at this point, the data has already been safely written to solid state storage.” Then the SSD shard copies can be purged.

Backblaze tested the speed increase. “Over a 12-day period following the shard stash deployment … the average time to upload a 256 KB file was 118 ms, while a 1 MB file clocked in at 137 ms … For comparison, we ran the same test against Amazon S3’s US East (Northern Virginia) region, aka us-east-1, from the same machine in New Jersey. On average, uploading a 256 KB file to S3 took 157 ms, with a 1 MB file taking 153 ms.”

In summary: “We benchmarked the new, improved Backblaze B2 as 30 percent faster than S3 for 256 KB files and 10 percent faster than S3 for 1 MB files.”

Veeam backups were even faster, we’re told: “These low-level tests were confirmed when we timed Veeam Backup & Replication software backing up 1TB of virtual machines with 256k block sizes. Backing the server up to Amazon S3 took three hours and 12 minutes; we measured the same backup to Backblaze B2 at just two hours and 15 minutes, 40 percent faster than S3.”

The increased speeds benefit all Backblaze B2 Cloud Storage customers, especially those who rely on 1 MB or less small file uploads. Files of 1 MB or less make up about 70 percent of all uploads to B2 Cloud Storage and are common for backup and archive workflows. Many data protection software providers split data into smaller, fixed-size blocks for upload to cloud storage, meaning users can expect to see significantly faster upload speeds for smaller files without any change to durability, availability, or pricing. 

The shard stash approach has been fully rolled out to Backblaze’s global data regions. All Backblaze B2 customers will, it promised, enjoy faster uploads and downloads, no matter their storage workload. Additional B2 Cloud Storage download performance enhancements are planned over the coming months.

We have asked AWS to comment.

Nebulon’s Medusa2 DPU casts snake eyes at enterprise server fleet customers

Nebulon has ported its niche server-management card functionality to an OEM Nvidia BlueField 3-based data processing unit (DPU) and so become a fully-fledged DPU supplier – offloading security, networking, storage and management functions from server host CPUs.

In its first, Medusa1, iteration, Nebulon’s SPU (Storage Processing Unit) is a PCIe 3-attached and cloud-managed server card, sporting an Arm processor and software that provides infrastructure management for the host server – looking after its security with a secure enclave and virtualizing locally attached storage drives into a SAN. The SPU is managed by Nebulon ON, a SaaS offering with four aspects: smartEdge to manage edge sites as a fleet; smartIaaS; smartCore to turn a VMware infrastructure into a hyperscale private cloud; and smartDefense to protect against ransomware. Medusa2 goes much further – adding general security, networking and storage functions.

Nebulon CEO Siamak Nazari explained in a statement: “Enterprises have been demanding a hybrid cloud operating model that more closely resembles the hyperscalers, but have been faced with subpar options at best. With our new Medusa2 SPU, we help our customers get one step closer to achieving this goal, and can deliver it in a secure, unified package.”

The claim is that, with Medusa2, this is the first time enterprises and service providers can unify enterprise data services, cyber and network services, and server lights-out management integration, all on a single DPU. It is said to eliminate the server CPU overhead and cyber security risk associated with hyperconverged infrastructure (HCI) software, effectively freeing up more processor cores and equating to a 25 percent reduction in software licensing costs, datacenter space and power consumption.

The Medusa2 card is based on BlueField-3 technology but is not an off-the-shelf BlueField card. It is a system-on-chip (SoC) device with 400Gbit/sec bandwidth, 48GB of DDR5 memory, and 8 lanes of PCIe 5 connectivity. COO Craig Nunes told us: “we can can connect into 16 internal NVMe, SAS or SATA drives.”

The card also has a hardware root-of-trust feature. Nunes said: “There’s actually a chip on the card that stores certificates and keys so that we can make sure that only trusted, authenticated individuals can get access.”

Storage, network, and cyber services, along with server management, are completely offloaded from the host server. The host system does not require any additional drivers or software agents to be installed, as the Medusa2 card is host OS and application-agnostic. The vSphere hypervisor is not running on the card.

The Nebulon ON cloud-based control plane provides fleet-wide monitoring, global firmware updates, zero-trust authentication and an invisible quorum witness for 2-node high-availability systems.

Nunes said: “We support all the VMware integrations that you’re familiar with: Windows Server and Hyper V integrations around clustering. We run a CSI driver for Kubernetes, for your Linux environments, doing OpenShift.” This means, he said, that it is a plug-and-play card for enterprises.

Competition

The whole point of using DPUs is to get extra server application code performance by offloading low-level security, networking and security stuff to a DPU card. The cost of the card has to be weighed against the extra application run time performance. If 75 servers with DPUs fitted can do the work of 100 with no DPUs then a customer could retire 25 redundant servers, saving on power and software costs – partcularly with core-based software licensing. Or they could have 25 percent more application processing, which could generate more business. Nebulon claims that fitting its Medusa2 DPU to a server enables it to support up to 33 percent more application workloads.

DPU benefits become more visible as server fleet numbers grow – which is why the CSP hyperscalers, with tens of thousands of servers, are prominent DPU users. Other hyperscalers will be prominent prospects for DPU sellers, and there are effectively just three of them: AMD, Intel and Nebulon.

Intel’s Mount Evans IPU (Infrastructure Processing Unit, which is Intel’s preferred term for a DPU) has 16 Arm cores, like BlueField-3. This is shipping to Google Cloud and other customers.

Intel also has  a Xeon-D-based Oak Springs Canyon IPU with a PCIe 4 interconnect.

AMD bought the Pensando DPU startup for $1.9 billion in August 2022 and is selling DSC-200 Pensando-based technology for software-defined networking, and general DPU use. 

Microsoft bought DPU startup Fungible in January this year, indicating that its Azure cloud will be using Fungible DPU tech in its hyperscale datacenters. With AWS having its in-house Nitro DPU technology and Google Cloud using Intel DPU technology, that means the three big CSPs are pretty much closed off from Nebulon.

Nebulon has existing relationships with Dell, HPE, Lenovo and Supermicro through which it supplies its Medusa1 SPUs to customers – “dozens of customers, large and small,” according to Nunes. It will be hoping to use these channels for its Medusa2 card, and possibly rely on its lights-out server management technology to make the difference between itself, AMD and Intel. It also has relationships with global distributor TD SYNNEX and Unicom Engineering for system integration activities.

Graid GPU-powered RAID card outperforms Xinnor software

Graid reckons its GPU-powered RAID card has delivered better performance than Xinnor software RAID running in the same server, a claim that is being contested.

RAID requires a set of parity calculations on data to be written to or read from storage drives. The parity bits are added to stripes of the data as they are written across drives and enable the data to be reconstructed if a storage drive is lost. There are various RAID schemes delivering different levels of protection, with RAID 5 providing protection against drive failure in configurations with three or more drives. 

Software RAID, such as Xinnor’s xiRAID, uses the host CPU to carry out the striping and parity calculations. Hardware RAID cards have dedicated processing to do it instead, offloading the host CPU. Graid says its SupremeRAID cards use multiple GPU cores to parallelize the parity calculations and perform them faster than ASIC or FPGA-powered cards.

It has run a fio synthetic benchmark, audited by Dennis Martin Consulting, on a Supermicro SYS-220U-TNR server featuring dual Xeon Gold 2 GHz, 32-core CPUs, and 22 x Intel P5510 384TB SSDs hooked up with NVMe and a PCIe gen 4 interconnect. The benchmark tested RAID 5 random and sequential read and write IO, measuring IOPS, CPU utilization, and a CPU effectiveness measure. This calculates the amount of IO performed in relation to the amount of CPU performance used and expressed as IO per 1 percent CPU.

The benchmark runs compared Xinnor’s xiRAID v4.0.1 and Graid’s SupremeRAID v1.5, which was generally faster and used less CPU in all the test scenarios.

Xinnor said in November last year that its software RAID is so efficient that it can outperform RAID cards, including SupremeRAID.

At the time, Sergey Platonov, Xinnor VP for Product Strategy, told us: “While Graid is indeed very fast, it comes with a slot tax, taking up PCIe lanes that are usually at a premium in NVMe-heavy setups.”

We asked Xinnor to comment on this latest Graid benchmark, and a spokesperson said: “First of all, we respect Graid as any other competitor. Their approach is different than ours, as we achieve high performance without requiring extra hardware, while Graid relies on a GPU. We believe there can be applicability for both approaches, depending on the application and system limitations.

“This said, on the report, we identified several inaccuracies, demonstrating poor knowledge of our technology. Concerning the performance and CPU load results, we have different numbers from independent tests run by several customers and from our own internal testing. This is not a big surprise, as setting up proper benchmark is not simple.

“That’s why we released on our blog page a guide to explain step-by-step how to run dependable and repeatable benchmarks. We welcome customers to use this guide to run their own test and validate which solution fits them best.”

Download the Graid benchmark report here.

YMTC first to ship 232-layer QLC NAND despite sanctions

China’s YMTC is reportedly shipping a 232-layer 3D NAND chip in QLC (4 bits/cell) format ahead of all other NAND suppliers in spite of US technology sanctions.

Canadian chip analyst TechInsights discovered the YMTC Die in a 1 TB ZhiTai Ti600 SSD. ZhiTai is a consumer SSD brand of YMTC. TechInsights, headquartered in Ottawa, uses reverse engineering and scanning tech to understand new technologies.

The ZhiTai Ti600 was introduced in June and is built on the Xtacking 3.0 architecture. It has up to 2 TB of QLC flash in an M.2 format, an NVMe 2.0 and PCIe gen 4 interconnect, and provides up to 7 GBps read speed and 6 GBps write speed, with an 800 TBW endurance over a five-year warranty. It is priced at $750 on the HKTV mall website.

YMTC ZhiTai Ti6002TB SSD
ZhiTai Ti6002TB SSD

Xtacking 3.0 is a YMTC NAND architecture in which the 3D NAND die is bonded to a separately fabricated CMOS peripheral circuit logic chip. YMTC revealed an X3-9070 TLC (3 bits/cell) NAND chip using 232-layer NAND at the August 2022 Flash Memory Summit.

The US CHIPS Act is supposed to prevent US suppliers shipping technology to China to prevent companies like YMTC from building 128-layer or greater 3D NAND. It appears to have failed.

TechInsights has revealed a scanned image of the YMTC 232-layer QLC die:

YMTC 232-layer die
This die has four planes, according to the annotation

It says this is the first QLC 3D NAND die with more than 200 active word lines that it has seen. The die’s bit density is 19.8 Gb/mm2, the highest density it has seen in a commercially available SSD.

Kioxia/WD (218L), Samsung (236L), SK hynix (238L), and Micron (232L) are sll working on their own 200-plus layer technologies. The Kioxia/WD effort uses separately fabricated control logic and NAND dies bonded together like YMTC’s Xtacking technology.

YMTC ZhiTai Ti6002TB SSD box

A Reddit thread revealed a late 2022 TechInsights slide deck on YMTC’s 232L technology. This analyzed a HikSemi CC700 2TB SSD launched in October 2022 that uses TLC 232-layer YMTC NAND. It says there are eight separate NAND dies in the product and an annotated image shows its different size from the QLC die:

This die has six planes, according to the annotation

Now YMTC has upped the cell bit count to 4. TechInsights said: “Like the innovation revealed by TechInsights in the Huawei Mate 60 Pro’s HiSilicon Kirin 9000s processor (which used SMIC 7nm (N+2) process), evidence is mounting that China’s momentum to overcome trade restrictions and build its own domestic semiconductor supply chain is more successful than expected.”

If TechInsights is right, YMTC is still a force to be reckoned with in 3D NAND fabrication. Research house TrendForce suggested last December that YMTC could exit the NAND market because of the US restrictions. That seems quite unlikely now, and if YMTC can build enough QLC 232-layer chips, US NAND suppliers could have difficulties in the Chinese market.

The Financial Times reports that YMTC is seeking billions of dollars in extra funding because of the cost involved in circumventing US technology export restrictions.

Snowflake unleashes Snowday announcement blizzard

Blizzard
Blizzard

Snowflake has announced improved support for external data, app development, generative AI models, data governance, and cost management directly in its data warehouse cloud at the Snowday 2023 gabfest.

The data warehouse darling is building an all-in-one data cloud service that customers can use as an abstraction layer over public cloud services, data lakes, and lakehouses, with app development, cost management, and governance capabilities services. Many of the new items are due in the near future, but the intent that users need never leave the Snowflake environment is clear.

“The rise of generative AI has made organizations’ most valuable asset, their data, even more indispensable. The company is making it easier for developers to put that data to work so they can build powerful end-to-end machine learning models and full-stack apps natively in the Data Cloud,” said Prasanna Krishnan, Snowflake senior director of product management.

Christian Kleinerman, SVP of Product, said: “Snowflake is making it easier for users to put all of their data to work, without data silos or trade-offs, so they can create powerful AI models and apps that transform their industries.”

The service/function names to look out for are Iceberg, Snowpark, Horizon, Cortex, Snowflake Native App Framework, Snowflake Notebooks, Powered by Snowflake Funding, and Cost Management Interface. Grab a favorite beverage and we’ll start with Iceberg.

Iceberg

Snowflake has a coming public preview of its work to integrate external Apache Iceberg format tables with its SQL-based data warehouse, and allow access to Iceberg data from other data engines. It says Iceberg tables enable support of data architectures such as data lakes, data lakehouse, and data mesh, as well as the data warehouse.

Iceberg Tables are a single table type that brings the management and performance of Snowflake to data stored externally in an open format. They make it easier and cheaper to onboard data without requiring upfront ingestion. Iceberg Tables can be configured to use either Snowflake or an external service like AWS Glue as the table’s catalog to track metadata, with a one-line SQL command to convert to Snowflake in a metadata-only operation. Apache Spark can use Snowflake’s Iceberg Catalog SDK to read Iceberg Tables without requiring any Snowflake compute resources.

Snowflake says it has expanded support for semi-structured data with the ability to infer the schema of JSON and CSV files (generally available soon) in a data lake. It’s also adding support for table schema evolution (generally available soon).

Snowpark

Snowpark is a facility for the deployment and processing of non-SQL code. More than 35 percent of Snowflake’s customers use Snowpark on a weekly basis (as of September 2023), and it is also being employed by Python developers for complex ML model development and deployment. New functionality includes: 

  • Snowpark Container Services (public preview soon in select AWS regions): This enables developers to deploy, manage, and scale containerized workloads within Snowflake’s fully managed infrastructure. Developers can run any component of their application – whether ML training, a ReactJS front end, a large language model, or an API – without needing to move data or manage complex container-based infrastructure. Snowpark Container Services provides an integrated image registry, elastic compute infrastructure, and RBAC-enabled, fully managed Kubernetes-based clusters with Snowflake’s networking and security controls.
  • Snowflake Notebooks (private preview): A development interface that offers an interactive, cell-based programming environment for Python and SQL users to explore, process, and experiment with data in Snowpark. Notebooks allow developers to write and execute code, train and deploy models using Snowpark ML, visualize results with Streamlit chart elements, and more within Snowflake’s unified platform. 
  • Snowpark ML Modeling API (general availability soon): This enables developers and data scientists to scale out feature engineering and simplify model training for faster and better model development. They can implement popular AI and ML frameworks natively on data in Snowflake without having to create stored procedures. 
  • Snowpark Model Registry (public preview soon): This builds on a native Snowflake model entity and enables the scalable deployment and management of models in Snowflake, including expanded support for deep learning models and open source large language models (LLMs) from Hugging Face. It also provides developers with an integrated Snowflake Feature Store (private preview) that creates, stores, manages, and serves ML features for model training and inference. 

For use cases involving files like PDF documents, images, videos, and audio files, you can also now use Snowpark for Python and Scala (generally available) to dynamically process any type of file.

Snowflake Native App Framework

The Snowflake Native App Framework (general availability soon on AWS, public preview soon on Azure) provides customers with the building blocks for app development, including distribution, operation, and monetization within Snowflake’s platform. That is, they can write their own apps to process Snowflake data and sell them to other customers through the Snowflake Marketplace.

Such developers can use Snowflake’s new Database Change Management (private preview soon) to code declaratively and templatize their work to manage Snowflake objects across multiple environments. The Database Change Management features serve as a single source of truth for object creation across various environments, using the common “configuration as code” pattern in DevOps to automatically provision and update Snowflake objects.

Snowflake also announced the private preview of the Snowflake Native SDK for Connectors, which provides core library support, templates and documentation.

With Snowpark Container Services as part of Snowflake Native Apps (integration in private preview), developers can bring in existing containerized workloads, for an accelerated development cycle, or write app code in the language of their choice and package it as a container.

Horizon

With more data types and tables supported, Snowflake is expanding its Horizon governance capability, which looks after compliance, security, privacy, interoperability, and access capabilities in its Data Cloud. Horizon is getting:

  • Additional Authorizations and Certifications: Snowflake recently achieved compliance for the UK’s Cyber Essentials Plus (CE+), the FBI’s Criminal Justice Information Services (CJIS) Security Policy, the IRS’s Publication 1075 Tax Information Security Guidelines, and assessments by the Korea Financial Security Institute (K-FSI), as well as StateRAMP High and US Department of Defense Impact Level 4 (DoD IL4) Provisional Authorization on AWS GovCloud. 
  • Data Quality Monitoring (private preview): This is used by customers to measure and record data quality metrics for reporting, alerting, and debugging. Snowflake is unveiling both out-of-the-box and custom metric capabilities for users. 
  • Data Lineage UI (private preview): The Data Lineage UI gives customers a bird’s eye visualization of the upstream and downstream lineage of objects. Customers can see how downstream objects may be impacted by modifications that happen upstream.
  • Differential Privacy Policies (in development): Customers can protect sensitive data by ensuring that the output of any one query does not contain information that can be used to draw conclusions about any individual record in the underlying data set.
  • Enhanced Classification of Data: Custom Classifiers (private preview), international classification (generally available), and Snowflake’s new UI-based classification workflow (public preview) allow users to define what sensitive data means to their organization and identify it across their data estate.
  • Trust Center (private preview soon): This centralizes cross-cloud security and compliance monitoring to reduce security monitoring costs, resulting in lower total cost of ownership (TCO) and the prevention of account risk escalations. Customers will be able to discover security and compliance risks based on industry best practices, with recommendations to resolve and prevent violations.

Cortex

Snowflake Cortex says (available in private preview) is an intelligent, fully managed service that offers access to AI models, LLMs, and vector search functionality to enable organizations to quickly analyze data and build AI applications. It gives users access to a growing set of serverless functions that enable inference on generative LLMs such as Meta’s Llama 2 model, task-specific models to accelerate analytics, and advanced vector search functionality. 

Snowflake Cortex is also the underlying service that enables LLM-powered experiences that have a full-fledged user interface. These include Document AI (in private preview), Snowflake Copilot (in private preview), and Universal Search (in private preview).

Cortex includes a set of general-purpose functions that use both open source and proprietary LLMs to help prompt engineering to support a broad range of use cases. Initial models include:

  • Complete (in private preview): Users can pass a prompt and select the LLM they want to use. For the private preview, users will be able to choose between the three model sizes (7B, 13B, and 70B) of Llama 2.
  • Text to SQL (in private preview): This generates SQL from natural language using the same Snowflake LLM that powers the Snowflake Copilot experience.

These functions include vector embedding and semantic search functionality so users can contextualize the model responses with their data to create customized apps in minutes. This includes:

  • Embed Text (in private preview soon): Transforms a given text input to vector embeddings using a user selected embedding model. 
  • Vector Distance (in private preview soon): To calculate distance between vectors, developers will have different functions to choose from: cosine similarity, vector_cosine_distance(), L2 norm – vector_l2_distance(), and inner product – vector_inner_product().
  • Native Vector Data Type (in private preview soon): To enable these functions to run against your data, vector is now a natively supported data type in Snowflake in addition to all the other natively supported data types.  
Snowflake Notebook Llama 2

Streamlit in Snowflake (public preview): With Streamlit, developer teams can accelerate the creation of LLM apps, with the ability to develop interfaces in just a few lines of Python code and no front-end experience required, we’re told. These apps can then be deployed and shared across an organization via unique URLs that use existing role-based access controls in Snowflake, and can be generated with just a single click. Learn more here.

Snowflake GenAI/LLM capabilities

Powered by Snowflake Funding

This Powered by Snowflake Funding Program intends to invest up to $100 million dollars toward the next generation of startups building Snowflake Native Apps. It features venture capital firms Altimeter, Amplify Partners, Anthos, Coatue, ICONIQ Growth, IVP, Madrona, Menlo Ventures, and Redpoint Ventures. AWS is helping by providing $1 million in free Snowflake credits on AWS over four years to startups building Snowflake Native Apps.

Snowflake says apps that are Powered by Snowflake benefit from the speed, scale, performance of Snowflake’s platform for accelerated time to market, improved operational efficiency, and a more seamless customer experience. With the Snowflake Native App Framework, developers can build an app, market, monetize, and distribute it to customers across the Data Cloud ecosystem via Snowflake Marketplace, all from within Snowflake’s platform.

Stefan Williams, VP Corporate Development and Snowflake Ventures, said in a statement: “A new way to deploy enterprise applications is emerging as companies look to bring their apps and application code closer to their data.” In other words, startups can build apps directly in the Snowflake data cloud and compete for VC funding and AWS credits.

Williams added: “Innovative enhancements in AI enabled through Snowpark, the Snowflake Native App Framework, and Snowflake’s … data privacy, security, and governance make it easier than ever for startups to build, deploy, and monetize enterprise apps. With our venture capital partners and AWS, the Powered by Snowflake Funding Program will accelerate this new era of software development.”

Potential developers can register their interest here.

Cost Management Interface

Customer admins will be able to manage and optimize their Snowflake spend with this interface, getting visibility into account-level usage and spend metrics. The Cost Management Interface in Snowsight has an Account overview that provides a look at account-level consumption and spend, including dollars and credits spent over a specified time period, average daily spend, top warehouses by cost, the most expensive queries etc. The trend in “credits per 100 jobs” graphic below further highlights how the effective value of a Snowflake credit changes over time.

They will be able to optimize their resource allocation on Snowflake through recommendations (private preview soon).  

Snowflake has also introduced Budgets in public preview on AWS. Budgets allow admins to set spend controls at an account level or drill down to more granular levels for a fixed calendar month that resets on the first day of the month. Both Budgets and Resource Monitors will be accessible via the Cost Management Interface, allowing admins to control Snowflake spend from one place for a better user experience.

The Snowflake data warehouse has coming support for ASOF joins (in private preview soon), which will enable data analysts to write simpler queries that combine time series data. It’s improving support for advanced analytics by increasing the file size limit for loading large objects (up to 128 MB in size). This will be in private preview soon.

Snowflake cost management

Snowflake is flooding customers with news about wider access to data, better app development facilities, and generative AI model support. The company doesn’t want to lose customers to Databricks, Dremio, or any other mass data-organizing and analyzing, model training upstarts. When CEO Frank Slootman says Amp It Up he really does mean what he says.

To find out more, explore a plethora of Snowflake blogs here.

IBM unveils Storage Scale System 6000 for AI workloads

IBM has launched a new Storage Scale appliance with twice the throughput of the current high-end ESS 3500.

Dennis Kennelly, IBM
Dennis Kennelly

Storage Scale is IBM’s established parallel file system software previously called GPFS and popular in HPC and allied scenarios. The ESS appliances are integrated scale-up/scale-out – to 1,000 appliances – storage systems with Storage Scale installed. The current ESS 3500 is built from 24-slot 2U nodes with dual active:active controllers, NVMe SSDs or IBM’s proprietary Flash Core Modules (FCMs), and 100 Gbit Ethernet or 200 Gbit HDR InfiniBand ports with a maximum 126 GB/sec throughput per node. The rapid increase in AI workloads has prompted IBM to boost the box as it were.

Denis Kennelly, IBM Storage general manager, said in a statement: “IBM Storage Scale System 6000 … brings together data from core, edge, and cloud into a single platform with optimized performance for GPU workloads.

“The potential of today’s new era of AI can only be fully realized, in my opinion, if organizations have a strategy to unify data from multiple sources in near real-time without creating numerous copies of data and going through constant iterations of data ingest.”

He said he sees Storage Scale and the 6000 as the top end of an information supply chain that collects data from multiple sources, consolidating workloads, and pumps it out to GPU servers for AI training and inferencing.

The SSS (Storage Scale System) 6000 datasheet helped provide the information in this table:

IBM Storage Scale specs

IBM has increased the chassis size from the 24-slot 2RU format used in the prior ESS appliances to a 48-slot 4RU design. It has also doubled the raw NAND drive maximum capacity to 30 TB and will later add inline compressing FCMs with either 38 TB or 76 TB effective capacity at a 2:1 compression ratio. The 6000 is based on the PCIe gen 5 bus, twice as fast as the ESS 3500’s PCIe gen 4 interconnect.

It also supports 400 Gb InfiniBand vs the 3500’s 200 Gb InfiniBand links, and 200 GbitE vs the 3500’s 100 GbitE. Put the drive capacity and link speed increases together with newer CPUs and more controller DRAM, and the maximum per-chassis data transfer rate shoots up from ESS 3500’s 126 GBps to 256 GBps. 

IBM says the 6000 has an NVMe turbo tier for extra small file transfer speed. As is the case with ESS 3500, Storage Scale in the 6000 supports Nvidia’s GPUDirect CPU-bypassing storage protocol for fast data delivery to Nvidia GPUs.

The 6000 is 2.5x faster than market-leading competitors, IBM claims without naming them. Existing GPUDirect-support suppliers include Dell (PowerScale), DDN, Huawei, NetApp, Pure Storage, and WekaIO.

IBM GPUDirect performance

We think that GPUDirect testing should show the SSS 6000 is the fastest GPUDirect data supplying system of all on a per-node basis but still not as fast as Huawei’s A310 on a per RU basis – an estimated 64 GBps read bandwidth vs the A310’s 80 GBps.

IBM GPUDirect performance
Existing GPU Direct performance per rack unit, before SSS 6000 launch

The 6000 will get new NVMe FCMs based on QLC (4bits/cell) NAND in the first half of 2024. These will have 70 percent lower cost, we’re told, and use 53 percent less energy per TB than the current top capacity 15.36 TB flash drives for the ESS 3500. This will enable the 6000 to support 2.5x the amount of data in the same floor space as the ESS 3500. This is based on a configuration using the 38 TB FCM drives with up to 2:1 inline compression coming in 1H 2024 with 900 TB/rack unit of floor space vs the previous generation 2RU Scale System 3500 using 30 TB flash drives with 360 TB/rack unit of floor space.

As well as putting out its new box, IBM said that the Storage Scale 6000 can be a collection point for data from multiple sources, a global data platform, as a diagram indicates:

This includes collecting data from external arrays such as Dell PowerScale, NetApp, and Pure Storage, and the cloud, and also dispersing data to tape systems or the cloud for longer-scale storage.

For more information, read David Wohlford’s blog and check out an SSS 6000 datasheet here. Wohlford is a worldwide Senior Product Marketing Manager for AI and Cloud Scale in IBM Storage.

Commvault dives into cyber-resilience

Data protection biz Commvault is morphing into a cyber-resilience supplier following a second consecutive growth quarter helped by its Metallic SaaS product and 500 additional customers.

Revenue for Commvault’s calendar Q3, ended September 30, was up 7 percent year-on-year to $202 million, beating guidance. Net profit of $13 million was almost three times higher than a year-ago. William Blair analyst Jason Ader told his subscribers: “Commvault reported a solid top and bottom-line beat in its fiscal second quarter and provided a generally upbeat outlook for the remainder of the year.”

President and CEO Sanjay Mirchandani said in a statement: “Our Q2 total revenue growth accelerated, driven by our hyper-growth SaaS platform, and we delivered robust operating margin leverage. Next week, at Commvault SHIFT, we’ll unveil our cyber resilience platform, combining our leading data protection capabilities with comprehensive new security and AI-powered innovations that are critical for customers in an era of escalating cyberattacks.”

Commvault revenues
The Q2 revenue line (green) is showing a steady acceleration, and Commvault has returned to sequential growth after 3 quarters of declining or flat sales

Financial summary

  • Total annual recurring revenue (ARR): $711 million, up 18 percent annually
  • Subscription revenue: $97.8 million, an increase of 25 percent
  • Perpetual license sales: $14.39 million; down 27.5 percent
  • Subscription ARR: $530 million, up 32 percent
  • Metallic ARR: $131 million, up 77 percent
  • Customer support revenues: $77 million, down 1 percent
  • Operating cash flow: $40.3 million
  • Free cash flow: $40.1 million

Ader noted: “The good news is that within the customer support revenue line, the mix continues to shift toward term and away from perpetual, with management expecting the crossover point to come next fiscal year. The result will be that customer support will be less of a headwind to revenue growth in future periods.”

The subscription customer count is now 7,800, more than 50 percent of Commvault’s customer base – up 500 sequentially and 2,200 annually. More than 60 percent of Metallic customers are new to Commvault and only about 25 percent of Metallic users are customers for Commvault’s enterprise protection products, we’re told.

US revenues grew 3.5 percent year-over-year while international revenues rose 12.3 percent, with the EMEA region seeing more and smaller deals than in the States.

Sanjay Mirchandani, Commvault
Sanjay Mirchandani

Mirchandani’s said total ARR is “the primary metric we use to measure underlying growth.” He prepared the ground for a shift into cyber-resilience, claiming: “We are going to introduce a radically new approach that empowers customers to stand up to today’s non-stop and escalating cyber threats. We are bringing together what we’re known for – best in class data protection – and combining it with exceptional data security, recovery, and AI-driven data intelligence.” 

This will be helped, we’re told, by new ecosystem partnerships. Commvault set up a Cyber Resilience Council yesterday, chaired by Melissa Hathaway. She’ll speak at the SHIFT event on November 9.

The earnings call revealed that Commvault is seeing increased caution in large deals, however.

The outlook for the next quarter is $208 million +/- $2 million, meaning a 6.6 percent annual increase at the mid-point. Commvault has upgraded its full fiscal 2024 revenue expectation from an $805-815 million range to $812-822 million.

Storage news ticker – 31 October

Dave Kushner.

Cohesity has appointed Dave Kushner as its VP for Federal Sales. He was VP Federal Sales at Enveil and an SVP of Sales at ViON Corp before that. He replaces Kevin Davis who has exited Cohesity to become President, US Public Sector, at Open Text. 

Data protector CommVault has set up a Cyber Resilience Council and its chairperson is Melissa Hathaway, who is also a strategic advisor to Commvault. We’re told she is the President of Hathaway Global Strategies and brings a wealth of experience to Commvault from a variety of areas, including policy, technology and the boardroom. She has gained this experience from impressive ventures, from serving in Barack Obama & George W. Bush’s presidential administrations to working with NATO and the World Bank. The council will advise on emerging security trends and cyber threats, as well as highlight best practices in cyber resilience.

Cohesity is playing the high-level council game as well, with its CEO Advisory Council. Blocks & Files has a special one-time only offer to be a strategic council member for any storage company willing to make a regular small financial transfer into its post-work fund.

Dell has released PowerStoreOS v3.6 with three new features: enhanced cyber resiliency with Metro Witness; NVMe/TCP for vVols to boost VMware performance; and data in place upgrades for first generation appliances. The release of PowerStoreOS 3.6 is accompanied by the receipt of an ENERGY STAR certification for the 1200 model.

Data orchestrator Hammerspace has published an eBook entitled “Unstructured Data Orchestration for Dummies.” It has been jointly written by John Carucci, an entertainment producer at Associated Press, Hammerspace content marketing manager Beth Mayer and global marketing head Molly Presley. Get it here

Hitachi Vantara (HV) has sponsored some research, like so many other suppliers looking for ways to burnish their image amongst prospective clients. The commissioned Forrester study, “Embracing ITaaS For Adaptability and Growth“, surveyed 213 IT leaders across North America and Europe to assess the IT as a Service (ITaaS) market with a focus on subscription- and consumption-based models. 56 percent of businesses reported a significant impact on revenue due to technology downtime. 50 percent of organizations face a high total cost of ownership (TCO) or technical debt associated with critical applications. 45 percent of businesses have difficulties navigating complex cloud landscapes. 55 percent of enterprises are struggling to derive meaningful insights from their data. Download the report here.

We asked Nasuni if its customers have simultaneous SMB, NFS, and S3 access to the same data set in a global filesystem. The answer is yes. It is possible to use all of those interfaces to get simultaneous access in the Nasuni global file system.

Although Kioxia and Western Digital have closed their merger talks, consultancy and research house TrendForce believes merger and acquisitions in the SSD industry is an inevitable trend. It says NAND flash global demand has seen a decline in its growth rate, shifting from approximately 30 percent before 2020 to around 20 percent in recent years. Furthermore, TrendForce’s data reveals that in 2023, all NAND flash suppliers have experienced their most significant operating losses since 2014. Given these challenges, some NAND flash suppliers may be compelled to explore strategies to sustain their competitiveness in a changing landscape.

We asked VAST Data if it supports HPE Cray’s Slingshot interconnect. We’re told: ”The short answer to your question is no, but this amongst other things, are all on the roadmap.” VAST Data’s Phil Manaz, global alliance lead for HPE, tells us: “Both VAST and HPE are committed to delivering significant value to our joint customers around their AI initiatives with HPE GreenLake for File. These first six months of this strategic partnership have been incredibly promising, and there’s so much more to come. We’re actively co-engineering and driving deep hardware and software innovation to integrate VAST across a number of HPE technologies in order to support a multitude of AI and HPC workloads. VAST and HPE are aggressively executing against our roadmap to deliver these – and you’ll see updates and announcements as the partnership and offering continue to mature and develop beyond year one.”

Western Digital sees shoots of growth after HDD winter

Following another low revenue quarter at Seagate, Western Digital disk sales were also depressed but flash bucked the downcycle trend, and WD thinks the disk market is bottoming out.

Western Digital revenue for Q1 of its fiscal 2024 ended September 29 was $2.75 billion, down 26 percent year-on-year, with a net loss of $685 million compared to a $27 million net profit a year earlier.

CEO David Goeckeler said: “Western Digital’s fiscal first quarter results exceeded our expectations [with] sequential margin improvement across flash and HDD businesses. Our Consumer and Client end markets continue to perform well and we now expect our Cloud end market to grow going forward.”

Looking at the revenues by media type, disk revenues were $1.19 billion, down 30.8 percent annually, while flash revenues came in at $1.56 billion, up 9.6 percent.

Cloud media revenues have fallen so much for five quarters in a row that they are now below client device sales and risk sliding below consumer device sales:

Western Digital segment splits

Financial summary

  • Gross margin: 3.6 percent vs 26.3 percent a year ago
  • Free cash flow: -$544 million
  • Cash & Cash equivalents: $2.03 billion

Cloud revenues were depressed due to lower nearline hard drive shipments to datacenter customers than last quarter and declines in both disk and SSD datacenter shipments year-over-year.

Disk business

WD shipped 10.4 million drives, 29.3 percent down year-over-year, and exabyte shipments declined 5 percent from last quarter. Cloud market buyers bought 5.3 million units, down 38.4 percent year-over-year; client buyers 2.6 million, down 23.5 percent annually; and consumer buyers 2.5 million drives, just 7.4 percent down. The Chinese market was also subdued.

The average selling price per drive was $112 vs $125 a year ago, and higher than in each of the preceding three quarters.

The 26TB UltraSMR drive accounted for nearly half of Western Digital nearline exabyte shipments. WD says it’s on track with qualification of its 28TB UltraSMR drive and has a road map into the 40TB capacity range, without transitioning to HAMR technology.

Goeckeler said: “HAMR is in development. It’s going well, and we’ll be able to fold that into our road map at the appropriate time. But for now, we’ve got a great road map … We’re leading the adoption of SMR into the cloud data center. And we have many more generations to go on our current road map, and then we’ll move to HAMR at the appropriate time when it’s mature and we can build it at scale.”

A comparison of Seagate and Western Digital HDD revenues over the past few quarters show WD is a near-permanent second placed player:

Seagate vs Western Digital

In the earnings call, Gockeler said: “Industry analysts estimate the HDD addressable market to grow at approximately 12 percent compound annual growth rate to $25 billion over the next three years, with cloud representing over 90 percent of the total addressable market.” This is quite different from the Pure Storage view that no new disk drives will be sold after 2028.

Flash business

WD says it had record flash bit shipments, up 49 percent annually, and is ramping a set of client SSD products using 172-layer, QLC (4bits/cell) NAND for shipment in calendar 2024, with a 232-layer NAND product set to follow that.

Goeckeler said: “Industry analysts forecast the flash market to grow at approximately 15 percent compounded annual growth rate over the next three years to $89 billion in calendar year 2025. We believe content increases in the consumer and client end markets, as well as explosive growth of data created in the cloud by emerging applications such as generative AI, virtual reality, and autonomous driving, are driving a faster growth in flash versus HDD.”

He pointed to the strength of the SanDisk consumer flash brand and WD’s gaming market presence (Black brand SSDs), but did not mention enterprise and cloud (hyperscaler) flash drives, a weak area for WD flash.

An analyst asked: “What is the team doing to improve its competitiveness in its enterprise and cloud SSD portfolio?”

Goeckeler responded: “Well, we like the portfolio we have. We qualified our NVMe-based enterprise SSD at multiple cloud providers. And unfortunately, we qualified right into a significant downturn in cloud consumption of enterprise SSDs. So, as that starts to come back over the next several quarters and as we go through ’24, we expect our position to improve as those vendors start consuming again. I mean, the reality is there’s just not a lot of buying in that market going on right now.”

He said the enterprise SSD market is relatively depressed and “we had qualifications at multiple hyperscalers. Those products are still active. We’re migrating them forward to future nodes, and we expect those to ramp as that market recovers.”

He noted that flash pricing is beginning to rise, a signal that the flash recession may be ending, supported by consumer and client SSD sales.

The flash business results were overshadowed by WD’s declaration it is set to spin off its flash business next year.

Outlook

Goeckeler said: “We are now emerging from a historic storage cyclical downturn … As we progress through fiscal year ’24, we see an improving market environment in both businesses,” meaning HDD and NAND/SSD. Specifically for disk, he said: “We think this past quarter was the bottom … And we see improving demand as we move throughout the fiscal year on a quarter-over-quarter basis.”

The outlook for the WD’s Q2 is revenues of $2.95 billion +/- $100 million, a 5.1 percent fall from a year ago at the mid-point – a much lower revenue decline rate than we have become used to over the past few quarters. Things are looking up.

Oxide on-prem cloud computer reinvents the server rack

Startup Oxide has delivered a rack-level system providing cloud-style computing on premises as its first commercial product.

Oxide was founded in September 2019 by datacenter heavyweights CTO Bryan Cantrill, CPO Jessie Frazzelle, and CEO Steve Tuck. It has had three funding rounds to date: A $20 million seed round in 2019; a $30 million A-round in September 2022; and a $44 million continuation A-round this month that coincides with its first product launch. All the rounds were led by Eclipse Ventures.

Cantrill was CTO at Joyent and distinguished engineer at Sun Microsystems/ Oracle before that. Frazelle, who left in July 2022 to co-found KittyCAD, was a software engineer at Docker, Mesosphere, Google, Microsoft, and GitHub. Tuck was president and COO at software and services company Joyent, also SVP worldwide sales, and held sales roles at Dell before that.

Oxide’s purpose is to build what it calls a commercial cloud computer, a cloud-scale and cloud-style rack-level, on-premises computing system.

Oxide rack

The Oxide cloud computer combines compute, storage, and networking elements as sleds in a plug-and-play rack. It includes the open source software needed to build, run, and operate a cloud-like infrastructure.

This software includes a Propolis hypervisor, a Nexus control plane, Crucible distributed block storage system, IAM (Identity and access management), and OPTE (Oxide Packet Transformation Engine) self-service network fabric. The software is anchored to a hardware root of trust.

The distributed block storage is based on OpenZFS and has configurable capacity and IOPS per volume. Volume size can be scaled upon demand. It has redundancy for high availability. It can integrate with external storage across a network link. Crucible provides instantaneous, point-in-time virtual snapshots for recovery and off-rail backup.

Oxide says OpenZFS checksums and scrubs all data for early failure detection. Virtual disks constantly validate the integrity of user data, correcting failures as they are discovered. There is automated rebalancing of data to preserve redundancy in the event of drive or sled removal.

Pools of resources are available either through APIs, a CLI, or a web-based UI.

Each sled contains an AMD processor, DRAM, and NVMe SSD storage. Each sled is slid into place and needs no wiring, we’re told. There can be 16, 24, or 32 sleds in a delivered rack. A compute sled has an AMD Milan EPYC 64-core CPU, 16 x DDR4 DIMM slots providing 512GiB or 1TiB of memory, and up to 10 x U.2 NVMe 3.2TB (2.91 TiB) SSDs. That provides a maximum of 32 x 32 TB – 1,024 TB of raw storage capacity. There is 100 GbE link to the rack’s network switch.

There are two network switches. Each has an Intel Tofino 2 processor with 6.4 Tbps throughput and 232 x 40/100/200 GBASE QSFP-28 uplink ports and 32 x 100 GBASE-KR4 backplane ports.

The rack switch has parts for three networks: An up to 12 Tbps programmable Ethernet ASIC, a secondary GigE switch ASIC, and an FPGA driving a proprietary low-level protocol for board control of other systems in the rack. It is connected via a PCIe link to a compute node for management.

Oxide compute sled. The green items are the NVMe drives

The rear of the rack has a DC busbar and a cabled backplane with blindmated networking. This means self-aligning connectors slide or snap into position as a sled is installed in the rack. There are no power or network cables to plug or unplug.

A blog by Cantrill says: 

  • Cloud computing is the future of all computing infrastructure.
  • The computer that runs the cloud should be able to be purchased and not merely rented.
  • Building a cloud computer necessitates a rack-level approach – and the co-design of both hardware and software.

In his view, “the rental-only model for the cloud is not sustainable.” A cloud computer has to be rack-scale and “one must break out of the shackles of the 1U or 2U server, and really think about the rack as the unit of design.”

That helps explain the blindmating. ”This is a domain in which we have leapfrogged the hyperscalers, who (for their own legacy reasons) don’t do it this way,” he says.

Oxide claims its rack is up to 35 percent more energy efficient than traditional server racks.

The Oxide rack ships with everything installed and can be set up in around four hours. Customers can use Kubernetes or cloud software tools like Terraform to deploy and configure workloads.

Oxide compute sled NVMe drive

Customers are said to include a US federal agency, the Idaho National Laboratory, and a financial services business. Several Fortune 1,000 companies are said to be interested.

In effect Oxide wants to replace Dell, HPE, and Supermicro on-premises racks with its own hyperconverged infrastructure rack with built-in public cloud facilities. The Oxide rack uses less power and is far easier to own, operate, and run from a hardware and a software sense than traditional server and HCI racks, Oxide says.

Download a specification sheet here.