Home Blog Page 59

Hammerspace boosts GPU server performance with latest update

Data orchestrator Hammerspace has discovered how to add a GPU server’s local NVMe flash drives as a front end to external GPUDirect-accessed datasets, providing microsecond-level storage read and checkpoint write access to accelerate AI training workloads.

As an example, a Supermicro SYS-521GE-TNRT GPU server has up to 16 NVMe drive bays, which could be filled with 16 x 30 TB SSDs totaling 480 TB or even 16 x 61.44 TB drives amounting to 983 TB of capacity. Hammerspace says access to these is faster than to networked external storage, even if it is accessed over RDMA with GPUDirect. By incorporating these drives into its Global Data Environment as a Tier 0 in front of Tier 1 external storage, they can be used to send data to GPUs faster than from the external storage and also to write checkpoint data in less time than it takes to send that data to external storage.

We understand that checkpoints can run hourly and take five to ten minutes, during which time the GPUs are idling. Hammerspace Tier 0 drops the time from, say, 200 seconds to a couple of seconds.

David Flynn, Hammerspace
David Flynn

David Flynn, founder and CEO of Hammerspace, stated: “Tier 0 represents a monumental leap in GPU computing, empowering organizations to harness the full potential of their existing infrastructure. By unlocking stranded NVMe storage, we are not just enhancing performance – we’re redefining the possibilities of data orchestration in high-performance computing.” 

Hammerspace points out that, although NVIDA-supplied GPU servers typically include local NVMe storage, this capacity is largely unused for GPU workloads because it is siloed and doesn’t have built-in reliability and availability features. With Tier 0, Hammerspace claims it unlocks this “extremely high-performance local NVMe capacity in GPU servers.”

It is providing this Tier 0 functionality in v5.1 of its Global Data Platform software. It points out that using GPU server’s local storage in this way reduces the capacity needed in external storage, thereby reducing cost, external rack space take-up, cooling, and electricity draw. This can save “millions in storage costs.” A figure of $40 million savings was cited for a 1,000 GPU server installation.

Hammerspace Global Data Platform

The company has also developed a software addition to speed local storage access and contributed it to the latest Linux kernel 6.12 release. It says this Local-IO patch to standard Linux enables I/O to bypass the NFS server and network stack within the kernel, reducing latency for I/O that is local to the server. 

This development “allows use of the full performance of the direct-attached NVMe which multiple devices in aggregate can scale to 100+GB/s of bandwidth and tens of millions of IOPS while maintaining mere microseconds of latency, making Tier 0 the fastest, most efficient storage solution on the market to transform GPU computing infrastructure.”

Hammerspace told us: “LocalIO is way more powerful than GPUDirect. It allows the Linux OS to auto recognize it’s connecting to itself and handle the IO request zero copy – by pointer to the memory buffer.”

Altogether, Hammerspace claims this makes it possible to “unlock the full potential of local NVMe storage by making it part of a global file system that spans all storage from any vendor. Files and objects stored on local GPU server storage can now be shared with other clients and orchestrated within the Hammerspace Global Data Platform. Data can be automatically and intelligently tiered between Tier 0, Tier 1, Tier 2, and even archival storage, while remaining visible and accessible to users.”

Hammerspace says it can work its local GPU server storage and Local-IO magic in the cloud as well as on-premises. Also, this is not just for GPU computing. It can run it in an x86 virtual machine (VM) farm to feed VMs with data faster.

v5.1 of Hammerspace’s software also includes:

  • A more modern and dynamic user interface.
  • A highly performant and scalable S3 object interface allowing users to consolidate, access, and orchestrate file and object data on a single data platform. 
  • Performance improvements for metadata, data mobility, data-in-place assimilation, and cloud-bursting.
  • New Hammerspace data policies (called Objectives) and refinements to existing Objectives make it easier to automate data movement and data lifecycle management.

We understand that a second hyperscaler customer after Meta has adopted Hammerspace and prospects in the US government lab market are looking good.

Quantum faces revenue drop but anticipates turnaround with operational overhaul

Quantum reported subdued results for its second FY 2025 quarter but said operational improvements, a product portfolio refresh, and go-to-market enhancements would return the company to growth.

Revenues in the quarter ended September 30 were $70.5 million vs $75.7 million a year ago and down 6.9 percent, with a GAAP loss of $13.5 million compared to a $3.3 million loss a year ago. The revenue fall was largely due to lower primary storage issues, meaning all-flash systems predominantly, and non-recurring project spend. That includes restructuring, getting back on SEC file, and new product introductions. 

Jamie Lerner, Quantum
Jamie Lerner

Quantum chairman and CEO Jamie Lerner said: “Sales bookings and customer win rates for the quarter were consistent with our overall business expectations as we continued to transform the company. However, operational headwinds with the supply chain continued this quarter, resulted in exiting the quarter with higher than anticipated backlog.”

This was approximately $14 million, above the typical $8 million to $10 million run rate. 

Lerner said in the earnings call: “We’ve been rotating our portfolio more to high-speed all-flash offers. And as you’ve been watching, high-speed all-flash systems, particularly those from Supermicro, have long lead times. So we’ve been finding that the … SSDs and high-speed servers that use SSDs just have longer lead times. And so we used to have about two to three week lead time on that type of server. Now it can be up to ten weeks.” 

Lerner said the Quantum ship was turning around: “Evidence of our transformation can be seen in the progress of gross margin improving 490 basis points sequentially to above 41 percent, as well as non-GAAP operating expenses being reduced by more than 8 percent year-over-year. These actions contributed to our achievement of breakeven adjusted EBITDA for the quarter.”

Financial summary for Q2 FY 2024:

  • Gross margin 41.5 percent, up 490 basis points
  • ARR: $146 million vs $145 million last quarter
  • Subscription ARR: $19.6 million, up 28 percent year-over-year and 5 percent sequentially
  • Cash, cash equivalents, and restricted cash at quarter end: $17 million vs $25.8 million at Sep 30, 2023.

Cash, cash equivalents, and restricted cash were $25.9 million last quarter vs $26.2 million the year before that. Interest expense increased to $6.1 million from $3.9 million a year ago, and total debt rose to $133 million from the year-ago $109.4 million.

The chart shows that there was a rising revenue trend starting in FY 2021’s first quarter, which ended ten quarters later in Q3 FY 2023. Then revenues plunged for three quarters in a row and have now stabilized in the $70 million to $71 million area for three quarters. The missing Q3 FY 2024 revenue number will probably be in the $75 million to $72 million area.

Quantum delayed filing its fiscal 2024 SEC report due to an accounting problem with its standalone pricing methods for sales in the period. Last quarter it provided Q1 FY 2024 results and this quarter the Q2 results have been revealed. We still await Q3 FY 2024 results and expect these in three months time when the firm reports its Q3 FY 2025 results.

Quantum revenues

The chart above shows that Quantum’s new normal for quarterly revenues is around $71 million and it has been cutting costs to try to regain profitability, saving almost $40 million since FY 2023. 

CFO Ken Gianella said in the earnings call: “Over the last several years, the company has had significant cash spend on onetime consulting, a new ERP, updated infrastructure, new product introductions, and restructuring expenses. We are pleased to announce that we are substantially complete with these efforts.”

Lerner said restructuring and operational improvements are “improving our free cash flow, which is expected to be positive in the back half of fiscal year 2025 and driving fiscal 2026 to be cash flow positive for the first time in five years.” 

A chart below showing Quantum’s revenues sequentially since FY 2016 shows a five-year downward trend, which Lerner then reversed for nine quarters, until that turnaround was itself reversed when the public cloud hyperscalers stopped buying Quantum’s tape libraries in FY 2023. 

Quantum revenues

However, the tape library business is showing signs of growth, with Lerner saying there was “a multimillion-dollar purchase order in-house from one of the world’s leading cloud platforms” for Quantum’s Scalar i7 RAPTOR library. Gianella added: “We’re super excited about the i7 coming out and being a category killer to get those win rates back up.”

The company thinks growth prospects are picking up, with Lerner saying: “Our business strategy remains focused on high-priority growth initiatives, particularly around Myriad and ActiveScale as we are seeing demonstrated proof points of our ability to significantly expand within our target verticals. In Q2 2025, we achieved significant pipeline growth for Myriad and ActiveScale.”

The DXi T-series 1RU all-flash target backup appliances did well in the quarter, with Lerner saying: “We’ve had multiple strategic wins against the competition based on the DXi T-Series fast recovery times in the face of a cyberattack due to its leading data reduction and recovery rates.

“While our efforts are still short of the intended results, we are seeing positive proof points through our new product introductions, including Myriad traction, combined with driving a more operationally efficient business.”

Outlook

Gianella said: “While we are exceeding our expectations on product mix, gross margin, and cost improvements, we need to continue to focus on improving our overall revenue execution. We see improvements in second-half of FY 2025 continuing into FY 2026.”

Next quarter’s revenue outlook is $72.0 million +/- $2.0 million. This will be a sequential rise at the midpoint but, as we don’t yet know the year-ago Q3 revenue number, we don’t know if this will be a revenue rise or fall compared to a year ago. Gianella said that the outlook “reflects management’s view of ongoing operational headwinds including transition to a new manufacturing partner during the quarter … We’re consolidating our manufacturing operations into one new location.”

Lerner said Quantum is “evolving our sales model to focus dedicated sales resources on select product lines” as a way of growing revenue. He added: “We have completed the heavy lift on the operational model. We have fully refreshed our product portfolio and we are now actively engaged in re-energizing our go-to-market approach. All of these combined create positive momentum in the coming quarters and beyond.”

Its full-year outlook is $280 million +/- $5 million, a 10 percent fall on fiscal 2024’s $311.6 million. This implies Quantum’s fourth FY 2025 quarter will bring in around $66 million, a 7.7 percent fall annually.

Delisting

Having avoided a previous Nasdaq delisting threat, because its shares traded below a minimum $1 dollar value, by implementing a reverse stock split, Nasdaq told Quantum on October 4 that it again faced delisting. This time it was because its minimum market value, based on its publicly traded shares, was below the required $15 million for 30 consecutive days. It has 180 days to regain compliance by its traded share market value being above $15 million for ten consecutive days.

Infinidat offers RAG for GenAI riches

Storage array supplier Infinidat has devised a RAG workflow deployment architecture so its customers can run generative AI inferencing workloads on its InfiniBox on-premises and InfuzeOS public cloud environments.

The HDD-based InfiniBox and all-flash InfiniBox SSA are high-end, on-premises, enterprise storage arrays with in-memory caching. They use the InfuzeOS control software. InfuzeOS Cloud Edition runs in the AWS and Azure clouds to provide an InfiniBox environment there. Generative AI (GenAI) uses large language models (LLMs) trained on general-purpose datasets, typically in massive GPU cluster farms, with the trained LLMs and smaller models (SLMs) used to infer responses (inferencing) to user requests without needing massive GPU clusters for processing.

However, their generalized training is not good enough to produce accurate responses for specific data environments, such as a business’s production, sales, or marketing situation, without their response retrievals using augmented generation (RAG) from an organization’s proprietary data. This data needs transforming into computer-generated vectors, dense mathematical representations of pieces of data, that need storing in vector databases and made available to LLMs and SLMs using RAG inside a customer’s own environment.

Infinidat has added a RAG workflow capability to its offering, both on-premises and in the cloud, so that Infinidat-stored datasets can be used in GenAI inferencing applications. 

Infinidat RAG architecture

Infinidat CMO Eric Herzog stated: “Infinidat will play a critical role in RAG deployments, leveraging data on InfiniBox enterprise storage solutions, which are perfectly suited for retrieval-based AI workloads.

Eric Herzog, Infinidat
Eric Herzog

“Vector databases that are central to obtaining the information to increase the accuracy of GenAI models run extremely well in Infinidat’s storage environment. Our customers can deploy RAG on their existing storage infrastructure, taking advantage of the InfiniBox system’s high performance, industry-leading low latency, and unique Neural Cache technology, enabling delivery of rapid and highly accurate responses for GenAI workloads.”

Vector databases are offered by a number of vendors such as Oracle, PostgreSQL, MongoDB, and DataStax Enterprise, whose databases can run on Infinidat’s arrays.

Infinidat’s RAG workflow architecture runs on a Kubernetes cluster. This is the foundation for running the RAG pipeline, enabling high availability, scalability, and resource efficiency. Infinidat says that with AWS Terraform, it significantly simplifies setting up a RAG system to just one command to run the entire automation. Meanwhile, the same core code running between InfiniBox on-premises and InfuzeOS Cloud Edition “makes replication a breeze. Within ten minutes, a fully functioning RAG system is ready to work with your data on InfuzeOS Cloud Edition.”

Infindat says: “When a user poses a question (e.g. ChatGPT), their query is converted into an embedding that lives within the same space as the pre-existing embeddings in the vector database. With similarity search, vector databases quickly identify the nearest vectors to the query to respond. Again, the ultra-low latency of InfiniBox enables rapid responses for GenAI workloads.”

Marc Staimer, president of Dragon Slayer Consulting, commented: “With RAG inferencing being part of almost every enterprise AI project, the opportunity for Infinidat to expand its impact in the enterprise market with its highly targeted RAG reference architecture is significant.” 

NetApp has also added RAG facilities to its storage arrays, including integrating Nvidia’s NIM and NeMo Retriever microservices, which is not being done by Infinidat. It and Dell are also supporting AI training workloads.

If AI inferencing becomes a widely used application, all storage system software – for block, file, and object storage – will have to adapt to support such workloads.

Read more in a Bill Basinas-authored Infinidat blog, “Infinidat: A Perfect Fit for Retrieval-augmented Generation (RAG): Making AI Models More Accurate”. Basinas is Infinidat’s Senior Director for Product Marketing.

MinIO releases AIStor with GPUDirect-like S3 over RDMA

Open source supplier MinIO has evolved its Enterprise Object Store software to develop the faster and more scalable AIStor for GenAI workloads, including training.

GenAI training requires shipping data at high speed for GPU processing, including to large-scale GPU server farms, and also writing checkpoint data at high speed to reduce GPU processor idle time. MinIO has introduced a new S3 API, promptObject, support for S3 over RDMA, AIHub private Hugging Face repository, and an updated global console with a new Kubernetes operator to extend its AI training and inferencing support.

AB Periasamy, MinIO
AB Periasamy

AB Perisamy, co-founder and CEO at MinIO, said: “The launch of AIStor is an important milestone for MinIO. Our object store is the standard in the private cloud and the features we have built into AIStor reflect the needs of our most demanding and ambitious customers.”

He thinks that “it is not enough to just protect and store data in the age of AI, storage companies like ours must facilitate an understanding of the data that resides on our software. AIStor is the realization of this vision and serves both our IT audience and our developer community.”

The promptObject API enables users, MinIO says, to “talk” to unstructured objects in the same way one would engage a large language model (LLM) moving the storage world from a PUT and GET paradigm to a PUT and PROMPT paradigm. Applications can use promptObject through function calling with additional logic. This can be combined with chained functions with multiple objects addressed at the same time.

For example, when querying a stored MRI scan, one can ask “where is the abnormality?” or “which region shows the most inflammation?” and promptObject will show it. MinIO reckons the applications are almost infinite when considering this extension. MinIO says: “This means that application developers can exponentially expand the capabilities of their applications without requiring domain-specific knowledge of RAG models or vector databases. This will dramatically simplify AI application development while simultaneously making it more powerful.”

Having S3 over Remote Direct Memory Access (RDMA) available enables RDMA’s low-latency, high-throughput capabilities, using customers’ 400GbE, 800GbE, and beyond Ethernet links. RDMA over Converged Ethernet (RoCE) brings RDMA’s benefits to Ethernet, supporting low-latency, high-throughput data transfer on a familiar, scalable infrastructure.

S3 over RDMA provides performance gains required to keep the GPU compute layer fully utilized while reducing storage server CPU utilization and latency-lengthening data copy into that server’s DRAM. This is equivalent CPU-bypass functionality to that used by Nvidia’s GPUDirect file-access protocol. It means object storage can now, in principle, feed data to GPU servers on the same terms as GPUDirect-supporting file systems, such as products from DDN, IBM (StorageScale), NetApp, Pure Storage, VAST Data, WEKA, and others.

GPUDirect storage access diagram
GPUDirect storage access diagram

MinIO says RDMA tackles TCP/IP’s limitations in high-speed networking through:

  •  Direct Memory Access: RDMA bypasses the kernel and CPU, reducing latency by allowing memory-to-memory data transfers.
  • Zero-Copy Data Transfer: Data moves directly from one application’s memory to another’s without intermediate buffering, improving efficiency.
  • CPU offloading: RDMA offloads network processing to the NIC, freeing CPU resources.
  • Efficient Flow Control: RDMA’s NIC-based flow control is faster and uses fewer CPU cycles than TCP’s congestion control, allowing for more stable high-speed performance.

We understand Nvidia is working with several object storage partners to make this facility available. Scality recently accelerated its object storage with the RING XP offering, but this was done, we understand, without using S3 over RDMA.

The private Hugging Face API-compatible AIHub repository is for storing AI models and datasets directly in AIStor. This enables customers to create their own data and model repositories on the private cloud or in air-gapped environments without changing a line of code. It eliminates the risk of developers leaking sensitive data sets or models.

The redesigned Global Console user interface for MinIO has a new Kubernetes operator that simplifies the management of large-scale data infrastructure with hundreds of servers and tens of thousands of drives. It also provides capabilities for Identity and Access Management (IAM), Information Lifecycle Management (ILM), load balancing, firewall, security, caching, and orchestration, accessed through a single pane of glass.

Rajdeep Sengupta, Director of Systems Engineering, AMD, commented: “We have deployed the MinIO offering to host our big data platform for structured, unstructured, and multimodal datasets. Our collaboration with MinIO optimizes AIStor to fully leverage our advanced enterprise compute technologies and address the growing demands of data center infrastructure.”

Altogether this is an important step forward for object storage and will make vast object-stored datasets directly available for AI training and inference. Read more about AIStor here.

Solidigm, Phison both break 100 TB SSD capacity barrier

Solidigm has announced a 122 TB version of its D5-P5336 SSD the same day that Phison announced a Pascari D205V SSD with the exact same capacity.

Update: Phison drive capacity corrected to 122.88TB.

The Phison drive is faster as it uses the PCIe 5 bus, with 32 Gbps link bandwidth, twice as fast as the gen 4 PCIe link used by Solidigm’s SSD.

Greg Matson, Solidigm
Greg Matson

Greg Matson, SVP of Strategic Planning and Marketing at Solidigm, said: “This massive capacity SSD is a game-changer – using far fewer watts per terabyte and freeing up valuable energy for other datacenter and edge power priorities.” 

Solidigm says it has shipped more than 100 EB (exabytes) of QLC-based product since 2018, and this new drive “is designed by the pioneers of QLC, so can be deployed with confidence.” 

The performance details are: 930,000 random read IOPS, 25,000 random write IOPS, 7.4 GBps sequential read and 3.2 GBps sequential write bandwidth. We understand that, like the existing D5-P5336 drives, the new one uses 192-layer 3D NAND in QLC (4bits/cell) format.

Solidigm D5-P5336
Solidigm D5-P5336

The highest-capacity SSDs from competing suppliers Samsung, Micron, and Western Digital are at the 60 TB level. With AI training and inference workloads needing fast access to larger and larger datasets, and GPUs taking an increasing share of datacenter electricity supply, having higher-capacity SSDs that customers can plug into existing PCIe gen 4 servers should be a benefit.

Solidigm says its new drive means you can store up to 4 PB of data per rack unit, extending the storage density enormously compared to disk drives, which max out around 35 TB in their 3.5-inch drive bays. The new drive also “improves power density at the edge with 3.4x more terabytes per watt versus 30 TB TLC.” 

Dell’s Travis Vigil, SBP for ISG Product Management, said in a statement: “Dell Technologies believes that higher density provides the path to maximizing storage energy efficiency while minimizing datacenter footprint. As we strive towards density in our own solutions, we look forward to continued storage innovations like Solidigm’s new 122 TB D5-P5336 solid-state drive.” 

The U.2 format D5-P5336 is sampling with customers now and an E1.L format version should be sampling in January 2025.

Phison’s Pascari D205V

Phison revealed its Pascari brand 64 TB D200V with a PCIe gen 5 interface in October. Now it has a D205V available for pre-order with a 122.88 TB capacity. The details are:

Phison Pascari D200V

PCIe 5.0×4 (Single port)/PCIe 5.0 2×2 (Dual port), NVMe 2.0, ISE, TCG Opal Supported, NVMe-MI supported

  • Power Loss Protection (PLP), 128 Namespaces, DWPD: 0.3, MTBF: 2.5 million hours
  • Sequential Performance Up To:
    • Read: 14,600 MBps
    • Write: 3,200 MBps
  • Random performance up to:
    • Read: 3,000K IOPS (4K)
    • Write: 35K IOPS (16K) 

How does this compare to the Solidigm 122 TB drive? Excellently, as Phison’s SSD is much faster at random read and write IOPS and also sequential read bandwidth. Basically it leaves Solidigm in the dust, apart from sequential write speed where Solidigm is equal at 3.2 GBps, despite its slower PCIe gen 4 bus. It raises the question of how the Solidigm drive would perform with a PCIe gen 5 interface.

Michael Wu, Phison
Michael Wu

Michael Wu, GM and President, Phison US, banged the AI drum, stating: “With the acceleration in AI training and data-intensive workloads, there has been a tangible shift to a future-forward focus on storage as a critical component in capturing necessary volume to support data quality and integrity … Customers can essentially push past previous infrastructure barriers to continue to scale as the market demands.”

The Pascari D205V is available for pre-order now for expected shipping in early Q2 2025 in the U.2 and E3.L form factors.

Phison will be showing its aiDAPTIV+ Pro Suite at SC24. This is said to offer the first commercial end-to-end AI experience, processing “data from ingest to inference, providing a comprehensive software solution for training LLMs on-premises with customers’ domain specific data. Pro Suite delivers ease of use in domain training, while supporting data privacy, control, and affordability provided by aiDAPTIV+.”

Phison is also showing a PS7161 device, saying it’s the first PCIe gen 6 redriver available now for pre-order with expected shipping in Q1 2025.

Bootnote

Samsung talked about a 128 TB flash drive in August 2017 made with QLC flash. This was just talk at the time and it hasn’t actually appeared yet. Nimbus Data introduced a 100 TB ExaDrive SSD six years ago in May 2018. This had a 3.5-inch enclosure, a SATA interface, and was made from 2bits/cell MLC NAND. It was relatively slow, with 100,000 random read and write IOPS, and did not make much of an impression in the general SSD market. We envisage Micron, Samsung, Western Digital, and Kioxia will all produce 100 TB-plus SSDs in the next few months.

Nutanix provides cloud-native AI stack

Nutanix has built a cloud-native Nutanix Enterprise AI (NAI) software stack that can support the deployment of generative AI (GenAI) apps “in minutes, not days or weeks” by helping customers deploy, run, and scale inference endpoints for large language models.

NAI can run on-premises, at the edge, or in datacenters, and in the three main public clouds’ Kubernetes offerings – AWS EKS, Azure AKS, and Google GKE – as well as other Kubernetes run-time environments. This multi-cloud operating software can run LLMs with Nvidia NIM-optimized inference microservices as well as open source foundation models from Hugging Face. The LLMs operate atop the NAI platform and can access NAI-stored data.

Thomas Cornely, Nutanix
Thomas Cornely

Thomas Cornely, SVP for Product Management at Nutanix, stated: “With Nutanix Enterprise AI, we’re helping our customers simply and securely run GenAI applications on-premises or in public clouds. Nutanix Enterprise AI can run on any Kubernetes platform and allows their AI applications to run in their secure location, with a predictable cost model.”

NAI is a component of Nutanix GPT-in-a-Box 2.0, which also includes Nutanix Cloud Infrastructure, Nutanix Kubernetes Platform, and Nutanix Unified Storage, plus services to support customer configuration and sizing needs for on-premises training and inferencing.

Nutanix sees AI training – particularly large-scale generalized LLM training – taking place in specialized mass GPU server facilities. New GenAI apps are often built in the public cloud, with fine-tuning of models using private data occurring on-premises. Inferencing is deployed closest to the business logic, which could be at the edge, in datacenters, or in the public cloud. NAI supports these inferencing workloads and app locations.

Nutanix says NAI has a transparent and predictable pricing model based on infrastructure resources. This is in contrast to most cloud services that come with complex metering and unpredictable usage-based pricing.

Nutanix Enterprise AI deployment scenarios
NAI deployment scenarios

The usability and security angles have roles here. NAI offers an intuitive dashboard for troubleshooting, observability, and utilization of resources used for LLMs, as well as role-based access controls (RBAC) to ensure LLM accessibility is controllable and understood. Organizations requiring hardened security will also be able to deploy in air-gapped or dark-site environments. 

Nutanix diagram
Nutanix diagram

Nutanix suggests NAI can be used for enhancing customer experience with GenAI through improved analysis of customer feedback and documents. It can accelerate code and content creation by leveraging copilots and intelligent document processing as well as fine-tuning models on domain-specific data. It can also strengthen security – including leveraging AI models for fraud detection, threat detection, alert enrichment, and automatic policy creation.

We see NAI presented as an LLM fine-tuning and inference alternative to offerings from Microsoft, Red Hat, and VMware that should appeal to Nutanix’s 25,000-plus customers. Coincidentally, Red Hat has just announced updates to its enterprise offerings including OpenShift AI, OpenShift, and Developer Hub:

  • Developer Hub – Includes tools for leveraging AI to build smarter applications, including new software templates and expanded catalog options.
  • OpenShift AI 2.15 – Offers enhanced flexibility, optimization, and tracking, empowering businesses to accelerate AI/ML innovation and maintain secure operations at scale across cloud and edge environments.
  • OpenShift 4.17 – Streamlines application development and integrate new security features, helping businesses to tackle complex challenges.

On top of that, Red Hat has signed a definitive agreement to acquire Neural Magic – a pioneer in software and algorithms that accelerate GenAI inference workloads.

SUSE also presented its SUSE AI, described as “a secure, trusted platform to deploy and run GenAI applications” at KubeCon North America.

NAI and GPT-in-a-Box 2.0 are currently available to customers. More information here.

Hitachi jumps into QLC flash array pool and adds object storage to VSP One

Hitachi Vantara has added a low-cost all flash array and object storage to its Virtual Storage One (VSP One) portfolio.

The VSP One portfolio is described as unified hybrid cloud product suite, which includes the on-premises VSP One SDS Block, VSP One Block appliance and VSP One File offerings for on-premises use, and VSP One SDS Cloud (cloud-native SVOS) for the AWS cloud. Hitachi V has now extended VSP One with this new block product and a new object storage appliance. 

Octavian Tanese.

The company’s chief product officer, Octavian Tanase, stated: “Enterprises today are navigating an incredibly complex data landscape, with hybrid and multi-cloud environments and the growing influence of GenAI transforming how they operate.” 

The QLC flash array provides “high-density, cost-effective storage ideal for large-capacity needs.” It “features public cloud replication providing disaster recovery and higher data availability. It uses dual-ported Samsung 30TB QLC SSDs “offering the best combination of performance and availability.” Hitachi V blogger Michael Hay –   – says that the VSP One Block Array has operating software “SVOS improvements ensuring longer media life” for these drives but there are no details. 

There is also “additional telemetry, covering performance and wear data, that can be used to analyze individual arrays and our fleet with QLC, learning with our customers over time.” The wear rate telemetry is monitored by Hitachi Remote Ops and visible in the Clear Sight management facility.

We’re not given array details such as the number of drives per chassis, raw capacity, network connectivity, etc. But we are told there is a 4:1 data reduction guarantee plus a 100 percent data availability one as well. 

A VSP One Block datasheet says customers can choose either TLC or QLC drives, meaning VSP One Block QLC is basically just a SKU variation and not a separate product. The basic chassis takes up 2RU and can hold up to 3.6 PB of effective storage, implying 900TB raw at a 4:1 data reduction ratio, and 30 x 30TB SSDs, which seems high for a 2RU chassis with controllers inside it. A system can scale out to 65 appliances in total and individual appliances can have up to two NVMe drive expansion shelves. That would explain the 30 drive number, by having them spread across a base and pair of expansion chassis.

VSP One Block Appliance chassis.

There is 256 GBps of available Fibre bandwidth. A VSP One Block model can have Compression Accelerator Modules (CAMs) to offload data reduction processing workloads from controller processors. And patented Dynamic Carbon Reduction (DCR) technology optimizes power consumption by switching controller CPUs into low-power ECO mode during periods of low activity.

There is even less information available about VSP One Object than VSP One Block. We’re told the multi-node S3-compliant object storage product has been “engineered for scalability and provides a robust solution for managing massive volumes of unstructured data driven by AI workloads,” particularly in the media, healthcare, and finance markets. 

As we understand things, it’s basically updated Hitachi Content Platform (HCP) software – as a Learn more link on its web page takes you to an HCP webpage. We have no idea what storage media is used and have asked Hitachi V questions about this and the VSP One Block QLC appliance.

There’s a little more information about the Virtual Storage Platform One suite of products here but it’s mostly marketing.

DeepTempo unveils AI-powered app for detecting security incidents

AI infosec startup DeepTempo finds evidence of cyber security incidents by using deep learning to check log data and has launched its Tempo app to do this by running natively in Snowflake.

It has just emerged from stealth and its founding CEO is Evan Powell. Tempo uses a large language model, LogLM, that can identify incidents and work with admin staff on fixing them.

DeepTempo Snowflake diagram
DeepTempo Snowflake diagram

The Tempo app’s agentless LogLM detects anomalies in network traffic and provides additional context such as similar attack patterns from the MITRE ATT&CK matrix, potentially impacted entities, and other information needed by security operations teams for triage and response. DeepTempo claims customers get faster detection of attack indicators – including new and evolving threats – and can optimize security spend by running the DeepTempo software on their existing security data lakes.  

Powell stated: “Attackers are using AI and collaboration to surpass defenders in innovation. Our mission at DeepTempo is to return the initiative to the defenders. By making available our AI-driven security solution as a Snowflake Native App, we are able to leverage Snowflake’s high availability and disaster recovery along with their security reviews and controls. Our Tempo software is available with immediate availability to the thousands of Snowflake customers.”

A Tempo blog explains: “Built and pre-trained with the assistance of a major global financial institution, Tempo has demonstrated a unique blend of accuracy and practicality, with false positive and false negative rates lower than one percent after adaptation to a new user’s domain. Tempo has been initially optimized to work with Netflow data and DeepTempo is recruiting users with similar logs such as VPC Flow logs as design partners.

“Tempo can identify subtle deviations from normal behavior, including longer-duration attacks that might slip past traditional signature-based systems. This capability is particularly valuable in the face of innovative attackers, as Tempo doesn’t need to keep track of specific attack patterns. Instead, it simply recognizes when activities deviate from the norm, triggering detection for any threat that emerges.”

Tempo is claimed to save money “by enabling organizations to keep more of their logs within Snowflake and use their SIEMs primarily for incident response rather than log storage … In one case study involving a large financial institution, projected savings reached several million dollars, representing up to 45 percent of their existing SIEM spending. These savings stem from the ability to use Snowflake as the system of record instead of pushing NetFlow and VPC flow logs into a separate SIEM.”

DeepTempo actually uses software technology from another startup, Skidaway, co-founded by CEO Evan Powell and CTO Brennan Lodge. Lodge has a cyber security, data management, and financial services background, gained from working at JP Morgan Chase, the Federal Reserve Bank of New York, Bloomberg, and Goldman Sachs.

Powell has been the founding CEO for several acquired startups: Clarus Systems, DDN-acquired Nexenta, Brocade-bought StackStorm, and Kubernetes-focused and DataCore-acquired MayaData.

Skidaway has developed the deep learning Log Language Model (LogLM) software to detect cyber security incidents by analyzing and filtering raw log data that is used by DeepTempo in its Tempo app. The LogLM software can run on-premises in any Kubernetes-based workload management system, or in a data lake, like the Tempo Snowflake native app, for example, and can scale to handle petabytes of log data.

DeepTempo graphic
DeepTempo graphic

This incident detection is done without sending the raw data to SIEM (Security Information and Event Management) systems. 

LogLM is a generalized log analysis tool to detect anomalies and support troubleshooting. Skidaway claims existing log analysis tools are task-specific and need task-specific data sets with specialized log label pairs for each task. LogLM has an instruction-based framework that can interpret and respond to user instructions by generalizing across multiple log analysis tasks.

Eric Zietlow, the DevRel leader and platform lead at Skidaway, spent time at Powell’s MayaData startup. 

DeepTempo’s Tempo is available in preview mode and is the first native app for cyber security in the Snowflake Marketplace. Find out more about using Tempo inside Snowflake here.

Bootnote

A Cornell University software engineering paper on LogLM: From Task-based to Instruction-based Automated Log Analysis, discusses automatic log analysis and how to transform log label pairs from multiple tasks and domains into a unified format of instruction response pairs. The abstract reads: “Experimentally, LogLM outperforms existing approaches across five log analysis capabilities, and exhibits strong generalization abilities on complex instructions and unseen tasks.”

VDURA enhances VDP performance and scalability for AI

VDURA has upgraded its VDURA Data Platform (VDP) storage operating system to provide more performance, scalability, and simpler management – with a twin focus on AI and HPC users.

Ken Claffey, VDURA
Ken Claffey

VDURA was previously known as Panasas, and its flagship product was PanFS (parallel file system software). It claims VDP is a significant modernization on previous releases, with a move to a fully parallel, microservices-based architecture, a new flash-optimized metadata engine, and an enhanced object storage layer.

CEO Ken Claffey stated: “Our latest release simplifies data management while delivering exceptional performance and reliability. By enabling enterprises to scale seamlessly, VDURA accelerates AI initiatives and helps businesses tackle complexity to achieve transformative results.”

The VDP software introduces VeLO (Velocity Layered Operations) and a VPOD (Virtualized Protected Object Device) concept. VDP is deployed as discrete microservices, simplifying deployment across thousands of nodes and helping ensure linear performance scalability with an infinitely expandable global namespace.

VDURA Data Platform layers
VDP layers

It has intelligent data orchestration capabilities that enable optimal data placement across discrete tiers of storage, all sharing a unified data and control plane within a single namespace. 

VeLO is a key-value store used within VDP’s Director layer for handling small files and metadata operations, often found in AI workloads. VDP supports an infinitely scalable number of VeLO instances in the same Global Namespace. VeLO is optimized for flash and delivers up to two million IOPS per instance.

VPODs are discrete, virtualized, protected storage units, which provide the foundation of data storage in hybrid nodes. VPOD instances are infinitely scalable, operate within a unified global namespace, and provide high performance and flexible scalability. Data is safeguarded through erasure coding across multiple VPODs in a VDP cluster, with an optional, additional layer of erasure coding within each VPOD for enhanced protection. This multi-layered approach achieves up to 11 nines of durability with minimal overhead. Data reduction services further optimize efficiency, reducing costs and total ownership expenses. 

VDURA has also introduced a V5000 Certified Platform – a hardware architecture combining modular storage nodes that can be configured for maximum throughput, capacity, or IOPS, depending on the user’s requirements. 

VDURA V5000 diagram
VDURA V5000 diagram

It includes:

  • 1RU Director Nodes powered by the latest AMD EPYC 9005 processors, Nvidia ConnectX-7, Broadcom 200Gbit/sec Ethernet, SAS 4 adapters, and Phison PCIe NVMe SSDs, optimized for high IOPS and metadata operations through VeLO. 
  • Hybrid Storage Nodes incorporating the same 1RU server used with the Director Node and 4RU JBODs running VPODs for cost-effective bulk storage with high performance and reliability. 

    VDURA tells us that VDP supports a flexible ratio of HDD to flash content, meaning customers can balance capacity and throughput, while maintaining a consistent global namespace, and optimizing VDP configurations for cost, performance, and durability needs. We asked VDURA a few questions about this VDP release to find out more.

    Will The VDURA Data Platform (VDP) support GPU Direct? 

    Yes, VDURA is planning to support GPUDirect Storage (GDS), as well as RDMA and RoCE (v2), summer of 2025. Ultra Ethernet Transport support will follow afterwards.

    Do the V5000 Director Nodes support SAS? 

    The V5000 Director Nodes are all NVMe-based and do not use SAS. The Storage Nodes do support SAS 4 adapters to manage high-density, high-performance JBODs within the platform, enabling scalable and cost-effective bulk storage with optimized performance.

    How do V5000 Director nodes communicate with VPODs in hybrid nodes?

    V5000 Director Nodes communicate with VPODs over a high-speed network. This can be either Ethernet 100/200/400 or InfiniBand NDR. SAS is not used as a network between Director and Hybrid Nodes.

    How does a parallel file system “talk” to a VPOD object storage backend? Do the Director Nodes do file-to-object mapping? 

    The VDURA Data Platform utilizes a unified namespace where Director Nodes handle metadata and small files via VeLO and larger data through VPODs. The Director Nodes manage file-to-object mapping, allowing seamless integration between the parallel file system and object storage.

    What Phison SSDs are used in the V5000 Director Nodes? 

    The V5000 Director Nodes are equipped with Phison Pascari X200 PCIe NVMe SSDs with various capacity options.

    Is VDP with VeLO and VPODs a disaggregated architecture? 

    Yes, VDP with VeLO and VPODs instances operates as part of a disaggregated, composable architecture.

    • The software instances are disaggregated from the physical hardware via the distributed microservices architecture. 
    • The metadata and data are disaggregated to different logical and physical domains within the same global namespace. 
    • The hardware platforms are themselves disaggregated and composable with compute-intensive Director Nodes that house the VeLO instances and then the Storage Nodes that house the VPOD instances, with a number of Director Nodes and Storage Nodes supported per cluster/namespace, all connected via a high-speed network.

    Will VDP run in the public clouds, considering it is microservices-based and supports storage classes? 

    Yes, we’re providing early access to the cloud edition of VDP to some of our customers now. General availability is set for the first half of 2025, enabling VDURA to seamlessly support both on-prem and public cloud instances of VDP. 

    Can VDURA offer thoughts on how VDP compares and contrasts to (1) Storage Scale, (2) DDN’s Lustre, and (3) WEKA’s data platform

    VDURA’s platform combines the high-performance parallel file system capabilities with object storage in a single namespace.

    IBM Storage Scale (aka GPFS) has a similar design point origin to Lustre as a “scratch” file system. IBM has developed its data protection capability over time bolting on additional layers into the ever-deeper, more complex stack. However, it does not offer the same level of protection as VDP and the biggest difference we hear from customers that have experience of both is that VDP offers far greater ease of use and reliability.

    DDN’s Lustre is a “scratch” file system. It is not designed to protect your data. Nor is it designed for the reliability or the ease of use required by enterprise customers. At the most basic level, it requires a third-party/additional software stack to provide data protection and handle hardware failures. In DDN’s case, they rely on their old RAID stack to provide this data protection. This discrete RAID layer is itself based on the legacy HA pair controller architecture. There are lots of problems with this approach at scale and that is why the availability and durability of these types of systems degrade as the number of storage nodes grows. 

    This is in stark contrast to VDP, which has advanced data protection built into the file system itself. Indeed, we protect data at multiple layers in the stack, which means that as the number of nodes in the cluster grows, our availability and durability only increases. This single integrated stack approach – and its self-healing capabilities – all help culminate in the superior ease of use and reliability that our customers enjoy.

    Compared to WEKA, VDURA offers better scalability and integration for hybrid storage, leveraging both flash and HDD, which allows more flexible tiering with significant cost benefits and ease of use. If we compare V5000 with the latest WekaPOD announcement, we can deliver the same or even better performance for a much lower price.

    Micron launches world’s fastest 60 TB PCIe 5 SSD

    Micron’s 6550 ION SSD matches the 61.44 TB capacity of Samsung, Solidigm, and WD’s competing drives, but promises higher performance as it uses the PCIe gen 5 bus instead of PCIe gen 4.

    The 6550 ION is to be used in networked AI data lakes, and for ingest, data preparation and checkpointing, file and object storage, public cloud storage, analytic databases, as well as content delivery. 

    Micron says the drive is the industry’s first E3.S and PCIe 5 60 TB SSD. It has OCP 2.5 support with active state power management (ASPM). This allows the drive to idle at 4 watts in the L1 state versus 5 watts in the L0 state, improving energy efficiency by up to 20 percent when idling. 

    Alvaro Toledo, Micron
    Alvaro Toledo

    Alvaro Toledo, Micron’s Data Center Storage Group VP and GM, said in a statement: “Featuring a first-to-market 60 TB capacity in an E3.S form factor and up to 20 percent better energy efficiency than competitive drives, the Micron 6550 ION is a game-changer for high-capacity storage solutions to address the insatiable capacity and power demands of AI workloads.” 

    He said it “achieves a remarkable 12 GBps while using just 20 watts of power, setting a new standard in datacenter performance and energy efficiency.” 

    The 6550 ION comes in E3.S, E1.L, and U.2 form factors. It is an all-Micron drive using in-house DRAM, NAND, controller, and firmware. From the security point of view, it supports SPDM 1.2 for attestation, SHA-512 for secure signature generation, and is TAA-compliant and FIPS 140-3 L2 certifiable.

    The drive’s endurance is one full drive write a day for the five-year warranty period. It’s built with what Micron calls its G8 NAND, with 232-layers. In June this year, Micron announced 276-layer 3D NAND, saying it was G9 NAND.

    Micron 6550 ION SSDs
    Micron 6550 ION SSDs

    The 6550 ION follows on from the existing 6500 ION, which is a 30.72 TB PCIe gen 4 SSD with an NVMe link. It was announced in May 2023, built with 232-layer TLC NAND, and marketed as having TLC performance with QLC pricing. The drive was produced in U.3 and E1.L form factors, and the 6550 comes in these plus the E3.S form factor. The 6550 is built with Micron’s G8 3D NAND in TLC format, and claimed to be “one to three NAND generations ahead of competing 60 TB SSDs,” meaning drives from Samsung, Solidigm, and Western Digital, which all use the PCIe gen 4 bus.

    We think Micron has renumbered its 3D NAND generations and missed out a generation as well. B&F asked to clarify its 3D NAND generation numbering and layer counts.

    Micron-supplied 3D NAND generation table.

    For reference, Samsung’s QLC BM1743 uses its seventh generation 176-layer 3D NAND. Solidigm’s QLC D5-P5336 has 196-layer NAND, and Western Digital’s TLC DC SN655 drive uses 112-layer NAND with a PCIe gen 4 interface.

    The actual 6550 performance numbers are 12.5 GBps for sequential read/write bandwidth, 1.6 million random read IOPS, and 70,000 random write IOPS. This is faster than the competing drives named above:

    • Samsung BM1743 – 7.2/2 GBps sequential read/write and 1.6 million/110K random read/write IOPS
    • Solidigm D5-P5336 – 7/3.3 GBps sequential read/write and 1.005 million/43K random read/write IOPS
    • WD DC SN655 – 6.8/3.7 GBps sequential read/write and 1.1 million/125K random read/write IOPS

    As the 6550 uses the faster PCIe gen 5 bus, the sequential speed advantage is hardly surprising. It is slower in terms of random read and write IOPS than Samsung’s BM1743.

    Micron claimed that “as the world’s first E3.S 60TB SSD, the 6550 offers best-in-class density, reducing rack storage needs by up to 67 percent.” It can “store over 1.2 petabytes per rack unit” and by “using a 1U high-density server, such as the HPE ProLiant DL360 Gen11 that can accommodate 20 E3.S drives per rack unit, operators can load servers in a single rack with 44.2 petabytes.”

    The 6550 ION can be fully written in 3.4 hours, while competing drives take up to 150 percent longer to fill. Micron says the 6550 ION delivers:

    • 179 percent faster sequential reads and 179 percent higher read bandwidth per watt
    • 150 percent faster sequential writes and 213 percent higher write bandwidth per watt
    • 80 percent faster random reads and 99 percent higher read IOPS per watt

    Specifically with AI workloads, it has:

    • 147 percent higher performance for Nvidia Magnum IO GPUDirect Storage (GDS) and 104 percent better energy efficiency
    • 30 percent higher 4KB transfer performance for deep learning IO Unet3D testing and 20 percent better energy efficiency
    • 151 percent improvement in completion times for AI model checkpointing while competitors consume 209 percent more energy

    With large AI training jobs involving models with billions of parameters, checkpointing is sometimes run as often as hourly, causing GPUs to be idle for minutes or even tens of minutes while the checkpoint is written to SSD storage. Faster checkpoints pay off with faster training job times and fewer idle GPU hours.

    45Drives adds ransomware protection and encryption to Ceph

    Open source storage systems supplier 45Drives has developed SnapShield ransomware protection and CephArmor encryption for Ceph.

    Ceph, developed by RedHat, is an open source object, block, and file storage with three copies of data kept for reliability. 45Drives is a Protocase subsidiary with offices in North Carolina and Nova Scotia. It supplies the Storinator storage server, Stornado all-flash server, Proxinator virtualization server, Destroyinator drive wiping, enterprise drives, plus other products.

    SnapShield uses real-time behavioral analysis and functions as a “ransomware-activated fuse,” snapshotting files every five minutes. When it detects ransomware, it disconnects the compromised client from the server and snapshots existing files. The theory is that the attack is stopped quickly after it starts and damage is limited.

    45Drives president Doug Milburn claimed: “It catches the attack within a few tens of files.” 

    SnapShield maintains detailed logs of the malicious activity, produces a list of damaged files, and offers a restore function to “quickly repair” any affected files. 

    45Drives diagram
    45Drives diagram

    There is no requirement for client-side agents and it has minimal impact on system performance. It’s compatible with 45Drives storage systems running ZFS or Ceph with Windows file sharing. 

    45Drives says Ceph lacks native block and file encryption and it has partnered with the University of New Brunswick (UNB), Faculty of Computer Science, to develop CephArmor object-level encryption to fill the gap. This is an enhancement to Ceph’s Reliable Autonomic Distributed Object Store (RADOS) layer and encrypts data, at object-level, before it is stored. Evaluations on the Storinator hardware have shown that the added CephArmor security layer maintains Ceph’s performance levels.

    We’ll have to wait, though, as the CephArmor project is expected to be ready for implementation by the end of 2025.

    Check out a SnapShield video here.

    Apache Cassandra survey highlights growing adoption for AI workloads

    Apache Cassandra is an established open source, NoSQL database designed for handling workloads across commodity servers. So what applications is it now supporting?

    The annual Cassandra Community survey has landed, revealing Cassandra’s evolving usage. Among respondents, 41 percent said Cassandra was their organization’s primary database, with more than 50 percent of enterprise data going through it. Over a third (34 percent) said 10 to 50 percent of their enterprise data was handled by Cassandra.

    “Scalability” was cited by 78 percent of respondents as a reason for using the database, while 73 percent claimed it was down to “performance.”

    Cassandra Community Survey November 2024 chart
    Cassandra Community Survey November 2024 chart

    Among multiple use cases at organizations, 47 percent use the database for time series data, and 34 percent use it for event logging. In addition, 31 percent use the platform for data aggregation.

    Other significant uses include online retail/e-commerce, user activity tracking, user profile management, fraud detection, and backup and archiving.

    In the future, 43 percent vowed to use Cassandra for AI workloads, and 38.5 percent planned to use it for machine learning workloads. Currently, 36 percent of users said they were already “experimenting” with the database to run at least one generative AI app.

    In terms of data volumes, 30 percent currently run over 100 TB on Cassandra, and 27 percent handle 10 to 100 TB on it. Just under a quarter (23 percent) put 1 to 10 TB through it.

    The survey found that 35 percent of Cassandra workloads were already in the cloud, and 25 percent of organizations pledged to put 10 to 50 percent of their workloads into the cloud over the next 12 months. Eight percent said they would be moving at least half of their workloads into the cloud in the next year.

    Some 37 percent of Cassandra users had been using the platform for five to ten years, and nearly a fifth (18 percent) had used the database for upwards of ten years.