Cohesity has unveiled a boosted Google Cloud partnership for large language model (LLM) access, added to its own AI technologies to improve LLM execution, and detailed an expanded Data Security Alliance.
The three-pronged announcement came at Cohesity’s three-day Catalyst virtual conference. The Google Cloud Platform (GCP) element is focused on using generative AI to investigate an organization’s entire data contents better by integrating Cohesity’s Data Cloud offering with GCP’s Vertex AI.
Sanjay Poonen
Cohesity CEO and president Sanjay Poonen, said: “To apply generative AI transformatively, businesses need to be able to easily get rapid insights from their data utilizing cutting-edge and leading AI/ML models.”
Google, which announced its Palm 2 LLM at Google IO earlier this month, can help supply them.
Google Cloud CEO Thomas Kurian said: “Vertex AI is one of the best platforms for building, deploying, managing and scaling ML models – and we’re excited that Cohesity is joining our growing open ecosystem to help more customers get value from their data via AI. Cohesity’s excellent data security and management capabilities, combined with Google Cloud’s powerful generative AI and analytics capabilities, will help customers get exceptional insights into their backup and archived data.”
GCP’s Vertex AI Workbench is a set of services for creating AI/ML models. It’s a platform or framework for data scientists to deploy machine learning models, ingest data, and analyze the results through a dashboard. Users with different AI/ML experience levels have access to a common toolset across data analytics, data science, and machine learning.
Cohesity and Google reckon that their joint customers will be able to get “human readable” insights into the data they’re securing and managing on Cohesity’s platform. Poonen claimed Cohesity “provides phenomenal search via our built-in indexing capabilities,” and has robust security protocols to keep the data private. Chatbots will provide a simpler way to search through Cohesity content.
Cohesity Turing and RAG
Turing is Cohesity’s own set of of non-generative AI/ML capabilities and technologies that are integrated into its Data Cloud. They include:
Ransomware anomaly detection: Uses modeling and data entropy detection to “see” anomalies in data ingested, which can provide early warnings of a hidden threat.
Threat intelligence: Provides curated and managed threat feeds used in conjunction with machine learning models to detect threats.
Data classification: Helps ensure that organizations can identify their most sensitive data and its location.
Predictive capacity planning: Forecasting capacity utilization based on previous capacity utilization and including a what-if simulator.
Cohesity is adding Retrieval Augmented Generation (RAG) AI model workflows to Turing, which it says can help customers get deeper insights and discovery from data or find content in petabytes of data faster. It has filed a patent application in this area. A Cohesity blog says RAG enables LLMs to generate more knowledgeable, diverse, and relevant responses and is a more efficient approach to fine-tuning such models.
Cohesity is not developing its own LLM. Instead it wants to make LLMs more efficient when looking into Cohesity datasets. Think faster, better, super-charged search. Poonen has talked to Microsoft about this. A video helps set the RAG scene:
The blog says: “The retrieval-augmented response generation platform under development by Cohesity accepts a user or machine driven input – such as a question, or a query. That input is then tokenized with some keywords extracted that are used to filter the petabytes of an enterprise’s backup data to filter down to a smaller subset of data. It then selects representations from within those documents or objects that are most relevant to the user or machine query. That result is packaged, along with the original query, to the Language Model (such as GPT4) to provide a context-aware answer. This innovative approach ensures that the generated responses are not only knowledgeable but also diverse and relevant to the enterprise’s domain-specific content.”
Specifically: “By using RAG on top of an enterprise’s own dataset, a customer will not need to perform costly fine-tuning or initial training to teach the Language Models ‘what’ to say. … Leveraging RAG always provides the most recent and relevant context to any query.”
Cohesity RAG should be available in the near-future but no specific date was given.
ServiceNow: ServiceNow Security Operations provides closed-loop detection and response for ransomware attacks via SOAR (Security Orchestration, Automation, and Response) integration to create workflows through their IT service management (ITSM) offering. Customers can access and address threats based on the potential impact to the business.
Tenable: The updated Tenable integration provides improved scalability so that snapshots can be scanned rapidly and improved vulnerability scanning used proactively, as part of cyber resilience best practices. Tenable powers Cohesity’s CyberScan capability.
Kioxia has replaced its gumstick client SSD with a new model with twice the capacity and double the random write speed of its predecessor.
The BG5 with its 1TB maximum capacity and PCIe gen 4 interface has been replaced by the BG6 with the same interface but a 2TB capacity. Like the BG5 it is an M.2 format drive in either 2230 or single-sided 2280 format and uses TLC (3 bits per cell) flash with a host-managed DRAM buffer. The BG5 used 112-layer 3D NAND, Kioxia’s BiCS 5 generation, but the new BG6 uses 162-layer BiCS 6 flash.
Neville Ichhaporia, Senior Veep and GM of Kioxia America’s SSD business unit, said: “Our new Kioxia BG6 SSDs deliver increased performance and density in a small footprint, making them well-suited to today’s ‘work and play from anywhere’ lifestyle.”
The BG6 has across the board performance improvements compared to the BG5, both random and sequential:
The random write IOPS have doubled from the BG5’s 450,00 to the BG6’s 900,000. Other speeds have increased as well but not by so much. The 2TB version goes faster than the 1TB version and we can expect the 250GB and 512GB drives to be slower again when they become available.
Kioxia chart
Kioxia has not supplied any endurance numbers for the BG6. We asked for them and a spokesperson said Kioxia “will release the full specification in July”. It says the new drive has:
Support for the latest TCG Pyrite and Opal standards.
Power Loss Notification signal support to protect data against forced shut downs.
Sideband signal (PERST#, CLKREQ# and PLN#) support for both 1.8V and 3.3V.
Supports platform FW recovery feature.
Support for the NVMe technology 1.4c feature set and basic management command over System Management Bus (SMBus), enabling tighter thermal management.
Having the drive use a DRAM buffer in the host makes it cheaper to manufacture. The BG6 does not have 256GB and 512GB capacities available yet, as they are still under development. All-in-all the BG6 looks a straightforward better drive than the BG5, thanks to the extra layers in the NAND and, no doubt, updated firmware in the controller.
Kioxia hasn’t announced pricing information yet but will have the BG6 series hardware on display at Dell Technologies World in Las Vegas this week. The series starts sampling in the second half of 2023, when OEM customers can give it the once-over.
Bootnote. This writer’s MacBook Air with its 256GB flash drive looks positively anaemic compared to a 2TB M.2 SSD. Having 2TB available would be great. And if Kioxia populated the other side of its M.2 2280 variant we could envisage a 4TB drive. That’s serious storage for a notebook.
Data protector Acronis announced general availability of Acronis Advanced Security + Endpoint Detection & Response (EDR) for Acronis Cyber Protect Cloud, with new capabilities such as AI-based attack analysis. EDR is designed for MSPs.
…
Open source data integration platform supplier Airbyte has launched a no-code connector builder that makes it possible to create new connectors for data integrations. The builder enables non-engineers such as data analysts to create an extract, load, transform (ELT) connector within five minutes – a process that traditionally could take more than a week.
…
Microsoft is publicly previewing Azure Container Storage. It provides a consistent experience across different types of storage offerings, including Managed options (backed by Azure Elastic SAN), Azure Disks, and ephemeral disk on container services. You can create and manage block storage volumes for production-scale stateful container applications and run them on Kubernetes. It’s optimized to enhance the performance of stateful workloads on Azure Kubernetes Service (AKS) clusters by accelerating the deployment of stateful containers with persistent volumes and improving quality with reduced pod failover time through fast attach/detach. Details in a blog.
…
CData announced that CData Sync is available on SAP Store and allows CData to deliver data integration to the data sources and SAP databases that organizations use.
Brantley Coile
…
Brantley Coile, the founder and CEO of Etherdrive SAN company Coraid, has shaved off his beard. He says he’s reinventing Coraid, now The Brantley Coile Company, as well.
…
External data integrator Crux has announced its Crux External Data Platform (“EDP”) SaaS offering to automate the onboarding of any external dataset directly from vendors into a customer’s store. The self-service cloud platform allows data teams to onboard and transform external data for analytics use up to 10 times faster than traditional manual methods. External data from governments, non-profits, and commercial data vendors is a critical business resource in many sectors such as finance, supply chain, retail, healthcare, and insurance. Crux has partnerships with over 265 leading data providers including MSCI, Moody’s, S&P, SIX, FactSet, and Morningstar.
…
File migrator and manager Datadobi has a blog about not forgetting stale data in WORM storage. It will need deleting eventually. Read the blog here.
…
data.world has announced the introduction of the data.world Data Catalog Platform with generative AI-powered capabilities for improving data discovery. data.world is the industry’s most popular data catalog with more than two million users, including enterprise customers with tens of thousands of active users.
…
Flipside has said its Flipside Shares offering is available on Snowflake Marketplace and provides joint customers with access to modeled and curated blockchain data sets, without the hassle of managing nodes, complex data pipelines, or costly data storage. Flipside provides access to the greatest number of blockchains and protocols in Web3, including Ethereum, Solana, Flow, Near, Axelar, and more than a dozen others.
…
IBM has acquired Israeli company Polar Security whose agentless product allows customers to discover, continuously monitor and secure cloud and software-as-a-service (SaaS) application data, and addresses a shadow data problem. Polar Security is a pioneer of data security posture management (DSPM) – an emerging cybersecurity segment that reveals where sensitive data is stored, who has access to it, how it’s used, and identifies vulnerabilities with the underlying security posture, including with policies, configurations, or data usage.
…
GPU-powered RAID card startup GRAID has signed up Trenton Systems as a partner. US-based Trenton Systems is a designer and manufacturer of ruggedized, cybersecure, made-in-USA computer systems for defense, aerospace, test and measurement, industrial automation, and other major industries.
…
Data manager and lifecycle organizer Komprise has new governance and self-service capabilities that simplify departmental use of its Deep Analytics – a query-based way to find and tag file and object data across hybrid cloud storage silos. It’s providing share-based access for groups, a new directory explorer, and exclusion query filters in file index search. Komprise says its latest release makes it dramatically easier for teams to find and manage their own data, while simplifying governance for IT.
…
Micron plans to install a 1gamma DRAM manufacturing line at its fab in Hiroshima, Japan, according to a Nikkei Asia report. This is part of an up to $3.6 billion (500 billion yen) investment program in Japan and will involve extreme ultraviolet (EUV) lithography, which will also be used in its Taiwan DRAM fab.
…
N-able CEO John Pagliuca has signed the CEO Action for Diversity & Inclusion Pledge, reinforcing company support and commitment towards its Diversity, Equality, and Belonging philosophy. CEO Action for Diversity & Inclusion was founded in 2017 and is the largest CEO-driven business commitment to advance diversity and inclusion in the workplace with more than 2,400 CEOs having pledged to create more inclusive cultures.
…
Pure Storage says Virgin Media O2, one of the UK’s largest entertainment and telecommunications operators, is a customer, using its portfolio – including FlashArray//X and Evergreen//Forever – to support its 47 million connections.
…
Security-focused data protector Rubrik has added user intelligence capabilities that utilize time series data recorded over consistent intervals in Rubrik Security Cloud to proactively mitigate cyber risks before they can be exploited. Customers will have visibility of the types of sensitive data they have, which users have access to the data, how that access has changed over time, and whether that access may pose any risk to their business.
…
Samsung has announced development of the industry’s first 128-gigabyte DRAM to support Compute Express Link (CXL) 2.0. It worked with Intel and its Xeon CPU to do so. The new CXL DRAM supports PCle 5.0 interface (x8 lanes) and provides bandwidth of up to 35GBps. CXL 2.0 supports memory pooling – a memory management technique that binds multiple CXL memory blocks on a server platform to form a pool, and enables hosts to dynamically allocate memory from the pool as needed. It will mass produce the product later this year.
…
SingleStore has launched SingleStore Kai for MongoDB, an API that turbocharges (100-1,000x) real-time analytics on JSON and vector-based similarity searches for MongoDB-based AI applications, without the need for any query changes or data transformations. The new API is MongoDB wire protocol compatible, and enables developers to power interactive applications with analytics with SingleStoreDB using the same MongoDB commands. It is available at no extra cost and is open for public preview as part of the SingleStoreDB Cloud offering. SingleStore is also introducing replication (in private preview) that can replicate MongoDB collections into SingleStoreDB.
…
The Storage Management Initiative’s SNIA Swordfish v1.2.5 is out for public review. This new bundle provides a unified approach to manage storage and servers in hyperscale and cloud infrastructure environments, making it easier for IT administrators to integrate scalable solutions into their datacenters. This new version provides:
Expanded support for profile and mapping in the Swordfish NVMe Model Overview and Mapping Guide
New use cases and section to the Swordfish Scalable Storage Management API User’s Guide
Functionality enhancements supporting both traditional and NVMe and NVMe-oF storage
…
The Information reports that Snowflake wants to acquire search engine startup Neeva to help its customers search documents stored in Snowflake’s data warehouse. Neeva has added a large language model front end and has a vector database. Neeva seems to have shopped itself to Databricks as well.
…
StorMagic says German manufacturers Witholz GmbH and WST Präzisionstechnik have deployed StorMagic SvSAN to simplify their IT environments and reduce hardware requirements, resulting in maximum operational efficiency, high availability of data and 100 percent uptime at a lower cost.
…
Veritas has updated its Veritas Partner Force program for FY 2024. It is incentivized with improved rewards for cloud-based deals, a simplified transaction process, and new training and accreditation programs. It will also support Veritas in delivering growth on the Veritas Alta secure cloud data management platform, continuing to modernize routes to market by improving available resources to two-tier channel and managed service providers.
…
Veza says its Veza Authorization Platform is available on the Snowflake Data Cloud. Joint customers can manage access permissions and secure their sensitive data at scale. Veza’s Authorization Platform provides companies with visibility into access permissions across all enterprise systems, enabling customers to achieve least privilege for all identities, human and non-human, including service accounts.
…
Scale-out, parallel filesystem supplier WEKA has unveiled v4.2 of its Data Platform with advanced data reduction and a container storage interface (CSI) plug-in for stateful containerized workloads that can lower storage and operational costs. It also offers significant performance improvements in the cloud (6x over alternatives in Azure), providing the scale and application data protection needed to support thousands of containers for cloud-native artificial intelligence (AI) and machine learning (ML). The advanced block-variable differential compression combined with cluster-wide data deduplication delivers data reduction at scale for an estimated cost saving of up to 6x for AI/ML training models, 3-8x for exploratory data analysis, and up to 2x for bioinformatic or large-scale media and entertainment workloads like visual effects.
…
HPE’s Zerto business unit has announced real-time encryption detection mechanism and air-gapped recovery vault features as part of Zerto 10, which includes monitoring for encryption-based anomalies. This capability monitors and reports on encryption as data streams in and can detect anomalous activity within minutes to alert users of suspicious activity. It can provide early warning of a potential ransomware attack – unlike backups, which can be up to a day old – and help pinpoint when an attack was initiated, so data can be recovered to a point seconds before it began. The Cyber Resilience Vault provides the ultimate layer of protection allowing for clean copy recovery from an air-gapped setup if a replication target is also breached.
…
Zerto also announced the launch of Zerto 10 for Azure, delivering disaster recovery and mobility. It delivers a new replication architecture for scale-out efficiency and native protection of Azure Virtual Machines with support for multi-disk consistency for VMs in Azure. It’s available in the Azure Marketplace.
DDN has unveiled an upgraded AI400 X2 ExaScaler array for AI and machine learning storage workloads that uses QLC SSDs and adds a compression facility for higher capacity.
ExaScaler is DDN’s Lustre-based scaleout and parallel file system software. QLC SSDs have a 4bits/cell format enabling the die to hold more data than a TLC (3bits/cell) arrangement at the cost of slower IO speed and shorter endurance. DDN says its new compression feature has been optimised for HPC and AI workloads.
Kurt Kuckein, DDN’s marketing VP, told us: “Over the last four or five years, we’ve seen this uptake in enterprise customer interest around DDN system, specifically driven by these AI algorithms. And that has really taken off this year with the broad interest in generative AI, ChatGPT and others have really driven interest in our AI solutions, especially in conjunction with the Nvidia SuperPOD systems.”
Senior Veep of Products James Coomer tells us that DDN has about 48 AI400X2 arrays supporting Nvidia’s largest SuperPODs: “All the other SuperPODs in the world, the vast majority of them, are running just multiples of the same unit.”
Adding QLC flash and compression to the AI400 X2 array “delivers both the best performance, as well as really good flash capacity for customers.” This system can provide 10x more data than competing systems and use a fraction of their electrical energy, he claimed. It uses 60TB QLC drives, enabling 1.45PB capacity in a 2RU x 24-slot chassis, doubling capacity per watt compared to the 30TB SSDs available from other suppliers.
The AI400X2 QLC uses a standard AI400X2 controller (storage compute node), in a 2RU chassis. It has 732TB of TLC SSD storage and a multi-core, real-time RAID engine and controller combo that can pump out 3.5 million IOPS and 95GBps. This can have two, four or five SP2420 QLC SSD expansion trays added to it, linked across NVMe/oF and Ethernet. Each tray holds up to 2.9PB of raw QLC capacity. That’s 2.3PB usable which, after compression, becomes 4.7PB effective.
The maximum effective QLC capacity is 11.7PB in a fully configured system. DDN claims the new array provides an up to 80 percent cost saving versus a comparable capacity TLC array, and enables apps to run faster than an NFS array. The added client-side compression uses array CPU cycles but, because the resulting dataset is smaller, the overall read and write performance is about the same.
DDN says its QLC version of the AI400X2 has a better price per flash TB than its existing TLC version which delivers better IOPS, up to 70 million, and throughput per rack. A hybrid TLC flash/disk system offers an even lower price per TB. It says it can meet datacenter AI storage budgets at three levels: either optimized for sheer performance, for price/performance, or for lower costs.
It is generally thought that the larger the model dataset used for training machine learning models, the better the result. That would encourage use of DDN’s AI400X2 QLC array. DDN also sees possibilities for it in other application areas, such as realistic 3D and immersive universes in gaming, protein and molecule creations for drug discovery, and autonomous driving.
DDN says its AI400X2 QLC system design does not need the internal switches and networks used by scale-out NAS systems, which are based around a controller chassis talking to flash JBOFs through a switched network. That helps lower its rackspace occupancy, cost and management complexity.
Coomer said: “Today’s QLC scale-out NAS systems offer low cost and high capacity, but they are extremely inefficient with IOPS, throughput and latencies, making them unusable for high-performance environments such as AI, machine learning, and real-time applications.”
Given that VAST Data has just announced SuperPOD certification for its scale-out NAS system, and says that parallel file systems are complex compared to NAS, we are going to see the two competing for the same customers. We could see customers new to AI model training who currently use NAS and not a parallel file system, go with VAST, in preference to DDN. Existing parallel file system users could possibly find that the AI400X2 QLC slides more easily into their workflows than a NAS-based system.
The 60TB drives make a huge capacity increase possible over 30TB drives and we know Solidigm has 60TB QLC SSDs coming. Kioxia and other NAND fabricators/SSD suppliers are bound to follow suit but maybe not Micron – it’s plugging away at higher capacity TLC drives built with 232-layer technology.
DDN will ship its AI400X2 QLC systems in the June-August period.
There is quite a lot of tech terminology that’s specific to the storage industry, and if you’re like us, you might even have jogged your memory with prior articles in the Blocks & Files site to discover what the definition of terms like B-Tree, LSM-Tree or yobibyte mean when you come across them.
But that’s not the easiest way to get to the bottom of the problem in a hurry when all you want is a quick and concise description of what a tech term means. It’s not a new problem. Businesses like Gartner, HPE, IBM and Kioxia each have glossary mini-sites to fix the same problem – generally using an index page with links to individual entries which explain what a term means.
And who better to create a storage news glossary mini-site than Blocks & Files? We’re obsessed with the tech. So we’ve cooked up an index listing all the storage tech terms we could initially think of, just over 300 so far. Here’s a glimpse of what that looks like:
Each term listed is URL-linked to an explainer definition and, where we think it’s needed, a mini-article. There is a link to the glossary mini-site on the Blocks & Files homepage to provide a single point of entry:
We tried to balance the economy of a mere definition with a bit of an explanation where it seemed to be a good idea. The SR-IOV entry, for example, says SR-IOV stands for Single Root I/O Virtualization and then provides a short explanation of what this means as well.
Getting the balance right here is difficult. Our glossary is not an encyclopedia, nor does it try to emulate Wikipedia, nor be as comprehensive as HPE. It’s intended to be a reasonably quick mini-reference. Do contact us if you think something can be improved, is wrong or missing. There’s a contact webpage you can use or you can message me on Twitter at @Chris_Mellor or drop me a line on LinkedIn where I can be found as Chris Mellor.
So what is a yobibyte? It’s 1,024 zebibytes with a link to a Decimal and Binary Prefix entry to explain what that is. Exabytes, exbibytes and other binary and decimal prefixes are all covered there.
Open source Velero has recorded more than 100 million Docker pulls – making it among the most popular Kubernetes app backup software, says CloudCasa, which supports it.
CloudCasa is the Kubernetes backup business of Catalogic that’s heading for a spin-out. Its cloud-native product integrates with Kubernetes engines on AWS, Azure, and GCP, and can see all the K8s clusters running through these engines. Velero provides snapshot-based backup for Kubernetes stateful containers and can run in a cloud provider or on-premises. CloudCasa supports Velero and provides a paid-for support package.
COO Sathya Sankaran told us the Velero stats imply “at least a million clusters are downloading.”
He says he had a conversation with a VMware product manager at KubeCon who told him that “they estimate about one-third of all Kubernetes clusters have been touched by Velero and at some point have had Velero installed and running … it’s a very substantial market presence.”
Sankaran added: “This is already a community ecosystem, driven very strongly by what the rest of the community thinks is good or bad.
We have asked business rivals Veeam, Pure, and Trilio what they think.
Sankaran says CloudCasa for Velero is the only Kubernetes Backup-as-a-Service offering with integration across multiple public clouds and portability between them. It offers a swathe of extra features over the base Velero provision (think Red Hat and Linux.)
Sankaran says: “Velero wants to become the Kubernetes backup standard. The commercial backup products are pre-packaged… Velero wants to be a plug-in engine, useable by new storage products as well as the historic incumbents.”
The exec’s hope is that CloudCasa can overtake rivals Kasten, Portworx, and Trilio by riding what he sees as a wave of Velero adoption, particularly in the enterprise, by offering them a multi-cluster and anti-lock-in offering. K8s app protection is different from traditional backup, says Sankaran, who claims layering it on to legacy backup is the wrong approach.
Whether it’s wrong will be decided by the market, by whether enterprises agree they need special (Velero-based) protection for their K8s apps or such protection provided by their incumbent data protection supplier.
VAST Data says its all-QLC flash file storage has been certified as an Nvidia SuperPOD data store.
Nvidia’s SuperPOD houses 20 to 140 DGX A100 AI-focused GPU servers and uses its InfiniBand HDR (200Gbps) network connect. The DGX A100 features eight A100 Tensor Core GPUs, 640GB of GPU memory and dual AMD Rome 7742 CPUs in a 6RU box. It also supports BlueField-2 DPUs to accelerate IO. The box provides up to 5 petaFLOPS of AI performance, meaning 100 petaFLOPS in a SuperPOD with 20 of them.
18-rack SuperPOD
VAST CEO and co-founder Renen Hallak said: “VAST’s alliance and growing momentum with Nvidia to help customers solve their greatest AI challenges takes another big step forward today … The VAST data platform brings to market a turnkey AI datacenter solution that is enabling the future of AI.”
The VAST pitch is that its Universal Storage system brings to market the first enterprise network attached storage (NAS) system approved to support the Nvidia DGX SuperPOD.
VAST Data co-founder and CMO Jeff Denworth told us: “For years customers have not had an enterprise option for these large systems, since the AI system vendors need to adhere to a very limited set of offerings. Many were burned by other NFS platforms in the past.”
A VAST statement said: “AI and HPC workloads are no longer just for academia and research, but these are permeating every industry and the enterprise players that own and manage their own proprietary AI technologies are going to be differentiated going forward. Historically, customers building out their supercomputing infrastructure have had to make a choice around performance, capabilities, scale and simplicity.”
The company reckons its storage system provides all four attributes, and says: “We have already sold multiple SuperPODs with more in the pipeline so the market is validating/recognizing this as well.”
The Nvidia-VAST relationship dates back to 2016, VAST says, with original development of its disaggregated, shared-everything (DASE) architecture. VAST supports Nvidia’s GPUDirect storage access protocol and also its BlueField DPUs. VAST’s Ceres data enclosure includes four BlueField DPUs.
DDN has a combined SuperPOD and Lustre-based A3I storage system. Previously, NetApp has certified its E-Series hardware running ThinkParQ’s BeeGFS parallel file system with Nvidia’s SuperPOD. Neither of these are enterprise NAS systems.
Back in 2020, NetApp provided a reference architecture for twinning ONTAP AI with Nvidia’s DGX A100 systems for AI and machine learning workloads. ONTAP is an enterprise NAS operating system as well as a block and object access system. Surely it must be possible to get an all-flash ONTAP system certified as a SuperPOD data store, unless ONTAP’s scalability limit of 24 clustered NAS nodes (12 HA pairs), meaning a 702.7PB maximum effective capacity with the high-end A900, proves to be a blocking restriction.
Dell is partnering with startup NeuroBlade to put SQL query-accelerating PCIe cards in select PowerEdge servers to speed up high throughput data analytics.
Update. Note added about NeuroBlade branding and CMO Priya Doty’s exit. 20 Sep 2023.
NeuroBlade has won Dell as a go-to-market partner for its hardware acceleration units that speed SQL queries. These G200 SPUs – SQL Processing Units – contain two NeuroBlade processors and work with query engines such as Presto, Trino, Spark and Dremio. No changes are needed to a host system’s data, queries or code as the SPU operates transparently as far as the application code is concerned.
Elad Sity, CEO and co-founder of NeuroBlade, said: “The work we have done enables organizations to keep up with their exponential data growth, while taking their analytics performance to new levels, and creating a priceless competitive advantage for them. This success couldn’t have been achieved without our engineering team, who have been collaborating with companies like Dell Technologies to unlock this new standard for data analytics.”
A couple of years ago NeuroBlade had developed its Xiphos rack enclosure, a Hardware Enhanced Query System (HEQS), and a compute-in-storage appliance containing four so-called Intense Memory Processing Units (IMPUs). These are formed from a multi-1,000-core XRAM processor, DRAM and an x86 controller. There are up to 32 x NVMe SSDs in the chassis and these can be Kioxia SCM-class FL6 drives for the fastest response.
NeuroBlade SPU card
The IMPU has been developed into what’s now called the SPU and NeuroBlade says it delivers consistently high throughput regardless of query complexity for applications involving business intelligence, data warehouses, data lakes, ETL, and others. As a dedicated SQL query processor, it can replace several servers previously doing the job and cut compute, software and power costs by a claimed 3 to 5x.
HEQS chassis
A HEQS image above shows eight SPUs fitted inside the chassis. This chassis can hold up to 100TB of data with its NVMe SSDs, and six chassis can be clustered to provide a 600TB resource. The Xiphos brand has lapsed with HEQS effectively replacing it.
Customers can buy SPU cards to put into servers, such as Dell’s PowerEdge, or complete HEQS chassis.
Although founded in Tel-Aviv, NeuroBlade has a US HQ in Palo Alto. It will be present at Dell Technologies World in Las Vegas, May 22-25, booth 1222, with displayed products and staff to talk about them.
NeuroBlade branding
NeuroBlade tells us: “The SQL Processing Unit of NeuroBlade is its own product, NeuroBlade SPU. The company has migrated away from the Xiphos and XRAM brand names. The SPU can be integrated into NeuroBlade’s HEQS (hardware enhanced query system), which is a rack-mounted server, or directly into the customer’s own data center.”
NeuroBlade CMO
We saw CMO Priya Doty has left Neuroblade, about 4 months after Mordi Blaunstein became VP Tech Business Development & Marketing. We asked NeuroBlade if the company could say why she left? The reply was: “NeuroBlade has focused its marketing strategy towards a more technical approach, emphasizing the product and its benefits. This, in turn, led to a restructuring of the marketing team. The company is now forming a new team in California led by Mordi Blaunstein, who has extensive experience in startups and enterprise organizations.”
Scality has released second generation ARTESCA object storage software with a 250 percent capacity increase, hardened security and Veeam v12 integration.
ARETESCA is Scality’s cloud-native S3-compatible object storage software, developed to complement its enterprise-class infrastructure RING object and file storage product with a lightweight alternative to supported MinIO, which has a $4,000/year entry point. Additional deployment options include VMware OVA (Open Virtual Appliance) format or as a complete software appliance system with a bundled and hardened Linux OS. It’s Scality’s fastest-growing product line.
Scality CMO Paul Speciale said: “ARTESCA makes data storage simple and secure for CISOs and their teams. It’s both affordable and easy to deploy in any environment, no strings attached. … ARTESCA 2.0 delivers the full package that today’s organizations are looking for — enterprise-grade security, simplicity and maximum performance at a price that won’t give CFOs heartburn.”
While RING software scales out to 100PB or more, ARTESCA is for use in the TBs to 5PB area where simple deployment and operations are needed. Like RING it can be used as both a Veeam performance and capacity object storage tier in a core data center but also as an edge backup target for Veeam in smaller datacenters. Scality suggested that RING is also suited to function as an archive tier.
ARTESCA 2 upgrades include security hardening for better malware protection. This has a new hardened Linux option that precludes OS access, reduces exposure to critical vulnerabilities, and limits a wide range of potential malicious attacks, we’re told. There is multi-factor authentication, unused network port lockdown, S3 object lock and auto-configuring of firewall rules, and asynchronous replication for virtual air-gapped offsite storage.
Access from Veeam is controlled by identity and access management policies. Specific Veeam v12 support includes Direct to Object and Smart Object Storage API, enabling added ransomware protection, data immutability and operational efficiencies. There is a simplified installer, shorter backup windows and restore times.
ARTESCA 2.0 will be available early June 2023. For new customers ARTESCA is free for 90 days starting early in the third quarter, with unlimited capacity. Software subscriptions start at less than $4,000/year for 5TB usable capacity with 24 x 7 support.
Background
Object storage software supplier Scality is 14 years old, has raised $172 million in funding, and reckons it will be profitable inside 12 months. It has 2EB of capacity under management for customers, encompassing some 5 trillion objects spread across 182,669 disk drives.
A conversation with CEO and co-founder Jerome Lecat provided an insight into its thinking about the object storage market and where it is going, such as towards an all-flash future.
Lecat said Scality is fully funded and should be profitable in a year.
Scality supports all-flash hardware configurations but sees little demand for them. Lecat said he doesn’t forsee the object storage market going all-flash because of energy savings compared to disk, saying: “I disagree that disk drives necessarily use more electricity than SSDs.”
This was a response to Pure’s Shawn Rosemarin predictiion that HDD sales will stop after 2028 through the combination of lower NAND prices, high electricity costs and limited electricity availability. The overall effect is that the TCO of flash arrays will be so much lower than disk as to prompt the start of a mass migration from spinning rust to electrically charged flash cells.
Lecat said disk drives lower their spin speed these days if they detect inactivity, and so save power.
He doesn’t see customer demand for all-flash object systems and, all in all: “I don’t think the all-flash object market is there.”
Kioxia has announced its datacenter SSD series is for HPE servers and arrays, using 3D NAND two generations behind its latest tech but with the latest PCIe gen 5 interconnect.
PCIe 5 operates at 32Gbps lane bandwidth — four times faster than PCIe 3’s 8Gbps and double PCIe 4’s 16Gbps sec.
The CD7 was originally unveiled in November 2021 as a PCIe 5 NVMe SSD using 96-layer BiCS4 technology in TLC (3 bits/cell) format and the E3.S drive standard. The drive was then sample shipping to potential OEMs. Kioxia is now transitioning NAND production in its joint venture fabs with Western Digital to BiCS6 162-layer chips.
Neville Ichhaporia, Kioxia America’s SVP and GM of its SSD business unit, said: “EDSFF and PCIe 5.0 technologies are transforming the way storage is deployed, and our CD7 Series SSDs are the first to deliver these technologies on HPE’s next-generation systems.”
The CD7 series drive is listed as a read-intensive drive with random read/write IOPS of up to 1,050,000/180,000 and sequential read and write bandwidth of 6.45GBps and 5.6GBps respectively. It has a five-year warranty, a 2.5 million hours MTBF rating and can sustain 1 drive write per day.
Other PCIe 5 SSDs vary in speed and storage. For example, Samsung’s PM1743 is available in E3.S format with up to 15.36TB capacity from its 128-layer NAND. It puts out 2,500,000/250,000 random read/write IOPS, 13GBps sequential read bandwidth and 6.6GBps sequential write bandwidth.
Kioxia says its CD7 SSDs support ProLiant Gen11 servers, Alletra 4000 storage servers (rebranded Apollo servers,) and Synergy 480 Gen11 Compute Modules, which all have PCIe gen 5 capability and have E3.S storage bays.
E3.S enables denser, efficient deployments in the same rack unit compared to 2.5-inch drives, with better cooling and thermal characteristics, we’re told. The format can, Kioxia says, raise capacities by up to 1.5-2x – although the CD7 only supports 1.92TB, 3.84TB and 7.68TB.
The CD7 is said to be suited for customers and applications such as hyperscalers, IoT and big data analytics, OLTP, transactional and relational databases, streaming media and content delivery networks as well as virtualized environments. A higher 3D NAND layer count version is surely on Kioxia’s roadmap.
Merger talks
Reuters reports that merger talks between Kioxia and Western Digital have sped up with a deal structure being developed. It cites unnamed sources. WD is under pressure from activist investor Elliott management to split its disk drive and SSD businesses into separate companies, and to then merge the SSD unit with Kioxia.
According to the newswire’s sources, under the fresh deal the combined Kioxia-WD business would be majority-owned by Kioxia with a 43 percent share, with WD having 37 percent, and the rest owned by existing shareholders of the two companies.
Kioxia was bought out of Toshiba by a Bain Capital-led consortium in 2017. A WD-Kioxia merger could provide a financial exit for that consortium. Toshiba owns 40.6 percent of Kioxia and Elliott Management has a Toshiba investment plus a board position.
Solidigm is touting a PCIe gen 4 QLC flash SSD offering TLC-class read performance and has appointed a pair of co-CEOs.
QLC or 4bits/cell NAND provides less expensive SSD capacity than TLC (3 bits/cell) NAND but has generally lower performance and a shorter working life. Solidigm is making a big deal about its new optimized QLC drive, which it says can cost-effectively replace both a TLC flash and a hybrid disk/SSD setup in a 7PB object storage array.
Greg Matson, VP of Solidigm’s datacenter group, played the sustainability card: “Datacenters need to store and analyze massive amounts of data with cost-effective and sustainable solutions. Solidigm’s D5-P5430 drives are ideal for this purpose, delivering high density, reduced TCO, and ‘just right’ performance for mainstream and read-intensive workloads.”
Solidigm says the DC P5430 is a drop-in replacement for TLC NAND-based PCIe gen 4 SSDs. It is claimed to reduce TCO by up to 27 percent for a typical object storage solution with a 1.5x increase in storage density and 18 percent lower energy cost. And it can deliver up to 14 percent higher lifetime writes than competing TLC SSDs.
From left; D5 P5430 in U.2 (15mm), E1.S (9.5mm) and E3.S (7.5mm) formats
The P540 uses 192-layer 3D NAND with QLC cells, and comes in three formats: U.2, E1.S and E3.S. The capacities are 3.84TB, 7.68TB, 15.36TB, and 30.72TB, but with the physically smaller E1.S model limited to a 15.36TB maximum. The drive does up to 971,000/120,000 random read/write IOPS and its sequential read and write bandwidth numbers are up to 7GBps and 3GBps respectively.
Solidigm says the new SSD has read performance optimized for both mainstream workloads, such as email/unified communications, decision support systems, fast object storage, and read-intensive workloads like content delivery networks, data lakes/pipelines, and video-on-demand. These have an 80 percent or higher read IO component.
How does the performance stack up versus competing drives? Solidigm has a table showing this:
The comparison ratings are normalized to Micron’s 7450 Pro and they do look good. The P5430’s endurance is limited, though, with Solidigm providing two values dependent upon workload type – random to 0.58 DWPD, and sequential up to 1.83 DWPD. It is a read-optimized drive after all.
Solidigm wants us to know that it has up to 90 percent IOPS consistency and ~6 percent variability over the drive’s life, and it supports massive petabytes written (PBW) totals of up to 32PB. Kioxia’s CD6-R goes up to 28PBW.
It says a 7PB object storage system using 1,667 x 18TB disk drives with 152 TLC NAND cache drives will cost $395,944/year to run. A 7PB alternative,using 480 x 30.72TB P5430s will cost $242,863/year – 39 percent less.
Solidigm ran the same comparison against an all-TLC SSD 7PB Object storage array and says its kit costs $257,791/year, 27 percent less than the TLC system’s $334,593/year. The TLC NAND system uses 15.36TB drives while Solidigm’s P5430-based box uses its 30.72TB drives, giving it a smaller rack footprint.
The D5-P5420 SSDs are available now but there is a delay before the maximum capacity 30.72TB versions arrive, which should be later this year.
The CEOs
Solidigm’s board has appointed two co-CEOs. The original CEO, Intel veteran Rob Crooke, left abruptly in November last year. The co-CEO of SK hynix, Noh-Jung Kwak, was put in place as interim CEO. Now two execs, SK hynix president Kevin Noh and David Dixon, ex-VP and GM for Data Center at Solidigm, are sharing responsibility.
David Dixon and Kevin Noh
Noh was previously chief business officer for Solidigm, joining in January this year. He has a 20-year history as an SK Telecom and SK hynix exec. Dixon was a near-28-year Intel vet before moving to Solidigm when that rebranded Intel NAND business was sold to SK hynix.
Bootnote
Here are the calculations Solidigm supplied for its comparison between a hybrid HDD/SSD and all-P5430 7PB object storage array and all-TLC array:
Micron has built a TLC SSD with 232-layer tech that’s faster and more efficient than Solidigm’s lower cost QLC drives. It’s also launched a fast SLC SSD for caching.
SLC (1 bit/cell) NAND is the fastest flash with the longest endurance. TLC (3bits/cell) makes for higher-capacity drives using lower cost NAND but with slower speed and shorter working life. QLC (4 bits/cell) is lower cost NAND again but natively has slower speeds and less endurance. 3D NAND has layers of cells stacked in a die – the more layers the more capacity in the die and, generally, the lower the manufacturing cost. Solidigm and other NAND suppliers are shipping flash with less than 200 layers while Micron has jumped to 232 layers.
Alvaro Toledo
Alvaro Toledo, Micron VP and GM for datacenter storage, told us: “Very clearly, we’re going after QLC drives in the market like the [Solidigm] P5316. And what we can say is this drive will match that on price, but beats it on value by a mile and a half. We have 56 percent better power efficiency, at the same time giving you 62 percent more random reads.”
The power efficiency claim is based on the P5316 providing 32,000 IPS/watt versus Micron’s 6500 ION delivering 50,000 IPS/watt.
Micron provides a set of performance comparison charts versus the P5316:
Solidigm coincidentally launched an updated QLC SSD, the P5430, today. Micron will have to rerun its tests and redraw its charts.
We have crafted a table showing the main speeds and feeds of the two Solidigm drives and the 6500 ION – all PCIe 4 drives – for a quick comparison:
The 6500 ION has a single 30.72TB capacity point and beats both Solidigm QLC drives with its up to 1 million/200K random read/write IOPS performance, loses out on sequential read bandwidth of 6.8GB/sec vs Solidigm’s 7GB/sec, and regains top spot with a 5GB/sec sequential write speed, soundly beating Solidigm.
Toledo points out that 6500 ION supports 4K writes with no indirection unit. Solidigm’s P5316 “drive writes in 64K chunks.” This, Toledo claims, incurs more read, modify, write cycles as the 64K is mapped to the drive’s 4K pages.
Micron 6500 ION
Its capacity and cost means that “by lowering the barriers of entry, we see that the consolidation play will make a lot more sense right now. You can store one petabyte per rack unit, which gives you up to 35 petabytes per rack.”
Toledo says the 6500 ION is much better in terms of price/performance than QLC drives: “Just the amount of value that we’re creating here is gigantic.” You can use it to feed AI processors operating on data lakes with 9x better power efficiency than disk drives in his view. And it’s better than QLC drives too: “This ION drive is just absolutely the sweet spot, the Goldilocks spot in the middle, where for about 1.2, 1.3 watts per terabyte, you can get all the high capacity that you need fast enough to feed that [AI] beast with a very low power utilization.”
The XTR is positioned as an affordable single-port caching SSD. Micron compares it to Intel’s discontinued Optane P800X series storage-class memory (SCM) drive, saying it has up to 44 percent less power consumption, 20 percent more usable capacity, and up to 35 percent of P5800X endurance at 20 percent of the cost.
Micron XTR.
It suggests using the XTR as a caching drive paired with its 6500 ION, claiming this provides identical query performance as an Optane SSD cache would provide. Toledo said: “We are addressing the storage-class memory workload that requires high endurance; this is not a low latency drive.”
Kioxia also has a drive it positions as an Optane-type device, the FL6, and Micron’s XTR doesn’t fare that well against it in random IO but does better in sequential reads, as a table shows:
Toledo says the FL6 is going after lower latency workloads than the XTR, but: “If you need to strive for endurance, the XTR can go toe to toe with a storage-class memory solution.”
Micron says the XTR has good security ratings – better than the Optane P5800X products, such as FIPS 140-3 L2 certification at the ASIC level – and provides up to 35 random DWPD endurance (60 for sequential workloads).