Decentralized storage network Storj says it has “made access to AI compute a bit easier” by partnering with CUDOS.
CUDOS is a DePIN (decentralized physical infrastructure network) compute network for AI and Web3 applications and workloads. It has access to hard-to-get H100 chips and the latest liquid-cooled H200 chips from Nvidia, which support AI workloads.
The alliance brings multi-petabyte capacity to joint customers, using Storj’s S3-compatible storage solutions within the CUDOS network, which combines cloud and blockchain technology.
Cudo, the parent company of CUDOS, is a cloud partner of Nvidia and is now working with Valdi, a recently acquired division of Storj, to bring high-end compute solutions to market.
Storj, with its decentralized storage, lets customers use underutilized storage capacity in datacenters across more than 100 countries, while Valdi allows customers to use available GPU compute cycles in datacenters worldwide.
Storj’s distributed platform promises enhanced security through client-side encryption and data sharding, we’re told. Files are encrypted, split into fragments, and distributed globally across tens of thousands of storage nodes in over 100 countries. This “delivers privacy, durability, availability, security, and superior edge performance,” Storj maintains, promising to keep data safe from unauthorized access and data loss.
Matt Hawkins
CUDOS provides computational power to enterprises to support large language models (LLMs), image and video recognition, and speech synthesis. By integrating Storj’s storage and Valdi’s compute capabilities, CUDOS says it is “enhancing” its offering for businesses and developers needing “scalable, secure, and cost-effective” compute and storage services.
“Partnering with Storj allows us to offer an unparalleled blend of compute and storage capabilities, enhancing our ability to deliver high-performance AI applications to the most demanding enterprise organizations,” said Matt Hawkins, CEO of CUDOS.
Ben Golub
Ben Golub, CEO at Storj, added: “Our distributed storage solutions ideally complement CUDOS’s compute infrastructure. Together, we’re creating a powerful platform that sets a new standard for scalable services, that are a cost-effective and high-performance alternative to hyperscalers like Amazon.”
Bootnote
Decentralized storage provider Cubbit aims to add decentralized compute as well.
Open source ETL provider Airbyte says its PyAirbyte Python library, introduced in late February, has helped more than 10,000 AI and data engineers sync over 6 billion records of data. Users have completed more than 221,000 PyAirbyte sync jobs, or over 10,000 syncs per week. PyAirbyte now boasts an average of 25,000 monthly downloads, according to metrics from the popular PyPi Python package repository.
…
Data migrator Cirrus has a MigrateOps feature in its Cirrus Data Cloud. This simplifies data mobility by automating the migration process. It says: “With storage integrations available for all leading storage arrays, hypervisor environments, and public clouds, it automates hundreds of manual steps in target storage preparation. This allows you to accelerate your migration and eliminate the risk of human errors, creating a fast, consistent, and secure process. … a one-click migration is at your fingertips. Whether you’re migrating to the cloud, moving onto a new hypervisor, or adopting a new hybrid cloud architecture, you can migrate, transform, and manage your block storage in a safe, consistent, trackable, scalable, and repeatable process with just a click.” Read a blog about MigrateOps here.
…
Data protector Commvault announced SHIFT 2024.It said this annual in-person and virtual event, combined with a 40-city roadshow, are the industry’s must-attend forums for CISOs, CIOs, data security, cloud, and data protection enthusiasts who want to learn about the latest developments, innovations, and partnerships in cyber resilience. SHIFT 2024 kicks off with an in-person event in London on October 8.On October 9-10, people from around the world can attend “exciting virtual simulcasts”, tailored to local time zones. To register for one of these simulcasts, click here. Following this initial launch, Commvault will embark on a global roadshow, with localized events in Australia, Canada, India, Italy, Spain, the United States, and beyond. To check out the roadshow schedule, click here.
…
MRAM developer Everspin Technologies announced a strategic award of $14.55 million over 2.5 years by the US government to develop a long-term plan to provide stable and continuous manufacturing services for aerospace and defense segments. Under the award, Everspin will provide a plan to mitigate risks to its MRAM manufacturing supply chain. Everspin’s MRAM manufacturing line in Chandler, Arizona, will continue to support both current and future Department of Defense strategic and commercial space system requirements.
…
Exascend has announced a 15.36 TB SAS interface SSD family in 2.5-inch format. The SA4 automotive SSD, S14 and SV4 industrial drives, and SE4 enterprise product all have a 240 GB to 15.36 TB capacity range, a 6 Gbps SATA interface and use 176-layer TLC NAND, except for the SV4, which relies on 112-layer TLC and has a restricted 480 GB to 3.84 TB capacity range.
…
Data orchestrator Hammerspace has hired Mike Riley as its Field CTO. He comes from a 6.5-year stint at Cohesity, where he was a senior principal engineer and Field CTO. Before that he spent a total of 18.5 years at NetApp culminating in a director of strategy and technology for worldwide sales. He writes on LinkedIn: “At Hammerspace my role as Field CTO centers on driving awareness in the value of a true global data platform. The Holy Grail for business: provide true unified access to all of their unstructured data assets regardless of where they’re stored. Unified access across multi-vendor & cloud environments to feed the growing need for AI, Data Analytics and Security. I don’t care where my data is stored but, when I want it, I want it as fast as humanly possible. When I’m not using it, I want it stored as economically as possible.”
…
Kingston announced the DC2000B, a PCIe 4.0 NVMe M.2 SSD using 112-layer 3D TLC NAND and optimized for use as an internal boot drive in datacenter servers. It includes onboard hardware-based power loss protection (PLP) and an integrated aluminum heatsink. The DC2000B is available in 240, 480, and 960 GB capacities, has 0.4 drive writes per day endurance, and is backed by a limited five-year warranty and free technical support.
…
Reltio announced Reltio Integration for Collibra, a prebuilt integration with Collibra’s Data Intelligence Platform. It says that due to fragmented data sources and numerous transformations, data teams and consumers have long struggled to find, understand, and trust data assets. Collibra claims it has addressed these challenges, but without a native integration, Reltio customers often resort to costly, time-consuming custom integration projects that require ongoing maintenance. With the integration, customers can now find and consume data, we’re told by the company. Users also gain greater visibility of their data assets thanks to Collibra’s data lineage capabilities. The Reltio Integration for Collibra is designed for customers using Reltio Customer 360 Data Product and/or Reltio Multidomain MDM. It’s available through Reltio and is listed on the Collibra Marketplace, where Reltio is a Collibra Silver Technology Partner.
…
In other Reltio new, the company has announced James Redfern, former CFO at Payscale, as its chief financial oficer, responsible for overseeing global accounting, finance, tax, and IT. He succeeds Gordon Brooks, who is retiring.
…
Wedbush analyst Matt Bryson, citing TheElec, says: “Samsung will begin taping out HBM4 later this year, with products expected in early 2025. The new parts will be based on the 1c process. While Samsung appears to be targeting the Blackwell Ultra refresh as a point at which it could potentially gain meaningful traction at NVDA with its HBM products, we see this tapeout as particularly significant for Samsung as we believe the 1c process node represents an important transition given Samsung’s broader recent struggles with DRAM process and production (which we believe in part have impacted the company’s HBM efforts).”
…
Data warehouser Snowflake reported Q2 2024 revenues of $868.8 million, up 29 percent year-over-year and beating expectations, with a net loss of $316.8 million, worse than the year-ago net loss of $26.9 million. Sridhar Ramaswamy, CEO of Snowflake, said: ”Snowflake delivered another strong quarter, surpassing the high end of our Q2 product revenue guidance and, as a result, we’re raising our product revenue guidance for the year. Product revenue was up 30 percent year-over-year at $829 million, while remaining performance obligations were $5.2 billion, up 48 percent year-over-year.” The company has lifted its full-year product revenue forecast, thinking more customers will use its services as they adopt GenAI. William Blair analyst Jason Ader says Snowflake’s guidance suggests solid consumption for the rest of the year, somewhat offset by expected headwinds in the second half for Snowflake’s storage business (11 percent of revenue) from customer adoption of external Iceberg tables and lower cost storage options within Snowflake (i.e. tiered storage). Ramaswamy emphasized that new product delivery continues to be one of the highest priorities for Snowflake, with the goal of expanding the purview, relevance, and audience of the Snowflake platform beyond core data warehousing and business analysts.
…
Structured data lifecycle manager Syniti announced the promotion of Alyssa Sliney to SVP of EMEA Delivery. In her new role, Alyssa will be responsible for delivery quality, driving year-over-year revenue growth, employee engagement and meeting and exceeding customer expectations. She will also continue to lead Syniti’s overall governance practice.
…
TrendForce, courtesy of Storage Newsletter, published calendar Q2 2024 SSD supplier market shares by units and by exabytes:
We made a combined supplier units and exabytes chart to illustrate how Solidigm and Samsung sell a lot of capacity relative to their unit counts:
If we list supplier unit shares by supplier combinations then Western Digital and Kioxia, linked by their joint NAND manufacturing venture, lead:
WDC/Kioxia: 28 percent
Samsung: 26.4 percent
Sk hynix/Solidigm; 17.6 percent
Micron: 12.7 percent
Kingston: 7.7 percent
SSSTC: 1.5 percent
Others: 7.2 percent
Charting supplier exabyte shares by supplier combinations shows SK Hynix and subsidiary Solidigm in second place behind Samsung and ahead of WD/Kioxia by quite a margin:
Samsung: 31.1 percent
Sk hynix/Solidigm; 26.2 percent
WDC/Kioxia: 20.7 percent
Micron: 13.0 percent
Kingston: 3.7 percent
SSSTC: 0.6 percent
Others: 4.6 percent
Solidigm’s early jump into QLC has really paid off in exabytes shipped.
…
Tom’s Hardware reports Western Digital released an 8 TB version of its Black SN850P NVMe SSD licensed for the PlayStation 5 console, doubling the previous 4 TB max capacity of the PS5 Black SSD for the PS5. The 8 TB product price of $999.99 is almost 3x the $349.99 4 TB version price. The extra 4 TB costs a heck of a lot more at $650 compared to $349.99 for the first 4 TB.
VergeIO has launched an open source infrastructure-as-code (IaC) software tool, Terraform Provider, allowing customers to set up its virtual datacenter infrastructure without getting deep into the weeds of code writing.
Terraform is a HashiCorp IaC tool that codifies cloud APIs into declarative configuration files. It can be used to automate IT infrastructure provisioning of servers, databases, firewall policies and other IT resources. VergeIO provides virtualized data centers with its VergeOS being a single hyper-converged software platform integrating hypervisor, storage, and networking components, replacing the traditional IT stack, and designed to run on commodity hardware.
Jason Yaeger, VergeIO SVP of Engineering, said in a statement: “Our Terraform Provider reflects VergeIO’s vision of a simpler, more accessible IT future. By bringing VergeOS together with Terraform, we are improving the experience for our current users and lowering the barrier to entry for new customers, making it easier than ever to adopt and benefit from our technology.”
Jason Yeager.
Terraform Provider consists of a high-level declarative configuration language which is used to define and provision data center infrastructure, automating infrastructure deployment and management. It is open source, cloud-agnostic, and its use helps ensure infrastructure consistency, reduces errors, and simplifies workflows. VergeIO says users can now implement its virtualized data centers without needing direct access to VergeIO’s internal code base, making deploying and managing the VergeIO infrastructure easier.
VergeIO states that its Terraform Provider currently supports a range of infrastructure resources, including virtual machines, drives, networks, and users, as well as data sources for clusters, groups, and more. It aims to continually enhance and expand the provider’s capabilities.
The provider supports incorporating VergeIO resources into Terraform workflows, enabling users to continue using familiar tools and services. Integrating VergeOS with Terraform allows customers to track changes to their infrastructure over time.
However, don’t be misled into thinking you no longer need to code VergeIO infrastructure. You still do, albeit at a higher level. Here is an example of Terraform being used to set up and populate an AWS S3 bucket:
And here’s a VergeIO example of adding a NIC to a Virtual Machine:
Customers can view the Terraform Provider code, submit update requests, and contribute to the provider through GitHub. Approved pull requests and tagged new versions will be automatically pushed to the Terraform Registry, ensuring that users can always access the latest features and updates.
US Nutanix users can get a single-tenant cloud just for their workloads from LightEdge, eliminating noisy neighbor resource-limiting problems.
LightEdge is a US-based cloud service and colocation datacenter provider with 12 Tier III datacenter sites and upwards of 1,300 customers. In April, it acquired St. Louis-based Connectria, which provides multi-cloud infrastructure (IBM, AWS, and Azure ) and managed hosting services, gaining six more datacenters and more than 400 additional customers. This was LightEdge’s fourth acquisition since it was itself bought by private equity and real estate fund management business GI Partners in 2021.
LightEdge now has 18 datacenters across 11 US regional markets.
Jim Masterson
The LightEdge Nutanix Dedicated Cloud provides resources and more security than a multi-tenant offering. It’s built on Nutanix’s hyperconverged infrastructure software with its AHV hypervisor and offers per-node scaling.
LightEdge CEO Jim Masterson said: “We’re excited to make Nutanix Dedicated Cloud available to our customers. It gives our customers another choice in single-tenant cloud and is an excellent solution for businesses needing dedicated resources, high performance, scalability, and disaster recovery capabilities built-in.”
Suggested use cases include:
High-performance applications needing dedicated and optimized resources.
Storage-intensive workloads requiring predictable utilization and capacity management.
Workloads needing regulatory compliance specifying known datacenter locations, etc.
Secure disaster recovery with quick restoration.
Infrastructure modernization without refactoring legacy applications.
The company also offers single-tenant VMware clouds as it is technology-agnostic. France’s OVH also provides single-tenant VMware and Nutanix services using its OVHcloud Hosted Private Cloud. The physical infrastructure is deployed and managed by OVHcloud in chosen datacenters, with the customer operating virtual machines, applications and data.
Dgtl Infra has a 2024 list of the top 250 datacenter operators globally. AWS, Azure, and Google are ranked first, second, and third, followed by Meta and Equinix. OVH is number 33 and LightEdge is ranked at 94.
LightEdge says it has a less than 1 percent customer churn rate, 24/7 access to a support team, and a 100 percent guaranteed uptime SLA. Read a brief LightEdge Nutanix brochure here.
DDN is being certified by Nvidia to provide the storage component of a reference architecture that is to be used by cloud service providers (CSP) to build Nvidia GPU-powered AI factory infrastructure.
This CSP reference architecture (RA) is a blueprint for building datacenters that can supply generative AI and large language model (LLM) services similar to the big three public clouds and hyperscalers. The Nvidia Cloud Partner scheme covers organizations that offer hosted software and hardware services in a cloud or managed services model to customers using Nvidia products.
Jyothi Swaroop
DDN’s SVP and CMO, Jyothi Swaroop, said in a statement: “This reference architecture, developed in collaboration with Nvidia, gives cloud service providers the same blueprint to a scalable AI system as those already in production in the largest AI datacenters worldwide, including Nvidia’s Selene and Eos supercomputers.
“With this fully validated architecture, cloud service providers can ensure their end-to-end infrastructure is optimized for high-performance AI workloads and can be deployed quickly using sustainable and scalable building blocks, which not only offer incredible cost savings but also a much faster time to market.”
Marc Hamilton, Nvidia VP for Solutions Architecture and Engineering, writes in blog: “LLM training involves many GPU servers working together, communicating constantly among themselves and with storage systems. This translates to east-west and north-south traffic in datacenters, which requires high-performance networks for fast and efficient communication.”
Marc Hamilton
Nvidia’s cloud partner RA includes:
Nvidia GPU servers from Nvidia and its manufacturing partners, including Hopper and Blackwell.
Storage offerings from certified partners, including those validated for DGX SuperPOD and DGX Cloud.
Quantum-2 InfiniBand and Spectrum-X Ethernet east-west networking.
BlueField-3 DPUs for north-south networking and storage acceleration, elastic GPU computing, and zero-trust security.
In/out-of-band management tools and services for provisioning, monitoring, and managing AI datacenter infrastructure.
Nvidia AI Enterprise software, including:
Base Command Manager Essentials, which helps cloud providers provision and manage their servers.
NeMo framework to train and fine-tune generative AI models.
NIM microservices designed to accelerate deployment of generative AI across enterprises.
Riva, for speech services.
RAPIDS accelerator to accelerate Spark workloads.
The DDN storage component uses its Lustre-based A³I (Accelerated, Any-Scale AI) system based on the AI400X2T all-flash Turbo appliances and DDN’s Insight manager. It is, DDN says, a fully validated and optimized AI high-performance storage system for CSPs featuring Nvidia’s HGX H100 8-GPU-based server services.
DDN AI400X2T
Each AI400X2T appliance delivers over 110 GBps and 3 million IOPS directly to HGX H100 systems.
An A³I multi-rail feature enables the grouping of multiple network interfaces on an HGX system to achieve faster aggregate data transfer capabilities without any switch configuration such as channel groups or bonding. It balances traffic dynamically across all the interfaces, and actively monitors link health for rapid failure detection and automatic recovery.
A Hot Nodes software enhancement enables the use of NVMe drives in an HGX system as a local cache for read-only operations. It improves the performance of apps where a dataset is accessed multiple times during a workflow. This is typical with deep learning (DL) training, where the same input data set or portions of the same input data set are accessed repeatedly over multiple training iterations.
The A³I has a shared parallel architecture with redundancy and automatic failover capability, and enable and accelerate end-to-end data pipelines for DL workflows of all scale running on HGX systems. DDN says significant acceleration can be achieved by executing an AI application across multiple HGX systems simultaneously and engaging parallel training efforts of candidate neural network variants.
The DDN RA specifies that “for NCP (Nvidia Cloud Partner) deployments, DDN uses two distinct appliance configurations to deploy a shared data platform. The AI400X2T-OSS appliance provides data storage through four OSS (Object Storage Server) and eight OST (Object Storage Target) appliances and is available in 120, 250 and 500 TB useable capacity options. The AI400X2-MDS appliance provides metadata storage through four MDSs (Metadata Server) and four MDT (Metadata Target) appliances. Each appliance provides 9.2 billion inodes. Both appliance configurations must be used jointly to provide a file system and must be connected to HGX H100 systems through RDMA over Converged Ethernet (RoCE) using ConnectX-7 HCAs. Each appliance provides eight interfaces, two per OSS/MDS, to connect to the storage fabric.”
The RA contains configuration diagrams for hooking up DDN storage to 127, 255, 1,023, and 2,047 HGX H100 systems; this is big iron infrastructure by any standards.
DDN RA diagram showing AI400X2T storage for 127 HGX H100 GPU servers
DDN says a single AI400X2T delivers 47 GBps read and 43 GBps write bandwidth to a single HGX H100 GPU server. Yesterday we noted Western Digital’s OpenFlex Data24 all-flash disaggregated NVMe drive storage delivers a claimed 54.56 GBps read and 52.6 GBps write bandwidth to a GPU server using GPUDirect. However, WD is not an Nvidia Cloud Partner.
We recently covered MinIO’s DataPOD, which delivers stored objects to GPU servers “with a distributed MinIO setup delivering 46.54 GBps average read throughput (GET) and 34.4 GBps write throughput (PUT) with an eight-node cluster. A 32-node cluster delivered 349 GBps read and 177.6 GBps write throughput.” You scale up the cluster size to reach a desired bandwidth level. Like WD, MinIO is not an Nvidia Cloud Partner.
DDN says its cloud reference architecture addresses the needs of service providers by ensuring maximum performance of their GPUs, reduced deployment times and guidelines to handle future expansion requirements. Check out DDN’s CSP RA here.
Fortanix has added File System Encryption to its Data Security Manager (DSM) product.
DSM is part of Fortanix Armor, a platform for consolidated data security services built on Confidential Computing, securing data while it is being processed, using hardware-based Trusted Execution Environments (TEEs) like Intel SGX (Software Guard Extensions). Its Runtime Encryption extends TEE protection by ensuring data and applications are secure even when they are actively being used in memory. Fortanix also offers a self-defending key management service to manage encryption keys, secrets, tokens, and certificates securely across different environments.
Fortanix unified data security platform graphic
DSM is a secure facility in that it secures sensitive data across public, hybrid, multi-clouds and private clouds. It has a Cohesity integration, in which Cohesity encrypts its backup data while Fortanix manages the keys, creating a separation of duties. Fortanix’s DSM has also been integrated with Cloudian’s HyperStore (v75.1). DSM received added Confidential Data Search in June last year, which provides scalable searches in encrypted databases with sensitive data, without compromising data security or privacy regulations. Or so the pitch goes. Fortanix also has a partnership with Snowflake to make DSM SaaS available to Snowflake customers.
Fortanix DSM screenshot
Anuj Jaiswal
Anuj Jaiswal, Fortanix VP of products, stated: “As data security becomes increasingly complex, offering organizations the ability to manage encryption across all levels through a unified platform creates huge value. The addition of Fortanix File System Encryption to our already robust Data Security Manager offering gives enterprises a one-stop shop for all of their encryption and data security needs.”
Fortanix File System Encryption (FSE) operates at the OS layer rather than the kernel layer, eliminating issues related to kernel dependencies. Enterprises can automate deployments using tools like Rundeck. FSE enables:
Levelling up of data security: Users can set up and manage agents to encrypt individual file systems mounted on host machines. They can scale agent deployments, DSM being a SaaS deployment, which are based on open policy agent specification.
Full control of access policies withgranular policy-based decryption so only authorized users and processes can access plaintext data.
Efficiently manage encryption keys: Centralize lifecycle management of all encryption keys while storing them in natively integrated FIPS-140-2 Level 3 HSMs, available as SaaS or on-premises. Prevent involuntary or malicious key deletion with quorum approvals.
This latest DSM development means that Fortanix’s unified data security platform now supports encryption across all layers, including application, database, storage, and file system.
Western Digital claims its OpenFlex Data24 system loaded with NVMe SSDs and an NVMe over Fabrics (NVMeoF) adapter does read and write IO across an Nvidia GPUDirect link faster than NetApp’s ONTAP or BeeGFS arrays.
The OpenFlex Data24 is a 2U x 24-drive slot enclosure introduced in 2020. The upgraded 3200 series was launched in August last year. It featured dual-port SSDs and a RapidFlex fabric bridge, supporting NVMeoF RDMA across RoCE and TCP for improved performance. Western Digital has published an Nvidia GPUDirect Storage technical brief document benchmarking GPUDirect and the OpenFlex Data24 system.
As GPUDirect bypasses a server’s host CPU and DRAM, it enables direct read and write operations to and from the server’s NVMe SSDs, as well as to and from the Data24’s SSDs via its fabric bridge. The Data24 is just another disaggregated storage system in this sense.
Typically, Nvidia GPUs are fed data either from parallel file systems, or from NVMeoF-supporting arrays such as those from NetApp, Pure Storage, and VAST Data. Note that VAST’s array architecture provides parallel file system performance.
The Western Digital tech brief provides detailed configuration data to show that it is a valid GPUDirect data source:
Its benchmarked GPUDirect IO performance is 54.56 GBps read bandwidth and 52.6 GBps write bandwidth. B&F has tracked GPUDirect storage performance from various suppliers and can compare the OpenFlex Data24 performance on a per-node basis with them:
B&F chart using public supplier GPUDirect performance numbers
Readers can see in B&F‘s chart that there are two groups of higher-performing systems, above 62.5 GBps bandwidth. To the left there are DDN and IBM with their parallel file system software – Lustre and Storage Scale respectively. On the right is WekaPOD, with its parallel Data Platform file system, and three PEAK:AIO results. These come from PCIe gen 5-supporting servers and PEAK:AIO’s rewritten NAS software.
The Western Digital OpenFlex scores are unusual in that there is near equality between the read and write numbers. They beat NetApp ONTAP and BeeGFS numbers, and Pure Storage’s write bandwidth but not its read bandwidth. They also beat VAST Data and WEKA write bandwidth, but lag behind a lot with WEKAPOD reads and slightly with VAST Data reads.
Western Digital noted that the “disaggregation of NVMe to NVME-oF only adds ∼10μs when compared to in-server NVMe drives.”
Its tech brief concludes that “while somewhat dependent on compute, storage capacity and performance requirements, the consumer ultimately has choice over which GPU servers, GPU RNICs, network components and SDD model to incorporate; all within a clearly defined cost model.”
COMMISSIONED: Organizations must consider many things before deploying generative AI (GenAI) services, from choosing models and tech stacks to selecting relevant use cases.
Yet before most organizations begin to tackle these tasks, they must solve perhaps their biggest challenge of all: their data management problem. After all, managing data remains one of the main barriers to creating value from GenAI.
Seventy percent of top-performing organizations said they have experienced difficulties integrating data into AI models, according to recent McKinsey research. These organizations experience issues with data quality, defining processes for data governance and having sufficient training data, McKinsey said. This can increase risks for organizations pursuing GenAI initiatives.
Getting your data house in order is table stakes for fostering AI capabilities while protecting corporate IP. But where do you start? And what data management and governance options are available?
Prioritize data quality and governance
Boosting data quality is a logical starting point. Large organizations are awash in data that could be useful for GenAI models and their resulting applications. However, the quality of data is often too poor to use without some corrections. Data, which is often siloed across different business functions, often includes wrong, outdated or even duplicative data.
This is par for the course in many organizations that have generated enterprise data over the years. However, using such disorganized data can wreak havoc on models, leading to bad outcomes, hallucinations and risk to corporate reputation. Remember, this is your organization’s IP, so you need to protect it.
How you massage your data to get the right outcomes will vary based on your business requirements. However, many organizations opt to collect, clean, preprocess, label and organize their data prior to leveraging it for training models.
Data governance is a critical factor for protecting corporate IP as you build GenAI models and applications. You’ll institute guidelines addressing AI usage within the organization and determine approved AI tools and usage policies.
Key to this is articulating a formal training policy to educate employees on how to use GenAI services ethically and responsibly, as well as the risks associated with inputting sensitive content into restricted gen AI systems.
Ultimately, however, a critical component of a good governance strategy is keeping a human-in-the-loop at all times. After all, isn’t it about time your humans and machines learn to work together
Synthetic data gives you secure options
Cleaning and governing your data will be good enough for many organizations dabbling in GenAI technologies. However, others may need to take a more prescribed approach when it comes to protecting their corporate IP.
For example, some GenAI use cases may be tough to execute as the data can be hard to obtain. And many organizations can’t afford to use their actual data, which may include personally identifiable data. This is particularly true in regulated markets, such as financial services, healthcare and life sciences bound to stringent data protection rules.
As a result, some organizations have turned to GenAI to use synthetic data, which mimics real-world patterns without exposing sensitive personal information. This can help you test data and see potential desirable outcomes.
It isn’t perfect; after all, the data is made up. But it may serve as a reasonable proxy for achieving your outcomes.
The unstructured data challenge
GenAI services produce unstructured data, such as PDFs, audio and video files, complementing the structured data stored in databases. Too many organizations let raw data flow into their lakes without cataloguing and tagging it, which can denigrate data quality.
Organizations typically wrangle the data with disparate tools and approaches, which challenges their ability to scale their initiatives.
To streamline their efforts, more organizations are turning to a data lakehouse, which is designed to work with structured and unstructured data. The data lakehouse abstracts the complexity of managing storage systems and surfaces the right data where, when and how it’s needed.
Dell offers the Dell Data Lakehouse, which affords your engineers self-service access to query their data and achieve outcomes they desire. The solution uses compute, storage and software in a single platform that supports open file and table formats and integrates with the ecosystem of AI and ML tools.
Your data is your differentiator and the Dell Data Lakehouse respects that by baking in governance to help you maintain control of your data and adhere to data sovereignty requirements.
The Dell Data Lakehouse is part of the Dell AI Factory, a fungible approach to running your data on premises and at the edge using AI-enabled infrastructure with support from an open ecosystem of partners. The Dell AI Factory also includes professional services and use cases to help organizations accelerate their AI journeys.
While organizations prefer their GenAI solutions to be plug-and-play, the reality is you’ve got to grab your shovel and come to work ready to dig through your data, prepare it to work with your models and protect it. Is your organization up to the task?
Lenovo has built a clusterable AI Data Lake system with AMD servers running Cloudian’s HyperStore object storage.
The hardware is Lenovo’s SR635 V3 all-flash server with an AMD gen 4 EPYC 9454P single socket CPU (48-core). A six-node test system fitted with 8 x 7.68 TB NVMe SSDs for data and 2 x 3.84 TB metadata SSDs per node delivered 28.7 GBps reads and 18.4 GBps writes. This was 74 percent more power-efficient than an equivalent disk drive-based system, according to Cloudian testing.
Michael Tso
Cloudian CEO and co-founder Michael Tso stated: “Lenovo’s industry-leading servers with AMD EPYC processors perfectly complement Cloudian’s high-performance data platform software. Together, they deliver the limitlessly scalable, performant, and efficient foundation that AI and data analytics workloads require.” He suggested that the combined Lenovo-Cloudian system suited AI, machine learning, and HPC workloads.
Lenovo executive director and GM for storage Stuart McRae said: “This partnership enables us to offer our customers a cutting-edge, scalable, and secure platform that will help them accelerate their AI initiatives and drive innovation.”
HyperStore software from Cloudian is S3-compatible, scalable to exabyte levels, and has Object Lock immutability to protect against ransomware. There are more than 800 enterprise-scale deployments of Cloudian’s HyperStore software.
Stuart McRae
MinIO has recently announced a DataPOD reference architecture for feeding data from its object storage software to Nvidia GPU servers. It quoted 46.54 GBps read and 34.4 GBps write bandwidth from an eight-node storage server system, with 24 SSDs per node. On a per-node basis, that equates to 5.82 GBps reads and 4.3GBps writes; faster than the Lenovo-Cloudian system’s 4.78 GBps reads and 3.1 GBps writes.
However, on a per-drive basis, the Lenovo-Cloudian system delivered 0.478 GBps reads and 0.31 GBps writes, whereas the MinIO DataPOD RA system provided 0.243 GBps reads and 0.179 GBps writes. This comparison is simplistic and lacks detailed specs of the systems’ CPUs, core counts, memory, PCIe structures, and networking ports. Costs are also not considered. Detailed research is necessary for concrete conclusions.
Lenovo SR635 V3 server
Both Cloudian and Lenovo say power efficiency is going to become increasingly important, citing a Morgan Stanley report which says power consumption for generative AI is forecast to increase at an annual average of 70 percent through 2027, meaning that “by 2027, generative AI could use as much energy as Spain needed to power itself in 2022.”
The Morgan Stanley analysts believe that generative AI power demands can be met with sustainable sources, and “massive demand for power can also advance sustainable energy technology across sectors.”
The combined Lenovo/AMD/Cloudian AI Data Lake system is available now from Lenovo and from authorized resellers.
The Own Company is providing continuous data protection for Salesforce users.
This business used to be called OwnBackup and renamed itself last year. It was founded in 2012 and initially backed up customer data for the Salesforce SaaS app, later expanding to include Sage Business Cloud Financials, Veeva (life sciences), nCino (financial data), Microsoft Dynamics 365 CRM and Power platform, and ServiceNow. The majority of its revenue comes from Salesforce customers. There are some 20 other Salesforce backup suppliers, offering interval-based backup, and Own Company is trying to distance itself from them by offering “Continuous Data Protection.”
Own interval-based backup diagram
Continuous Data Protection (CDP) from Own pushes data changes to a backup as they happen, allowing businesses to capture changes in their data in near real time. Own claims no other Salesforce data protection supplier has this capability.
Adrian Kunzle
Own CTO Adrian Kunzle stated: “This innovative approach to Continuous Data Protection will provide our Salesforce customers with an unparalleled advantage for capturing every change to their data.”
Salesforce documentation says: “Change Data Capture publishes change events, which represent changes to Salesforce records. Changes include creation of a new record, updates to an existing record, deletion of a record, and undeletion of a record.”
Own’s turnkey CDP uses Salesforce Change Data Capture events to continuously capture all changes to production data as they happen. It provides a full historical record of how data changes over time.
A Salesforce customer can create retroactive Point-in-Time snapshots down to the minute. Own says this “mitigates the data loss that occurs when using traditional interval-based backups,” and can better meet Recovery Point Objective (RPO) and Recovery Time Objective (RTO) goals.
It can also enable “more precise data analysis and AI innovations by leveraging time-phased data sets, such as customer health or sales pipeline trends.”
Own continuous data protection diagram
We asked Own how its CDP compared to other CDP offerings and were told: “While Continuous Data Protection is an industry term, Own is the first company to bring Continuous Data Protection to SaaS applications by leveraging platform events.
“This unique approach to data protection enables customers to minimize data loss to near zero, accelerate data recovery time, removes previous limits to scalability, and ensures high-fidelity data for AI and machine learning. Other vendors (Rubrix, Cohesity, Veeam, Zerto) have CDP for other types of environments such as VMware.”
The Continuous Data Protection product for Salesforce is generally available from today.
Rubrik has extended its SaaS app data protection to Salesforce Core platform data.
The company already protects SaaS applications such as Microsoft 365, with Alcion, and added support for Atlassian’s Jira Cloud offering in December last year. At that time it said it wanted to support Salesforce, ServiceNow, Google Workspace, Dynamics 365, and more SaaS apps in the future.
Anneka Gupta
Rubrik chief product officer Anneka Gupta writes in blog: “The new offering is powered by the robust data security capabilities of Rubrik Security Cloud, a unified platform that consolidates data protection across SaaS, cloud, and on-premises environments.”
Unlike specialist SaaS data protectors such as The Own Company or public cloud data protectors including Clumio, Rubrik is offering a single data protection and cyber-resilience product covering on-premises applications and data, public cloud-based apps and their data and also SaaS app data. It believes customers with distributed application environments will benefit from dealing with a single data protection supplier, with features like air-gapping and zero-trust architecture, rather than having several if not many different data protection silos.
It quotes a Gartner finding that, by 2028, 75 percent of enterprises will prioritize backup of SaaS applications as a critical requirement, compared with 15 percent this year.
Many backup suppliers are moving into SaaS app protection, or have already done so, including Assigra, Commvault, Druva, HYCU, KeepIt, The Own Company, Veeam, and Veritas. Developments are coming thick and fast. The Own Company has announced Continuous Data Protection for Salesforce today.
In its pitch, Rubrik said Salesforce Data Protection offers:
Rapid and precise recoveries of lost records, including related records, to the right point in time
Customer help in maintaining data integrity by preventing the restoration of corrupted data
Intuitive user experience for Salesforce admins configuring and managing backups
Unified and comprehensive data protection management across Salesforce, other SaaS platforms, on-premises, and multi-cloud environments
Rubrik’s Salesforce Data Protection is now generally available on the AppExchange, and will be on display at Dreamforce 2024, taking place September 17-19 in San Francisco. Interested parties can register for the “Secure and Simplify: Salesforce Data Protection with Rubrik” webinar taking place September 4 at 9:00 am PST.
Western Digital is spinning off its NAND and SSD business, which could be valued similarly to Solidigm, says Wedbush analyst Matt Bryson.
Disk drive and SSD manufacturer Western Digital is splitting into two distinct businesses, one making hard disk drives (HDDs) and the other making NAND flash and SSDs. Current WD CEO David Goeckeler will run the NAND and SSD business while EVP for Global Operations Irving Tan will become CEO of the HDD business.
Bryson calculates that the NAND and SSD side of Western Digital (based on acquiring SanDisk nine years ago for $19 billion) could be valued between $10 billion and $23 billion as a standalone business.
Matt Bryson.
He told subscribers: “The net of our analysis remains that the value of the HDD portion of Western Digital comprises nearly the entirety of the total company valuation (at just over $20 billion) and that SanDisk is effectively valued at close to nothing.
“Our sum of the parts valuation suggest that the market is simply attaching no value to WDC’s NAND business.”
We can see this if we compare Western Digital and Seagate annual revenues and market capitalization. Seagate, which is overwhelmingly an HDD business, earned $6.6 billion in revenues in its last fiscal year. Its current market capitalization is $21.4 billion. Western Digital revenues for its latest fiscal year were $13 billion, making it twice as big as Seagate, while its market capitalization of $21.9 billion values it as almost equal to Seagate. Thus, effectively, half of Western Digital’s business is valued at nothing compared to Seagate.
In his note, Bryson attempts to estimate the standalone worth of the NAND/SSD part of Western Digital, considering four financial issues and then making assumptions. The issues are:
Today Western Digital has debt. How will it be apportioned between the two new companies?
Will the split be accompanied by a stock sale to raise capital?
How will Western Digital’s current operating expenses be divided between the two new companies?
How will these two new businesses be taxed?
Also, there are no means of breaking out the expenses of Western Digital’s NAND business.
He makes a series of assumptions about these issues. The Western Digital HDD business will have operating expenses equivalent to those of Seagate. They might be less as Western Digital won’t have the expenses of the Seagate Lyve and systems business operations.
As the Western Digital HDD operation generates most of the company’s cash, it should take on “the lion’s share of the debt.” It has a greater ability to pay the debt, “more modest future capital requirements,” and could refinance “the debt load given the former conditions.”
He also reckons that there will be “a somewhat lighter tax burden for the HDD entity” versus the NAND/SSD business based on historical precedent.
Given these issues and assumptions, he estimates the HDD business is worth between $19 billion and $24 billion using 8x and 10x sales net income multiples. The NAND/SSD operation is worth between $10 billion and $22 billion, depending upon gross margin estimates and sales net income multiples.
In summary, Bryson said: “We believe that once split, the separate SanDisk and WD entities should be worth at least $30 billion, and arguably will eventually command a valuation of $40 billion-plus if we are correct and NAND and HDDs are indeed entering into a prolonged up cycle with the full benefits manifesting in 2025.”
Bryson pointed out that ”SanDisk’s lowest market cap (in the three years prior to WD’s proposed acquisition) was around $10 billion.” SK hynix paid a similar amount for Solidigm.