Home Blog Page 156

Micron launches faster U.3 datacenter SSD

Micron has launched a 9400 datacenter NVMe SSD using the PCIe gen 4 interface with great performance and a 30.7TB maximum capacity.

Update: 7450 and 9500 positining comment from Micron added. 10 Jan 2023.

Just over nine months after launching its 7450 datacenter SSD using PCIe gen 4 and 176-layer 3D NAND, Micron has used the same TLC NAND and interface to build the 9400 with double the maximum capacity, 60 percent more random read IOPS, slightly more sequential read and significantly more sequential write bandwidth.

A statement from Micron’s datacenter storage VP and GM, Alvaro Toledo, said: “High performance, capacity and low latency are critical features for enterprises seeking to maximize their investments in AI/ML and supercomputing systems.” 

Micron 9400
Micron 9400

Micron says the 9400’s latency is 69μs read and 10μs write. For reference, the 7450’s latency is as low as 80μs read and 15μs write, with the 9400 being more responsive.

Micron told us: “The 9400 is aligned to the NVMe performance market which is currently focused on U.3, especially for high capacities over 30TB like the 9400 SSD. Our 7450 SSD is a mainstream data center SSD and offers one of the industry’s broadest form factor options to address diverse use cases for all major platform functions including boot and main data storage.”

The 9400 comes in the U.3 2.5-inch format, whereas the preceding 7450 comes in U.3, E1.S and the M.2 formats. The 9400 also comes in PRO and MAX variants for read-centric and mixed read/write workloads respectively, like the 7450. Both the 7450 and 9400 PROs have a 1 drive writes per day (DWPD) endurance with the MAX variants enjoying a 3 DWPD rating.

Here’s a table comparing the 7450 U.3 variant and the 9400’s characteristics:

Micron SSD performance

The 9400 has a maximum of 1.6 million random read IOPS compared to the 7450’s 1 million, and 410,000 (MAX) random write IOPS, with the 9400 delivering up to 600,000. There has been a 25 percent sequential write speed boost from the 7450’s 5.6GBps to the 9400’s 7GBps, with a lesser sequential read speed increase from 6.8GBps to 7GBps. 

Since the 9400 and 7450 use the same NAND, we understand the speed increases come from greater parallelism inside the controller and drive.

Looking into the speed picture

How does the 9400 compare to other PCIe 4 SSDs? 7GBps sequential reads is good but not outstanding but the 7GBps sequential writing speed is very good.

For example, Inspur’s Enterprise NVMe SSD  reaches 7GBps sequentially reading and Samsung’s AIC format PM1733 and PM1735 achieve 8GBps. A Phison-Seagate X1 SSD in U.3 format attains 7.4GBps, but that drive is not actually shipping. Solidigm’s D7-P5520 and P5620 drives, both in U.2 format, deliver up to 7.1GBps read bandwidth. The 9400 is not top-rank in this general company, but is top if only U.3 format drives are considered.

Sequential write speeds of 7GBps or above are rare indeed. The only drives that exceed this are the Phison Seagate X1 SSD that puts out 7.2GBps when reading, and Liqid’s LQD4500 with its 16 lanes and AIC format pumping out up to 16GBps. But that has a composable systems focus. Micron’s 9400 is top in shipping U.3 format drive terms.

In the IOPS field, the 9400’s 1.6 million random read IOPS are only exceeded by Liqid’s LQD4500 (4 million and same AIC composable systems proviso) and the Phison-Seagate X1’s 1.75 million. A U.2 format SmartIOPS Data Engine T2 reaches 1.7 million random read IOPS. The 9400 is the fastest shipping U.3 drive again. 

The 9400 MAX’ 600,000 random write IOPS are exceeded by several other drives; Western Digital’s SN770 and SN850, Solidigm’s P44 Pro, SK hynix’ Platinum P41, Seagate’s FireCuda 520 and 530, Samsung’s 980 Pro and 990 Pro, and others, all in M.2 gumstick card format. No other U.3 or U.2 NVMe PCIe gen 4 SSD beats the 9400 though. It reigns supreme in U.3 random write IOPS terms.

Micron declares that the 9400 is the world’s fastest PCIe 4 datacenter U.3 drive shipping, and that’s certainly true.

It has provided numbers showing faster performance than competing SSDs with RocksDB (up to 25 percent higher performance), Aerospike Database (up to 2.1 times higher peak performance), and multi-tenant cloud architecture workloads (double the overall performance ), but hasn’t identified the competing products.

Power efficiency and rack space

Micron says that, at the 7.68TB capacity level, the 9400 SSD delivers 94,118 4K random read IOPS/watt versus 53,100 IOPS/watt for the prior generation Micron 9300 NVMe SSD. But this is a 64-layer, PCIe 3, 3D NAND drive from 2018. It would be surprising if the 9400 wasn’t more thrifty with its electrical power.

Toledo said: “Thanks to its industry-leading 30TB capacity and stunning performance with over 1 million IOPS in mixed workloads, the Micron 9400 SSD packs larger datasets into each server and accelerates machine learning training, which equips users to squeeze more out of their GPUs.” 

Micron points out that a standard two-rack-unit, 24-drive server loaded with 30.72TB 9400 SSDs provides 737TB per server. By doubling capacity per SSD, Micron is enabling enterprises to store the same amount of data in half as many servers. 

WEKA co-founder and CEO Liran Zvibel provided a nice quote for Micron: “High-performance, high-capacity storage like the Micron 9400 SSD provides the critical underlying technology to accelerate access to data and time to insights that drive tremendous business value.” 

Wally Liaw, co-founder and SVP of business development at Supermicro, was on-page as well: “The Micron 9400 SSD delivers an immense storage volume of over 30TB into every rack while simultaneously supporting optimized workloads and faster system throughput for advanced applications.”

Download a 9400 product brief here.

VMware and Nutanix still dominating HCI market

IDC has resumed putting out hyperconverged system supplier revenue and market share numbers after after a three-quarter gap and not much has changed since the end of 2020 and the latest third 2021 quarter numbers. As before, VMware leads in revenue and market share ($982.3 million, 41.5 percent) followed by Nutanix ($581.2 million, 24.6 percent) with Huawei, Cisco and HPE distantly behind.

Hyperconverged Infrastructure (HCI) systems, also known as servers with storage, combine compute, networking and storage hardware with hypervisor software, providing a scale out system such that customers can grow their systems by adding HCI nodes. The hypervisor software aggregates the storage across the individual nodes into a server or virtual SAN.

Here is our table of the IDC numbers:

Vendors dip in and out of the table as IDC only publishes numbers for the top five vendors in any quarter. You have to buy its reports to get the full picture.

The numbers are more than a year old and we have no IDC visibility into what’s happened since the third 2021 quarter. We have charted both the revenue and market share numbers to show the trends: 

Virtually nothing has changed over the gap since the fourth 2020 quarter. VMware still rules the roost with vSAN. While the overall HCI market grew at 14.3 percent in the year to 3Q 2021, VMware vSAN revenues rose 17.7 percent, with Nutanix having a 13.4 percent increase. Only Huawei beat VMware, with its 21.6 percent rise.

Cisco HCI sales declined 14 percent in the year to 3Q 2021 and there’s little reason to think they may have grown since then. HPE’s sales fell 7.7 percent in the same period and its market share went down from 4.5 percent in the second 2020 quarter to 3.5 percent in the third 2021 quarter.

Other suppliers accounted for 21.3 percent of the market in the third 2021 quarter almost as much as Nutanix.

The takeaway here is that this is a stable and mature market, with VMware’s vSAN solidly in the lead at just under the $billion/quarter revenue mark, Nutanix locked into second place at just over the $500 million revenue level, and three also-rans around the $100 million/quarter point. Expect little change as time goes by.

Bootnote.

IDC tells us itreplaced hyperconverged with cloud infrastructure spending around this time (late 2021). That’s why there’s been no hyperconverged infrastructure spending reports since then.

Why is there a market for S3 backups and data lake protection?

Cloud-based data lakes store data in cloud object storage and that data needs protecting – that’s the whole pitch made by SaaS data protection suppliers such as Commvault, Druva, HYCU and Clumio. Data lake suppliers such as Databricks say they look after data protection, but that is not down to the S3 or Azure Blob data storage layer. AWS protects the S3 infrastructure and AWS Backup provides S3 object data protection, but these vendors argue it can be limited in scale.

Basic refresher

Broadly speaking data can be stored as blocks, files or objects. Data stored in blocks is located on a storage medium at a location relative to the medium’s start address. The medium, such as an SSD, disk drive, tape or optical disk, is organized into a sequential set of blocks running from the start block to the end block. For example,  a piece of data, such as a database record, could be start at block 10,052 and run for 50 blocks to block 10,102.

The device can be virtualized and referred to as a LUN (Logical Unit Number) which could be larger or smaller than a physical device. Both file and object data is ultimately stored in blocks on storage media with an internal and hidden logical-to-physical mapping process translating file and object addresses to device block addresses.

A file is stored inside a hierarchical file:folder address space with a root folder forming the start point and capable of holding both files and sub-folders in a tree-like formation. Files are located by their position in the file:folder structure.

An object is stored in a flat address space using a key generated from its contents.

S3, Amazon Web Services’ Simple Storage Service, is an object storage service for storing data objects on storage media. It currently stores about 280 trillion objects, according to Michael Kwon, Clumio’s marketing VP.

An object is a set of data with an identity. It is not a file with a filesystem folder-based address nor is it a set of storage drive blocks with a location address relative to the drive’s start address, as in a SAN with its LUNs (virtual drives).

S3 objects are stored in virtual repositories called buckets which can store any amount of data. They are not fixed in size. Buckets are stored in regions of the AWS infrastructure, meaning they have a two-part address: region name followed by a user-assigned key. 

S3 buckets have access controls which define which users can access them and what can be done to the bucket’s contents. There are eight different classes of S3 storage: Standard,  Standard-Infrequent Access (Standard-IA), One Zone-Infrequent Access (One Zone-IA), Intelligent-Tiering,  S3 on Outposts,  Glacier Instant Retrieval,  Glacier Flexible Retrieval, and Glacier Deep Archive.

S3 objects can be between 1 byte and 5TB in size, with 5TB being the maximum S3 object upload size. Data objects larger than 5TB are split into smaller chunks and sent to S3 in a multi-part upload operation.

Since S3 buckets can store anything, they can store files. An object can have a user-assigned key prefix that represents its access path. For example, Myfile.txt could have a BucketName/Project/Document/Myfile.txt prefix.

There is a flat S3 address space, unlike a hierarchical file:folder system, but S3 does use the concept of a folder to group and display objects within a bucket. The “/“ character in an object’s key prefix separates folder names. So, in the key prefix BucketName/Project/Document/Myfile.txt prefix, there is a folder, Project, and sub-folder, Document, with both classed as folders.

All this means that data stored in an S3 bucket can look as if it is stored and addressed in a file:folder system but it is not. An object’s key is its full path, in this case BucketName/Project/Document/Myfile.txt. AWS uses the / character to delineate virtual Project and Document folders but they are not used as part of the object’s location process, only for grouping and display purposes.

The S3 folder construct is just used to group objects together but not to address them in a hierarchical fashion. We are now at the point where we understand that S3 buckets are effectively bottomless. How is data in them protected?

S3 data protection

An organization supplying services based on data stored in S3 buckets could have a data protection issue, say some vendors. For example, data lakehouse supplier Databricks has its Delta Lake open source storage layer running in AWS. This runs on top of Apache Spark and uses versioned Parquet files to store data in AWS cloud object storage, meaning S3.

AWS protects the S3 storage infrastructure, ensuring it is highly durable. It states: “S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive redundantly store objects on multiple devices across a minimum of three Availability Zones in an AWS Region.”

Amazon S3 also protects customers’ data using versioning, and customers can also use S3 Object Lock and S3 Replication to protect their data. But these don’t necessarily protect against accidental or deliberate data loss, software errors and malicious actors.

Amazon has made AWS Backup for S3 generally available to “create periodic and continuous backups of your S3 bucket contents, including object data, object tags, access control lists (ACLs), and other user-defined metadata.”

But Chadd Kenney, VP of Product at Clumio, said AWS Backup is limited to backing up 3 billion objects. “You can see it on their site that it’s a 3 billion object limit,” he said. In his view this is not enough: “You need to get to exabytes scale, and you need to get to billions and billions of objects.”

S3 data lake protection

Databricks VP for field engineering Toby Balfre told us: “Databricks uses Delta Lake, which is an open format storage layer that delivers reliability, security and performance for data lakes.  Protection against accidental deletion is achieved through both granular access controls and point-in-time recovery capabilities.”

This protects Delta Lake data – but it does not protect the underlying data in S3.

Clumio’s Ari Paul, director of product and solutions marketing, said: “If you have a pipeline feeding data into a Snowflake, external table or something, what happens if the source data disappears? Our approach is that we protect data lakes from the infrastructure level.”

Data lakes and lakehouses can use petabytes of data in the underlying cloud object storage and it cannot be protected by AWS Backup once it scales up to the 3 billion object limit.

Clumio’s S3 backup runs at scale and has two protection technologies. The Inventory feature lists every object that’s changed in the last 24 hours and is used as a delta to run a backup. Secondly, Event Bridge sees all object changes in a 15-minute period and does a micro or incremental backup of them.

Kenney said: “We do an incremental backup of those object changes. And then every 24 hours, we will validate that all those object changes were correct. So we never miss objects. It’s kind of a dual validation system.” This provides a 15-minute RPO compared to Inventory’s 24-hour RPO.

Clumio mentions a customer who found AWS Backup inadequate at its scale. Kenney told us: “AWS Backup was their [original] solution, but they had 18 billion objects in a single bucket.” 

Clumio currently claims to be able to back up 30 billion S3 objects and, Kenney tells us: “We’re now at 50 billion internally.” 

The takeaway here is that end users with data lake-scale S3 storage, meaning more than 3 billion objects per bucket, may need to look at third-party data protection services.

Only 50% of Israel’s startup unicorns are really unicorns: report

Half of Israel’s 2022 tech startup unicorns are no longer worth a billion dollars or more. That’s the conclusion of Viola, an Israeli private equity company, in its 2022 end-of-year report, which says an investment bubble has burst.

Viola’s report mentions storage unicorn suppliers, and adds that 21 new Israeli unicorns were minted in 2022:

Viola's tech unicorns list in Israel, 2022
Viola’s tech unicorns list in Israel, 2022

We can recognize four storage-related suppliers in the Infrastructure section at the top of the list: RedisLabs, VAST Data, Infinidat and OwnBackup.

Viola says there has been a “dramatic valuation correction” in 2022 and a decline in VC startup funding in Israel, particularly in rounds greater than $99 million. There were $8.3 billion worth of rounds in the second half of 2021, $4.5 billion in the first 2022 half, and a forecast of just $1.1 billion in the second 2022 half. This funding activity has been “negatively impacted by the public market’s performance.” This has occurred with a shift from a post-COVID high to a looming recession with companies missing top line targets and lowering their spending. There was less startup capital invested, fewer deals, longer inter-round periods and lower valuations, Viola says.

The report states: “We believe only 50 percent justify their unicorn status based on our estimation of their current revenue and offering.” As for new funding, it adds: “Late-stage companies will increasingly utilize debt” because the cost of VC financing is rising. Debt avoids significant shareholder dilution and prolongs the startup’s runway to profitability without setting a valuation price”.

Could the four Israeli storage startup unicorns mentioned above no longer enjoy unicorn status? We think it unlikely because their 2022 growth and funding status was good:

  • Infinidat – a high-end array supplier with somewhat opaque funding details since its $95 million C-round in 2017. There was, Crunchbase says, a $45 million debt raise in 2018, followed by an unspecified venture round in 2019 and a D-round in 2020. This featured Goldman Sachs, TPG Growth, Claridge Israel, Moshe Yanai and ION Crossover Partners, but the amount has not been revealed, leaving us with a public total of $370 million raised. This may understate the true position as we think something around $40 million to $80 million was raised in the D-round.
  • OwnBackup – supplies Backup-as-a-Service to Salesforce, Microsoft Dynamics and Service Now customers. It raised $167.5 million in January 2021, and $240 million in August 2021, taking its total raised to $507 million-plus with a $3 billion-plus valuation.
  • Redislabs – supports and sponsors the open source NoSQL Redis (Remote Dictionary Server) key-value database. It raised $60 million in 2019, $100 million in 2020 and $110 million in 2021, with $347 million raised in total.
  • VAST Data – scale-out high-end filer supplier which raised $40 million in 2019, $100 million in 2020 and $83 million in 2022 with $263 million in total funding. It no longer has hardware on its books but it is a hardware and software-based design house.

Our thinking is that these four suppliers have retained their unicorn status and could grow more in 2023.

Pure Storage on its predictions for 2023

B&F discussed predictions for 2023 with Ajay Singh, Pure Storage‘s Chief Product Officer, and challenged some of these, including tiering management, the economics of HDDs and more.

Pure Prediction – 1: 2023 will sound the last death knell of spinning-disk storage with the era of tiering, complexity, and forced customer compromises finally coming to an end. 

Blocks & Files: Why? What forced customer compromises? The hyperscalers use tiering so what’s wrong with it?

Ajay Singh

Ajay Singh: The benefits of flash for data storage are undeniable – it has until now been largely a question of cost, and what workloads a customer has justify the additional cost for flash.  To bridge that gap, customers have had to make tradeoffs, or manage tiering between flash and disk to save costs.  

With the long-term decline of NAND costs, the additional benefits of QLC and advances like Pure Storage’s DirectFlash technology, we are within arm’s reach of crossing that cost-chasm and saving customers from having to make tradeoffs.

Blocks & Files: Also Pure’s Evergreen//One has four block and two unified file and object tiers in the catalog. The block tiers have different performance levels. Eg:

  • Capacity Tier – lower commitments with a minimum of 200 TiB, and for tier 2 workloads, decreasing the minimum entry point by one third. Other tiers retain their minimum commitment of 50 TiB .
  • Faster Performance Tier to accelerate hybrid and multi-cloud environments.
  • Even faster Premium Tier to support specialised tier 1 workloads such as containers and test and dev applications.
  • The fastest UItra Tier designed for in-memory databases.

So if Pure itself offers tiering, what’s wrong with it?

Ajay Singh: There’s always going to be a place for higher and lower performance workloads – but with our performance tiers, it’s an issue of optimizing and balancing compute performance for storage efficiency and capacity, not trading off disk vs. flash because of cost.  With our Evergreen/One’s performance tiers, we are giving customers choice on how much compute processing to deploy to drive needed performance vs. how much to optimize for capacity, scale and power efficiency.

Pure Prediction – 2: The economics that long led companies to maintain lower-cost but slower spinning-disk storage no longer hold as NAND cost per bit continues to approach that of disk.

Blocks & Files: But flash is still more expensive than disk and there is no evidence that NAND cost/bit will equal disk cost/bit. Where are the numbers in cost/bit and TCO to justify your assertion?

Ajay Singh: Without getting into exact cost $’s (which we don’t disclose), the main drivers to close the gap on finished system costs between Pure’s arrays and a disk-based array mainly come down to: more effective data reduction, much greater density (without sacrificing performance), lower power, cooling and space costs, and much longer service lifetimes.

The parallelism of flash, combined with Purity operating environment software which was designed to exploit it, allows us to drive significantly more data reduction (dedupe and compression) than disk-based systems.  

Our DirectFlash technology allows us to ship much denser systems without sacrificing performance in the way that large HDDs or SSDs would, which results in being able to provide several times more raw storage behind the same amount of compute that other systems can.  

And then with the operating savings we give customers in lower power, cooling, and space costs, as well as reliability and service lifetimes – we can make flash very compelling compared to HDD-based systems on a TCO-basis.

Pure Prediction – 3: The workloads that dominate companies’ IT and strategic agendas are increasingly based on modern machine generated unstructured data which is incompatible with the spinning disk. 

Blocks & Files: Why is machine-generated data incompatible with spinning disk? It needs to be stored and disk storage is cheaper than flash storage. What is the problem?

Ajay Singh: Unstructured data tends to be produced and accessed in much more unpredictable and highly concurrent ways – mostly as a result of the applications interacting with the data. Structured data tends to be beholden to a single application, which drives one type of access pattern and moderate parallelism.  

Some forms of unstructured data, particularly those involved in analytics / data pipelines or technical computing, tend to be fanned in and out of larger scale-out applications / compute tiers, which results in highly parallel and less predictable data access patterns – something that is much more challenging for mechanical disk-based systems to manage.

Pure Prediction – 4: The pandemic has forced organizations to look at the human touch points associated with forklift upgrades, painful upgrades, and unplanned outages – and the need to eliminate them. Flash is better enabled by software and significantly more reliable. 

Blocks & Files: Prove it. High-availability disk drive arrays can have non-painful, non-forklift upgrades and can be free from outages. Such things are not inherently media-centric. Why is flash better enabled by software than disk and what is it better enabled for? What numbers are you using to justify the statement that flash is significantly more reliable than disk?

Ajay Singh: Pure’s flash is significantly more reliable than disk.  We look at industry data on annualized failure rates (AFR) of HDDs and SSDs, and compared to our own fleet reliability data (DirectFlash modules that Pure has shipped, is supporting, and is monitoring through phone home telemetry), our flash modules are significantly more reliable as measured by our own annualized return rates.  

This advantage compared to HDDs, is largely from avoiding environmental or mechanical failures, and compared to SSDs is largely from far simpler firmware in the drives (due to DirectFlash software), and much better media endurance due to Purity’s flash awareness and optimizing for P/E cycles.

Pure Prediction – 5: Truly elastic “as-a-service consumption” is delivering the agility that organizations need as they evolve to distributed cloud architectures. Flash is more agile and efficient.

Blocks & Files: Are you saying that disk storage cannot be used in an elastic “as-a-service consumption way? Why not? The hyperscalers use disk storage in an as-a-service way so they find it agile and efficient. What numbers are you using to justify this claim? Which hyperscalers have abandoned disk storage?

Ajay Singh: This is more a comment on the as-a-Service experience built on enterprise arrays – hyperscalers have built their services on significantly different infrastructure (SW and HW) than the typical enterprise has or needs.  

Within the context of enterprise-class systems, Pure’s flash-based systems are able to serve a wide range of workloads, over a wide range of performance and capacity points, with a single set of HW building blocks, a largely shared SW codebase, and 2 core architectures.  This is contrasted against many other systems needing to be more specifically configured/tuned for different workload types, making it much more difficult to deliver a truly seamless “as-a-Service” experience based on it.

Enfabrica intros hyper-distributed data moving ASIC

Startup Enfabrica is developing a technology it claims will be “able to scale across every form of distributed system – across cloud, edge, enterprise, 5G/6G, and automotive infrastructure – and be adaptable to however these workloads evolve over time.”

Rochan Sankar, Enfabrica
Rochan Sankar

Enfabrica was started in September 2019 by CEO Rochan Sankar and chief development officer Shrijeet Mukherjee. Sankar was previously a senior director of product management and marketing at Broadcom. Mukherjee has an engineering background in Silicon Graphics, Cisco, Cumulus Networks (acquired by Nvidia) and Google, where he was involved in networking platforms and architecture.

The company has had an initial $50 million funding round from Sutter Hill Ventures.

The initial product is a server fabric interconnect ASIC, a massively distributed IO chip. It will be part of an integrated hardware and software stack to interconnect compute, storage and networking to get applications executing in servers faster. This sounds like a Data Processing Unit (DPU) or SmartNIC device.

Enfabrica hyper-distributed server fabric scheme
Blocks & Files diagram illustrating Enfabrica’s hyper-distributed server fabric scheme

The envisaged workload is to interconnect hyperscalers’ scaled-out and hyper-distributed servers running AI and machine learning applications, and the scale of this workload is increasing at a high rate. In a October 2022 blog by Sankar, he said: “AI/ML computational problem sizes will likely continue to grow 8 to 275-fold every 24 months.”

Another blog by Sankar and Alan Weckel, principal analyst at 650 Group, says: “An architectural shift is needed because the growth in large-scale AI and accelerated compute clusters – to serve evolving workloads as part of, for instance, AdSense, iCloud, YouTube, TikTok, Autopilot, Metaverse and OpenAI – creates scaling problems for multiple layers of interconnect in the datacenter.”

Shrijeet Mukherjee, Enfabrica
Shrijeet Mukherjee

The need is to get data into the servers faster so that overall runtimes decrease. In Sankar’s view: “The choke point occurs at the movement of enormous amounts of data and metadata between distributed computing elements.This issue applies equally to compute nodes, memory capacity that must be accessed, movement between compute and storage and even movement between various forms of compute, whether it is CPUs, GPUs, purpose-built accelerators, field-programmable gate arrays, and the like.” 

Enfabrica claims: “The key to scaling distributed compute performance and capacity rests in the ability to expand, optimize and scale the performance and capacity of the I/O that exists within servers, within every data center compute rack and across compute clusters.” It’s talking about developing a new distributed computing fabric across many, many nodes in a warehouse-scale computing system.

It says: “Innovations and technology transitions in interconnect silicon will drive substantially higher system I/O bandwidth per unit of power and per unit of cost to enable volume and generational scaling of AI and accelerated computing clusters.

“These technology transitions apply to multiple levels of data center interconnect: network, server I/O and chip-to-chip; multiple layers of the interconnect stack: physical, logical and protocol; and multiple resource types being interconnected: CPUs, GPUs, DRAM, ASIC and FPGA accelerators and flash storage.”

Its view is that “further disaggregation and refactoring of interconnect silicon will deliver greater performance and efficiencies at scale, in turn creating new system and product categories adopted by the market.”

This disaggregation refers to the “functions of network interface controllers (NICs), data processing units (DPUs), network switches, hostbus switches and I/O controllers and fabrics for memory, storage, GPUs and the like.”

Standardization with CXL, Ethernet and PCIe is needed if new technology is to be successfully adopted. Sankar and Weckel say: “Enfabrica is building interconnect silicon, software and systems designed to these same characteristics to provide best-in-class foundational fabrics – from chiplet, to server, to rack, to cluster scale computing.”

Enfabrica wants its SFA universal data moving ASIC and fabric to replace SmartNICs and DPUs by providing a much faster interconnect fabric than any SmartNIC/DPU combination.

Comment

This startup, somewhat like the Microsoft-acquired Fungible, wants to replace existing interconnect schemes in warehouse-scale, hyper-distributed datacenters. The number of potential datacenter customers will be relatively small and Enfabrica will have to develop its product in conjunction with potential customers to be certain its technology will match market needs and deliver the benefits it promises. We think it is some years away from general availability.

Storage news ticker – January 5

Storage analyst Steve Duplessie
Steve Duplessie

The co-founding analyst at ESG, Steve Duplessie, has retired, becoming a professional golfer at the Ibis Golf and Country Club in Palm Beach, Florida. He has been a fixture in the storage analytics space for more than two decades since starting up the Enterprise Strategy Group in January 1999. He was previously an EMC sales rep before starting Invincible Technologies Corp in 1993 to sell storage and allied hardware and software. ESG became a prominent if not the leading storage analysis and research consultancy producing highly respected validation reports for supplier’s products, and becoming a strategic advisor to leading storage suppliers. ESG was bought by TechTarget a year ago. 

Folio Photonics is demonstrating its Folio optical disk technology at CES 2023. Folio disks are multi-layered and can hold between 500GB and 1TB of data with 4 to 8 layers.  Theoretically it can be developed to 2TB to 4TB per disk using 32 layers, with multi-disk cartridges holding even more data. The premise is to have tape-class storage density with optical disk random access speed.

HPE is selling its HPC Holdings interest in H3C Technologies to China’s Unisplendour, via its Hong Kong-based Unisplendour International Technology subsidiary, according to a December 30 SEC 8K filing. HPE and its Izar Holding Co subsidiary own 49 percent of  H3C, which has the exclusive right to sell HPE products in China. H3C was formed by HPE and Tsinghua Holdings subsidiary Unisplendour in 2015 to enable HPE to sell products in China. This deal could net HPE between $3.5 billion and $4.0 billion and could be used for stock buybacks and/or acquisitions. The actual amount will be based on the price per share of 15 times the last year’s post-tax profit of H3C (measured as of the period ending April 30, 2022) divided by the total number of H3C shares outstanding. HPE should reveal the price at the end of the month.

Stephen Bates, storage exec for Huawei
Stephen Bates

Huawei has appointed Stephen Bates, formerly CTO at computational storage supplier Eideticom, as its Technical VP and Chief Architect Emerging Storage Systems. He will lead a Canadian team researching all aspects of emerging storage systems from media (NAND, QLC, PLC, PM and other memory tech) through protocols and standards (NVMe, CXL, UCIe etc.) to system software (OSes, file systems, hypervisors and containers) and applications (databases, analytics, AI etc.).

SSD controller supplier Phison is demonstrating the company’s latest PS5026-E26 PCIe Gen 5 SSD at CES 2023. Its Gen 5 X Series SSD controller doubles the bandwidth and improves latency by 30 percent compared to its PCIe gen 4 predecessor, surpassing 14GBps sequential and 3.2 million IOPS random performance. It is also previewing its latest Gen5 X Series SSD enterprise controllers, which can provide twice the performance per watt in comparison to the previous X1 generation. It’s also showing its PS7201 Retimer and PX7101 Redriver supporting Gen 5 PCIe.

Phison SSD storage

App development and infrastructure software supplier Progress plans to buy MarkLogic, which supplies a unified multi-model NoSQL data platform encompassing data and semantic metadata. The price is expected to be $355 million and should complete in the next few months. When closed, the acquisition is expected to add more than $100 million in revenue and strong cash flows to Progress.

Seagate is partnering with OSNEXUS to sell Lyve Cloud hosted storage (Exos Corvault) operated by OSNEXUS’s QuantaStor scale-up/scale-out unified file, block and object storage software. Lyve Cloud is a managed storage service using Exos Corvault and Exos AP 2U24 chassis for flash storage and compute disk drive enclosures in Equinix colo centers. It also provides Zadara-based services. Seagate has already partnered with MinIO for object storage software for Lyve Cloud and Hammerspace for its Global Data Environment. Download a solution brief here.

….

Verge.io, which supplies virtual datacenters, says it is hiring a first-rate, well known, experienced industry veteran as its CMO next week.

Live data replicator WANdisco has joined the Scalable Open Architecture for Embedded Edge (SOAFEE) organization, which wants to build an open, standardized cloud-native architecture for automotive innovation and design. WANdisco will support the development of software that enables an array of automotive applications in software-defined vehicles (SDV). SOAFEE includes enterprise companies from across the automotive, semiconductor and cloud industries, and is governed by AWS, Bosch and Arm, among others.

Kioxia-WD merger negotiations reportedly restart

Bloomberg reports Kioxia and Western Digital are discussing a merger in renewed talks that could create a force in the NAND industry to rival Samsung.

Kioxia and Western Digital operate a joint-venture producing NAND chips at foundries in Japan, and previously discussed a merger last year.

Kioxia was formed by Toshiba selling 60 percent of its NAND foundry and SSD business to a Bain-led consortium for $18 billion in 2017. This was a way of recouping losses incurred by Toshiba’s Westinghouse nuclear power station building business, losses severe enough to make a Toshiba bankruptcy possible.

Kioxia, which inherited Toshiba’s share in the Western Digital JV,  has since prospered, building new fabs through the JV, and considered an IPO, which would provide Toshiba with cash, but the plan was put on hold in October last year due to a NAND oversupply depressing prices and hence revenues.

The JV arrangement is that Kioxia owns the foundries and that it and WD each buy half of the fab’s output at a small markup on the production cost. Kioxia has its own NAND production capacity outside the JV and Wells Fargo analyst Aaron Rakers estimates this is equivalent to about 20 percent of the JV’s output.

David Goeckeler

Western Digital has two main business units: hard disk drives (HDDs) and SSDs, built using the JV’s NAND chips. WD is involved with activist investor Elliott Management has a stake in WD and is campaigning for a separation of the two businesses, claiming that the valuation of the two separate parts would be higher than WD’s present valuation in stock price terms. WD’s board and CEO, David Goeckeler, are presently engaged in a strategic review of the business as a result of Elliott Management’s engagement.

Were Kioxia and WD to merge then the two would have a NAND market share larger than current market leader Samsung.

TrendForce NAND revenue market share numbers

Research house TrendForce estimated Samsung’s NAND market revenue share at 31.4 percent in the second half of 2022, with Kioxia at 20.6 percent and WD at 12.6 percent. Combine the two and we have 33.2 percent. SK hynix would be third with 18.5 percent and Micron fourth with 12.3 percent, possibly prompting an alignment between the two to gain manufacturing economies of scale.

A merged Kioxia-WD NAND business would most probably have a higher valuation than WD’s current NAND business unit, and possibly even higher than WD’s total current valuation of $10.5 billion. Such an outcome would please Elliott Management. Spinning off a separate WD-Kioxia NAND business could provide Elliott and other WD shareholders with an increased valuation of their combined stock in WD’s HDD business and the enlarged NAND/SSD business. WD’s shares rose 5.2 percent to $33.05 in the NYSE when news of the merger talks was published by Bloomberg. They rose another 8 percent in after hours trading.

It is thought by analysts such as Rakers, that WD would need outside financial help to complete a merger with Kioxia. An obvious partner here could be Bain and its Kioxia ownership consortium which could see shares in a combined Kioxia-WD entity worth more than their shares in Kioxia alone.

WD’s HDD business is currently operating in a depressed and oversupplied market and the company reported an overall 26 percent revenue decline in its third calendar 2022 quarter. It is expecting a 38 percent year-on-year revenue decline at the mid-point of its fourth quarter guidance, down to $3 billion.

Although the NAND market, like the HDD market, is also over-supplied, both are expected to recover as the world’s demand for storing and accessing data continues unabated.

Bootnote

Both Kioxia and Western Digital refused to comment on Bloomberg’s merger talks report.

Opinion: Online disk archives are just wrong

A question: what’s the difference between nearline disk storage and an active archive system only using disk drives? The answer is none.

The Cambridge Dictionary defines the word archive thus: “A computer file used to store electronic information or documents that you no longer need to use regularly.”

In that case it no longer needs to be stored on disk drives offering continuous access.

Active Archive Alliance

The Active Archive Alliance (AAA) organization definition of an active archive says: “Active archives enable reliable, online and cost-effective access to data throughout its life and are compatible with flash, disk, tape, or cloud as well as file, block or object storage systems. They help move data to the appropriate storage tiers to minimize cost while maintaining ease of user accessibility… Creating an active archive is a way to offload Tier 1 storage and free up valuable space on expensive primary storage and still store all of an organization’s data online.”

In other words, an active archive covers non-primary data, meaning secondary (nearline) and tertiary (offline) data with no mention of online media being restricted to caching. The AAA is saying you can have an online archive.

Its version of the four-tier storage model omits the media types from all the tiers, and contains a deep archive sub-class:

Active Archive Alliance 4-tier storage model as shown in the Storage Newsletter
Active Archive Alliance 4-tier storage model as shown in the Storage Newsletter

This opens the door to online archives, such as products from disk drive maker and AAA member/sponsor Seagate.

Seagate Enterprise Archive Systems

Seagate describes enterprise data archives as “storage systems or platforms for storing organizational data that are rarely used or accessed, but are nevertheless important. This may include financial records, internal communications, blueprints, designs, memos, meeting notes, customer information, and other files that the organization may need later.”

The “early enterprise data archives were mostly paper records kept in designated storage units… More recently, organizations are moving their data archives to cloud-based solutions. Cloud-based solutions make data archives more accessible and reduce the associated costs.” 

Cloud-based solutions include on-premises object storage disk-based systems using Cloudian, Scality or other object storage software and Seagate Exos disk drive enclosures or its Lyve Cloud system of a managed disk array service.

There is no concept of the disks caching data in front of a library of offline tape or optical disk cartridges here. Analyst Fred Moore of Horrison Information Strategies has a different view.

Horrison view of archive

He explains what an archive means in a “Building the Archive of the Future” paper sponsored by Quantum. Unlike backup, which is making copies of data so that the copy can be restored if the original is lost or damaged, an archive is a version of the original data from which parts can be retrieved, not restored.

This definition, with the restore vs retrieval keystone, is the one used by W Curtis Preston in his Modern Data Protection book published in 2021.

Modern Data Protection by W Curtis Preston talks about archive storage

The moving of data to archival storage frees up capacity on the primary storage location and takes advantage of cheaper and higher-capacity long-term storage with slower access times, such as tape or optical disk. Moore says there are two kinds of archive; an active archive composed from offline tape and online disk drives, and a longer-term or deep archive composed just from offline storage.

An archive is defined as well by its use of specific software; object storage software that scales out and geo-spreads unstructured and object data to manage and protect archival storage needs. It includes smart data movers, data classification and metadata capabilities.

Moore says: “A commonly stated objective for many data center managers today is that ‘if data isn’t used, it shouldn’t consume energy.‘” This clearly places tape as the greenest storage solution available. He suggests: “Between 60 and 80 percent of all data is archival and much of it is stored in the wrong place, on HDDs and totals 4.5-6ZB of stored archival data by 2025 making archive the largest classification category.” Note that thought: “Stored in the wrong place… on HDDs.”

Archive is mentioned in Fred Moore’s four-tier storage diagram
Fred Moore’s four-tier storage diagram

His point is clear: disk storage is the wrong medium for an archive. What role then does disk play in the active archive tier? Moore says: “An active archive implementation provides faster access to archival data by using HDDs or SSDs as a cache front-end for a robotic tape library. The larger the archive becomes, the more benefit an active archive provides.”

In Moore’s view, online media, disk or NAND, is a cache in front of a tape library, not a storage archive tier in its own right. That’s quite different from the Active Archive Alliance viewpoint.

Online archives and nearline storage

The AAA’s active archive definition is confusing as it includes both online and offline media. For Moore, an archive is inherently offline.

An archive in the traditional sense should not include storage systems using constantly moving media, such as disk or tape; it uses too much electricity and archive data access needs generally don’t require continuously available access. An archive should be based on offline media only, with a front-end online cache for active archives.

To my mind there needs to be a strong distinction between offline and online archive media because the energy consumption and access characteristics are so different. Letting online disk into the same category of system as offline media is like letting a carbon-emitting fox into an environmentally green hen house. Calling disk-based storage an active archive systems is a misnomer. They should be regarded as nearline object storage systems.

Some Active Archive members appear to agree. In an August 2022 blog, IBM’s Shawn Brume, Tape Evangelist and Strategist, said: “In a study conducted by IBM in 2022 that utilized publicly available data, a comparison of large-scale digital data storage deployments demonstrated that a large scale 10 petabyte Open Compute Project (OCP) Bryce Canyon HDD storage had 5.1 times greater CO2e impact than a comparable enterprise tape storage solution.” 

Brume blog graphic on archive storage
Brume blog graphic. Tape is far more environmentally friendly than disk

“This was based on a ten-year data retention lifecycle using modern storage methodologies. The energy consumption of HDD over the life cycle along with the need to refresh the entire environment at Year 5 drives a significant portion of CO2 emissions. While the embedded carbon footprint is 93 percent lower with tape infrastructure compared to the HDD infrastructure.”

Brume goes on to include the AAA’s four-class tiered storage diagram in his blog, which distinguishes between active archives and archives, which have the deep archive sub-class.

Seagate and spin-down

You could theoretically have a disk-based archive system if it used spin-down disks. This was tried by Copan with its MAID (Massive Array of Idle Disks) design back in the 2002-2009 period, and revisited by SGI in 2010. It’s not been successful, though.

Disk drive manufacturer Seagate actually produces spin-down disk systems. Its Lyve Mobile array is  “portable, rackable solution easily integrates into any data management workflow. Get versatile, high-capacity and high-performance data transfers. With industry-standard AES 256-bit hardware encryption and key management in a rugged, lockable transport case.” The disk drives are not spinning when the transport case is being transported.

In theory then, it could develop a spin-down Exos or Corvault disk enclosure and then its attempts to present itself as lowering the lifetime carbon emissions of its products would have a stronger substance.

Linbit pushes DRBD-based software as SAN replacement

Savvy Linux users can provide SAN services to physical, virtual and containerized applications without buying a SAN product, by using LinBit open-source software based on DRBD, which is included in the Linux distribution.

LINBIT provides support for DRBD in a similar way to how Red Hat supports its Linux distro and has four downloadable products on its website: a DRBD Linux kernel driver, DRDB Windows driver, LINSTOR cluster-wide volume manager and, in tech preview form,  LINBIT VSAN for VMware. The core product is DRDB, a Distributed Replicated Block Device for Linux. 

Philipp Reisner

Unassuming LINBIT founder and CEO Philipp Reisner describes the Vienna-based company, with its €4.5 million annual revenues, as “a small money machine.” 

He was talking to an IT Press Tour in Lisbon, describing his products and their place in the Linux world.

The company is privately owned and not VC-funded. Reisner started up the company in the early Linux open-source days. The company has 35 to 40 full time employees in Europe, mostly in Vienna, with 30 other employees in the US. It works with SIOS in the Japanese market.

Its software is based on Linux kernel technology which Reisner devised as part of his 2001 diploma thesis at Vienna’s Technical University. That has become DRBD and it has been included in the Linux kernel since version 2.6.33 (2009), been deployed on all major Linux distributions and is hardware and software agnostic.

The basic idea seems simple now, looking back 21 years later. Everything happens in the kernel data path, reducing the number of context switches and minimizing block IO latency.

A primary compute node with local storage issues a block write to that storage. A copy of the write is sent (replicated) to a connected secondary node, written there as well, and an ack sent back to the primary; data safely replicated, job done.

Linbit supports DRBD primary and secondary consistency groups with replication between them. There can be up to 32 replicas and each may be asynchronous or synchronous. All participating machines – there can be from 3 to 1,000s of participating nodes – have full replicas.

Linbit DRBD diagram

Linbit has developed DRBD to software-defined storage (SDS) including persistent volumes for containers, and says DRBD can be used for transaction processing with Oracle, PostgreSQL and MariaDB, virtualization with OpenStack, OpenNebula and others, and for analytics processing in data warehouses and other read-intensive workloads. Linbit SDS customers include Intel, Porsche, Cisco, Barkman, IBM, Sphinx, Kapsch, BDO Bank.

It recently added support for persistent memory and nonvolatile DIMM metadata and improved fine-grain locking for parallel workloads. More performance optimizations are on its roadmap as is a production release of WinDRDB – a port of the Linux DRBD code for highly available replicated disk drives to Windows. The V1 non-production release took place in the first 2021 quarter.

Other roadmap items are support for the public cloud with storage drivers for AWS EBS, Azure disks and Google Persistent Disks.

The world of Linbit is a universe away from the aggressive code-wrangling characteristic of Silicon Valley startups. It is a tribute to a single-minded concentration on a core Linux technology and a complete opposite to VC-funded, money-focused, growth aggression.

Object storage: Coldago moves Quantum up and Cloudian down in the rankings

Research firm Coldago has released a 2022 object storage map, moving Quantum into the Leader’s area from last year’s Challenger’s ranking, and giving Cloudian a lower Execution and Capabilities score than in 2021.

The Coldago Research Map 2022 for Object Storage lists and ranks 15 vendors playing in the object storage market segment. They are, in alphabetic order: Cloudian, Cohesity, Commvault, DataCore, Dell, Hitachi Vantara, IBM, MinIO, NetApp, Pure Storage, Quantum, Scality, Spectra Logic, VAST Data and WEKA. There are seven leaders: Cloudian, Hitachi Vantara, IBM, MinIO, NetApp, Pure Storage and Quantum, in alphabetic order. 

DDN, Huawei and Fujitsu, which all appeared in Coldago’s 2021 object storage map, disappeared from the 2022 supplier rankings while WEKA is a new entrant.

We asked Coldago analyst Philippe Nicolas questions about about this year’s map.

Blocks & Files:  Could you describe the basic Map concept please?

Philippe Nicolas

Philippe Nicolas: A Coldago Map has essentially 2 dimensions: one – Vision and Strategy, and two – Execution and Capabilities. A leader is recognized by its position to the right but also to the top. Coldago doesn’t only compare products … but analyzes companies playing in a domain. This includes company profile, business and strategy, products, solutions and technologies. I recognized that the product part occupied a large piece and our features matrix uses more than 50 criteria or attributes.

Blocks & Files: What are your data sources?

Philippe Nicolas: We have collected lots of data during this period from end-users, partners and vendors meetings, watching companies and products announcements, cases studies, integrations or partnerships. These informations could be positive like number of deployments, large configurations, application integrations or negative like product un-installations, data integrity and sales behaviors. Of course negative info could have serious impacts. All these data are then digested, segmented, grouped and normalized to make comparison and ranking easier.

Blocks & Files: Talk us through some of the vendor highlights.

Philippe Nicolas: Pure Storage and VAST Data had an excellent year in many aspects; product release and features, deployments, revenues or market penetration. MinIO and Cloudian are leaders like last year even if their positions changed a little based on their 12 month’s operations. 

Scality is still a serious challenger and it reflects their 12 month’s activity. Relatively, Scality has vertically a better position that Cloudian but it belongs in the challenger partition. 

For MinIO, the company continues to confirm their ubiquitous presence with tons of integrations, large adoption and it has moved up vertically. Again, Cloudian is a leader which is not the case for DataCore and Cohesity. DataCore delivered a better job than Caringo – which it acquired in January 2021. 

Cohesity had clearly some positive aspects on the execution side with strong channels, and a comprehensive product with multiple deployment flavors. WEKA is added this year, entering for the first time, as it provides an S3 interface like all others, but it belongs in the specialists’ section. We’ll carefully monitor WEKA’s S3 activity in 2023.

Blocks & Files: Any comments about the other suppliers?

Philippe Nicolas: The historical players, many of them, missed some opportunities and clearly lost some visibility except Hitachi Vantara or MinIO. Also, Quantum has made an interesting move with a product established for years and several developments; it now belongs in the leader’s section, thanks to an improved activity in many aspects during the 12 months. NetApp and IBM, Red Hat included, also occupied a leader’s position.

Blocks & Files: What has changed in the object storage market since last year?

Philippe Nicolas: What is important to realize is the shift in the market for a few years. Being internally an object storage with the few characteristics we know like flat address space… is no longer so important. In other words, the Map is not about object storage purists but more about the market reality. On the users’ side, they care about a S3 interface with a high degree of compatibility with the Amazon API but less on other aspects. They’re looking for ease of deployment, maintenance and support, scalability, capacity and a few other core features of course, especially in the data protection area.

Storage news ticker – 3 January 2023

Storage news
Storage news

Acronis‘ latest cyberthreats and trends report for the second half of 2022 has found that phishing and the use of MFA (Multi-Factor Authentication) fatigue attacks are on the rise. The report provides an in-depth analysis of the security landscape including ransomware threats (number 1), phishing, malicious websites, software vulnerabilities and a security forecast for 2023. Threats from phishing and malicious emails have increased by 60%, and the average cost of a data breach is expected to reach $5 million in 2023. Download a copy of the full Acronis End-of-Year Cyberthreats Report 2022 here.

AWS has announced Amazon S3 Block Public Access that works both at the account and bucket level, and enables an AWS customer to block public access to existing and future buckets. This feature can be accessed from the S3 Console, the CLI, S3 APIs, and from within CloudFormation templates. More info is here.

Druva has replaced Code42 at Alector, a San Francisco startup aiming to slow the progression of neurodegenerative diseases like Alzheimer’s and Parkinson’s. Druva’s case story claims Code42 wasn’t meeting Alector’s demands and plagued the IT team with tickets. Alector also suffered from data sprawl with data residing across a variety of environments, including the data center and Microsoft 365. This made it difficult to track the status and location of workloads and their security, we’re told. Here is the case study.

Molly Presley, Hammerspace’s SVP of Marketing, predicts 2023 will be the year distributed organizations can realize the value and insights of unstructured data faster and more efficiently. Unstructured data will – she reckons – be able to have the same unified access, management, utilization, and rationalization as structured data thanks to unified data management with a unifying metadata control plane and automated data orchestration. Trends for the year include:

  • IT supply chain challenges will compel new approaches to data management.
  • To access sufficient compute resources, organizations need the ability to automate Burst to the Cloud.
  • Access to software engineering talent must be possible from anywhere in the world.
  • Edge will no longer be used only for data capture but also for data use. 
  • The use of software-defined and open-source technologies will intensify.
  • Metadata will be recognized as the holy grail of data orchestration, utilization, and management.
  • A shift away from hardware-centric infrastructures toward data-driven architectures.
  • Data Architects will be the upcoming King of the IT Jungle. 
  • True storage performance that spans across all storage tiers.

… 

IBM has produced a Spectrum Scale Container Native Version 5.1.6.0. Spectrum Scale in containers allows the deployment of the cluster file system in a Red Hat OpenShift cluster. Using a remote mount attached file system, it provides a persistent data store to be accessed by the applications via a CSI driver using Persistent Volumes (PVs). This project contains a golang-based operator to run and manage the deployment of an IBM Spectrum Scale container native cluster.

Containerised Spectrum Scale.

High-end array supplier Infinidat has hired Dave Nicholson as a new Americas Field CTO to replace the retiring Ken Steinhardt. Nicholson’s experience includes being a member of the Wikibon/Silicon Angle/theCube storage and IT analyst firm, GM for Cloud Business Development at Virustream, VP & CTO of the Cloud Business Group at Oracle and Chief Strategist for the Emerging Technology Products Division at EMC. He has roughly 25 years of experience in enterprise storage.

Unified data platform supplier MarkLogic’s MarkLogic 11 product version includes:

  • Geospatial analysis – a more flexible model for indexing and querying geospatial data, scalable export of large geospatial result sets, interoperability with GIS tools; includes support for OpenGIS and GeoSPARQL
  • Queries at scale – Improved support for large analytic, reporting, and/or export queries with external sort and joins
  • Unified Optic API for reads and writes – write and update documents with the Optic API without having to write server-side code
  • BI analysis – use GraphQL to easily expose multi-model data to BI tooling using an industry standard query language
  • Docker and Kubernetes support – deploy MarkLogic clusters in cloud-neutral, containerized environments that use best practices to ensure your success

Effective January 17, Micron has promoted Mark Montierth to corporate VP and GM of its Mobile Business Unit. He is currently VP and GM of high-bandwidth and graphics memory product lines in Micron’s Compute and Networking Business Unit. Raj Talluri is still listed on Linked in as SVP and GM of Micron’s Mobile BU but he is leaving to pursue another opportunity, we’re told.

Amazon FSx for NetApp ONTAP now has FedRAMP Moderate authorization in US East (N. Virginia), US East (Ohio), US West (N. California), and US West (Oregon), and FedRAMP High authorization in AWS GovCloud (US) Regions. Additionally, Amazon FSx for NetApp ONTAP is now authorized for Department of Defense Cloud Computing Security Requirements Guide Impact Levels 2, 4, and 5 (DoD SRG IL2, IL4, and IL5) in the AWS GovCloud (US) Regions. NetApp said that, with this announcement, agencies at all levels of government can move data workloads to the AWS Cloud.

Germany’s Diakonie in Südwestfalen GmbH, which stores medical data from surgical robots, X-ray PACS and mammography screening, has countered rising storage HW (all-disk) costs by introducing automated archiving on tape with PoINT Storage Manager. This has a two-tier storage architecture with primary (disk) and archive (tape) storage tiers and automated transfer from the former to the latter. German-language case study here.

Diakonie in Südwestfalen PoINT system diagram.

Digital insurance provider Allianz Direct is using Rockset’s cloud-native Kafka-based technology to deliver real-time pricing. An algorithm incorporates over 800 factors and adapting these rating factors to pricing models would previously have taken weeks. Rockset’s schema-less ingest and fast SQL queries allow Allianz Direct to introduce new risk factors into its models to increase pricing accuracy in 1-2 days, we’re told. Rockset’s native connector for Confluent Cloud enables Allianz Direct to index any new streaming data with an end-to-end data latency of two seconds, Rockset said. Allianz Direct also uses Rockset to power real-time analytics for customer views and fraud management.

Taiwan’s Digitimes reports that Samsung has raised its NAND prices by as much as 10%. This follows Apple terminating a NAND purchase deal with YMTC and YMTC getting placed on the US Entity list.

Samsung announced the development of its 16-Gbit DDR5 DRAM built using the industry’s first 12nm-class process technology, and the completion of product evaluation for compatibility with AMD. The new DRAM features the industry’s highest die density, Samsung said, which enables a 20% gain in wafer productivity. Its speed is up to 7.2Gbit/sec and it consumes up to 23% less power than the previous Samsung DRAM. Mass production is set to begin in 2023. The new DRAM features a high-κ material that increases cell capacitance, and proprietary design technology that improves critical circuit characteristics. It is built using multi-layer extreme ultraviolet (EUV) lithography.

Samsung 12nm DDR5 DRAM chips.

SK hynix is showcasing a new PS1010 SSD at CES in January. The product was first unveiled at the October 2022 OCP summit. It is an E3.S format product, uses 176-layer 3D NAND and has a PCIe gen 5 interface making it, SK hynix claimed, 130% faster reading, 49% faster writing and 75% better performance/watt than previous generation. Its also showing CXL memory, GDDR6-AiM memory and HBM3 memory.

The 2020 era PE8010 uses 96-layer TLC NAND with a PCIe gen 4 interface and a 1TB to 8TB capacity range. It delivered random read of 1,100,00 IOPS, random writevof 320,000 IOPS, sequential read of 6,500 MB/sec and sequential write of 3,700MB/sec.

Research house Trendfocus has produced a native tape capacity ship table from 2017 to 2021 with a forecast out to 2027. We charted the numbers, reported by the Storage Newsletter, to show the annual exabyte shipments and year-on-year percent changes;  

The 2021 percent change peak was due to the late arrival of the 18TB (raw) LTO-9 format in 2020. The 2022 to 2027 capacity ship CAGR is said to be 21 percent. Trendfocus sees an economic recovery in 2024 lifting the capacity ship growth rate. Over 80 percent of the shipments will be LTO-format tapes, with IBM 3592 format following. Tape still reigns supreme for archive data storage.

TrendForce further projects that the Client SSD attach rate for notebook computers will reach 92% in 2022 and around 96% in 2023. A demand surge related to the pandemic is subsiding, and the recent headwinds in the global economy have caused slower sales in the wider consumer electronics market. As such, client SSDs are going to experience a significant demand slowdown, which, in turn, will constrain demand bit growth. TrendForce projects that for the period from 2022 to 2025, the YoY growth rate of NAND Flash demand bits will remain below 30%. Eventually, enterprise SSDs will take over from client SSDs as a major driver of demand bit growth in the global NAND Flash market.

DataOps observability platform startup Unravel Data has confirmed that David Blayney has joined as Regional VP, Europe, the Middle East, and Africa (EMEA). Unravel raised a $50 million Series D round of funding in September led by Third Point Ventures, with participation from Bridge Bank and existing investors that include Menlo Ventures, Point72 Ventures, GGV Capital, and Harmony Capital, bringing the total amount of funding raised to $107 million.

WANdisco has announced a string of contract wins. It has signed an initial agreement worth $12.7m with a global European based automotive manufacturer for IoT data in the client’s data centre to be migrated to the cloud. This is a one-off migration. WANdisco has also signed a commit-to-consume agreement worth $31m with a second tier 1 global telco and IoT app supplier. Half of the $31m will be paid in advance following the commencement of the project. WANdisco now expects that FY22 revenues will be significantly ahead of market expectations and no less than $19m. Bookings for FY22 are expected to be in excess of $116m.