Analysis: With Dell’s PowerScale scale-out filer being included in the company’s generative AI announcement this week, we looked at how it and other storage systems compare when using Nvidia’s GPUDirect protocol, to serve data to and from Nvidia’s GPUs.
Update. Pure Storage FlashBlade GDS readiness clarified. 1 Aug 2023.
GPUDirect Storage (GDS) is an NVMe and CPU-bypass protocol to enable storage systems to send data to SuperPod and other Nvidia GPU servers as fast as possible. For reads, traditional storage has a server’s CPU copying data from its storage resource, DAS, SAN or NAS, into a memory buffer, and then writing it out to the network interface device for onward transmission. The data bounces, as it were, from the storage to a memory buffer before going to its destination. Incoming data (writes) follow the same route in reverse. GDS cuts out this bounce buffer stage and sends it direct from storage to the destination GPU system’s memory.
It is supported by Dell, DDN, IBM, NetApp (ONTAP and E-Series BeeGFS), Pure Storage, VAST Data and WekaIO. Pure told us: “Pure Storage’s FlashBlade hardware portfolio is GPU Direct Storage (GDS) ready, with software enhancements delivering complete GDS support to be available in the near term, further strengthening Pure’s collaboration with NVIDIA and enhancing the AIRI//S solution.”
GPUDirect is not the only method of sending data between NVIDIA GPUs and storage systems but it is reckoned to be the fastest.
Dell yesterday announced a Validated Design for Generative AI with Nvidia, featuring compute, storage and networking. The storage products included are PowerScale, ECS, and ObjectScale, with Dell’s PowerEdge servers providing the compute.
PowerScale is scale-out filer storage, the rebranded Isilon, while ECS is object storage with ObjectScale being containerized object storage based on ECS. Neither have any public GPUDirect bandwidth numbers published. The PowerScale F600 was noted as being GPUDirect-compatible in 2021; “PowerScale OneFS with NFSoRDMA is fully compatible and supported by NVIDIA GDS (GPUDirect Storage).” The ECS and ObjectScale systems do not have public GPUDirect bandwidth numbers.
We collated all the public sequential read and write storage node bandwidth numbers that we could find for Dell, DDN, IBM, NetApp, Pure and VAST Data systems when serving data to and from Nvidia GPU servers using GPUDirect.
We tried to find WekaIO numbers and did find a 113.1GB/sec sequential read bandwidth result for its scale-out and parallel filesystem. But it was an aggregate result from a cluster of servers using Weka’s file system. The number of servers was not revealed, nor their physical size, and so we couldn’t obtain a per-node bandwidth number. Nor could we find any write bandwidth numbers, and our table of results, which we charted below, does not include Weka numbers. It doesn’t have a Pure FlashArray//C read bandwidth number either as we could not locate one.
The overall read bandwidth per node results showed DDN in first place, VAST in second place, IBM ESS3200 third, NetApp E series in fourth place, ONTAP in fifth place and Dell in last place.
The write bandwidth results were different: DDN first, IBM second, NetApp E series third, VAST fourth, Pure fifth, NetApp ONTAP sixth and Dell seventh.
We should point out that the VAST storage nodes need compute nodes to go with them and the node size row at the bottom of the chart is only for the VAST Ceres storage node.
Our actual numbers are reproduced here:
The PowerScale numbers were derived from public Dell PowerScale F600 GiBps numbers for a 48-node GPUDirect system. We derived the per-node numbers from them.
Dell’s co-COO, Chuck Whitten, has resigned just two years after joining Dell, following board-level discussions.
Chuck Whitten
Anthony Charles “Chuck” Whitten joined Dell in August 2021 after spending 23 years and two months at Bain, the business consultancy group that was working with Dell to help shape its strategy and growth initiatives.
When Whitten joined Dell to work alongside its other COO, Jeff Clarke, Michael Dell said: “As our top advisor, Chuck has been an integral part of the team for a long time working across strategy, transformation and operations. I couldn’t be happier to have him as co-COO to capture big growth opportunities across our portfolio as the world becomes more digital and data-driven.”
Now he’s gone, as a post in LinkedIn said: “Sharing the announcement that went out to Dell team members today announcing that I will be stepping down as Co-COO of Dell Technologies.”
No reason was given. All he said was: “It has been a privilege working alongside and learning from Michael Dell and Jeff Clarke the last 14 years, and the company has never been better positioned for the current moment in technology. I will forever be cheering on Dell Technologies – the future is bright!”
Whitten shared an image of an internal Dell communication showing a message from Michael Dell and his own response:
The joint Dell-Whitten decision came “after discussions with Chuck and the board of directors about the leadership profile the company needs in its next chapter.” Both say great times lie ahead for Dell. Currently times are not so great as Dell’s results have been blighted by poor PC sales.
Whitten joined in Dell’s Q2 fy2022 and that and the next four quarters were revenue growth quarters for Dell, with an all-time high of $28.4 billion in Q3 fy2022.
Starting in the third fiscal 2023 quarter, however, PC sales fell off a cliff as makers faced a post-COVID slowdown. Since then Dell’s revenues have been declining for three quarters in a row:
The Q4 fy2023 revenue decline was mostly due to lower PC and server sales, both limited by supply chain difficulties. Revenues fell by 20 percent Y/Y in its latest quarter (Q1 Fy2024), with PC sales down 23 percent to $12 billion in a fifth successive down quarter for PC shipments. The smaller ISG business unit, selling servers and storage, also slumped 18 percent to $7.6 billion. Dell said it would lay off more than 6,500 staff in February.
The company is expecting the current Q2 fy2024 quarter to be poor as well, forecasting a 12 to 18 percent Y/Y decline.
Right now, though, Dell claims it is seeing a monster Generative AI-led growth opportunity. It’s adding GPUs to its workstations and servers in an edge AI effort. Its PowerScale F600 all-flash scale-out filer supports Nvidia’s GPD Direct protocol for sending data to GPU servers. There’s nothing for Whitten to do here.
Whitten won’t leave empty-handed though. We understand from SEC filings he was given a $5 million sweetener bonus to join Dell, and he has $19 million in stock options.
Today is SysAdmin Day. The concept was created by Ted Kekatos, a system administrator, in 1999. He wanted to establish a special day to recognize the hard work and dedication of system administrators who keep computer systems running smoothly.
.…
File sharer and collaborator Box announced a new plugin for Microsoft’s next-generation AI workplace tool, Microsoft 365 Copilot. Box also announced updates to its integrations with Microsoft 365 that it says provide joint customers with enhanced features for sharing, editing, and collaborating within Microsoft Teams as well as Office products including Word, Excel, and PowerPoint. Box recently announced Box AI to combine foundational AI models with content stored in Box. The Copilot tool with this new integration will allow users to synthesize and summarize all their shared Box documents in Teams to draw insights from collaborators. They may also ask questions of shared content to derive milestones in an existing project, with the aim of limiting the need to scroll through Chats and Channels in order to get up to speed more quickly on conversations and new learnings.
…
Data protector and manager Cohesity released yet another ransomware scare report. Over 90 percent of respondents believe the threat of ransomware to their industry has increased in 2023. Nearly 3 in 4 (74 percent) companies will pay a ransom to recover data and restore business processes. Two-thirds of respondents (67 percent) lack full confidence that their company could recover their data and critical business processes in the event of a system-wide cyberattack. Nearly half of the respondents (45 percent) confirmed their company had been the victim of a ransomware attack in the prior six months. Meanwhile, over 95 percent of the respondents revealed they would need over 24 hours to recover data and business processes if a cyberattack occurred, 71 percent said it would take more than four days, while 41 percent said over a week would be required. …
Data protector Commvault has joined the AWS ISV Workload Migration Program and attained multiple AWS Service Ready Program designations. Through the AWS ISV Workload Migration Program, customers can accelerate their migration to AWS through promotional credits and enhanced technical guidance and support. Combined with Commvault’s automated scalable and repeatable workflows, the result is pitched as a simplified, reduced-cost modernization to AWS.
…
Anandtech reports Micron’s Crucial business unit has announced two new portable SSDs: the USB 3.2 Gen 2 X9 Pro, and the USB 3.2 Gen 2×2 X10 Pro. The X9 Pro comes with 1, 2, or 4TB capacities and provides up to 1,050 MBps read and 1,050 MBps write bandwidth.
The X10 Pro has the same capacities but IO is faster, with up to 2,100 MBps read and 2,000 MBps write speed.
Both use Micron 176-layer 3D NAND. A 1TB X9 Pro costs around $80, 4TB $290, a 1TB X10 Pro costs $120 and a 4TB X10 Pro will set you back $290.
…
DataCore’s Perifery division has appointed Jonathan Morgan as SVP of product and technology, based in Wales. He is responsible for strategic product decisions across three development groups/technologies (Object Matrix, Swarm and OpenEBS). The role includes go-to-market alliance strategy and evangelising the product strategy. Morgan was Object Matrix CEO from 2003 to 2023, when DataCore acquired Object Matrix. Morgan will report to Perifery General Manager Abhijit Dey.
…
Data migrator/manager Datadobi has a blog saying “businesses not only grapple with the management of vast amounts of data but also face the looming threat of illegal data concealed within their digital repositories. … Illegal data encompasses a broad spectrum of content or files that contravene laws, regulations, and/or company policy. It includes materials such as pirated software, confidential information obtained through unlawful means, and content that promotes or facilitates illegal activities; as well as content that is simply not acceptable or useful on the corporate network such as holiday videos and cat pics.”
Datadobi’s file metadata indexer and searcher StorageMAP, the blog says, ”can play a crucial role in assisting organizations with mitigating the risk of illegal data.” But the word ‘illegal’ means not allowed by law so how can StorageMap detect illegal data in the strict meaning of the word? We asked and Datadobi’s Steve Leeper, VP Product Marketing, said: “Users of StorageMAP can find unwanted/illegal data by executing searches for specific characteristics such as file extensions that are either not allowed per corporate policy or meet criteria for further inspection.”
…
Prowess Consulting used Vdbench, an industry-standard storage benchmarking tool, to simulate a dataset with a 2:1 compression ratio and a 2:1 deduplication ratio, and check out a couple of arrays with 4:1data deduplication ratio (DDR) guarantees. It said Dell’s PowerStore 1200T demonstrated a DRR of 4.8:1, whereas a Vendor A storage platform, also with a 4:1 storage reduction guarantee, only achieved a 2.8:1 ratio. The PowerStore 1200T exceeded the 4:1 guarantee, while the Vendor A platform did not even meet the guarantee. Vendor A is unknown.
…
According to a BanklessTimes.com data analysis, web3 decentralized storage supplier Filecoin’sQ2, 2023 protocol revenue witnessed a 67 percent increase, reaching $11.5 million. There was a 64 percent quarter-on-quarter increase in active storage deals and a 60 percent rise in large datasets clients in the same quarter. This marks the second consecutive quarter of revenue growth for the platform since its low of $5.7 million in Q4 of 2022.
…
Amber Huffman
The Flash Memory Summit announced this year’s recipient of its Lifetime Achievement Award: Amber Huffman, who is being recognized for her achievements in bringing flash storage into the mainstream of the data storage industry by founding and driving the standards – ONFI, NVMe, AHCI, OCP and UCIe – that have made flash memory a mainstream storage medium for virtually all computing applications. She’s currently a principal engineer at Google Cloud. Before that she was a 25-year vet at Intel, finishing up as an Intel Fellow and VP, and CTO of its IP Engineering Group, leaving in late 2021 to join Google.
…
Data orchestrator Hammerspace says its revenues are growing at nearly 300 percent Y/Y. Customers include3 largest global Telcos, private Space exploration, hyperscale Large Language Model pipeline, online game development, Federal government, VFX and Entertainment, retailers, and more. It is targeting mid-size and large enterprise and hyperscalers as potential AI use-case customers, saying its high-performance data orchestration software can feed AI pipelines with more data faster in what is a new AI-influenced data cycle.
…
Hitachi Vantara has launched its Hitachi Unified Compute Platform (UCP) for Azure Stack HCI, starting with just one Hitachi server and scale up to 16 nodes. It supports multiple validated configurations, including a Hitachi Virtual Storage Platform (VSP) one. Customers can purchase UCP for AzureStack HCI with the Hitachi EverFlex infrastructure-as-a-service subscription model.
…
IBM Storage Scale v5.1.8.1 has just been released. It has a tech preview of Storage Scale (GPFS and then Spectrum Scale as it was) on the Google Cloud Platform (GCP) through cloudkit (an interactive CLI-based multi-cloud provisioner). The cloud-kit provides the ability to deploy an IBM Storage Scale Cluster on Google Cloud Platform (GCP) public cloud, including provisioning of all required cloud infrastructure, installation, and initial configuration of IBM Storage Scale. More info’ here.
…
High-end array supplier Infinidat has appointed Richard Connolly as regional director for the UKI and the DACH (Germany, Austria and Switzerland) regions. He was the Director of Global & Strategic Accounts at Palo Alto Networks and before that a global sales director at Hitachi Vantara.
…
Fabless 3D X-DRAM semi-conductor startup NEO Semiconductor’s CEO, Andy Hsu, will deliver a keynote address titled “New Architectures which will Drive Future 3D NAND and 3D DRAM Solutions”on August 9th at 11:40 a.m. PT at the Flash Memory Summit in San Jose. A new AI application for 3D X-DRAM called “Local Computing” will be disclosed, drastically increasing AI chip performance to a new level never reached before. He’ll mention other new memory structures derived from 3D X-DRAM for 3D NOR flash memory, 3D Ferroelectric RAM (FFRAM), 3D Resistive RAM (RRAM), 3D Magnetoresistive RAM (MRAM), and 3D Phase Change Memory (PCM). These new memory structures will bring technological breakthroughs for migrating these memory cells from 2D into 3D.
…
HCI vendor Nutanix has added Mark Templeton to its board of directors. He was most recently CEO of DigitalOcean, and prior to that spent 20 years at Citrix Systems (a Nutanix Partner), including 14 years as President and CEO. Templeton currently serves on the boards of Arista Networks and Health Catalyst, as well as several private company boards.
…
Samsung posted consolidated revenues of KRW 60.01 trillion for its Q2, 2023 quarter with profits of KRW 0.67 trillion. The Memory Business – DS division – posted KRW 14.73 trillion in consolidated revenue and KRW 4.36 trillion in operating losses for the second quarter. It saw results improve from the previous quarter as its focus on High Bandwidth Memory (HBM) and DDR5 products in anticipation of robust demand for AI applications led to higher-than-guided DRAM shipments. The division will focus on sales of high-value-added products such as DDR5, LPDDR5x and HBM amid expectations of a recovery in demand. It will continue to strengthen mid- to long-term competitiveness by increasing investments in infrastructure, R&D and packaging technology, while also enhancing the completeness of the Gate-All-Around (GAA) process.
…
Data analytics software supplier SQream says its petabyte-scale GPU-based SQL analytics database SQream DB is available on the Samsung Cloud Platform (SCP). The Samsung Cloud Platform (SCP) was launched by Samsung SDS in 2021. SQream became available on the Alibaba cloud in 2018. It also runs on AWS, Azure, GCP and the Oracle Cloud Infrastructure.
…
Synology announced the amount of data backed up in its cloud storage and backup platform, Synology C2, has surpassed 200 PB, an increase of 47 percent over the same period last year. Additionally, over the last 12 months, Synology said it saw faster adoption among customers with larger datasets (over 50TB) compared to the previous year. The cloud platform offers two services for organizations seeking cloud backup: C2 Storage for backing up Synology servers, and C2 Backup for protecting edge and cloud data.
…
Teledyne LeCroy has released SVF/Enduro v5.3 software with support for testing Flexible Data Placement (FDP), which enables SSD suppliers to verify performance and compliance for their NVMe storage devices. FDP creates a whole new command set that moves the responsibility of data placement away from the drive controller to the host application.
…
Replicator WANdisco, just readmitted to the AIM market, has appointed Xenia Walters and Chris Baker to its Board as Non-Executive Directors in a board-level shake up. Interim CEO Stephen Kelly and interim CFO Ijoma Maluza become permanent and both join the board as well. Current board member Karl Monaghan will step down ahead of the next AGM in August.
Analysis: DDN says storage for generative AI and other AI work needs a balance of read and write speed, claiming its own storage provides the best balance because of its write speed superiority over competing systems.
The company supplies its Exascaler AI400X2 array with Magnum I/O GPUDirect certification for use with Nvidia’s DGX SuperPod AI processing system. It uses 60TB QLC (4bits/cell) SSDs and has a compression facility to boost effective capacity. Some 48 AI400X2 arrays are in use with Nvidia’s largest SuperPODs according to DDN, which said it shipped more AI storage in the first quarter of this year than in all of 2022.
SVP for Products at DDN, James Coomer, has written a blog, Exascale? Let’s Talk, in which he says that an AI storage system has to support all stages of the AI workload cycle. “That means ingest, preparation, deep learning, checkpointing, post-processing, etc, etc, and needs the full spectrum of IO patterns to be served well.”
AI processing read-write mix
What IO patterns? Coomer cites an OSTI.GOV whitepaper, Characterizing Machine Learning I/O Workloads on Leadership Scale HPC Systems, which studied: ”the darshan logs of more than 23,000 HPC ML I/O jobs over a time period of one year running on Summit – the second-fastest supercomputer in the world”. Darshan is an HPC IO characterization tool. Summit’s storage was the GPFS (Storage Scale) parallel filesystem.
The whitepaper says: “It has been typically seen that ML workloads have small read and write access patterns,” and “Most ML jobs are perceived to be read-intensive with a lot of small reads while a few ML jobs also perform small writes.”
But, “from our study, we observe that ML workloads generate a large number of small file reads and writes.”
The paper says ~99% of the read and write calls “are less than 10MB calls. … ML workloads from all science domains generate a large number of small file reads and writes.”
A chart shows their finding with GPFS-using workloads:
There is a roughly even balance between read and write IO. Coomer’s blog has a chart showing the balance between read and write IO calls with small calls (less than 1MB) dominating the scene:
The paper’s authors say: “The temporal trend of ML workloads shows that there is an exponential increase in the I/O activity from ML workloads which is indicative of the future which will be dominated by ML. Therefore better storage solutions need to be designed that can handle the diverse I/O patterns from future HPC ML I/O workloads.”
Armed with this balanced and small read/write preponderance finding Coomer looks at the IO capabilities of different QLC flash systems using NFS compared to the DDN AI400X2 storage.
He contrasts a part-rack DDN AI400X2 system that can provide 800GBps of write bandwidth with a competing but un-named system needing 20 racks to do the same.
Coomer’s graphic showing DDN (right) and competing supplier’s QLC/NFS system (left) needed to meet his 800GBps write target. The white rectangles represent empty rack space
Coomer told us: “The servers and storage are all there (embedded inside the AI400NVX2) and no switches are needed for the back end. We plug directly into the customer IB or Ethernet network. The DDN write performance number is a measured one by a customer not just datasheet.”
What competing system?
We looked at various QLC flash/NFS systems to try and find out more. For reference, an AI400X2 system can deliver 90 GBps of read bandwidth and 65 GBps write bandwidth from its 2RU chassis. Twelve of them hit 1.08 TBps read and 780 GBps write. Thirteen would reach 1.17 TBps read and 845 GBps write bandwidth from 26 RU of rack space.
It’s necessary to include the servers and switches in the alternative systems to make a direct comparison with the DDN system.
VAST Data Lightspeed storage nodes provides 50 GBps from a 44RU configuration. We would need 16 of these to achieve 800 GBps write speed, plus the associated compute nodes and switches. Say ~20 racks roughly; it depends upon the storage node-compute node balance.
A newer VAST Data Ceres system provides 680 GBps from 14 racks, 48.6 GBps per rack, meaning 17 racks would be needed to reach the 800 GBps write speed level:
The slide captioning at the bottom mentions 680 GBps write bandwidth.
A Pure Storage FlashArray//C60 has up to 8 GBps throughput from its 6RU chassis. If we assume that is write bandwidth and we have 1.3 GBps per RU and will need 14 or so racks to reach 800 GBps write speed.
Update. Forty eight Dell PowerScale F600s demonstrated 56.4GiBps sending data to 48 Nvidia GPUs, 1.175GiB/node (1.26GB/node). A node is 1RU in size and, therefore, to achieve 800GBps write bandwidth there would need to be 635 nodes; about 15 fully populated racks.
Scale-out NFS inefficiency
In Coomer’s view scale-out NFS system architectures are inefficient because they are complex:
DDN’s Exascaler AI400X2 system does away with the server-interconnect-buffer complexity because its client nodes know where the data is located:
It can provide, Coomer argues, the balanced small read and write IO performance that ML workloads, like generative AI, need, according to the OSTI.GOV research. It can do this from significantly less rackspace than alternative QLC flash/NFS systems. And that means less power and cooling is needed in AI data centers using the DDN kit.
File collaborator Egnyte has added a generative AI chatbot to its platform to summarize docs and create audio/video transcripts.
Generative AI uses large language models (LLMs) to receive human input in ordinary language, query and analyze source content files and answer queries in ordinary language. Examples such as ChatGPT have created a huge wave of interest given their ability to understand normal speech and respond quickly and in an authoritative – but not always accurate – way to a vast range of queries and requests, such as write software code and find objects in photographs.
Vineet Jain
CEO and co-founder Vineet Jain said: “While very much in vogue right now, Egnyte has been using large language models for close to a decade. The outputs of these models were historically focused on a relatively narrow set of IT security, privacy, and compliance applications.”
These existing embedded Egnyte ML models are used by customers to classify and protect sensitive data, comply with privacy regulations such as GDPR and CCPA, and detect anomalous usage patterns that may be indicative of a data breach or insider threat.
The latest LLMs used in generative AI are much more open-ended than this and have a wide focus, as Jain reflected: “With recent advances in AI and compute, we’re now able to unleash content intelligence for every user on our platform.”
Everyday Egnyte users, rather than data scientists and SQL-skilled folks, will be able to use a chat-based interface to ask Egnyte’s AI to answer questions and perform tasks related to the files to which they have been granted access. This will be quite different from the usual tick-a-box-and-get-a-report dashboard type interface. The AI can perform tasks such as:
Generate summaries of large complex documents
Create text-based transcripts of audio and video files
Find photos within an image library containing a particular object
Egnyte chief strategy officer David Spitz said: “Generative AI is unlocking the vast troves of data and insights previously buried in people’s documents and media files while freeing up knowledge workers from countless low-value tasks.”
Egnyte’s AI uses private instances of various AI models to ensure the source data and AI-generated responses adhere to each customer’s security and compliance policies. It is being offered in a limited way to select customers and its wider roll-out and general availability will be announced at a later data
Egnyte is a late-stage startup, it was founded in 2008 and has received $137.5 million in funding, with the last round in 2018. It provides file-based collaboration and governance facilities to its enterprise customers, competing with CTERA, Nasuni and Panzura. There are more than 17,000 customers using its software.
Micron is sampling a 25GB high bandwidth memory product made from a stack of 8 smaller chips, just months after SK hynix’s 24GB, 12-high stack. The company also said it has a 36GB, 12-high HBM3 stack coming.
High Bandwidth Memory (HBM) is a way of combining multiple DRAM dies in a single chip and connecting this to a host CPU via an interposer. This enables a server to have more DRAM and higher bandwidth than can be obtained from the traditional but limited-in-number socket interfaces which hook up DRAM-carrying DIMMs to the processor. In effect a DIMM is replaced by a much larger HBM DRAM stack. GPU systems use HBM in preference to DIMMs because of their higher capacity and bandwidth.
Micron’s Praveen Vaidyanathan, VP and GM of its Compute Products Group, said: “Micron’s HBM3 Gen2 technology was developed with a focus on unleashing superior AI and high-performance computing solutions for our customers and the industry.”
Micron’s new HBM gen 2 chip
The 24GB Micron product, built using its 1β (1-beta) DRAM process node, delivers more than 1.2TBps of bandwidth, using a pin speed faster than 9.2Gbps, which it claims is a 50 percent improvement over current HBM3 products. An SK Hynix HBM3 chip delivers up to 819GBps and that satisfies Micron’s 50 percent improvement claim.
Micron claims that, when using its new HBM3 product, the training time for large language models will be reduced by more than 30 percent, resulting in lower TCO. Also the product will benefit ML inference operations as it “unlocks a significant increase in queries per day, enabling trained models to be used more efficiently.“
All-in-all the sampling product “addresses increasing demands in the world of generative AI for multimodal, multitrillion-parameter AI models.”
Micron says its new HBM3 product – its second gen of HBM3 – is 2.5x better on a performance per watt rating than previous HBM3 products. It claims this has come about because of a doubling of the through-silicon vias (TSVs or connecting holes) over competitive HBM3 offerings, thermal impedance reduction through a five-times increase in metal density, and an energy-efficient data path design.
8 Micron HBM3 gen 2 chips on SoC board with host
This should save datacenter electricity costs, with Micron claiming that, for an installation of 10 million GPUs, every five watts of power savings per HBM cube is estimated to save operational expenses of up to $550 million over five years. We’re talking about massive datacenters here.
Nvidia VP of Hyperscale and HPC Computing, Ian Buck, said: “At the core of generative AI is accelerated computing, which benefits from HBM high bandwidth with energy efficiency. We have a long history of collaborating with Micron across a wide range of products and are eager to be working with them on HBM3 Gen2 to supercharge AI innovation.”
Semiconductor foundry operator TSMC has received samples of Micron’s HBM3 Gen2 memory and is working with Micron on evaluation and testing.
Micron’s 36GB, 12-high stack, HBM3 gen 2 product will start sampling in the first calendar quarter of 2024.
The disk drive market is depressed with Seagate reporting its lowest annual revenues in 17 years, as Seagate’s largest buyers, the hyperscale cloud providers, bought far fewer nearline drives, and next quarter’s outlook isn’t much better.
Revenues in the quarter ended June 30 were $1.6 billion, 39 percent lower Y/Y, with a loss of $92 million. That looks poor compared to the year-ago profit of $276 million but is much better than last quarter’s loss of $433 million. Full fy2023 revenues came in at $7.38 billion which compares badly with the year-ago $11.6 billion, meaning a 37 percent drop. There was a loss of $529 million contrasting sharply with the prior year’s $1.65 billion profit and its first annual loss in 13 years or more.
CEO Dave Mosley mentioned “a profound downturn in demand” and his results statement said: “Our fourth quarter and fiscal 2023 performance reflected the uneven pace of economic recovery in China, cloud inventory digestion, and cautious enterprise spending amid the uncertain macroeconomic environment.” Seagate expects the three adverse factors to persist in the next two quarters.”
Dramatic loss reduction from Q3 fy2023 to Q4 fy 2023
“Through our actions, Seagate is now leaner, our balance sheet healthier, and our product roadmap even stronger, positioning the company to weather the near-term business environment, deliver financial leverage, and capture attractive long-term opportunities for mass capacity storage.”
The actions include lowering its cost structure and reducing output with a build-to-order approach. It’s also increasing prices. With not so much cash coming in, Mosley talked about increasing debt In the earnings call: “We’ve enhanced balance sheet flexibility taking out nearly $800 million of debt funded largely by monetizing non-manufacturing facility assets.” That means sale and leaseback of office facilities. Mosley said Seagate has a “focus on returning to profitability.”
The “product roadmap” statement alludes to the existing 20TB PMR technology being extended into the mid-to upper 20TB area, meaning 24/25TB with conventional drives and beyond with SMR tech, while the replacement HAMR technology is on track to begin its volume ramp in early cy2024 with Seagate preparing qualification with a broader number of customers beyond hyperscale buyers.
The company has paused stock repurchases for the reminder of the fiscal year, but it is still paying a dividend.
Financial summary for quarter
Gross margin: 19 percent vs 28.9 percent a year ago
EPS: -$0.44 vs $1.27 last year
Operating cash flow: $218 million
Free cash flow: $168 million
Cash dividend: $0.70/share
Cash and cash equivalents: $786 million
Debt: $5.5 billion
There was a large drop in disk drive capacity shipped, from last quarter’s 119EB and the year-ago year-ago155EB to 91EB; a 33.5 percent Q/Q fall. Average drive capacity fell from last quarter’s 8.2TB to 6.4TB, as cloud providers, Seagate’s largest buyers by far, bought fewer mass capacity nearline drive sales.
Seagate’s HDD capacity shipped has been on a downward trend for six quarters.
CFO Gianluca Romano said: “The average capacity is down because of mix. … in the June quarter, we shipped less to cloud and more to VIA, and of course the capacity is much higher in the cloud business.” (VIA is the video surveillance market.)
Oddly there was an increase in 10K rpm mission-critical drives, a market area prone to SSD cannibalization.
This is Seagate’s lowest annual revenue number since 2005’s $7.6 billion, and its first annual loss in many years, such is the savage nature of the disk drive market’s revenue fall.
The fy2023 annual revenue number was Seagate’s lowest for 17 years, such is the depth of the HDD recession
Nearline drive sales to cloud buyers are the main focus for Seagate, with generative AI boosting them. Mosley said: “In addition to the ongoing migration of workloads to the cloud, which we believe is far from over, gen AI is expected to be a catalyst for data creation, underpinning future demand for mass capacity storage.”
He emphasized disk cost advantage over SSDs: “While flash will continue to feed high-performance compute engines, mass capacity HDDs will remain the most cost-efficient storage media to house the enormous volumes of data being generated and used for predictive analytics.”
No suggestion there that SSDs will kill new disk drive sales in the next 5 years as Pure Storage is predicting. Quite the reverse, with Mosley saying: “Even in today’s unsustainably low NAND pricing environment, HDDs are still roughly five times more cost-efficient than comparable flash solutions on a per bit basis, and we do not project that gap to close in the next decade.” Take that Pure.
But Mosley’s ignoring 5-year TCO arguments of flash drive superiority over disk put forward by both Pure and VAST data. If hyperscale cloud providers start using SSDs for their nearline storage needs, and thus believing the TCO argument, then the HHD manufacturers will have a severe problem. That’s the core HDD vs SSD market battleground.
Seagate revenues for the next quarter are forecast to be $1.55 billion +/- $150 million; a 34 percent Y/Y drop at the mid-point. We’ll see if Seagate’s tight controls can can eke a profit out of that.
MemVerge reckons customers could cut their cloud app carbon emissions by using its Memory Machine Cloud software with per-workload emissions tracking.
Its Memory Machine Cloud software enables users to submit jobs to the AWS cloud in a serverless way and have them run in EC2 Spot instances, with dynamically modified sizes as needed to economize on cloud resources. The Memory Machine Cloud Essentials edition is free and includes a WaveWatcher calculator to track cloud app carbon emissions of cloud apps so organizations can right size for a lower carbon footprint.
MemVerge COO Jon Jiang said: “For companies focused on reaching carbon neutrality, real-time measurement of carbon emissions for each workload is an essential capability. The telemetry works hand-in-hand with real-time right sizing that has the potential to drive the emissions of datacenter servers down 10 percent industry wide.”
Memory Machine Cloud’s latest release, v2.3, has the ability to put an app to sleep, then restart where you left off. MemVerge says that data scientists were faced with the choice of leaving their cloud instance running all night or shutting down and restarting their apps built with popular integrated development environments (IDE) like RStudio and Jupyter. Sleep and WaveRide from within an IDE.
The Memory Machine software now includes IDE Sleep and Wakeup widgets for popular IDEs such as RStudio that allow data scientists to put their apps to sleep or use WaveRider continuous right sizing from within RStudio.
A Memory Machine Cloud WaveWatcher service builds on its ability to track CPU, memory, network, storage IO usage by application, to calculate carbon emissions. The data can then be used by the WaveRider service to automatically right size resources and reduce CO2 footprint.
MemVerge has defined a three-step blueprint to lower cloud app carbon emissions. Step one is to migrate apps from your datacenters to the public cloud, AWS in this case. That’s because, according to GoClimate.com, the Kg CO2 / year / server for cloud servers is approximately half that of on-prem servers.
Step two is to track cloud app carbon emissions by using tools like the Memory Machine Cloud WaveWatcher service. This profiles app resource usage and identifies opportunities for right-sizing optimization, such as enabling automatic runtime workload moves to smaller or larger instances based on their real-time resource needs.
Step three is to do this right-sizing continuously. It will cost though, as you have to upgrade from the free Memory Machine Cloud Essentials to Memory Machine Cloud Pro to get WaveRider continuous right sizing. This is charged on a 25 percent of savings pay-as-you-go basis.
Comment
This looks interesting but is limited to AWS. We think that his limitation may be removed by MemVerge in the future. Also there are no examples of saved carbon emissions by named applications, which is a pity. The chart above does show some generic type savings but there are no details. No doubt the MemVergers will produce them soon enough.
Korean memory fabber SK hynix reckons it has met the bottom of the DRAM market trough and the revenue way is up from here on, but the NAND market bottom is not here yet.
Revenues from its DRAM and NAND operations in the second 2023 quarter were 7.306 trillion won ($5.7 billion) with a loss of 2.988 trillion won (-$2.3 billion) compared to year-ago revenues of 13.8 trillion won ($10.8 billion), meaning a 47 percent drop, and a profit of 2.9 trillion won ($2.2 billion). Both DRAM and NAND product sales increased in the quarter, with a higher DRAM average selling price (ASP) contributing most to the revenue growth.
A company statement said: “Amid an expansion in [the] generative artificial intelligence (AI) market, which has largely been centered on ChatGPT, demand for AI server memory has increased rapidly. As a result, sales of premium products such as HBM3 and DDR5 increased, leading to a 44 percent sequential increase in revenue for the second quarter.”
A revenue history chart shows that there has been a substantial Q/Q rise in revenues from last quarter’s 5.1 trillion won ($4.0 billion). The company thinks that the demand for AI memory will stay robust, and also that DRAM prices are improving due to decreased production by memory companies in general.
Wells Fargo analyst Aaron Rakers said: “Hynix sees long-term AI server growth in mid-30 percent CAGR range looking forward; total server demand to grow in high-single digit percent CAGR range.”
Kim Woohyun, VP and CFO at SK hynix, said: “Having passed the trough in the first quarter, the memory semiconductor market is seen to have entered the recovery phase. SK hynix will strive to prop up earnings through its technological competitiveness in high-end products.”
It says the price for general DRAM products such as DDR4 continued to decline on sluggish demand for both PCs and smartphones. Rakers said: “Smartphone demand is expected to improve in 2H2023 via new product cycle and content growth via price elasticity.”
SK hynix’ overall ASP for DRAM rose due to increased high-end DRAM sales for AI servers. It intends to expand sales of high-end DRAM products, including HBM3, DDR5 and LPDDR5 memory products, as well as 176-layer SSDs, to help accelerate its earnings improvement.
It will expand HBM3 and DDR5 output in response to AI server demand. But NAND inventory levels are still too high and the company will cut its NAND output by 5 to 10 percent to help reduce them, and so bring NAND production and demand into a better balance.
SK hynix will also work to raise the quality and yields of its gen 5 10nm DRAM process technology, the 1b-nanometer class, to help raise production. It will do the same with its 238-layer NAND this year but only expand the scale of mass production once the NAND industry upturn is visible.
Separately, according to to a South China Morning Post report, SK hynix has rebuffed suggestions that it will sell it Dalian 3D NAND fab in China, which it acquired from Intel with Solidigm. Dalian output is needed for Solidigm SSDs. It notified the state of California that it has also laid off 98 people from the Sacramento County branch of its NAND Product Solutions Corp, aka Solidigm, likely as a way of cutting costs.
Hewlett Packard Enterprise says its Zerto data protection can offer real-time detection of ransomware encryption across diverse data types, within the VMware environment, and can recover data to the point just before an attack.
Update. Zerto ransomware detector only works in VMware environment. 31 July 2026.
The newly released Zerto 10 employs a Shannon entropy detector, the details of which are elaborated in a blog by HPE Senior Distinguished Technologist, Dimitris Krekoukias. He writes: “A modern ransomware detection mechanism needs to be able to deal both with legitimate host-level activity, plus modern ransomware, dynamically, without relying on fixed thresholds and with no assumptions regarding data types.”
HPE’s Zerto provides continuous data protection and disaster recovery. A launch video features Deepak Verma, VP of Product at Zerto, explaining: “What we’re embedding in Zerto 10 is a new detector, a set of algorithms that detect encryption at the block level. So, regardless of the operating system that’s being used, being able to detect if a singular block of data is encrypted. We’re doing this in batches.”
AZerto 10 launch video screenshot shows Deepak Verma talking about real-time ransomware encryption detection
Zerto’s Virtual Replication Appliance (VRA), when running in a host server, duplicates write data headed for storage and replicates it to a target HPE/Zerto system. The copied data is analyzed outside of the production data stream to avoid production delays. An inline ransomware detector is embedded at the VRA replication target level to analyze the copied data blocks.
However, Krekoukias states that identifying ransomware-induced encryption can be challenging due to legitimate encryption of data by IT systems and the variety of data types involved. Traditional ransomware detection methods involve searching for high Shannon entropy levels in a set of data blocks, with unexpected shifts indicating abnormal activity. [More details here.] This method, though, has limitations due to the fixed thresholds it relies upon.
Zerto replication target diagram
The fixed thresholds, Krekoukias explains, can be problematic as they differ depending on data types, data compression, and tricks like encoding data using base64. This results in encrypted data appearing to have less entropy and could hinder the effectiveness of fixed threshold detection systems.
By contrast, he claims, Zerto 10’s detector produces more accurate results because of its dynamic data type awareness. Krekoukias writes: “To further increase accuracy, the solution trains itself. That training is automatically done per stream, which further enhances accuracy.”
Once ransomware encryption is detected, the encrypting process can be halted and the affected data can be recovered up until the point the ransomware attack was detected. Krekoukias says: “This can enable things like identifying what servers/files started first to be encrypted, and then rolling back to the last known write operation before ransomware started encrypting – which allows businesses to recover and quarantine the best possible way, with the least amount of risk and disruption.”
He told us: “It works in VMware environments. ”
Bootnote
IBM has also utilized Shannon Entropy in its Storage Virtualize ransomware detection facility.
Humans are often the weakest link in a corporate network and the entry point for ransomware attackers, a factoid that underscores the necessity for persistent vigilance against this growing cyber threat.
A stark reminder of this was today presented by cloud storage provider Backblaze in its 2023 Complete Guide to Ransomware report. The latest edition of this annual report states:
“This year’s most important update has been the rise of generative AI for increasingly sophisticated, automated phishing attempts… Text generated by models like ChatGPT help cybercriminals create very personalized messages that are more likely to have the desired effect of getting a target to click a malicious link or download a malicious payload.”
Traditionally, phishing messages were relatively easy to identify, often featuring spelling errors, grammatical mistakes, and awkwardly constructed sentences. With the aid of software like ChatGPT, however, “criminals can enter a prompt to quickly receive error-free, well-written, and convincing copy that can be immediately used to target victims.”
The report provides a comprehensive review of ransomware’s prevalence, attack vectors, the sequence of events during an attack, and the necessary steps for responding to an attack. It stops short of suggesting foolproof methods to prevent attacks from breaching IT infrastructure, as no solution that is 100 percent reliable exists.
The best measures involve proactive steps: ensuring robust IT user education and having an efficient recovery system in place to restore encrypted files from uncorrupted backups. If sensitive data has been exfiltrated, however, the options are limited.
B&F diagram
Malware attack vectors can be either human or machine-mediated, each designed to deliver malware into IT systems, encrypting files or copying them for transmission to attacker HQ. Victims someimes pay a ransom, often in cryptocurrency, to decrypt their files or prevent widespread distribution.
The caveat, of course, is that dealing with cybercriminals doesn’t guarantee a successful outcome even after paying a ransom. Backblaze’s report warns: “Paying the ransom only encourages attackers to strike other businesses or individuals like you. Paying the ransom not only fosters a criminal environment but also leads to civil penalties – and you might not even get your data back.”
Coveware chart from Backblaze report
Victims should report attacks to the appropriate authorities. If data has been encrypted, it’s recommended to recover it using clean backups. Increased vigilance in protecting sensitive data and transparent communication with potentially impacted parties is advised.
Preventing malware infiltration into systems requires a comprehensive approach. Backblaze recommends several protective measures such as to “restrict write permissions on file servers as much as possible.” In addition to this, we would suggest restricting file movement permissions to prevent sensitive data from leaving the boundaries of your IT system.
Modern IT security strategies increasingly adopt a zero-trust model – no person, device, or service requesting data access is automatically trusted, but must be validated every time, even for repeat requests. This principle should also extend to file egress destination devices, given the risk of trusted services unintentionally transferring sensitive files to an unsecured location, as in the case of Fortra or MOVEit.
The 2023 Complete Guide to Ransomware report is accessible on Backblaze’s website through its blog page.
Scandal struck data replicator WANdisco was readmitted to the AIM stock market yesterday after shareholders approved proposals to raise fresh capital via a new share issue.
The company’s stock was suspended from AIM in early March when WANdisco’s claimed that a single senior sales rep had fabricated sales in calendar 2022: the reported $24 million revenue for the year turned out to be just $9.7 million worth of genuine turnover.
A wholesale restructuring of the board and the C-suite followed, with co-founder, chairman and CEO Dave Richards and CFO Erik Miller both leaving. WANdisco was forecast to run out of cash earlier this month and shareholders were asked to approve a capital infusion through issuing new shares. This they did on July 24 and the company raised $30 million at £0.50/share.
Some 99.97 percent of shareholders approved this resolution an a General Meeting yesterday. Chairman Ken Lever, who was installed earlier this year, said in a statement:
“We are absolutely delighted that the resolution to increase the authorised share capital has received such overwhelming support, which will enable the company to conclude the fundraise. The board and the executive management can now concentrate on driving the business forward to achieve growth in value for shareholders and all stakeholders.”
The shares are now trading at 46.74 GBX (£0.4674), compared to their 131,000 GBX (£13.10) value immediately before the suspension. Lever’s exec team have a mountain to climb.
Share price chart from Google Finance
Management has hatched a turnaround plan and this could involve a name change at some point.
Just last month, WANdisco filed its 2022 annual report which revealed a going concern warning, indicating that every dollar it generated during the trading year cost $3: turnover was $9.7 million and the net loss came in at $29.7 million.
Anthony Miller, co-founder and managing partner at TechMarketView, said the overwehelming shareholder support WANdisco received at its AGM was a “show of confidence worthy of an autocratic regime”.
“As a result, WANdisco’s shares were re-listed on AIM this morning at 50p. They last traded at £13.10p,” he added, “WANdisco shares are now no more than casino chips. Place your bets and see if you can beat the banker.”