Toshiba says it has increased the S300 Pro’s surveillance disk drive’s transfer rate by doubling its cache capacity.
The 3.5-inch form factor S300 Pro uses conventional magnetic recording with 4, 6, 8 and 10 TB capacities. It sits alongside the S300 shingled magnetic recording media drive in the Toshiba surveillance drive line-up. Both are air-filled drives with the S300 having 6, 8 and 10 TB capacity points. The 2020-era S300 Pro had a 256 MB cache and a maximum sustained transfer rate of 248 MBps. The new edition has a 512 MiB (537 MB) cache and its max transfer rate is 13.3 percent faster at 281 MBps.
The workload rate has increased as well, from the 2020 drive’s 180 TB/year to the new drive’s 300 TB/year. It supports 600,000 load/unload cycles, has a 1.2 million hours MTBF rating and a 3-year warranty. As before it has a SATA 6Gbitps interface and spins at 7,200rpm.
The new drive’s weight varies with its capacity: 10TB – 755g, 8TB – 730g, 6TB – 710g, 4TB – 690g. We understand this is because each 2TB increment in capacity adds a platter and read/write head assembly to the drive. That implies the 4TB drives has 2 x 2 TB platters and the 10TB version 5 x 2 TB platters.
The drive supports up to 64 video cameras and can be used in an up to 24-bay cabinet. Toshiba said in a statement that users “get the capacity to record and playback video’d events in real-time and high resolution, and with object identification and face recognition.”
Toshiba competitor Seagate launched a 10-platter, 20TB SkyHawk video surveillance drive in 2022. Its current video surveillance drive webpage lists 1, 2, 3, 4, 6 and 8TB Skyhawk drives only though.
Western Digital’s Purple brand drives are its video surveillance drives and these have a 1TB to 14 TB capacity range.The 14TB version has a 64 MB cache and an up to 110 MBps transfer rate, far slower than Toshiba’s claim for the S300 Pro.
WD also has a Purple Pro Smart Video Surveillance product, with an 8TB to 24TB capacity range, a 256 MB cache and an up to 267 MBps transfer rate; again slower than Toshiba’s S300 Pro.
Seagate and WD’s larger capacity video surveillance drives will have helium-filled enclosures and be more costly to manufacture than air-filled drives.
VAST Data has named its first ever chief financial officer in what could be a sign that it is edging closer to an IPO.
Putting aside the fact it seems strange that VAST has never had a CFO before, despite already raising hundreds of millions of dollars in funding, it has now selected and hired a very experienced bean counter.
Appointee Amy Shapero was previously the CFO at e-commerce giant Shopify. “As the company prepares for its next stage of growth, this strategic appointment will enable VAST to scale faster, better serve customers and further expand VAST’s global business,” says a VAST statement.
Amy Shapiro.
Shapero is said to have played a major role in raising Shopify’s annual revenue from $700 million to nearly $6 billion in less than five years, before leaving the company in 2022. Her previous work spans private and public technology companies in commerce, financial services, marketing, and information services.
According to her LinkedIn profile she has been an “advisor/mentor” to pre-IPO companies and founders, from November 2022 until this month and joining VAST.
Jeff Hoffmeister is the current CFO at Shopify after joining in October 2022. Shapero left Shopify after the company’s third quarter 2022 earnings announcement on October 27, 2022. Her exit at the time was part of a broader C-suite reshuffle at Shopify. On her departure, Shopify CEO Tobi Lütke had said: “I want to thank Amy for her significant contributions to our company. Over the past five years, Amy has been an important partner in helping to advance our strategy.”
Considering her pre-IPO advisor activity before joining VAST, the company seemingly wants to tap into that expertise following its past funding raises. VAST, founded in 2016, raised $40 million in an 2018 A-round, $40 million in a 2019 B-round, $100 million in a 2020 C-round, $83 million in a 2021 D-round, and $118 million in an E-round in 2023. That’s an impressive total of $381 million raised.
As a private company, VAST does not publicly disclose its sales figures, but at Blocks & Files we believe the firm’s sales growth is very solid, and that it may well be profitable or near-profitable. Such a state is, of course, an ideal pre-IPO state.
In December last year, VAST said it had tripled revenue year-on-year, and passed the $200 million annual recurring revenue point at the end of September, as sales responded to a surge in generative AI workloads. It was then valued at $9.1 billion, and had had a positive cash flow for several years, with a gross margin of nearly 90 percent. VAST Co-founder Jeff Denworth said the company would consider an IPO filing at the right time. It had not hired banks for a public listing, but a 2024 IPO could be “on the cards”, depending on the market conditions, according to Denworth.
Blocks & Files has asked whether VAST is now nearer to declaring an IPO plan with the appointment of Shapero. We will update this story when we get a response.
Renen Hallak.
“Amy’s extensive finance, strategy, and operating experience with disruptive, mission-driven, high-growth companies like ours will prove invaluable as VAST continues to scale and expand at a rapid pace,” said Renen Hallak, CEO and co-founder of VAST Data, in a company statement.
Shapero will oversee all financial operations, including budgeting, forecasting, financial reporting, and investor relations. In addition, she will play a “key role” in shaping the company’s strategic direction, added VAST, ensuring alignment between financial goals and business objectives by fostering a “dynamic finance team” to support customers, partners and product development.
“From my first conversations with Renen and the VAST leadership team, it was immediately clear to me that this is an exceptional company, with brilliant leadership and an incredible opportunity in front of us as AI’s impact grows,” said Shapero in another company statement. “I’ve always been data-driven, and throughout my career I’ve helped companies to harness data for insights to improve their customer experience, innovate to build new products, and use economies of scale to create new value.”
VAST’s unified AI data platform features an all-flash-based storage architecture, a “next-gen” database to organize all structured and unstructured data across a global namespace, and containerized Data Engine services running on connected DPU, CPU or GPU servers to power AI. The Data Engine provides the functional underpinnings for AI applications. VAST has moved on from its all-flash storage base and now has a multi-layered, AI-focussed software stack that extends beyond its storage arrays to run on servers.
Micron Technology has produced PCIe Gen6 datacenter SSD technology, as part of a portfolio of memory and storage products to support demand for AI.
The PCIe (Peripheral Control Interconnect Express) bus connects a host computer’s CPU to peripheral device controllers for devices such as SSDs, and graphics cards. The technology was showcased at this week’s FMS: The Future of Memory and Storage conference, in Santa Clara, California. At the conference, the company delivered a keynote focussing on how Micron’s products are impacting AI system architectures, while enabling “faster” and “more power-efficient” solutions to manage large datasets.
“AI and other data-intensive workloads are driving the need for higher performance storage in the datacenter,” said Alvaro Toledo, vice president and general manager of Micron’s datacenter storage group. “Our development of the industry’s first PCIe Gen6 SSD for ecosystem enablement is designed to meet these growing future demands, providing unprecedented speed for our customers’ highest-throughput workloads.”
Full duplex per lane speeds.
PCIe 6 is twice as fast as PCIe 5, with a 16 GBps per-lane speed. The specification can be accessed here. Current PCIe 5 SSDs, like Micron’s 9550, are delivering up to 14 GBps sequential read bandwidth across 4 lanes. With its PCIe Gen6 datacenter SSD technology, the firm says it is delivering sequential read bandwidths of over 26 GBps to partners. Host computer motherboards and SSDs supporting PCIe 6 could appear from next year onwards.
Micron reckons it is kickstarting a Gen6 PCIe ecosystem.
In June, Micron said its revenues rose 81.5 percent year-on-year in the third quarter ended May 30, 2024, as demand for AI server memory rocketed. Sales were worth $6.8 billion, compared to $3.8 billion a year earlier, when the memory market seemed to bottoming out.
Gartner’s latest Enterprise Backup and Recovery Software Solutions magic quadrant kicks Acronis out of the supplier list and has Veeam strengthening its position in the leaders quadrant with the highest Ability to Execute ranking.
A comparison with last year’s edition of this MQ shows that, in the Leaders’ quadrant, Dell falls back in completeness of vision while Veeam improves on that axis to join a close-packed, arrow head-like leading group alongside Commvault, Rubrik, Cohesity and Veritas. (Magic quadrant details and features are explained in a bootnote below.)
There are no Challenger suppliers listed. Arcserve was the sole challenger in 2023 but it has been demoted in Ability to Execute terms and is now a niche player. In the Niche Players quadrant Acronis exits the MQ completely, because, the report says: “its focus on prioritizing MSPs and edge/endpoint device workloads resulted in its inability to meet the inclusion criteria.” This leaves Arcserve, Unitrends, Microsoft and OpenText in the Niche Players’ box. As before, Druva, HYCU and IBM are the Visionaries, with IBM suffering a worse Completeness of Vision rating compared to 2023.
The Gartner report writers make some strategic planning assumptions:
By 2028, 75 percent of enterprises will use a common solution for backup and recovery of data residing on-premises and in cloud infrastructure, compared with 20 percent in 2024.
By 2028, 75 percent of enterprises will prioritize backup of SaaS applications as a critical requirement, compared with 15 percent in 2024.
By 2028, 90 percent of enterprise backup and recovery products will include embedded technology to detect and identify cyberthreats, compared with fewer than 45 percent in 2024.
By 2028, 75 percent of large enterprises will adopt backup as a service (BaaS), alongside on-premises tools, to back up cloud and on-premises workloads, compared with 15 percent in 2024.
By 2028, 75 percent of enterprise backup and recovery products will integrate generative AI (GenAI) to improve management and support operations, compared with fewer than 5 percent in 2024.
This is more or less an instruction to suppliers to put these features, if missing, on their development roadmaps.
A big question for us is: which of the three visionaries will make the jump into the leader’s box?
The report has detailed strengths and cautions note on each supplier and you can download a copy of this MQ from HYCU’s website (registration required).
Bootnote
The “Magic Quadrant” is a 2D space defined by axes labelled “Ability To Execute” and “Completeness of Vision”, and split into four squares tagged “Visionaries” and “Niche Players” at the bottom, and “Challengers” and “Leaders” at the top. The best placed vendors are in the top right Leaders box and with a balance between execution ability and vision completion. The nearer they are to the top right corner of that box the better.
Cloud storage provider Backblaze pulled in $31.3 million revenues during calendar Q2, up 27 percent YoY, and it reported a $10.3 million loss versus the $14.3 million net loss a year earlier. It expects $32.6 million, +/- $2 million revenues next quarter.
Backblaze’s revenue growth rate is accelerating.
B2 Cloud Storage revenue was $15.4 million, an increase of 43 percentYoY, while Computer Backup revenue was $15.9 million, an increase of 15 percent YoY.
William Blair analyst Jason Ader writes to subscribers, telling them: “The revenue mix continues to shift toward B2 Cloud, with B2 Cloud revenue growing 43% year-over-year in the quarter, representing 49% of total revenue. With strong growth expected to continue throughout 2024, we expect B2 Cloud will surpass 50% of revenue in the second half. Backblaze continues to position B2 as a low-cost cloud storage provider and noted that it has seen no impact from broader macro challenges. Meanwhile, Computer Backup continues to surprise to the upside, with last year’s price increase resulting in lower churn than expected in the first half of this year.”
…
Marc Suidan.
Backblaze has hired Marc Suidan as its CFO, replacing the retiring Frank Patchel. Co-founder, chairman and CEO Gleb Budman said: “I would like to thank Frank for all of his contributions to Backblaze. He’s been an integral part of our company’s success, especially leading us through our successful IPO and for years after. We greatly appreciate his years of service and wish him well in retirement.” Suidan’s background includes leading a publicly held company as CFO, and advising and leading companies of all sizes in the technology and media industries, including numerous storage and software as a service (SaaS) cloud companies. CRO Jason Wakeam was hired in the quarter as well.
…
Backup and data security supplier Commvault has expanded its cyber and data security ecosystem through strategic integrations with an array of security partners: Acante (data access governance), (Data Security Posture Management), Google Cloud (threat Intelligence), Splunk (threat detection and response), and Wiz (cloud security). These new integrations are available immediately through Commvault and its partners. For detailed product specifications, configuration guides, and additional resources, visit Commvault’s Partner page.
…
Datalake service supplier Cribl announced an agreement with managed security services provider Vijilan Security. Zac Kilpatrick, VP of Global Go-to-Market Partners at Cribl, said: “By combining Cribl’s vendor agnostic data management solutions with Vijilan’s managed extended detection and response, joint customers are equipped to take complete control over their enterprise data to ensure the most secure digital environments.”
…
FalconStor has reported calendar Q2, 2024, revenues of just $2.4 million, flat YoY, with a $30,963 net loss, better than the year-ago net loss of $456,785. CEO Todd Brooks said in a statement: “In Q2, we continued to effectively manage operating expenses and bolster our cash position, while we once again grew hybrid cloud ARR run-rate by over 100% compared to Q2 2023. Our growth is fueled by the expansion of FalconStor’s data protection and migration technology across the IBM global ecosystem, spanning on-premises, cloud, and MSP segments of the IBM Power customer base.” FalconStor obtained certification of its StorSafe and StorGuard integration with IBM Storage Ceph, IBM’s on-premises AI data lake solution.
…
FMS 2024 Best of Show awards:
SSD category – Most Innovative Technology – KIOXIA’s RAID Offload data protection technology to offload RAID parity compute
Most Innovative Business Application – Graid Technology SupremeRAID, Solidigm SSDs, CheetahRAID Raptor edge servers, and Tuxera Fusion File Share
Most Innovative Memory Technology – Industry Standards Category – SNIA for the EDSFF Specification
Most Innovative Hyperscaler Implementation – Hammerspace for its support of Meta’s AI Research SuperCluster
Most Innovative Hyperscaler Implementation – All-Flash and Hybrid Storage Array Category Winner – Infinidat’s InfiniBox G4 Family
Most Innovative Artificial Intelligence (AI) Application – Unique Products – Neo Semiconductor 3D X-AI memory chip technology
Most Innovative Artificial Intelligence (AI) Application – Phison’s aiDAPTIV+ technology
…
Data management supplier Komprise has published “The Komprise 2024 State of Unstructured Data Management” report which examines the challenges and opportunities with unstructured data in the enterprise. This report summarizes responses of 300 global enterprise IT leaders (director and above) at US firms with more than 1,000 employees. The survey was conducted by a third party in June 2024. Most (70 percent) organizations are still experimenting with new AI technologies as “preparing for AI” remains a top data storage and data management priority. Yet cost optimization is an even higher priority this year and they are trying to fit AI into existing IT budgets. Only 30 percent say they will increase their IT budgets to support AI projects. Get a copy of the report here (registration required.).
…
LAM Research has a document discussing its cryogenic etching approach to 1,000 layer 3D NAND. Download it here.
…
Mainframe app migrator to the public cloud startup Mechanical Orchard has raised $50m in a Series B round led by GV, formerly Google Ventures. It previously raised $24 million in an A-round in February this year. It must have demonstrated fast product development progress to get a B-round just six months later.
Mechanical Orchard team.
…
MSP backup service provider N-able reported $119.4 million revenues in calendar Q2 of 2024, up 12.6 percent, with a $9.5 million profit, more than double the year-ago $4.5 million profit a year ago. CFO Tim O’Brien said: “Our second quarter performance marks our seventh consecutive quarter operating north of the Rule of 45 on a constant currency revenue growth and adjusted EBITDA basis.”
William Blair’s Ader said in a statement: “Outperformance in the quarter was driven by consistent demand for N-able’s backup and security suites (Cove and EDR/MDR were standouts) and enhanced focus on long-term contracts (drove some of the upside due to higher upfront revenue recognition). In addition, the company posted ACV bookings growth of 20% year-over-year (in our view, the best leading indicator), reflecting still strong secular trends around IT outsourcing for both SMBs and enterprises amid a maturing MSP market.”
…
A comparison of Backblaze and N-able quarterly revenue growth rates show parallel curves. Both businesses are growing consistently and steadily, with N-able’s MSP channel bringing in a lot more revenue.
…
NEO Semiconductor announced the development of its 3D X-AI chip technology, targeted to replace the current DRAM chips inside high bandwidth memory (HBM) to solve data bus bottlenecks by enabling AI processing in 3D DRAM. 3D X-AI can reduce the amount of data transferred between HBM and GPUs during AI workloads. NEO says this is set to revolutionize the performance, power consumption, and cost of AI Chips for AI applications like generative AI.
…
Nimbus Data launched its ExaDrive EN line of Ethernet-native SSDs supporting NVMe-oF and NFS protocols. ExaDrive EN is based on an ARM SoC that provides processing power for functions including native NFS and NVMe-oF/TCP protocol support, AES-256 inline encryption, and full data checksums. ExaDrive EN adheres to the SNIA Native NVMe-oF Drive Specification v1.1 to ensure compatibility with EBOF (Ethernet Bunch of Flash) and Ethernet switch-based enclosures. ExaDrive EN will be initially available in 16 TB capacity using TLC flash with higher capacities expected in 2025.
Nimbus ExaDrive EN drives (top left), Nimbus FlashRack (top right) and Nimbus SSPs (bottom).
…
Nimbus Data unveiled HALO Atmosphere storage software, encompassing block, file, and object storage. Its Flexspaces feature enables all data types and protocols to share one logical pool, including block (NVMe-oF, iSCSI, Fibre Channel, SRP), file (NFS, SMB), and object (S3- compliant) storage. HALO supports any workload type (mission-critical enterprise, extreme performance, maximum efficiency, data mobility) on one platform. HALO Atmosphere is available today with Nimbus Data’s all-flash systems. HALO will be available on the public cloud in Q4 2024.
…
Nimbus Data also announced its new FlashRack all-flash storage systems powered by its HALO software. Nimbus says that with Federation, hundreds of FlashRacks can be centrally managed, simplifying administration at scale. FlashRack features Nimbus Data’s patented Parallel Memory Architecture (PMA), a stateless design that writes data to enterprise-grade flash memory in a single operation. We’re told that a single FlashRack cabinet offers up to 100 PB of effective capacity, 3 TBps of throughput, and 200 million IOPS, all while drawing 18 kW of power.
Using industry-standard 16 TB, 32 TB, and 64 TB NVMe SSDs, as well as a new option – SSPs, or Solid State Packs – combine multiple SSDs into one hot-pluggable and encrypted storage module of up to 512 TB. Using Flexspaces, SSPs can be mirrored, then split and unmounted. Customers can purchase FlashRack without any capacity and then add qualified SSDs from major vendors.
A 2RU FlashRack Turbo has up to 1.5PB raw capacity with 24 x 64 TB SSDs or 3 x 512 TB SSPs and needs 900w typical power. A 2RU FlashRack Ultra has up to 768 TB raw capacity with 24 x 32 TB SSDs or 3 x 256 TB SSPs, needing 700W typical power. Find out more here.
…
Nimbus Data unwrapped BatArray, a fusion of the company’s FlashRack all-flash systems with Tesla’s Cybertruck, creating the world’s first mobile flash storage data center. BatArray uses six FlashRack Turbo systems, each storing 1.5 PB, to house 9 PB of raw all-flash storage. After considering redundancy and 3:1 data reduction, 25 PB of effective capacity is possible. Cybertruck provides a 240V 40A power circuit in the truck bed. It’s possible to run the whole infrastructure from this single circuit. With its 123 kWh battery, Cybertruck can power the entire storage system for up 24 hours entirely from its EV battery.
NImbusDatastand at FMS showing BatArray Cybertruckon the right.
With its patented Parallel Memory Architecture, BatArray delivers up to 360 GBps of ingress performance, or approximately 3 terabits per second. Nimbus Data claims this data rate is three times faster than the massive AWS Snowmobile, Amazon’s original data transfer vehicle based on a 45-foot long semi-trailer truck. All data is automatically encrypted in hardware using AES-256 with KMIP support. Egress speed is even faster, reaching up to 600 GBps, or nearly 5 terabits per second.
At maximum performance, a user can fill BatArray to capacity in about 7 hours, still leaving more than 200 miles over range in the Cybertruck battery. With 400 Gigabit Ethernet FR4 fiber cabling and transceivers, this transmission rate is possible over 2 km, allowing for some distance between BatArray and the source or destination connection points. BatArray supports industry standard NFS, SMB, S3, and NVMe-oF protocols.
…
NVM Express, Inc. today released three new specifications and eight updated specifications. The three new specifications are the NVMe Boot specification, the Subsystem Local Memory command set and the Computational Programs command set. The updated specifications are the NVMe 2.1 Base specification, Command Set specifications (NVM Command Set, ZNS Command Set, Key Value Command Set), Transport specifications (PCIe Transport, Fibre Channel Transport, RDMA Transport and TCP Transport) and the NVMe Management Interface specification. The NVM Express specifications and the new feature specifications are available for download on the NVM Express website.
…
Semiconductor designer Rambus says it has advanced data center server performance with the industry-first Gen 4 DDR5 Register Clock Driver (RCD). This technology boosts the data rate to 7200 MT/s, setting a new benchmark for performance. It enables a 50 percent increase in memory bandwidth over today’s 4800 MT/s DDR5 module solutions. You can read more about it here.
…
Data protector Rubrik has a tech integration and partnership deal with Mandiant, part of the Google Cloud, aiming to expedite customers’ threat detection and path to cyber recovery. Mandiant Threat Intelligence is now integrated directly in the Rubrik Security Cloud. Breaking intrusions, active campaigns, and evolving threats detected by Mandiant Threat Intelligence are now integrated into Rubrik’s Threat Monitoring capability providing threat intelligence to Rubrik Enterprise Edition customers. Rubrik’s Threat Hunting and Threat Monitoring capabilities are used to identify a safe recovery point by automatically applying Mandiant Threat Intelligence’s thousands of knowledge points against every Rubrik backup. Rubrik Clean Room Recovery allows customers to recover and store data in a clean Google Cloud environment or multi-cloud environments. Rubrik and Mandiant can bring together their respective Ransomware Response and Incident Response teams to provide victims with additional investigative and recovery support. Read more about all this in a Rubrik blog.
…
Silicon Motion announced its SM2508 – the best power efficiency PCIe Gen5 NVMe 2.0 client SSD controller for AI PCs and gaming consoles. It’s the world’s first PCIe Gen5 client SSD controller using TSMC’s 6nm EUV process, offering a 50 percent reduction in power consumption compared to competitive offerings in the 12nm process. With less than 7W power consumption for the entire SSD, we’re told it delivers 1.7x better power efficiency than PCIe Gen4 SSDs and up to 70 percent better than current competitive PCIe Gen5 offerings on the market.
Silicon MOtion SM2508.
…
South Korean memory, NAND and SSD manufacturer SK hynix will receive up to $450 million in funding and access to $50 million in loans as part of the US CHIPS and Science Act for its investment to build a production base for semiconductor packaging in Indiana. lt plans to seek from the U.S. Department of the Treasury a tax benefit equivalent of up to 25% of the qualified capital expenditures through the Investment Tax Credit program. This follows SK hynix’s announcement in April that it intends to invest $3.87 billion to build a production base for advanced packaging in Indiana, creating an expected 1,000 jobs.
…
Western Digital launched two new automotive flash products – the Western Digital AT EN610 NVMe SSD and iNAND AT EU75 – for next-generation, high-performance, centralized computing (HPCC), advanced driver-assistance systems (ADAS), and other autonomous driving systems, at FMS 2024. It also launched the RapidFlex Interposer, which converts PCIe SSD signals to Ethernet so PCIe eSSDs can be deployed in either an Ethernet-switched or a PCIe-switched EBOF NVMe-oF architecture. It unveiled the world’s first 8TB SD card, the SanDisk SDUC UHS-1, and a 16TB external SSD at FMS 2024 as well as a new 64TB eSSD for storage-intensive applications. WD previewed two PCIe 5.0 x 4 lane M.2 2280 NVMe SSDs; one performance focussed and the other a DRAM-less mainstream drive. Both used BiCS8 218-layer NAND.
…
Winbond Electronics unveiled the W25N01KW, a 1Gb 1.8V QspiNAND flash device. It’s designed to meet the increasing demands of wearable and battery-operated IoT devices with low standby power, small-form-factor package, and continuous read for fast boot and instant-on support, achieving up to 52 MBps in both Continuous Read and Sequential Read modes. It’s available in compact WSON8 (8mm x 6mm) and WSON8 (6mm x 5mm) packages.
ExaGrid has updated its Tiered Backup Storage system with extra support for Veeam workloads.
The company’s appliances ingest backup data to a disk cache landing zone, with post-ingest deduplication to a repository tier providing efficient capacity usage.
ExaGrid’s systems include a non-network-facing tier to create a data security air gap, and data object immutability protection against ransomware and other malicious attacks.
The newly launched version 7.0.0 platform supports Veeam writing to ExaGrid Tiered Backup Storage as an object store target using the S3 protocol, as well as supporting Veeam Backup for Microsoft 365 directly to ExaGrid. Veeam added the ability to write backups directly to object storage appliances in February 2023.
ExaGrid has achieved “Veeam Ready-Object” status with immutability, and “Veeam SOSAPI (Smart Object Storage API)” certification, to verify the new features. Veeam states that, to interact with object storage, SOSAPI uses API requests. It sends API requests to an S3 compatible object storage repository, like Exagrid, and gets the necessary information in a set of XML files. These files contain details on the backup target system, object storage repository capacity and correct storage usage, object storage capabilities and a state of the backup processing.
Bill Andrews.
“We continue to update our Tiered Backup Storage solution, as we know that data is not truly protected by backups if the backup solution itself is vulnerable to threat actors,” said Bill Andrews, president and CEO of ExaGrid. “In addition to the S3 Object Locking for Veeam in Version 7.0.0, ExaGrid also provides Retention Time-Lock with a non-network-facing tier, delayed delete policy, and immutable data objects – it is double security.”
ExaGrid’s Landing Zone is designed to support fast backups and restores, and “instant” VM recoveries, and its Repository Tier offers the “lowest cost” for long-term retention, claims the supplier.
The Marlborough, Massachusetts-headquartered firm says around half of its sales are now generated outside the US. Last month, the privately-owned business said it added 137 new customers in its latest quarter, claiming it added 64 six- and seven-figure deals as part of that, to break its own sales records. It wasn’t obliged to reveal the actual sales figures.
Contextual AI is using WEKA’s Data Platform parallel filesystem software to speed AI training runs as it develops counter-hallucinatory retrieval augmented generation (RAG) 2.0 software.
Contextual AI’s CEO and co-founder Douwe Kiela was part of the team that pioneered RAG at Facebook AI Research (FAIR) in 2020, by augmenting a language model with a retriever to access data from external sources (e.g. Wikipedia, Google, internal company documents).
A typical RAG system uses a frozen off-the-shelf model for embeddings, a vector database for retrieval, and a blackbox language model for generation, stitched together through prompting or an orchestration framework. And it can be unreliable, producing misleading, innacurate and false responses (hallucinations).
Kiela and his team are developing RAG 2.0 to address what Contextual says are the inherent challenges with the original RAG system. RAG 2.0 end-to-end optimizes the language model and retriever as a single system, we’re told. It pretrains, fine-tunes, and aligns with human feedback (RLHF – Reinforcement learning from human feedback) all components as a single integrated system, back-propagating through both the language model and the retriever to maximize performance.
Contextual founders CEO Douwe Kiela (left) and CTO Amanpreet Singh (right)
Contextual claims: “Using RAG 2.0, we’ve created our first set of Contextual Language Models (CLMs), which achieve state-of-the-art performance on a wide variety of industry benchmarks. CLMs outperform strong RAG baselines based on GPT-4 and the best open-source models by a large margin, according to our research and our customers.”
It produced test results by comparing its Contextual Language Models (CLMs) with frozen RAG systems across a variety of axes:
Open domain question answering: Contextualizes the canonical Natural Questions (NQ) and TriviaQA datasets to test each model’s ability to correctly retrieve relevant knowledge and accurately generate an answer. It also evaluates models on the HotpotQA (HPQA) dataset in the single-step retrieval setting. All datasets use the exact match (EM) metric.
Faithfulness: HaluEvalQA and TruthfulQA are used to measure each model’s ability to remain grounded in retrieved evidence and hallucinations.
Freshness: It measures the ability of each RAG system to generalize to changing world knowledge using a web search index and showing accuracy on the recent FreshQA benchmark.
Each of these axes is important for building production-grade RAG systems, Contextual says. CLMs significantly improve performance over a variety of strong frozen RAG systems built using GPT-4 or state-of-the-art open source models like Mixtral.
Contextual builds its large language model (LLM) on the Google Cloud, with a training environment consisting of A3 VMs featuring NVIDIA H100 Tensor Core GPUs, and runs it there. It originally used Google’s Filestore but this wasn’t fast enough and nor did it scale to the extent required, it said.
Its AI training used Python which utilized a large amount of tiny files, causing it to be extremely slow to load files within Google FileStore. Also long model checkpointing times meant the training would stop for up to 5 minutes while the checkpoint was being written. Contextual needed a file store to move data from storage to GPU compute faster, with quicker metadata handling, training run checkpointing, and data preprocessing.
With its consultancy partner Accenture, Contextual looked at alternative filesystems from DDN (Lustre), IBM (Storage Scale) and WEKA (Matrix – now the Data Platform), also checking out local SSDs on the GPU servers, comparing them in a proof-of-concept environment in Google’s Cloud.
We’re told WEKA outperformed Google Filestore, with 212 percent higher aggregate read bandwidth, 497 percent higher aggregate write bandwidth, 212 percent higher aggregate read IOPs, 282 percent higher aggregate write IOPs, 70 percent lower average latency and reducing model checkpoint times by 4x times. None of the other contenders came close to this.
Contextual uses WEKA to manage its 100TB AI training data sets. The WEKA software runs on a 10-node cluster of GCE C2-std-16 VMs, providing a high-performance data layer built on NVMe devices attached to each VM, for a total of 50 TB of flash capacity. The single WEKA namespace extends to an additional 50 TB of Google Object Storage, providing a data lakelet to retain training data sets and the final production models.
Overall Contextual cloud storage costs have dropped by 38 percent per TB, it says, adding that its developers are more productive. Contextual, which was founded in 2023, raised $80 million earlier this month in a Series A round.
There’s more detail about RAG 2.0 and CLMs in a Contextual AI blog which we’ve referenced for this article.
Kioxia America is showcasing a prototype broadband SSD with an optical interface for next-generation data centers, at this week’s Future of Memory and Storage (FMC) conference in Santa Clara, California.
In the demo a board-level component interfaces to a PCIe cable and then carries the PCIe signal across the optical link to the SSD. By replacing the electrical wiring interface with an optical one, the alternative technology is said to allow greater physical distance between compute and storage devices; 40 meters (circa 131 feet) at present. Kioxia has a 100 meter length in its roadmap. Along with slimming down the wiring, the new option is also expected to deliver “high flexibility to data center system designs and applications,” according to Kioxia. Additionally, energy efficiency and “high signal quality” is being promised.
By adopting an optical interface, it becomes possible to aggregate individual components that make up systems, such as SSDs and CPUs. “This furthers the evolution of a disaggregated computing system that can efficiently utilize resources according to a specific workload,” says Kioxia. SSDS could be located in cooler environments than hot server rack areas.
The optical interface may also enhance high-performance computing (HPC) environments.
Kioxia foresees using an optical link for future PCIe gen 5, gen 6, 7 and 8 SSDs. It also envisages using optical switching to extend the host-SSD distance further and enable disaggregation.
The showcased technology at FMC is the result of the Japanese Next Generation Green Data Center Technology Development Project, which is funded by that country’s New Energy and Industrial Technology Development Organization (NEDO).
As part of the project, new technologies are being developed with the goal of achieving more than 40 percent in energy savings when compared to current datacenters in operation.
Kioxia America is a subsidiary of Kioxia Corporation, the worldwide supplier of flash memory and solid-state drives. Kioxia is rumored to be preparing an IPO to recapitalize itself after paying off debts, but has not confirmed this.
Last month, Kioxia said it was sample shipping 2 terabit NAND chips, the highest capacity NAND chips currently available. Pure Storage, for instance, was said to be queueing up to buy them for its own storage product portfolio. The new shipped chips have a QLC (4bits/cell) design and use Kioxia’s BiCS 8 218-layer flash node architecture.
SK hynix has worked on developing a computational object storage system (OCS) with Los Alamos Nuclear Labs (LANL) and is showing it at FMS 2024.
It involves having object data stored in Parquet files on NVMe SSDs. When LANL wants to do a large scale simulation run, the data on the SSDs is preprocessed by OCS to reduce the dataset sent to the analysis servers.
The context for this is that, in HPC, analysis of physics simulation data requires large amounts of data held in the storage nodes to be fed to the compute nodes for processing. This requires network bandwidth and enough memory in the compute nodes to hold the processing data set.
But, SK hynix says, “the actual data required for analytics is only a small part of the total data.” Computational storage can be used to reduce the data transfer amount by selecting only the data needed for the processing data set, it says, pre-processing it so to speak.
The intent is to cut down the time needed for analysis of physics simulation data, with a >6.5x speed up demonstrated at an SK hynix demonstration of its prototype OCS system work with LANL at the Supercomputing 2023 event in Denver, CO. The company claimed OCS “can perform analytics independently without help from compute nodes”, claiming this highlighted “its potential as the future of computational storage in HPC.”
SK hynix data awareness slide from OCS KV-CSD presentation
OCS is said to be data-aware. This is because block-based storage does not know anything about its data contents, apart from its Logical Block Address and range. An object storage system stores metadata and this can include data content identifiers such as ID, key, name, etc. Local processing on key-value computational storage drives (KVCSD) can then use this indexing metadata to select required data items.
KVCSD is a hardware-accelerated key-value store and can be based on existing key-value stores such as RocksDB and LevelDB. It is described in a downloadable, SK hynix, LANL and NVIDIA research paper: “KV-CSD: A Hardware-Accelerated Key-Value Store for Data-Intensive Applications.“ This says the KV-CSD consists of an NVMe SSD and “a System-on-a-Chip (SoC) that implements an ordered key-value store atop the SSD.” The SoC has has 4 x ARM Cortex A53 CPU cores, 8GB DDR4 RAM, and a Ubuntu OS. This implements an LSM-Tree based KV store.
The claimed result is that: “Through offloaded processing, KV-CSD streamlines data insertion, reduces host-device data movement for both background data reorganization and query processing, and shows up to 10.6x lower write times and up to 7.4x faster queries compared to the current state-of-the-art software key-value stores on a real scientific dataset.”
The paper notes that: “by directly implementing key-value storage management in device, KV-CSD provides opportunities to leverage low-level storage interfaces, such as Zoned Namespace, to optimize performance whereas a software key-value store must rely on the underlying filesystem and the operating system to adopt these optimizations accordingly.” In fact the KV-CSD has a 15TB NVMe zoned namespace SSD using PCIe gen 3.
The OCS project involves the analytics software stack using the Apache analytics ecosystem, including Substrait and Arrow. Substrait provides a standard and open representation of analytic query plans enabling parts of the query to be pushed down from an S3-based storage server to OCS computational storage, specifically an Object based Computational Storage Array (OCSA) used as a backend storage.
There, indexing techniques are used to filter the stored Parquet file dataset and select only the data needed for the query. Apache Arrow software has a language-independent columnar memory format for flat and hierarchical data with a common transfer format. It can be used to transfer such query results, the reduced size data set, back up the stack, using much less network bandwidth, to analytics servers that don’t need as much memory as before.
SK hynix and LANL said: “Orders of magnitude of data movement can be saved by pushing such indexing capabilities closer to the storage devices.”
A demonstration of OCS at FMS 2024 uses the Paraview/VTK (Visualization Toolkit) with Substrait sending a portion of the analytics query plan to the OCS system.
Overall client and OCS system diagram
Sungsoo Ryu, head of memory systems research at SK hynix, stated: “This novel approach to data processing minimizes redundant data transfers between analytics applications and storage, and lightens the storage software stack. This accelerates the performance of data-intensive applications such as big data analytics, artificial intelligence and more. SK hynix is striving to develop an analytics ecosystem in collaboration with industry partners.”
Gary Grider, High Performance Computing division leader at Los Alamos, said: “Our large-scale indexing efforts, fueled by industry-standard ecosystems like the Apache columnar analytics, are showing great results.”
Read the OCS research paper for more details of the KV-CSD. Access an SK hynix OCS presentation here.
Blocks & Files interviewed Denise Natali, the new VP for Americas sales at Datadobi, over email about the firm’s views on unstructured data management. Datadobi provides a StorageMAP facility to locate, identify and manage unstructured data, building on its core migration technology.
We wanted to gain a better understanding of Datadobi’s position on data tiering, orchestration, supplying a product vs a service, and Gen AI needs. We outlined the context in a basic diagram we sent over:
Datadobi reciprocated with its own view of the unstructured data management market:
It also sent a note explaining the terms:
That provides the background context for Natali’s answers to our questions.
Blocks & Files: What is Datadobi’s strategy for growing sales in this market?
Denise Natali
Denise Natali: Priority number one for Datadobi is to stay close to its customers and partners with whom we have such close and trusted relationships. We spend the majority of our time with customers working to understand their needs – we are dedicated to understanding their unique “why?” Only then do we help guide them towards the next steps in the unstructured data management journey – and share how StorageMAP can help them solve specific unstructured data management challenges and achieve desired outcomes.
In addition, we are dedicated to not only staying on top of, but in front of, evolving market demands/trends. So in addition to the time we spend with our customers and partners, there are folks in our organization that spend a tremendous amount of time with industry experts (such as leading industry analysts) and by reading and keeping up to date with business and technology journals (such as yours).
Drilling down a bit … StorageMAP is a versatile solution – and the market needs to know this. But, at the same time, we need to be able to demonstrate our strengths in particular areas – our ability to help our customers and partners overcome challenges and achieve their goals. For instance, one such area of focus is hybrid cloud data services. From a sales standpoint, we need to be able to prove that StorageMAP is hands-down the most robust and comprehensive solution here – a truly vendor-neutral solution, with unmatched unstructured data insight, organization, and mobility capabilities and the only solution capable of scaling to the requirements of large enterprises.
And last but certainly not least, it is critical to our sales process that we demonstrate that StorageMAP enables its customers to maintain data ownership and control – addressing customer concerns about metadata security and compliance, whether the data is managed on-premises, remotely, or in the cloud.
Blocks & Files: Does Datadobi have a data orchestration strategy?
Denise Natali: For sure. Our goal is to always remain at the forefront of unstructured data management technology. Customers and partners want increased automation and policy-driven data management capabilities – so data orchestration is an integral part of our near-term roadmap.
As you likely know, today customers are seeking solutions with data orchestration capabilities for a number of reasons – such as improved data management, enhanced data quality and consistency, increased operational efficiency, better decision making, scalability, compliance and security, and cost savings.
Blocks & Files: Do customers want unstructured data management as a service or as a software product they buy?
Denise Natali: As with any other solution, enterprises are looking at a number of options for delivering unstructured data management. But before we get into that, it is important to note that unstructured data management as a market is still nascent. Right now, we find that our conversations with customers are really focused on their specific and immediate needs and not necessarily a well-defined need either. We take a consultative approach to help them to explore what it is they are trying to achieve, and help them to make that a success. As our CEO likes to say, we deliver an outcome not a software product. That is what our second-to-none reputation is built upon.
Ultimately, enterprises will want an unstructured data management solution to play well with their existing infrastructure. The last thing most of them are looking for is yet another standalone point software [product] that they have to manage. Integration with their ecosystems, whether managed by them or an external party will be key.
But what is essential to all our enterprise customers, whichever path they chose, is that they maintain ownership and control over their data and metadata to make sure it remains secure and compliant with their internal policies and regulatory requirements.
At the end of the day, maintaining ownership and control is exactly why our customers prefer an on-prem solution. Many aaS offerings create security headaches for customers as they lose control of their data. At the behest of our customers, we are exploring various consumption models which are different than aaS models – the choices are not just “purchase/subscribe” vs “aaS”.
Blocks & Files: Does an unstructured data manager have to support all data silos, both on-premises and in the public cloud?
Denise Natali: Absolutely. An unstructured data manager must support all data silos – both on-premises and in the cloud. Here’s why:
1. Hybrid Cloud Strategies – Many organizations adopt hybrid cloud strategies, maintaining a mix of on-premises and cloud-based data storage to optimize performance, cost, and security. An effective unstructured data manager must seamlessly manage data across these diverse environments.
2. Data Mobility – As organizations grow and evolve, the need to constantly move data between on-premises systems and various cloud platforms increases. Supporting all data silos makes sure that the data owners and consumers (whether people or applications such as Gen AI) can easily and quickly get the data where they need it. Liberating data from the confines of specific hardware is key to a successful hybrid-cloud strategy.
3. Unified Management – To streamline operations and reduce complexity, organizations prefer a single pane of glass for managing their unstructured data. A unified data manager that supports all silos provides centralized control and visibility, enhancing operational efficiency. This doesn’t mean that an unstructured data manager (likely in IT) will need to do all the work, but a single point of coordination between the data custodians, data owners, and data consumers, will be vital.
4. Data Lifecycle Management – Different regulatory requirements may apply to data stored on-premises versus in the cloud. A comprehensive data manager can help enforce consistent compliance and other policies across all storage locations through the implementation and monitoring of policies created by the data owners.
5. Optimized Storage Utilization – Organizations can optimize storage costs and performance by strategically placing unstructured data based on usage patterns and access requirements. Supporting all data silos allows for intelligent data tiering and lifecycle management.
6. Scalability and Flexibility – Businesses need the flexibility to scale their storage solutions as needed. An unstructured data manager that supports both on-premises and cloud environments can easily adapt to changing storage demands as they evolve.
Blocks & Files: How does an unstructured data manager support customers adopting Gen AI technology, for training and for inferencing?
Denise Natali: The ideal unstructured data manager can support customers adopting Generative AI technology for training and inferencing through several key functions. It is important to note here that when Gen AI applications (and many other kinds of applications) claim to process “unstructured data” what they mean is they can deal with small amounts of unstructured data. Ten terabytes of unstructured data is likely considered to be fairly large for most applications.
What StorageMAP can help with is identifying the right 10TB from the multiple petabytes and billions of files that enterprises have at their disposal, making sure that only high-quality and pertinent data is used for AI model training.
As a side note, applications that perform other tasks – such as looking for PII data within a file – have the same limitation. StorageMAP does not replace these applications, but can make them far more effective by helping to get the right data to them rather than the current approach of “best guess.”
****
By adding data orchestration functionality Datadobi will be competing with Architecta and Hammerspace, and able to upsell in its existing customer base..
Solidigm has launched a pair of datacenter SSDs using the PCIe gen 5 interface.
Greg Matson
The SK hynix subsidiary’s D7 PS1010 and PS1030 follow on from its earlier PCIe gen 4 D7-P5520 and P5620 which were built from 144-layer 3D NAND in TLC format. The new drives use 176-layer 3D NAND, still in TLC format, and are much faster, thanks in part to their PCIe 5 bus, twice the speed of the PCIe 4 bus.
Greg Matson, Solidigm’s SVP of Strategic Planning and Marketing, said in a statement: “The Solidigm D7-PS1010 and D7-PS1030 SSDs were meticulously engineered to meet the increasingly demanding IO requirements across a range of workloads such as general-purpose servers, OLTP, server-based storage, decision support systems and AI/ML.
“In a world where every watt counts, these drives are PCIe 5.0 done right, not only delivering industry-leading four-corner performance, but also up to 70 percent better energy efficiency compared to similar drives by other manufacturers.”
Here’s a performance comparison table for the new drives and the older P5520 and P5620:
We can see that the stated random read IOPS have almost tripled while the random write IOPS have nearly doubled (PS1010) or more than doubled (PS1030). The sequential read bandwidth has also slightly more than doubled, with the write speed also more than doubling. These drives’ latency has improved as well, with read latency lessening by 20 percent and write latency decreasing 28 percent.
The capacity ranges are pretty similar to the older drives, with the mixed-use, 1 drive write per day (DWPD), PS1010 having 1.92, 3.84, 7.68, and 15.36TB variants. The read-intensive, 3 DWPD PS1030 ranges from 1.6 to 3.2, 6.4 and on to 12.8TB. They are packed in either 2.5-inch U.2 cases or the newer E3.S 15mm enclosure.
Their main characteristics are:
Solidigm chart.
What Solidigm calls standard endurance is the lower endurance model of the two. It says the PS1010 is a mixed-use and mainstream drive with the PS1030 being mixed-use and write-centric with both having performance heavily skewed in favor of reads over writes.
Solidigm’s product brief has a detailed PS1010 performance comparison against stated values for competing suppliers Kioxia, Micron and Samsung:
Solidigm chart.
Solidigm suggests these new SSDs can be used for HPC, general purpose servers, OLAP and cloud computing services. It makes a big thing about them being suited for AI pipeline work, functioning as an NVME data cache drive in cloud-located GPU servers, and also as an all-flash tier front-ending a disk-based object storage tier. For on-prem use it has the GPU server using them as an NVMe cache drive in front of all-QLC SSD object tier.
D7-PS1010 PS1030 case.
Ace Stryker, Director of Market Development at Solidigm, claimed in a statement: “As AI workloads continue to surge, storage performance becomes critical. The Solidigm D7-PS1010 and D7-PS1030 are a game-changer for AI-driven enterprises, capable of outperforming competitors at critical stages of the AI pipeline.”
Energy efficiency is claimed to be 70 percent better than Samsung’s PM1743.
Get a product briefing doc for the two new SSDs here.
Acronis is now owned by private equity after EQT bought a majority stake.
The Singapore-based backup software business is massive. It protects over 750,000 businesses across 150 countries, through more than 20,000 service providers offer Acronis Cyber Protect services. EQT is Europe’s largest private equity business. The cost of the stake was not revealed. Acronis founders, management, and existing investors will remain significant minority shareholders.
Ezequiel Steiner
Acronis CEO Ezequiel Steiner put out a statement: “We are thrilled that EQT shares our vision for growth and supports our strategic expansion. With EQT as strong partner, we will continue Acronis’ expansion strategy and continue to deliver the very best service to Acronis’ partners and customers.”
Johannes Reichel, Partner and Co-Head of Technology within EQT’s Private Equity advisory team, said: “Acronis is a strongly positioned cybersecurity and data protection software platform with a clear value proposition to Managed Service Providers. EQT has followed the company’s journey for many years and continues to be impressed by its performance and innovative strength. We are very excited to partner with Acronis, the management team and existing investors on its next phase of growth.”
The roots of Acronis go back to 1997 when SWsoft was founded by Russian entrepreneur Serguei Beloussov (Serg Bell) as a privately held server automation and virtualization company. Web hosting and OS partition virtualization business Parallels was started up in 1999 by Ilya Zubarev and Serguei Beloussov. It developed virtualization technology for the Mac enabling MacOS to run Windows in a parallel partition.
Beloussov, Zubarev, and Stanislav Protasov then co-founded backup and disaster recovery provider Acronis in 2003 as a Parallels spin-off. Its TrueImage product dates from then and is sold by OEMs as a PC backup, recovery, migration and DR facility.
SWsoft bought Parallels in 2004. This had such a strong brand image that SWsoft changed its name to Parallels in 2008. Corel acquired Parallels in 2018. Virtuozzo, the sole remnant of SWsoft, is owned by a group including Serg Bell.
Serguei Beloussov
Acronis made its headquarters in Singapore in 2008 and then moved to Schaffhausen in Switzerland in 2014 to improve is effectiveness as a global business. Serg Bell was CEO from 2013 to 2021 and board chairman from 2013 to 2021. He is now Chief Research Officer and an executive board member.
Ezequiel Steiner became Acronis CEO in October 2023, taking over from Patrick Pulvermueller, who was CEO from 2021. Pulvermueller remained a board member and became a CEO advisor.
Acronis developed additional security and cloud-based offerings offerings with, for example, automation facilities for MSPs.
Blackrock and others invested $250 million in Acronis in 2022, valuing the company at $3.5 billion. A 51 percent majority ownership stake at that valuation would cost $1.785 billion. We suspect Acronis is valued at a higher amount than $3.5 billion in 2025. According to Reuters’ sources the valuation could be $4 billion.
Fellow backup and security company Veeam, also founded by Russian entrepreneurs, was bought for around $5 billion by private equity in 2020.
Serg Bell told us: “Today’s announcement is great progress. It has always been important for Stanislav and myself – the founders – to find a partner that aligns perfectly with Acronis’s culture and vision. A partner that is committed to accelerating the deployment of advanced, state of the art cyber protection and operations solutions across the world, while maintaining the highest standards of quality and partner service. With the amount and intensity of cyber threats constantly growing, we are confident that Acronis is uniquely placed to be the best platform for Service Providers to profitably protect and operate their customers’ information technology infrastructure
“As we celebrate this significant milestone for Acronis, we are also looking forward to devoting more time to advancing the fields of science, research and education with the team at Constructor Group which I founded in 2019. We are changing the delivery and accessibility of best science, research and education through Constructor Tech Platform. The Constructor Tech team is leveraging the tidal wave of generative AI and Metaverse to enable scientists, researchers, teachers, students and academic administrators to accelerate the technological breakthroughs that will help solve the world’s most pressing challenges. We are on the way to creating a world-renowned center of excellence for research and innovation at Constructor University graduating founders, CEOs and the C-suite leaders of tomorrow and Constructor Capital is funding and growing its deeptech, software and ed/science tech portfolio”
The EQT Acronis transaction is pending customary regulatory approvals and is anticipated to close in the first to second quarter period of 2025.