Home Blog Page 139

Backup and security one and the same for CISOs

Interview The separate IT environments of backup and security are merging to become a single operational area for Chief Information Security Officers (CISOs). This is the view put forward by Simon Chappell, the CEO of UK-based Assured Data Protection (ADP), who B&F interviewed about the state of data protection. ADP has been involved in the field since 2016 and has a relationship with Rubrik. In fact it is Rubrik’s largest global partner.

Blocks & Files: With security having an ever stronger influence on data protection and security officers wanting to control, reduce and manage their attack surface, do you think there will be a trend for organizations to reduce their number of backup suppliers and concentrate on ones who have a cybersecurity posture and can help CISOs in their work?

Simon Chappell of backup vendor Assured Data Protection
Simon Chappell

Simon Chappell: For over two decades there has been a division of focus (and budget) between security and data protection. At Assured we are seeing this division change, and the responsibility for data security is increasingly seen as a single risk that needs to be mitigated.

Essentially, there are three ways to stop attackers. You either prevent them from entering in the first place, or you detect and eliminate them if they get through the defenses. But you’ve got to have a robust recovery strategy in place if the first two fail. So it’s becoming more likely that the CISO’s who understand all three are required will opt for service providers who can deliver each layer.

However, I never envisaged having to deal with CISOs when we started out, but these days we can’t get away from them. We seem to be speaking to them all the time. It’s not surprising really, especially when you consider how much pressure they’re under to secure their organizations. So it makes sense they would want to have DR (Disaster Recovery) and backup as part of their remit. Although you must appreciate that DR and security professionals are from different sides of the tracks, we’re still developing a better understanding of each other and how we can work together.

For example, we’ve always dealt with the CTO or other IT professionals, but right now we’re working on a deployment where we’re only dealing with the CISO; it’s been quite a sea change.

Blocks & Files: Going further, do you think a need to have data protection suppliers that can cover all of an organization’s data silos and contribute to a CISO’s concerns could trigger a consolidation in the backup industry?

Simon Chappell: Potentially, yes. As previously mentioned, DR and backup address many of the CISO’s concerns, and they’re looking to work with suppliers that meet all their requirements. From an industry perspective we’re seeing aspects of cybersecurity encroaching onto the DR space and vice versa. As a result, we’re now involved in broader discussions about a more holistic approach to cybersecurity and data protection – and how we fit into that. It’s great to be part of the conversation, but it’s new territory for everyone.

However, it’s given us the chance to refine our proposition to cover all aspects of a customer’s workloads, whether they’re on-prem or in the cloud. But ultimately, I don’t envisage any major consolidation in the data protection world. There’s more likely to be consolidation in the “managed detection and response” sector.

Blocks & Files: How would you advise organizations to protect their data at edge computing sites with limited or no IT staff and, possibly, limited network connectivity bandwidth?

Simon Chappell: The great advantage of a fully managed service is that no IT staff are required, and reporting can be shared with whichever operational team members require it. We increasingly find that network connectivity is less of an issue than it used to be. There seems to be a good correlation between data sizes at edge sites and bandwidth. Assured have some well-rehearsed workarounds where the data sizes and bandwidth available are out of sync.

Blocks & Files: Some suppliers in the data protection industry have suffered data breaches, such as Rubrik via a Fortra zero-day attack, Exagrid in June 2021, and Kaseya in July 2021. How damaging are such attacks and can data protection suppliers absolutely prevent themselves getting attacked?

Simon Chappell: The world is suffering a tsunami of cyber-attacks, and no one is immune to the threat including data protection providers. Continued threats only underline the requirement for strong data security practices, irrespective of whether the data is mission-critical production data or in a development environment.

Blocks & Files: Has Assured Data Protection ever been (a) attacked and (b) had its systems penetrated by hackers? How do you prevent such penetrations of your IT systems?

Simon Chappell: We are acutely aware of the continued threat, and we practice what we preach when it comes to data security.

Blocks & Files: Do you think that ransomware and similar attacks will force a move away from daily backups or half-daily backups towards more continuous data protection so as to reduce the time window in which potential damage can be wreaked?

Simon Chappell: Not necessarily, but risk vectors have changed considerably in the last few years. Businesses need to be more diligent these days because of the persistent threat posed by breaches and ransomware.

Life felt a lot simpler when all you had to do at the end of the day was change a tape. Nowadays it seems that even a daily backup is no longer sufficient and it’s better to have real-time monitoring capabilities in place, especially for mission-critical workloads. The days of staging ad hoc disaster recovery drills are over it would seem.

We haven’t stood still, however, and we’ve responded to this by developing a continuous recovery testing model for our customers using our own proprietary software platform.

Blocks & Files: The Canadian standards institute is adopting an IT standard that says don’t make data copies for new applications. How do you think this applies to backups, which make copies? My thinking is that, if the standard is applied absolutely then all backups of new applications are forbidden – which seems crazy.

Simon Chappell: The way backup software operates is to create a different file or snapshot within the backup environment. This will be encrypted and secure, and ideally should be held in immutable storage. In this sense it isn’t an exact replica of the original file.

Blocks & Files: Oh that’s a clever take. Moving on, HYCU and Asigra are asking third parties to create API-level connectors between SaaS applications and the HYCU and Asigra backup software. What do you think of this? Should Rubrik do the same?

Simon Chappell: Rubrik is ideally positioned to expand SaaS coverage given the focus on the cloud native Rubrik Security Cloud platform. Assured’s customer base is happily served by the existing Rubrik feature set and we don’t currently see any major gaps in protection scope.

Blocks & Files: In five years’ time how will data protection processes have changed and why?

Simon Chappell: The fundamentals won’t change and will probably be very similar to today. I’ve been in this business for over 20 years, and if I’m honest five years doesn’t feel like a long time to me. The DR and backup world may think it’s going to change radically in that time, but I doubt it will change that much.

One thing that will change, though, are customer expectations. They will expect a quick response following an incident, they won’t want to be inconvenienced for long. They will assume that speed and ease of recovery will be standard across their on-prem and cloud installations. As far as they’re concerned, complex and large systems should be recoverable to any chosen point in time in a short time frame. Automation is going to play a huge part in delivering on that expectation, which is something that we’ve invested heavily in to improve our own automation platforms.

Storage news ticker – April 27

Active Archive Alliance storage report

The Active Archive Alliance has announced its 2023 State of the Industry Report: Effective Data Management Through Active Archives. It’s not yet available on the alliance’s website but we have a downloadable copy. It’s basically marketing SSD, disk, optical and tape storage as archive hardware operated by active archive software for file placement. Separately, Swiss Vault with its Vault File System has joined the Active Archive Aliance. 

Cigent has announced an SSD with built-in ransomware protection. But the drive must be installed as a boot drive in a Windows endpoint system, limiting its usefulness. It has an onboard processor running ML models to check for ransomware access patterns detectable in telemetry data from the drive controller. It operates out of the main SSD data path. The Secure SSD+ drives, as covered by sister site The Register, will have capacities of 480GB, 960GB and 1920GB. Linux support is coming soon.

CloudSoda, which supplies an accelerated data movement and storage cost analytics application, announced it has won a NAB Show Product of the Year award. Its storage-agnostic and ecosystem-aware application is built to help companies gain insights into unstructured data by intelligently moving data across tiers and from edge to cloud, optimizing data placement and performance. Its functionality eliminates hidden storage costs, we’re told, making data management more efficient and affordable.

SaaS-based data protector Cobalt Iron has been awarded US Patent 11632374 concerning techniques to be implemented in its Compass SaaS backup offering. It will apply ML techniques to device behaviors and health status so that infrastructures and business security controls will become more intelligent over time by learning from past operational outcomes, the documents say. Cobalt Iron’s newly patented ML techniques continually improve authentication controls over time by learning from the results of past controls. The technology, Cobalt Iron says, automatically adjusts authorization controls based on conditions, events, project status, access activities etc. and makes the entire IT infrastructure more secure and intelligent.

Startup Fluree has raised $10 million in A-round funding. It has a graph ledger database and data pipeline toolset for trusted, secure data management in the Web3 market. Its graph database technology allows developers to store and manage data in a decentralized format for interoperable data exchange across distributed parties. Fluree will use the new capital to expand its Web3 data platforms and enterprise offerings, enabling digital trust for new applications in verifiable credentials, enterprise blockchain and decentralized data management. The company has raised a total of $16.5 million to date. It closed a $6.5 million seed round in 2019.

GRAID is banging the performance drum with a document comparing its RAID5 speed with PCIe gen 5 NVMe SSDs vs alternative hardware RAID cards and also software RAID technology. It says SupremeRAID uses more of the available SSD performance, shown as dark blue bars in the chart below. Up to 97 percent of total SSD performance is wasted when using alternative hardware or software RAID5, we’re told.

GRAID storage performance

InfluxData, supplier of the time series analytics database InfluxDB, announced version 3.0, rebuilt on Apache Arrow as a columnar database, with significant gains in performance, high-volume ingestion and compression, real-time querying, and unlimited scale. It also announced InfluxDB 3.0 Clustered and InfluxDB 3.0 Edge to give developers next-gen time series capabilities in a self-managed database. InfluxDB supports metrics, events and traces time series data. Customers can upgrade to InfluxDB 3.0 from 1.X and 2.X  and run existing workloads faster and at a lower cost with minimal key changes.

Kaseya is sponsoring the Miami Heat basketball sports team as its IT solutions partner, and its stadium has been renamed the Kaseya Center. There was much assistance from the Miami-Dade County administration after the team’s previous sponsorship deal with the now bankrupt FTX collapsed. Kaseya has its headquarters in Miami.

Storage company Kaseya sponsors Miami Heat

Kioxia employees received the Award for Science and Technology from the Commendation for Science and Technology from Japan’s Ministry of Education, Culture, Sports, Science and Technology for their invention of a high-density 3D NAND memory device and its production lines.

Komprise has a blog about its new filesystem analytics reporting capabilities, now available from the Reports Tab in Komprise’s Intelligent Data Management software. Some of these require a Deep Analytics subscription. They cover showback reports for data tiering and migrations, orphaned data and more.

Memory module producer Netlist has won a long-running multiple LRDIMM patent infringement lawsuit against Samsung, which has been ordered to pay Netlist $303 million by a jury in a Texas court. The case started in 2021. The case is Netlist Inc v. Samsung Electronics Co, US District Court for the Eastern District of Texas, No. 2:21-cv-00463. Netlist has previously won a similar case against SK hynix, netting $40 million and a cross-licensing deal in 2021. We imagine it will hope for a licensing deal offer from Samsung and not an appeal against the judgement. Netlist is also suing Google and Micron for similar reasons.

Scale-out filer software supplier Qumulo has announced special effects company MARZ (Monsters Aliens Robots Zombies) is using its File Data Platform for its production-ready Vanity AI system. This enables visual effects (VFX) teams to deliver large volumes of high-end 2D aging, de-aging, cosmetic, wig, and prosthetic fixes, we’re told. The technology is 300 times faster than traditional VFX pipelines with no capacity constraints and a lower cost, according to Qumulo. Instead of taking anywhere from one to five days for a VFX artist to complete a shot, Vanity AI can be used to complete feature-film shots in as little as three minutes. Vanity AI uses Qumulo’s software when running on-premises.  It will launch as a publicly available, web-based service later this year. Qumulo software is available on AWS, Azure and GCP.

Samsung has been hit by the memory slump. Calendar Q1 revenue for the company was ₩63.75 trillion ($56.1 billion) compared to ₩77.78 trillion ($68.4 billion) a year ago, with net profit of ₩1.57 trillion ($1.38 billion), well down on last year’s ₩11.32 trillion ($9.96 billion). The memory (DRAM+NAND) business saw revenues of ₩8.92 trillion ($7.85 billion), down 56 percent on a year ago. Samsung experienced price declines in its memory business as customers, facing macro-economic concerns, used up their inventories rather than buy new DRAM and NAND chips. In response it’s cutting production in legacy product areas and tuning business mixes. Samsung moved its DRAM business mix up to higher density (more profitable) chips but didn’t ship as much capacity as it hoped (low bit growth). It shipped more NAND bits than it expected by also moving the NAND product mix to high density product. It hopes high-end mobiles, new high-core CPU servers and the need for AI hardware will drive a DRAM demand recovery. Samsung will focus on its V7 (176-layer) and V8 (238-layer) technologies in the NAND market.

A Seagate ESG report said it has extended the life of over one million hard disk drives and solid state drives through its refurbishment program in fiscal 2022, preventing 540 metric tons of e-waste from going to landfill. The business is over halfway toward its 2030 goal of powering 100 percent of its global footprint with renewable energy. It has launched an UltraTouch consumer external HDD that is manufactured from 30 percent post-consumer recycled materials by weight and features 100 percent recyclable packaging.

Like Samsung, SK hynix had a dreadful first 2023 quarter with weaker memory chip (NAND and DRAM) demand and lower prices. Revenues of ₩5.1 trillion ($3.9 billion) were 58 percent down on a year ago, and it made a loss of ₩2.6 trillion ($1.9 billion) compared to the year-ago ₩1.2 trillion ($912 million) profit.

SK hynix storage results

DRAM revenues were ₩2.95 trillion ($2.24 billion), down 66 percent year-on-year while NAND revenues were ₩1.679 trillion ($1.275 billion) and down 61 percent annually. The company said: “We expect revenues to rebound in the second quarter after bottoming out in the first, driven by a gradual increase in sales volume.” It forecasts an improvement in market conditions from the second half of 2023 after memory inventory levels at customers declined throughout the first quarter. Sk hynix will invest for mass production readiness of 1b nanometer DRAM (the fifth generation of 10nm technology) and 238-layer NAND to support a quick business turnaround once market conditions improve.

Betsy Doughty has resigned as VP Corporate Marketing at SpectraLogic after nearly 18 years in the position. We are not yet aware of where she will move to next.

Veeam has been ranked #1 global provider for Data Replication & Protection software in IDC’s latest Semiannual Software Tracker for the second half of 2022. Veeam, with a year-over-year growth rate of 8.1 percent, grew faster than the other top five vendors and the overall market average. Its revenue grew 8.4 percent sequentially over the first half of 2022. 

Zadara has validated a combined offering of its zStorage and the Kasten K10 data management platform providing integrated data protection with backup, disaster recovery (DR) and application mobility for Kubernetes environments.

Pure Storage updates FlashArray Purity OS

Pure Storage has added integrated block, file and virtual machine management facilities to its FlashArray Purity OS, saying this enables its array to run all three workloads simultaneously with a common storage pool, unlimited filesystem size and unified policy management.

The File Services for Flash Array software is generally available from today and includes VM-aware storage that provides per-VM granularity for management and protection. It says that users have unified LUN (block), share (file) and VM storage services using Purity’s global flash pool and facilities and the FlashArray all-flash hardware.

Shawn Hansen, VP and GM for FlashArray, said: “The legacy unified storage market has been held back by the inflexibility and cost of decades-old architectures. It’s time for a new way: We’re excited to introduce the first truly unified block and file storage platform, built from the ground up for modern simplicity and the ability to evolve with customers.” 

The legacy architectures is a reference to systems such as NetApp’s ONTAP, Dell and HPE, which have split product functionality or split products for file and block services. HPE’s Alletra Storage MP for example, has separate block and file storage software environments. DDN’s acquired Tintri business was the first to offer VM-aware and relevant storage services with its VMstore arrays.

Peter Skovrup, Pure’s VP Product Management for FlashArray, told B&F: “Tintri had some of the components – we take it a little bit further.”

Pure first obtained file services support for FlashArray with its Compuverde acquisition in 2019. It integrated this software into its Purity v6 release in mid-2020:

Pure Purity v6 with integrated Compuverde file services
Purity v6 with integrated Compuverde file services

Now it has built on that with integrated file, block and VM services. Specifically customers receive:

  • Global storage pools for storage admins to use as needed, across block and file, with non-disruptive expansion on the fly and unlimited file system sizes, 
  • Unified policy Management for block and file storage services, 
  • VM-Aware Storage capabilities with per-VM granularity for statistics, snapshots, quotas, and policies,
  • Common use case support, including VMware and NFS data stores, user directories and profiles, content repositories, data protection, and backup. NFS v3, 4.0 and 4.1 are supported.
Peter Skovrup, Pure Storage
Peter Skovrup

Skovrup said: “VMware is two-thirds blocks and VVOLS and one-third NFS datastores.”

Pure arranged for the ESG research outfit to cast its eye over this new functionality, and Practise Director and Principal Economic Validation Analyst Aviv Kaufmann said: “Through customer interviews, product walkthroughs and based on our broad view of the storage industry, we have found that Unified all-flash block and file storage from Pure Storage reduces complexity to manage block and file workloads by 62 percent, with a 58 percent lower total cost of ownership.” 

Pure’s pitch is now that its FlashArray systems can run block, file and VM-level workloads on the same hardware/software base with all three environments having the benefit of its non-disruptive upgrades, global deduplication, scalability, clustering and protection facilities.

We should remember that Pure can support Kubernetes containerized workloads with its Portworx functionality as well.

Pure Storage CTO Alex McMullan told us that Pure is building systems for colossal scale, his terminology. We should recognize that the 300TB Direct Flash Modules planned for 2026 are part of this idea. B&F thinks that Pure could be telling its customers that they won’t have to go the public cloud to run applications against massive data sets. They can do it on-premises and its existing arrays will be upgradable to the colossal scale ones coming; colossal in storage density and not in physical size.

Pure is building a universal storage system, one supporting block, file, VM and container workloads, scalable to the exabyte level and extending out to the public cloud. It wants to lead the way to very large and highly integrated storage systems and not be outflanked by other suppliers’ best-of-breed and workload-specific products. 

BMC buys virtual tape library biz Model9

BMC has bought Model9, the Israeli mainframe VTL and data export startup, for an undisclosed price.

Houston-based BMC started out in 1980 as a mainframe software company and developed into a mainframe and hybrid cloud software and services supplier. It was bought by a private investor group for $6.9 billion in 2013, and then sold to KKR in 2018 for $8.3 billion including debt.

In the intervening years, BMC has itself acquired RSM Partners, Compuware, Alderstone, ComAround and StreamWeaver. The purchase prices were not revealed. 

John McKenny

John McKenny, SVP and GM of Intelligent Z Optimization and Transformation at BMC, said in a statement: “Together BMC and Model9 support innovation by reimagining mainframe data management. With mainframe cloud data management, organizations get all of the benefits of the cloud including flexibility, scalability, and price, while delivering the advantages of on-premises mainframe computing for large-scale, business-critical applications. Together, we will extend cloud benefits to our mainframe customers as they accelerate their journey to become an Autonomous Digital Enterprise.”

BMC says storing data on mainframes is expensive, limiting and reckons better data management, including moving appropriate parts of the data to public cloud object storage, can help eliminate expensive secondary storage hardware and associated software license costs. 

To date, the public funding total for Model9 has reached $13.5 million.

Model9, started up in 2016, moves mainframe data to other systems, initially replacing mainframe tape storage with disk-based Virtual Tape Libraries, and then sending it with the S3 protocol to AWS and S3-supporting on-premises object storage systems. The software includes APIs for sharing the data with analytics services and other apps. The customer list includes global financial institutions, government agencies and retail companies, we’re told.

Model9 support for AWS and snowflake

A deal was announced between Model9 with AWS in December 2022 to help replatform and convert mainframe applications so they can run on AWS. It also included Model9’s Cloud Data Platform for Mainframe being used a RESTful API-accessible repository for applications and AWS services that need to use mainframe data. Model9 has three product lines: Manager to move and store backup/archive data in the cloud; Shield to cyber-protect copies of mainframe data; and Gravity to move mainframe data to the cloud and there transform it and load it into cloud data warehouses and AI/ML pipelines. 

In January Model9 announced a partnership with Hitachi Vantara to feed mainframe data to its HCP object store – the S3 support mattered here – and VSP 5000 storage systems and make it available to apps running in the hybrid cloud.

We understand that BMC saw this Model9-AWS relationship and recognized a good fit with its own mainframe-cloud activities.

BMC said the acquisition will provide customers with the capability to store and share mainframe data across the hybrid IT landscape, including public and private clouds. The sales pitch is that customers can manage and operate secondary storage in a consistent way and migrate mainframe data to the cloud reliably, efficiently, and securely

Gil Peleg

Gil Peleg, Model9 founder and CEO, said: “The combination of Model9’s Cloud Data Management for Mainframe solutions with the BMC AMI product portfolio will enable customers to modernize more quickly and safely while fully leveraging their investment in existing trusted infrastructure.” 

BMC’s AMI is its Automated Mainframe Intelligence portfolio of products. The deal, BMC’s sixth acquisition in three years, is expected to close in the first half of 2023, subject to customary closing conditions.

Rubrik, Zscaler double down on data protection

Rubrik is automating sensitive data file detection and classification and working with Zscaler to stop such files being exported outside an organization’s IT boundaries. Rubrik has also doubled its ransomware recovery warranty to $10 million.

Data loss prevention (DLP) is intended to stop an organization’s private data being leaked by detecting copying onto unauthorized devices, such as a USB stick, or network transmission to end-points outside the organization’s network. Once outside an organization’s control, such files can be used to extort cash from the file owner.

Frank Dickson, Group Vice President, Security & Trust, at IDC, contextualized this: “The reputation of Data Loss Prevention has not been favorable as past implementations were often highly manual, management was painful, and the burden of data classification was often pushed onto the end user. The Rubrik and Zscaler integration addresses a critical need through automation by allowing organizations to easily implement protections on critical data while minimizing the management burden on data security professionals.”

DLP checks have to know what to look for and then have to be implemented. Zscaler is a cloud security company with tools to detect known file exfiltration such as Exact Data Match (EDM) for specific data items and and Indexed Document Matching (IDM) file fingerprinting. But its software has to know what files to look for. Step forward Rubrik.

The big issue is that an organization can have billions of files in its global, distributed and multi-cloud data estate. How does it know which ones contain sensitive information and must not be exported beyond its firewalls? The task of scanning, detecting, and identifying (classifying) sensitive files needs automating. That much is simple but how? You could simply and periodically scan every file but this takes much time and effort, an indexing system, and it can interfere with production data processing. 

Rubrik is basically a backup software supplier and knows about scanning files. It provides a Sensitive Data Monitoring & Management facility which works separately from production systems to discover and classify sensitive data. Once known, stronger data protection policies can be applied to these files, such as telling Zscaler not to let them be exported and ensuring protection in immutable backup repositories.

The two companies present their partnership under a zero trust umbrella and say it helps to protect against ransomware, which it certainly does as far as file theft and subsequent extortion is concerned.

Jay Chaudhry, Zscaler CEO, said: “Combining Zscaler’s and Rubrik’s leadership and expertise in zero trust data security allows our joint customers to reap the benefits of protecting their most sensitive and important data with ease.”

Rubrik is demonstrating its Zscaler integration at booth 235 in the RSA Conference 2023, which is taking place this week at San Francisco’s Moscone South Expo Hall. Zscaler also has a presence in booth 2051.

Separately Rubrik, recently itself attacked by malware, has doubled its ransomware recovery warranty from $5 million to $10 million. It is presenting this, in part, as a response to the National Cybersecurity Strategy unveiled by the White House on March 2. The strategy calls for the industry to “rebalance” to a shared responsibility for effective cyber defense. So Rubrik says it’s stepping forward to do that.

Co-founder and CEO Bipul Sinha said: “It is important for us to expand our Ransomware Recovery Warranty to deepen trust, and to further show our customers that we stand together with them in the fight against cybercrime. We were first with our warranty, and we welcome the increase of shared accountability and responsibility in this new era of cybersecurity.” Sweetening a warranty payout as a way of selling more product is a neat marketing move.

Weebit Nano raises $40M to break ReRAM resistance

Startup Weebit Nano has raised $40 million to boost sales of its ReRAM non-volatile memory.

We call it a startup although it IPO’d on the Australian stock exchange in 2016 just one year after being founded to develop ReRAM technology invented and patented by Rice University’s Professor James Tour, an unusual route for an Israeli startup. It merged with Radar Iron in Australia and so gained an Australia Securities Exchange presence. It has raised money since by share placements rather than VC funding rounds, as has just happened with this $40 million gained from selling 12 million shares to international institutional investors and existing shareholders.

Coby Hannoch, Weebit Nano
Coby Hanoch

A statement from CEO Coby Hanoch said: “Our first ReRAM product is now available to customers through SkyWater Technology, and we are in advanced discussions with many leading fabs and integrated device manufacturers. Funds raised, combined with our strong balance sheet of approximately US$31 million cash at the end of December 2022, ensure we are well placed to transfer and qualify our ReRAM technology in Tier-1 fabs and foundries to capitalize on the growing global need for better performing memory technology.”

Resistive RAM (ReRAM) is storage-class memory (SCM), strong binary values as one of two measured resistance states based on the presence or absence of oxygen filaments in a nano-porous SiOx (silicon oxide) material. The technology has DRAM-class speed (100ns write) and endurance (~1 million write cycles and 10-year retention).

Weebit Nano has developed the Rice research into usable technology in the embedded computing market. It aims to extend it to the discrete memory market by adding a selector to its ReRAM cells with help from the French CEA/Leti research institution.

Weebit Nano SkyWater image
Weebit Nano SkyWater image

Having achieved initial availability from SkyWater in a 130nm process, Weebit Nano says it will use the $40 million to build on that rollout and continue product development. Hanoch said: “Our ReRAM has already demonstrated it is able to scale to smaller geometries for advanced applications, and has significant competitive advantages over other existing and emerging memory technologies.”

Intel’s Optane withdrawal, after several years of trying to establish an SCM market, educated potential customers and semiconductor foundries then left them with no 3D XPoint product future, leaving a gap. Weebit Nano wants to poke its head through that gap and tell them that ReRAM could be just what they need.

Panmnesia boosts recommendation models with CXL

Korean startup Panmnesia says it has a way of running recommendation models 5x faster by feeding data from external memory pools to GPUs via CXL caching rather than host CPU-controlled memory transfers.

Its TrainingCXL technology was developed by computing researchers at the Korea Advanced Institute of Science & Technology (KAIST), Daejeon. Panmnesia, which means “remember everything,” was started to commercialize it.

Panmnesia’s CEO is KAIST associate professor Myung-Soo Jeong, and he said in a Seoul Finance interview: “ChatGPT, which has recently become an issue, requires more data as technology develops, and more memory to store it.” High-bandwidth memory (HBM) will not solve this issue: “Even if memory performance is increased through HBM, connection technology that can gather each memory module in one [virtual] place is essential to increase storage capacity.”

What Panmnesia has developed is DirectCXL, a hardware and software way of implementing CXL memory pooling and switching technology specifically for recommendation engines but benefiting other large-scale machine learning applications.

Panmnesia memory expander module and chassis loaded with switch and expander modules
Panmnesia memory expander module (top) and chassis loaded with switch and expander modules (below)

Recommendation engine models are machine learning processes used by Facebook, Amazon, YouTube and ecommerce sites to suggest products to users based on what they are currently buying or renting. Computer eXpress Link (CXL) is technology for interconnecting devices with memory across a PCIe gen 5 or gen 6 bus-based system so that they can share data in physically separate but coherent memory areas. Panmnesia accelerates data access in distributed CXL memory pools by using the CXL caching protocol, cxl.cache, and reduces model run time by doing some processing in its CXL memory controllers.

The technology’s background is is described in two complex academic IEEE papers. “Failure Tolerant Training With Persistent Memory Disaggregation Over CXL” uses Optane PMEM for the external memory. “Memory Pooling with CXL,” meanwhile, which is behind a paywall, uses DRAM. 

To understand what’s going on – how Panmnesia achieves its >5x speed up – we need to visit the basics of the hardware scheme and its CXL components, and get to grips with a few recommendation engine technology concepts.

Panmnesia TrainingCXL graphic
Panmnesia TrainingCXL graphic

Hardware and CXL

Panmnesia’s technology has a CXL v2 switch and CXL Memory controller providing access to shared external CXL memory both to a CPU and to GPUs, each with their own local memory capacity. This  shared coherent memory pool is referred to as a Host Physical Address (HPE) and is larger than the individual memories of the GPUs and the CPU, thus enabling the data for large recommendation models to be stored in memory instead of being fetched piecemeal from much slower SSD storage. The controller also has a PIM (Processing-in-Memory) capability plus an additional near-data processor, oddly called a DPU. These speed recommendation engines work by preprocessing certain kinds of data, reducing it in size and processing complexity before sending it to the GPUs.

The recommendation engine model relies on data items called embedded vectors, with vectors being a set of numeric values in arbitrary dimensions – tens, hundreds or even more of them – that describe a complex data item such as a word, phrase, paragraph, object or image or video. Vector databases are used in AI and ML applications such as semantic search, chatbots, cybersecurity threat detection, product search and recommendations.

Panmnesia CXL switch-based scheme
Panmnesia CXL switch-based scheme

A machine learning model takes an input vector, describing an item, and looks for similar items in a database, making parallel and repetitive calculations to arrive at its decision. Crudely speaking, the larger the database and the more comprehensive the vectors, the likelier you will be recommended a follow-on Bourne movie while checking out the first one rather than Good Will Hunting.

Training recommendation engines can take days, which is why a 5x speedup is desirable. The embedded vectors are loaded into the CXL controller’s DRAM. They are then, using the CXL.cache protocol, made available to the GPUs with the model being run and intermediate values fed back to the CXL controller’s memory the same way. Using the CXL.cache protocol means that the host CPU does not have to move the data from one HBM memory area to another, saving time.

Having the PIM and DPU elements in the controller preprocess the embedded vectors, reducing them in size before sending them out to the GPUs for the next training cycle, means that work can be done in parallel with the GPUs which, of course, are also operating in parallel. It all goes to save time and avoid buying more GPUs to get more GPU memory.

As enterprises adopt ChatGPT-style chatbots to operate on their own private data sets, they will need training and this represents a large emerging market for Panmnesia. A TrainingCXL video describes the technology in its PMEM incarnation.

Panmnesia’s founders have patented some of their technology, with applications for more patents under way. All in all the technology looks worth following as Panmnesia’s academic founders establish their company and set up business processes and technology documentation. Check out its website – it has lots of content.

HPE: Alletra is our storage brand to rule them all

HPE says it is trying to simplify its portfolio branding by moving from a multitude of product and service tags to just six.

Jim Jackson

This will take place over the next three years and was confirmed by EVP and CMO Jim Jackson in a blog. Jackson is a 22-year HP/HPE veteran, and so will know about all the sometimes bewildering array – no pun intended – of product names and services that confront customers.

“At Hewlett Packard Enterprise (HPE) we are driving toward one customer experience, delivered through one integrated platform, leading with the Hewlett Packard Enterprise and HPE GreenLake edge-to-cloud platform brands.”

The branding structure is being simplified from today’s 29 product and service brands to two computing brands and one brand each for storage, networking, software and services. 

HPE branding scheme

The chosen brands include ProLiant and Cray in the compute sector, Alletra storage, Aruba networking, Ezmeral for software and just HPE for services. Existing brands outside these six will be renamed

“To minimize disruption as we transform, we will implement these changes through the natural course of business over the next three or so years. When we introduce new products, or change the value proposition of existing products, we will apply the new brand architecture and naming strategy. When existing products reach end of life, we will retire that brand name.”

In the Storage arena, HPE currently has the Alletra brand, the SimpliVity HCI product and StoreOnce, a range of deduplicating backup target appliances that come with the InfoSight cloud-based management tool. SimpliVity and StoreOnce are not market leaders in their respective – and mature – product areas. Jackson’s post signals that these brand names now have a maximum three-year life span. 

The OEM’d Hitachi Vantara VSP 5000, sold as the high-end XP8 array and a PowerMAX competitor, is not a focus for HPE, judging by its lack of HPE website prominence, although a search for the term will present options. The second generation XP8 was announced by HPE in October 2021. Hitachi Vantara has enabled its VSP software to run in the public clouds. HPE may make a decision about the XP8 future when a gen 3 VSP 5000 comes along.

The Alletra brand encompasses the Alletra 9000, 6000, 5000, 4000, and dHCI. These products came to the Alletra brand from the Primera (9000) Nimble (6000, 5000, dHCI) and Apollo data server (4000) products.

There are a variety of branded GreenLake storage services, with HPE GreenLake for File Storage, Block Storage, Alletra Storage, HCI, Disaster Recovery, and Backup and Recovery. The File Storage and Block Storage services actually run on Alletra Storage MP hardware.

Jackson chose not to add Alletra prefixes to the Simplivity, StoreOnce and InfoSight brand names. We think this indicates they may continue under their current naming until they each EOL.

HPE, in a sense, is following Dell, which went through its own storage brand simplification following its EMC acquisition by applying a Power branding scheme to the many Dell and EMC storage products (VMAX, Data Domain, Isilon, etc.), resulting in PowerMAX, PowerProtect, PowerScale, etc. IBM has also imposed a unifying overall brand name on its storage software products, moving from the Spectrum brand to IBM Storage.

Jackson says: “By offering fewer brands, combined with descriptive naming, we will make it easier for customers to understand which products and services are right for them, know what is designed to go together, and what their natural upgrade path is.”

SingleStore embraces ChatGPT

SingleStore is demoing ChatGPT as a way of looking for data in its unified real-time transactional and analytic database, SingleStoreDB.

The company is running a webinar on April 25 to show how the chatbot can be used on an organization’s private data. ChatGPT is a generative AI application built on GPT-3.5 and GPT-4, which are Large Language Models, created by OpenAI. It has astonished many with its ability to respond to complex natural language queries, search public data sets, and build intelligent and comprehensive replies, even writing valid computer code. However, although its replies are often right, they can also be wrong.

SingleStore’s webinar notes say: “We believe that the next iteration of ChatGPT will be the ability to use it against data that is not publicly available (internal docs, wikis, code, meeting notes, etc). SingleStore is the ideal database for this given its abilities to store vector data, perform semantic searches and pull data from various sources without extensive ETL.” 

SingleStore CMO Madhukar Kumar and Product Management Director Arnaud Comet
SingleStore CMO Madhukar Kumar (left) and Product Management Director Arnaud Comet

The hour-long webinar is being presented by SingleStore CMO Madhukar Kumar and Product Management Director Arnaud Comet. Attendees will be able to hear how to build a ChatGPT app on their own data stored in SingleStoreDB, with Javascript and SQL examples. The SQL queries will be given vector functions which are then fed to ChatGPT.

A previous SingleStore webinar showed how applications using SingleStore integrated with MindsDB could use machine learning capabilities including a GPT-3-based facility. The presenters described how attendees could deploy a data-centric ML system on the AWS cloud with existing SQL skills.

MindsDB says it brings machine learning into databases via AI Tables; machine learning models stored as virtual tables inside a database. These tables work on database data to make predictions.

It looks as if SingleStore is adding AI Tables functionality into SingleStoreDB. Earlier this month lakehouse shipper Databricks updated its open source Dolly ChatGPT-like large language model to make its facilities available for business applications without needing massive GPU resources or costly APIs.

Comment

Two substantial analytics database suppliers are rushing out ChatGPT facilities so that non-skilled SQL users can query their organization’s database/data lakes in real time using natural language and get detailed answers that should be right – in that the chatbot’s universe is drawn entirely from the organization’s data set.

The idea is that executives and managers will be able to query data lakes without needing any help from SQL experts or data scientists, and the SQL geeks and data scientists themselves may be able to use ChatGPT facilities to become even better at their jobs. Overall it looks as if there will be far more analysis of data sets with ChatGPT used as the skillful concierge by execs and managers, and a great tool builder by SQL experts and data scientists.

Hungry hungry GPUs: SK hynix feeds 12-layer HBM chip to AI chatbot market

SK Hynix Icheon campus

SK hynix has developed a high-performance, high-capacity high bandwidth memory chip composed of 12 DRAM chip layers bonded together providing 24GB of DRAM. It’s looking to sell them into compute-intensive markets such as the AI chatbot industry.

High bandwidth memory (HBM) has been devised to get around the x86 CPU socket scheme for hooking up DRAM to CPUs. A socket is effectively a channel and there are a limited number per CPU, 8 per Ice Lake gen 3 Xeons, providing about 200GB/sec of memory bandwidth. HBM, with a more direct, interposer-based, processor connection, has higher bandwidth – up to 819GBps with SK hynix’s HBM gen 3 scheme. SK hynix started mass production of an 8-layer HBM gen 3 chip in June last year with 16GB of capacity. Now it has moved on.

A spokesperson said: “The company succeeded in developing the 24GB package product that increased the memory capacity by 50 percent from the previous product. … We will be able to supply the new products to the market from the second half of the year, in line with growing demand for premium memory products driven by the AI-powered chatbot industry.”

There have been five HBM generations so far, with JEDEC standards defining them:

B&F Table

SK hynix fabbed the basic DRAM chips and used TSV (Through Silicon Via) interconnect technology, that provides thousands of holes through the chip, for electrode connections. This enabled the chip thickness to be reduced by 40 percent over its non-TSV-using predecessor. That, in turn, meant 12 chips could be stacked with the same height as its prior 8-layer, 16GB product; 50 percent more capacity in the same product dimensions.

The 12 DRAM layers were affixed together using Mass Reflow Molded Underfill (MR-MUF) technology, with which, SK Hynix says, multiple chips are stacked on a base substrate and bonded through reflow, simultaneously filling the gap between the chips or between the chip and the substrate with a mold material.

The reflow bonding involves two soldered parts pressed together and heated so that the solder melts and flows, and then cools to form electrical bonds.

Potential customers have evaluation samples of these 12-layer 24GB HBM chips, the highest capacity HBM3 chips available. Sang Hoo Hong, head of Package & Test at SK hynix, said: “The company plans to complete mass production preparation for the new product within the first half of the year.”

The intention and hope is that there will be a surge in enterprise adoption of large language model AI technology, like ChatGPT, which will boost GPU sales. GPUs needs lots of memory and use HBM technology for that rather than slower x86 CPU socket methods. 

Step forward SK hynix with fast, high-capacity HBM3 chips ready and waiting to keep those bots chatting.

Hey Presto! IBM buys VC-backed SaaS player Ahana

IBM said yesterday it was interested in mergers and acquisitions but we didn’t expect it to move this fast – it has just bought Ahana, which supplies the Presto in-memory, distributed SQL datalake query engine as a service in its Ahana Cloud.

Presto is a Facebook (now Meta)-originated open-source project to provide datalake analytics using a distributed SQL query engine. It is popular and has thousands of users, including Intel, Uber and Twitter. Facebook started the project in 2012 and contributed it to the Linux Foundation in 2018 with it forming the Presto Foundation.

The acquisition news came via a Big Blue blog by Vikram Murali, VP Hybrid Data Management, who wrote: “We’re thrilled to share that IBM has acquired Ahana, the venture-backed SaaS for Presto startup company, and we want to write more about our belief in Open Source and why IBM and Ahana are joining forces for the benefit of Presto.”

Stephen Mih

Ahana was started up in 2020 in San Meteo by CEO Stephen Mih, CTO David Simmen, chief product officer Dipti Borkar, and principal software engineers Vivek Bharathan, Ashish Tadose and George Wang. With $7.8 million in seed funding they wanted to make Presto easier to use and better integrated with other parts of the data lake analytics ecosystem.

Their development efforts were rewarded with a $20 million A-round fund raise the next year, a $7.2 million extended A-round last year, and ultimately a total of $32 million raised, according to VentureBeat. The acquisition price has not been revealed but speculation is in the $100 million+ area, which – if correct – would make this a classic Silicon Valley startup success story.

The Ahana Cloud runs on AWS and makes it simpler and easier to run Presto queries across huge data lakes.

Ahana graphic

At the time of the A-round,  the Presto open-source project had enjoyed massive growth that year, with hundreds of thousands of pulls of the Docker Sandbox Container for Presto hosted by Ahana, more than 1,000 members in global Presto meetups, and 10 companies that signed on to the Presto Foundation.

Now, Murali writes: “The project has 14.6K Github stars, saw a 110 percent growth of members in the community over the past year, and an engagement rate of close to 50 percent across all of the Presto community channels.” He talks about Uber’s use of Presto: “The scale of Presto at Uber is just as impressive. With 20 clusters that process 100M+ queries each day, Uber depends on Presto to run ad-hoc analytics for its 7K weekly active users.“

Ahana graphic

IBM’s open-source credentials are stronger than ever, dating from its original support for Linux in the late ’90s, working with Linux, Apache and Eclipse, investing $1 billion from 2001 onwards, and buying Red Hat in a landmark deal in 2018 for $34 billion. Murali also noted that IBM was “a founding member of the Cloud Native Computing Foundation (CNCF), which fostered the growth of Kubernetes.” 

IBM is now a member of the Presto Foundation and inherits Ahana’s role in the Presto Foundation Outreach Committee plus Ahana’s four Presto project committers and two Technical Steering Committee members. 

Murali closes his post by saying: “We believe we’re entering an exciting next chapter of Presto, and we look forward to sharing more with the community as we move forward.”

There is nothing about the IBM acquisition on Ahana’s website yet.

Bootnote

The four Presto creators at Facebook – Dain Sundstrom, Martin Traverso, David Phillips and Eric Hwang – left Facebook in 2018, a year before Presto was donated to the Linux Foundation. The trio named their code fork PrestoSQL, which rebranded to Trino, and set up Starburst to sell Trino connectors and support. 

There is a Facebook document about Presto’s history here.

Seagate HAMRs way to top of nearline capacity

Seagate HAMR technology
Seagate HAMR technology

Seagate has a 30TB+ HAMR disk drive coming this quarter for its CORVAULT arrays, which will leapfrog the capacity of Western Digital’s 22TB and 26TB drives and leave Toshiba stumbling behind at the 20TB level.

Update: Is CORVAULT a JBOD or an RBOD? See bootnote at end of article. 27 April 2023.

The HAMR news was disclosed during an earnings call with financial analysts to discuss Seagate’s weak second quarter results. HAMR (Heat-Assisted Magnetic Recording) requires a read-write head to fire a laser pulse at a drive platter’s bit area to enable the otherwise high-resistance recording material to receive a magnetism change.  Seagate says this will enable a step-change advance in areal density over conventional perpendicular magnetic recording (CMR or PMR) technology.

CEO Dave Mosley said: “We expect to recognize initial revenue from 30-plus terabyte platforms this quarter as part of our CORVAULT system solutions.”

Seagate CORVAULT

The Exos CORVAULT product is a self-healing JBOD (Just a Bunch Of Disks – see note below) chassis first announced two years ago. It holds 106 SAS-connected disk drives, meaning 2PB of capacity with 20TB drives. Seagate launched its first 22TB drive earlier this month for NAS, direct-attach and RAID disk drive environments. It’s not been announced as available for CORVAULT. Why bother if a 30TB+ drive is coming?

What Seagate is not doing is announcing that its 30TB+ HAMR drive will be available to its OEM partners, such as the storage array suppliers, which is curious. It has shipped samples to only one cloud service provider, with Mosley saying Seagate “achieved the key milestone last week of shipping initial qualification units to a cloud launch partner.”

Neither is it announcing the actual capacity. Instead Mosley talked about “increasing capacities from three to four to five terabytes per disk or more” over time, with “disks” meaning platters.

CFO Gianluca Romana backed this up, answering a question during the call and mentioning: “In the future, 3 terabyte per disk or 3.5 terabyte per disk and 4 terabytes per disk, those are very interesting propositions for our customers.”

A 10 x 3.5TB/platter drive would be 35TB, but Seagate keeps saying 30TB+ and not 30TB or 35TB. We envisage it could add an extra platter and have an 11 x 3TB/platter drive with 33TB. Both Toshiba and Western Digital have discussed 11-platter drives in the past. This would tally with longer than normal qualification times by OEMs and CSPs as having both an extra platter and new recording technology would require more checking for reliability, power draw, and so forth.

Competition

Let’s settle on a 33TB HAMR drive, with 10 x 4RU CORVAULT chassis providing a 35PB rack, and ask where this leaves competitors Western Digital and Toshiba. In a hole, frankly. Our view of the three HDD supplier’s capacity roadmap is in this table:

Seagate, Toshiba and WD roadmaps

The current quarter is highlighted in yellow and we can see what the three suppliers have in their product ranges. Western Digital has been in the lead with 22TB CMR and 26TB SMR drives. But Seagate will leapfrog both and be able to tell customers they can store more data in less rackspace for less money with its HAMR tech. 

Western Digital discussed its HDD roadmap with analysts a year ago, but this was a directional thing with no dates, apart from HAMR drives appearing in 2026.

Seagate competitor WD roadmap
Western Digital HDD roadmap in 2022

We are convinced that WD will announce something like a 26TB CMR drive, using its Optinand flash-enhanced technology along with a 30TB SMR (shingled magnetic recording) drive in the next couple of quarters. Otherwise it will lag Seagate and may lose nearllne drive market share.

Of course it might pull its 50TB archive drive concept out if its back pocket and try to overtake Seagate, but that will have to get over a large initial qualification hurdle and overcome concerns about being locked in to a single supplier.

Toshiba is in a bad place. It has already dropped behind the schedule outlined in its 2022 roadmap and faces falling even further behind as it is still at the 20TB CMR drive level:

Seagate competitor Toshiba roadmap
B&F diagram showing Toshiba’s HDD roadmap in 2022

The nearline disk drive market is the HDD market these days. There’s little or no future in 2.5-inch, 10K mission-critical disk drives, and the notebook and desktop drive markets are being taken over by SSDs. Toshiba will need to keep up with nearline drive capacity demands.

TL;DR

Seagate could be about to become the disk drive market technology and capacity leader if its HAMR tech delivers the cost and reliability goods, leaving WD in second place and Toshiba in danger of lagging behind.

Bootnote

Seagate took exception to CORVAULT being described as a JBOD, even a self-healing JBOD, with a spokesperson saying: “CORVAULT is [an] RBOD (reliable bunch of disks) not JBOD. However Seagate marketing doesn’t call it [an] RBOD, they prefer to call it “self-healing”.

In our view CORVAULT is a JBOD with embedded intelligence (VelosCT ASIC) to give it a self-healing capability. It is not a block storage array in the Dell, HPE, NetApp, etc., sense. It is a kind of RAID array, as a CORVAULT data sheet says it suppports “Seagate ADAPT erasure coding -or- RAID 5, 6” for data protection.

So RBOD I think it shall be, for want of anything better.