Home Blog Page 112

What a mesh. How to tell a lakehouse from a data lake

Ever get confused about the differences between a data warehouse, data lake, lakehouse, and data mesh? Bruno Rodrigues Lopes, a senior solutions architect at Bradesco, posted a handy set of definitions and differences on LinkedIn which provide a good way of viewing these different data constructs.

Let’s set an anchor with the difference between a database and a data warehouse. Real-time, current records of structured data are stored in a database. A data warehouse, such as Teradata, introduced in the late 1970s, stores older records in its structured data store with fixed schema for historical data analysis. Data is typically loaded into a data warehouse though an extract, transform and load (ETL) process applied to databases. Business intelligence users query the data, using structured query language (SQL), for example, and the warehouse code is designed for fast query processing.

Warehouse vs data lake vs lakehouse vs mesh
Diagram by Brit Kishore Pandey

A data lake combines a large amount of structured, semi-structured, and unstructured data. Reading the data generally means filtering it through a structure defined at read time, and can involve an ETL process to get the read data into a data warehouse. An S3 or other object store can be regarded as a data lake where its contents can be used by data scientists and for machine learning.

As we might expect from its name, a data lakehouse aims to combine the attributes of a data warehouse and data lake and it is being used for both business intelligence, like a data lake, and also machine learning workloads. Databrick’s DeltaLake is an example. It has its own internal ETL functions, referencing internal metadata, to feed its data warehouse component.

A data mesh is a way of using a data lakehouse and data lake. When initiated, these two concepts typically have a centralized data team handling the questions from management and line of business (LOB) owners that need analysis routines run in the lakehouse or warehouse, as well as maintaining the warehouse/lakehouse. As the analysis workload rises, this central team can become a bottleneck, preventing timely analytic routine creation. They don’t understand LOB constraints and preferences, adding more delay.

If the LOB, domain-specific people can write their own queries, bypassing the central team, then their query routines get written and run faster. This distribution of query production to LOB teams, which operate on a decentralized self-service basis, is the essence of the data mesh approach. They design and operate their own data pipelines using the data warehouse and lakehouse as a data infrastructure platform.

The data mesh approach is supported by suppliers such as Oracle and Snowflake, Teradata, and others.

There we have it – four data container constructs parceled up, ready to be stored in your mental landscape and deployed when needed.

Storage news ticker – September 22

Cloud storage provider Backblaze has released Computer Backup 9.0 in early access, which it says eliminates stress over restoring data. The version removes the 500GB size limit on its restoring capabilities and comes with a dedicated restore app for macOS and Windows clients. Computer Backup offers free egress and is set at a fixed cost based on employee count with no limit to the amount of computers per organization.

New research commissioned by data management warehouse ClickHouse shows it’s speedier than Snowflake and more cost-effective when it comes to measuring real-time analytics. ClickHouse claims it improves data loading and query performance by up to 2x and compression by 38 percent even with Snowflake optimizations. ClickHouse reckons it can achieve these moves with a 15x reduction in cost. See ClickHouse’s blog (part one and part two).

Data protector CrashPlan has launched CrashPlan for MSPs, a dedicated program for IT Managed Service Providers and Managed Security Service Providers. They can now provide their customers with CrashPlan’s endpoint backup and recovery SaaS offering. CrashPlan for MSPs’ cloud-based SaaS deployment requires zero on-premises hardware and comes with out-of-the-box system defaults to get customers up and running. Direct billing to the MSP on a monthly, per-user basis gives the option to add or remove customers without having to commit to a cap, and perform customer chargebacks based on consumption.

Salesforce and Databricks have expanded a strategic partnership that delivers zero-ETL (Extract, Transform, Load) data sharing in Salesforce Data Cloud and lets customers bring their own Databricks AI models into the Salesforce platform.

Storage benchmark SPC-1

China’s Huarui Expon Technologies, the SPC’s newest member, published its first SPC-1 result and set a new performance record. A cluster of 32 ExponTech WDS V3 all-flash nodes demonstrated 27,201,325 SPC-1 IOPS with an SPC-1 IOPS Response Time of 0.217ms. ExponTech WDS V3 is a fully self-developed software-defined, high-performance distributed enterprise-level block storage platform designed for large-scale core data processing applications. ExponTech WDS V3 adopts a decentralized distributed system architecture, which it says enables smooth horizontal expansion while maintaining linear scalability of performance and capacity with the number of nodes. The platform supports NVMe SSD, SATA SSD, and SCM drives as primary storage media and is compatible with both 25G/100G RDMA RoCEv2 and traditional 10G TCP/IP networking technologies.

Storage benchmark SPC-1

ibi, which supplies scalable data and analytics software, has announced Open Data Hub for Mainframe, directly bringing mainframe data into web-based business analytics capabilities. Business users can integrate and manipulate disparate data in real time and import rich information into preferred analytics tools. They can avoid the costs of large ETL projects and eliminate stale data by accessing the data in-place on the mainframe. Data scientists gain access to previously inaccessible mainframe data, enabling them to create custom queries without SQL and work with deeper, more complex datasets in real time for comprehensive analytics. Developers benefit from an intuitive GUI interface to generate data queries without creating custom SQL queries while providing seamless access to multiple databases, boosting productivity and software quality.

IBM has announced a Cyber Recovery Guarantee for its FlashSystem storage products. It guarantees to recover a user’s immutable snapshot data in 60 seconds or less, enabling businesses to persist through a ransomware attack protected by IBM FlashSystem with Safeguarded Copy. If Safeguarded Copy takes longer than 60 seconds or is unrecoverable, IBM Expert Lab will try to help the client get back on their feet. Find out more here.

Kioxia America has donated a command set specification to the Linux Foundation vendor-neutral Software-Enabled Flash Project. Software-Enabled Flash technology gives storage developers control over their data placement, latency outcomes, and workload isolation requirements. Through its open API and SDKs, hyperscale environments may optimize their own flash protocols, such as flexible direct placement (FDP) or zoned namespace (ZNS), while accelerating adoption of new flash technologies. Kioxia has developed working samples of hardware modules for hyperscalers, storage developers and application developers. The Linux Foundation’s Software-Enabled Flash Project offers several levels of membership and participation.

Lightbits Labs has announced Lightbits v3.4.1 with a transition to full userspace operation, completely eliminating Linux kernel dependencies for the first time. It has ported Lightbits kernel code to userspace and optimized it. This release extends compatibility to new distributions, including RHEL 9.2, Alma Linux 9.2, and Rocky Linux 9.2, and we’re told makes upgrades simpler and faster. The release adds various observability improvements to Lightbits and to its monitoring stack. Lightbits has now made its debut on the Azure marketplace and is offering an early preview of its new Azure Managed App Marketplace offering. To use it, click “subscribe” on the marketplace.

Pacific Northwest National Laboratory is collaborating with Microsoft and Micron to make computational chemistry broadly available to applied researchers and industrial users. The project, known as TEC4 (Transferring Exascale Computational Chemistry to Cloud Computing Environment and Emerging Hardware Technologies), is part of a broad effort announced by the Department of Energy to quicken the transfer of technology from fundamental research to innovation that can be scaled into products and capabilities. Instead of using a centralized supercomputer, the team will use Microsoft’s Azure Quantum Elements, which features simulation workflows augmented by artificial intelligence, and incorporate Micron’s CXL memory expansion modules.

The Presto Foundation has announced two new contributions to Presto, the open source SQL query engine for data analytics and data lakehouse. IBM is donating its AWS Lake Formation integration for Presto, and Uber is donating its Redis-based historical statistics provider. AWS Lake Formation is a service intended to make it easy to set up a secure data lake in a matter of days, providing the governance layer for AWS S3. With the AWS Lake Formation and Presto integration, data platform teams will be able to integrate Presto natively with AWS Glue, AWS Lake Formation, and AWS S3 while providing granular security for data.

Storage supplier Scality has achieved a Silver rating from EcoVadis, the world’s largest provider of business sustainability ratings. EcoVadis has become the global standard by rating more than 100,000 companies globally. Scality’s score jumped into the top 25 percent of all organizations ranked by EcoVadis for sustainability improvements, and was based on benchmarks that evaluate participating organizations on 21 sustainability criteria metrics across four core pillars: Environment, Labor and Human Rights, Ethics, and Sustainable Procurement.

SoftIron has been authorized by the Common Vulnerability and Exposures (CVE) Program as a CVE Numbering Authority. The CVE Program seeks to provide a common framework to identify, define, and catalog publicly disclosed vulnerabilities. As each vulnerability is detected, reported, and assessed, a CVE ID is assigned and a CVE Record is created. This ensures information technology and cybersecurity professionals can identify and coordinate efforts to prioritize and address vulnerabilities and protect systems against attack.

Starburst, which supplies data lake analytics software, has announced new capabilities for its data lake analytics platform including Dell storage support for ECS, ObjectScale, and Ceph in Starburst Enterprise. On-premises connectivity in Starburst Galaxy extends beyond the “cloud data source only” world and allows enterprises moving to the cloud to still access their on-prem datasets. There is integration of Starburst Galaxy and Databricks Unity Catalog to provide an additional metastore alongside AWS Glue, Hive HMS, and Galaxy Metastore so customers have access to and can blend modern data sources without migration or reconfiguration. The on-prem connectivity is probably the most significant addition here as it’s highly differentiated from other solutions on the market that are more cloud-centric. The idea is to give options to those at early stages of migration or those who must keep data on-prem for regulatory/data localization reasons so they can still use that data.

Synology has attained Veeam Ready – Repository certification for its enterprise-oriented storage units. These units also support Active Backup for Business.

Reuters reports that a $14 billion tender offer from private equity firm Japan Industrial Partners (JIP) for Toshiba has ended in success and will enable the troubled conglomerate to go private. The JIP-led consortium saw 78.65 percent of Toshiba shares tendered, giving the group a majority of more than two-thirds which would be enough to squeeze out remaining shareholders. Toshiba owns about 40 percent of SSD manufacturer Kioxia, which is involved in merger talks with its NAND foundry joint venture partner Western Digital.

Marianne Budnick, CMO at storage company VAST Data
Marianne Budnick

VAST Data has appointed its first CMO, Marianne Budnik, who will lead all marketing functions including product marketing, field marketing, customer advocacy, communications and public relations, content, creative and brand strategy. She will work closely with VAST Data co-founder Jeff Denworth, who will maintain his responsibilities leading the company’s product and commercial strategies. Budnik comes to VAST from marketing exec roles at CrowdStrike, CyberArk, and CA Technologies, along with nearly a decade in marketing leadership roles at EMC Corporation. VAST has also appointed Stacey Cast as VP of global operations, responsible for VAST’s operational processes that cover product quality and delivery. She comes to VAST from being SVP of Business Operations at H20.ai, and previously held similar roles at SentinalOne, Cohesity, Nimble Storage, and NetApp.

Veeam commissioned Censuswide to conduct a survey of UK business leaders to understand how the rising threat of cyber-attacks is affecting their companies. Some 43 percent indicate that ransomware is a bigger concern than all other critical macroeconomic and business challenges, including the economic crisis, skills shortages, political uncertainty, and Brexit. Of the 100 directors of UK companies surveyed, a fifth considered dissolving their business in the year after an attack and 77 percent reduced staff numbers. While turnover, customer retention, and productivity were also hit, these aren’t the only negative consequences as the survey also uncovered widespread psychological effects on respondents related to ransomware attacks.

Weka Data Platform’s Converged Mode is the first scale-out storage on deep learning instances available to users running workloads in the cloud. It uses ephemeral local storage and memory in cloud AI instances to yield exponential cost savings and performance improvements for large-scale generative AI resources compared to traditional data architectures. It’s done this in collaboration with its customer Stability AI, an open source generative AI company, and it is intended to enhance Stability AI’s ability to train multiple AI models in the cloud, including its popular Stable Diffusion model, and extend efficiency, cost, and sustainability benefits to its customers. 

Zadara was placed in the “leader” position in GigaOm’s Storage-as-a-Service (STaaS) Sonar Report, beating competitors including HPE, NetApp, IBM, Pure Storage, and others. It achieved a rating of “exceptional” in categories relating to cost, expansion, ease of use, data plane and protocol support, and multi-tenancy. For more information, see the full report.

Komprise wants to help monitor sprawling data estates

Storage Insights from Komprise provides file and object capacity and usage information across multiple suppliers, locations, and clouds.

Update: Datadobi response to Komprise’s competitive characterisations added. 3 Oct 2023.

It’s a single interface designed to let IT admins spot storage usage and consumption trends across their hybrid cloud estate, drill down, and execute plans and actions in one place to drive a better return on data storage deployments. It uses, we’re told, Komprise’s Transparent Move Technology to execute data movement plans and migrate data to users or applications, without disruption or obstructing data acess.

Storage Insights is included in the v5.0 Komprise Intelligent Data Management release.

Kumar Goswami, Komprise
Kumar Goswami

Kumar Goswami, co-founder and CEO of Komprise, said: “Enterprise storage is becoming more distributed across on-premises, multi-cloud and edge environments, and often across multiple vendor systems. This latest release gives customers an easier, faster way to proactively manage and deliver data services across this complex hybrid IT environment while optimizing their data storage investments.”

In March, Komprise announced an Analysis offering available as-a-service with a set of pre-built reports and interactive analysis. It could look at all file and object storage – including NetApp, Dell, HPE, Qumulo, Nutanix, Pure Storage, Windows Server, Azure, AWS, and Google – to see a unified analysis showing how data is being used, how fast it’s growing, who is using it, and what data is hot and cold. There was no ability to move files or objects with this service and a separate purchase was needed to get that functionality.

Storage Insights has that data movement capability, we’re told. Komprise bills Storage Insights is a new console that will be included in all editions of Komprise including Komprise Analysis. Komprise already provides visibility across heterogeneous storage, and Storage Insights delivers a new console that adds storage metrics to further simplify its customer’s data management. It gives admins the ability to drill down into file shares and object stores across locations and sites, and look at metrics by department, division or business unit, such as:

  • Which shares have the greatest amount of cold data?
  • Which shares have the highest recent growth in new data?
  • Which shares have the highest recent growth overall?
  • Which file servers have the least free space available?
  • Which shares have tiered the most data?

They can then:

  • Tier cold data transparently from the shares that have the highest amount of cold data to cheaper storage
  • Identify cloud migration opportunities such as moving least modified shares or copying project data to data lakes
  • Identify potential security threats and ransomware attacks on data stores with anomalous activity such as high volume of modifications
  • Set alert thresholds to see unusual activity or other data requiring fast actions such as storage nearing capacity

Users can customize and filter a report facility to understand the current state of storage assets across sites. They can see details on capacity, percentage of modified or new data and can filter by shares, status, data transfer roles, and more. Admins can sort file shares and view by largest, most cold data, highest recent modified data, least free space, most and least data archived or tiered by Komprise and more. They can also look into specific file servers/services, such as NetApp, Dell EMC Isilon (PowerScale), Pure Storage, AWS, Azure and Windows to check out what’s going on and keep their unstructured data estate healthy.

Storage Insights screen grab.

Krishna Subramanian, Komprise president and COO, told us: “Storage Insights is a management console to quickly understand both data usage and storage consumption as well as where you add new data stores for analysis and data management activities. We created this because several of our customers were frustrated that each storage vendor tends to report free space or storage consumption differently and they had to search across different places for this information. By adding it to our consoles, customers do not have to look in multiple places and they get one consistent definition and view of their storage metrics as well.”

Storage Insights appears similar to Datadobi’s StorageMap offering. This has an environmental dimension that Storage Insights does not. Datadobi says it helps users meet their CO2 reduction targets through carbon accounting of unstructured data storage in their hybrid clouds.

Subramanian said: “Migration vendors like DataDobi have realized the value of visibility and are now adding another product for analytics but historically this approach has lacked traction because it is a separate product, not built in as a cohesive integration, and does not address managing data at scale.” (See update note below.)

Other competitors include Data Dynamics with its StorageX file lifecycle management software, and Hammerspace with its unstructured data orchestration offering.

Komprise’s v5.0 Intelligent Data management software release also includes new pre-built reports, including Potential Duplicates, Orphaned Data, Showback, Users, and Migrations reports as well as other platform updates. Storage Insights is also available with Komprise Analysis and Komprise Elastic Data Migration. More information here.

Update. A Datadobi source said: “The premise of Subramanian’s statement is untrue. Datadobi offers one and only one solution, StorageMAP. It is, in fact, Komprise that offers multiple “products,” as can be witnessed on their website.”  

“We should note that we do, however, agree with Subramanian’s statement that “this approach has lacked traction because it is a separate product, not built in as a cohesive integration, and does not address managing data at scale.” But, it is important to point out that this is true of Komprise’s multiple products, not Datadobi’s one and only StorageMAP. Only StorageMAP offers the ability to assess, organize, and act from a single platform in order to gain optimum value from data, while reducing risk, cost, and carbon footprint.”

Kioxia reportedly refinancing loan for possible WD flash merger

Kioxia is looking to refinance a ¥2 trillion ($14 billion) loan arrangement pursuant to its potential merger with Western Digital’s NAND and SSD business unit, at least according to sources who spoke to Bloomberg.

Under pressure from activist investor Elliott Management, Western Digital is examining the possible split of its hard disk drive and NAND/SSD business and the latter’s subsequent merger with Kioxia. Western Digital and Kioxia have a joint venture to operate NAND fabs in Japan, with each taking approximately half the chip output to build SSDs (WD) or sell on raw chips (Kioxia).

The merger discussions have been ongoing since January. The background can be referenced here. Kioxia would own 49.5 percent of the combined business with Western Digital having 50.05 percent.

Kioxia is 56.24 percent owned by a Bain Capital-led private equity consortium and 40.64 percent owned by Toshiba, its original parent. Publicly owned Toshiba is going through a $14 billion buyout process to take it private, and Kioxia’s ability to negotiate with WD may be affected by Toshiba concerns and delayed decisions. Western Digital is also experiencing a downturn in its revenues from a severely depressed disk drive market.

The Japan Times, citing anonymous people close to the situation, reports that Kioxia’s bankers, including Sumitomo Mitsui Financial Group, Mizuho Financial Group, and Mitsubishi UFJ Financial Group, intend to file commitment letters. These banks will provide equal shares of a ¥1.3 trillion ($9.1 billion) amount. The Development Bank of Japan will provide ¥300 billion ($2 billion) and the remaining ¥400 billion ($2.8 billion) will be loan commitments, making up the ¥2 trillion total.

Some of the loan money will fund dividends payable to the existing Kioxia shareholders.

A combined Kioxia/Western Digital NAND business would have a 34 percent revenue share of the NAND market, according to TrendForce numbers from November 2022, more than current market leader Samsung’s 31 percent share. The merger makes market sense in this regard. The merged business would have its stock traded on Nasdaq and also pursue a Tokyo exchange listing.

Neither Bain, the identified banks, Kioxia nor Western Digital responded to Japan Times requests for comments. Kioxia declined to comment to Reuters as did the three named banks.

Allocating AI and other pieces of your workload placement puzzle

COMMISSIONED: Allocating application workloads to locations that deliver the best performance with the highest efficiency is a daunting task. Enterprise IT leaders know this all too well.

As applications become more distributed across multiple clouds and on premises systems, they generate more data, which makes them both more costly to operate and harder to move as data gravity grows.

Accordingly, applications that fuel enterprise systems must be closer to the data, which means organizations must move compute capabilities closer to where that data is generated. This helps applications such as AI, which are fueled by large quantities of data.

To make this happen, organizations are building out infrastructure that supports data needs both within and outside the organization – from datacenters and colos to public clouds and the edge. Competent IT departments cultivate such multicloud estates to run hundreds or even thousands of applications.

You know what else numbers in the hundreds to thousands of components? Jigsaw puzzles.

Workloads Placement and… Jigsaw Puzzles?

Exactly how is placing workloads akin to putting together a jigsaw puzzle? So glad you asked. Both require careful planning and execution. With a jigsaw puzzle – say, one of those 1,000-plus piece beasts – it helps to first figure out how the pieces fit together, then assemble them in the right order.

The same is true for placing application workloads in a multicloud environment. You need to carefully plan which applications will go where – internally, externally, or both – based on performance, scalability, latency, security, costs and other factors.

Putting the wrong application in the wrong place could have major performance and financial ramifications. Here are 4 workload types and considerations for locating each, according to findings from IDC research sponsored by Dell Technologies.

AI – The placement of AI workloads is one of the hottest topics du jour, given the rapid rise of generative AI technologies. AI workloads comprise two main components – inferencing and training. IT departments can run AI algorithm development and training, which are performance intensive, on premises, IDC says. And the data is trending that way, as 55 percent of IT decision makers Dell surveyed cited performance as the main reason for running GenAI workloads on premises. Conversely, less intensive inferencing tasks can be run in a distributed fashion at edge locations, in public cloud environments or on premises.

HPC – high-performance computing (HPC) applications ALSO comprise two major components – modeling and simulation. And like AI workloads, HPC model development can be performance intensive, so it may make sense to run such workloads on premises where there is lower risk of latency. Less intensive simulation can run reliably across public clouds, on premises and edge locations.

One caveat for performance-heavy workloads that IT leaders should consider: Specialized hardware such as GPUs and other accelerators is expensive. As a result, many organizations may elect to run AI and HPC workloads in resource-rich public clouds. However, running such workloads in production can cause costs to soar, especially as the data grows and the attending gravity increases. Moreover, repatriating an AI or HPC workload whose data grew 100x while running in a public cloud is harsh on your IT budget. Data egress fees may make this prohibitive.

Cyber Recovery – Organizations today prioritize data protection and recovery, thanks to threats from malicious actors and natural disasters alike. Keeping a valid copy of data outside of production systems enables organizations to recover lost or corrupted due to an adverse event. Public cloud services generally satisfy organizations’ data protection needs, but transferring data out becomes costly thanks to high data egress fees, IDC says. One option includes hosting the recovery environment adjacent to the cloud service – for example, in a colocation facility that has a dedicated private network to the public cloud service. This eliminates egress costs while ensuring speedy recovery.

Application Development – IT leaders know the public cloud has proven well suited for application development and testing, as it lends itself to the developer ethos of rapidly building and refining apps that accommodate the business. However, private clouds may prove a better option for organizations building software intended to deliver a competitive advantage, IDC argues. This affords developers greater control over their corporate intellectual property, but with the agility of a public cloud.

The Bottom Line

As an IT leader, you must assess the best place for an application based on several factors. App requirements will vary, so analyze the total expected ROI of your workloads placements before you place them.

Also consider: Workload placement is not a one-and-done activity. Repatriating workloads from various clouds or other environments to better meet the business needs is always an option.

Our Dell Technologies APEX portfolio of solutions accounts for the various workload placement requirements and challenges your organization may encounter as you build out your multicloud estate. Dell APEX’ subscription consumption model helps you procure more computing and storage as needed – so you can reduce your capital outlay.

It’s true: The stakes for assembling a jigsaw puzzle aren’t the same as allocating workloads in a complex IT environment. Yet completing both can provide a strong feeling of accomplishment. How will you build your multicloud estate?

Learn more about how Dell APEX can help you allocate workloads across your multicloud estate.

Brought to you by Dell Technologies.

Solidigm SSD looks to outpace rivals on random writes

Solidigm has announced a storage-class memory (SCM) datacenter write caching SSD, taking direct aim at Micron’s XTR drive and Kioxia’s FL6 drive and putting out comparably faster random write speeds.

The D7-P5810 is a 144-layer 3D NAND drive in 1bit/cell (SLC) format with 800GB capacity in a U.2 (2.5-inch) form factor and a PCIe gen 4×4 NVMe interface. A 1.6TB version is slated to arrive in the first half of next year. Solidigm positions the drive as a fast write cache sitting in front of slower mass capacity QLC (4bits/cell) drives such as its own 61.4TB D5-P5336. The D7-P5810 can also be used in high-performance computing (HPC) applications.

Greg Matson, Solidigm VP of Strategic Planning and Marketing,said: “Solidigm has now further expanded its industry-leading endurance swim lane coverage [with] a new ultra-fast datacenter SSD with compelling specifications to serve customers’ very high write-intensive needs.” 

Solidigm says the D7-P5810 can be used with QLC drives in a cloud storage acceleration layer (CSAL) deployment. CSAL is open source software that uses SCM as a write cache to shape large, small, random and sequential write workloads to large sequential writes which can then be written to QLC SSDs and improve their endurance.

Solidigm claims the D7-P5810 delivers caching, high-performance computing (HPC), metadata logging, and journaling for write-intensive workloads, with nearly 2x better 4K random write IOPS performance than Micron’s 960GB XTR NVMe SSD, also launched as an SLC caching drive.

We’ve tabulated the basic performance data for 3D NAND SLC SCM caching drives provided by Kioxia, Micron, and Solidigm to see how they stack up:

Solidigm vs Micron, Kioxia

Solidigm’s D7-P5810 certainly does have better random write IOPS than Micron’s XTR in its 960GB guise, also beating its 1.92TB variant’s 350,000 IOPS, and also exceeds Kioxia’s FL6 400,000 IOPS.

Solidigm’s new drive appears – from the numbers it provided – to have the worst sequential write performance of the three, though.

It says the D7-P5810’s endurance is 50 drive writes per day (DWPD) for random writes and 65 DWPD for sequential ones. Micron’s 960GB XTR does 35 DWPD with random writes and 60 with sequential ones – both less than Solidigm’s.

The D7-P5810 provides 53µs read latency and 15µs write latency, with Micron’s 960GB XTR delivering 60µs read latency and the same 15µs write latency. Kioxia’s FL6 has 29µs read latency and 8µs write latency, being the fastest of the three in latency terms.

The Solidigm and Micron drives have the same active and idle state power draws – 12W and 5W respectively. The 800GB version of Kioxia’s FL6 is rated at 14W in the active state and 5W in what Kioxia calls its ready state, meaning idle as we understand it, making it the poorest of the three in power consumption terms.

Check out more D7-P5810 info here. The D7-P5810 will be on display at the Solidigm/SK hynix booth (A8) at the Open Compute Summit (OCP) in San Jose, CA, October 17-19, 2023. 

HYCU CEO wrote the book on SaaS data apocalypse

We face a looming SaaS data apocalypse, according to HYCU co-founder and CEO Simon Taylor, and he’s written a book about it.

SaaS data protector HYCU announced its R-Cloud scheme in February. This involves SaaS app vendors using HYCU APIs to add backup services to their apps. HYCU aims to have 100 SaaS app connectors ready within months and is targeting 500 in a few years.

Taylor’s book is like a 183-page piece of sales collateral for R-Cloud, a weighty solution brief-type document arguing that users are facing malware-caused SaaS app data losses. Since there are so many SaaS apps in use – 17,000 in the USA and 23,000 worldwide, he says – we could have critical services shut down through such attacks,.

His text argues that there is a shared responsibility model between SaaS app suppliers and their customers. The vendors look after the security of their infrastructure and the customers look after the security of their data – or should, because Taylor claims they mostly don’t.

He writes that HYCU ”research led us to a shocking discovery. Among the 23,000 SaaS vendors globally, only five of them were backed up and recoverable via the world’s leading data protection vendors. Not five percent, but five. Just five vendors.”

He asks: “What risks did this massive gap pose to organizations relying on these SaaS vendors? As we delved deeper into these issues, we began to realize the magnitude of what we were up against – a potential SaaS data apocalypse. And we knew it was imperative to shed light on this impending issue, lest we find ourselves unprepared for a disaster of an unprecedented scale.”

Taylor leads a company with big ambitions for SaaS data protection sales, which would encourage him to be energetic in his claims about the risks of not protecting SaaS data. He may be referring to an accumulation of myriad smaller SaaS app data losses rather than a single massive one, though, or both.

Taylor says: “The looming threats to SaaS data are not theoretical. They are real, palpable, and have the potential to inflict catastrophic damage.” The threats include negligent or malicious employees, human error, accounts left active after employees have quit, and malware penetrating third-party apps integrated with SaaS apps. He wants us to be scared and informed about the risks. “In the end, the value of data is incalculable. It’s the lifeblood of modern businesses, governments, and institutions. Protecting it should be one of our highest priorities. The SaaS data apocalypse is not an inevitability. It’s a potential future that we can, and should, strive to prevent. By understanding the scope of the challenge, acknowledging our shared responsibility, and taking decisive steps to protect our data, we can navigate towards a safer, more secure digital future.”

Some of the book’s flashier quotes:

  • The SaaS data apocalypse is “the shadow cast by the brilliant light of technology, a specter looming ever larger with each passing day.”
  • “The sheer scale of data handled by these [SaaS] services and the widespread reliance on them have painted a bulls-eye on their backs. It’s not a question of ‘if’ but ‘when’ a serious breach will occur, with potentially catastrophic consequences.”
  • “It’s a ticking time bomb, and the clock is rapidly winding down.”
  • “As the CEO of HYCU, a company deeply involved in data protection, I’m privy to the darker underbelly of our digital world. I’ve seen firsthand how the promise of technology can be subverted, turning a tool for progress into a weapon of disruption. It’s a sobering reality, but it’s one we must face head-on.”
  • “Imagine a world, much like our own, where data breaches extend beyond the business realm, piercing through the very fabric of our daily lives, to impact those most defenseless – our children, our elderly, our sick, our needy. Consider the vulnerability of those for whom a SaaS data breach would carry life-altering consequences.”
  • “Our defenseless, our vulnerable, those we have a duty to protect, are at risk. This is the heart-wrenching, terrifying reality we confront as we peer into the abyss of a full-blown SaaS data apocalypse.”

HYCU is ready to save us from these risks, though. There’s more of this in the book, available for $11.39 on Amazon, although HYCU may be sending it out to its customers and prospects. The message is: ”Get your SaaS app data protected.” That’s the only way, Taylor says, to avert the SaaS data apocalypse. It will also help improve HYCU’s business prospects.

Gartner primary storage MQ shows consistency at the top

The Gartner primary storage magic quadrant has consolidated with eight Leaders – the same as last year – just one Challenger and one Niche player as three suppliers leave the MQ.

Out of the 10 suppliers, eight are leaders in this year’s MQ. As far as the set of suppliers in the leaders’ quadrant are concerned, there is no change in their number from last year or the prior year. As a reminder, the “Magic Quadrant” is a 2D space defined by axes labelled “Ability To Execute” and “Completeness of Vision” split into four squares tagged “Visionaries” and “Niche Players” at the bottom, and “Challengers” and “Leaders” at the top. The best placed vendors are in the top right Leaders box and with a balance between execution ability and vision completion. The nearer they are to the top right corner of that box the better. 

Gartner primary storage MQs for 2022 and 2023
Gartner primary storage MQs for 2022 and 2023. Inspur, present in 2022, is now called IEIT Systems.

There are slight positioning changes in the Leaders’ box. Pure still leads ahead of a tightly packed group composed of NetApp, Dell, HPE, Huawei, and IBM. Infinidat and Hitachi Vantara are both rated slightly lower on completeness of vision than in 2022.

Lenovo and Inspur, who were Challengers last year, have both exited the MQ, and DDN (Tintri), also a Challenger last year, has moved down to the Niche Players box. Gartner’s analysts note three cautions for Tintri:

  • Tintri lags the primary storage market leaders in supporting as-a-service consumption plans with SLA guarantees, due to its lack of a product and set of offerings for an on-premises hybrid platform for centralized IT operations.
  • Tintri lags market competitors in not offering or supporting ransomware detection capabilities, only recovery from integrated backup support.
  • Tintri’s TCE1000 SDS cloud engine lags leading competitive offerings in its breadth of public cloud options that it supports.

IEIT Systems, referred to as Inspur last year, is the sole Challenger now. It is a China-based supplier with, Gartner says, “a broad portfolio of primary storage products, including the NVMe all-flash array HF Series and a new SDS AS13000 ICFS product to address entry-level, midrange and high-end market requirements.” Gartner points out that it lacks a competitive software-defined storage offering for AWS, Azure, and GCP, and lags industry leaders with its workload placement AIOps capabilities.

Last year Fujitsu was a Niche player. It has exited this year’s MQ altogether. Zadara was a Visionary in 2022 but there are now no Visionaries at all.

Zadara supplies a fully managed compute (EC2 compatible), storage (block, file, object), and network system hosted in a customer’s datacenter, a global point-of-presence (POP) network, or in the AWS, Azure, GCP, and Oracle public clouds. It hired a new CEO, Yoram Novick, in April with the previous CEO, co-founder Nelson Nahum, becoming CTO and chairman of the board.

Gartner says: “Fujitsu was unable to meet inclusion criteria, due to its dependency on a third-party storage controller operating system. Lenovo was unable to meet inclusion criteria, due to its dependency on a third-party storage controller operating system. Zadara was unable to meet the minimum revenue inclusion criteria.” The minimum revenue criterion is “$100 million in recognized primary storage billings and or bookings revenue (using GAAP) over the last four quarters as of 31 March 2023, excluding support revenue; or else have generated over $100 million in total annual recurring revenue contract value as of 31 March 2023.”

Storage news ticker – September 20

Storage news
Storage news

Data protector Acronis has announced Acronis Cyber Protect Home Office (formerly Acronis True Image) with a suite of features that integrate secure backup and AI-based security, making it the “ultimate must-have solution” for individuals, families, home office users, and small businesses. It includes AI capabilities to proactively identify and neutralize potential threats, while providing an added layer of security against cyber-attacks including automated recovery from ransomware attacks. Enable the two-factor authentication (2FA) functionality to maximize your security. It also includes Backup and Cloning, Remote Management, a mobile app, and backup. 

Airbyte, creators of the fastest-growing open source data integration platform, says that its community has built more than 1,500 connectors in just three months using its no-code connector builder. It says that over the last six months the Airbyte engineering team has made significant progress on performance – multiplying the speed by up to 10 times for popular PostgreSQL and MySQL source connectors.

An Arcserve survey has revealed that healthcare was the most targeted industry by ransomware last year. Some 45 percent of healthcare respondents suffered a ransomware attack in the past 12 months. Two out of three paid the ransom and 82 percent of healthcare IT departments do not have an updated disaster recovery plan.

Sarv Saravanan

Data protector Commvault has appointed Sarv Saravanan as its first chief customer officer. He comes from Microsoft where he led its Global Delivery Center, which engages with the company’s biggest customers and their strategic partners to accelerate cloud transformations. Saravanan said: “In an industry facing threats that are more autonomous than ever before, customers are looking for unparalleled cyber resiliency know-how and an aggressive roadmap that harnesses the power of AI with the ease of SaaS. By continuing to redefine data protection, Commvault will widen its competitive advantage while furthering its customers’ advantages.” 

Edge-to-cloud file services supplier CTERA has published some ransomware stats saying 85 percent of organizations suffered at least one ransomware attack in the past 12 months. Over 93 percent of ransomware attacks explicitly target backups. After an attack, the average time to recovery stands at 3.4 weeks. This means businesses typically face a downtime of 136 hours at an astonishing cost of $300k per hour. Alarmingly, one in four organizations pay the ransom and never recover their data. CTERA says it and Scality offer unbreakable ransomware protection, data immutability, and operational efficiency. 

Cosmetics house L’Oréal is using Databricks to power its global Beauty Tech Data Platform as part of a multi-cloud strategy to improve customer experience across the globe. The Databricks Lakehouse will unify data across all of L’Oréal’s cloud data platforms. The Lakehouse will provide a complete view of the consumer’s data, from inquiry to purchase, from shipment to care, and from the online to offline experience. Etienne Bertin, Group CIO, stated: “L’Oréal operates in 150 countries, selling over 7 billion cosmetic products to more than 1.2 billion consumers every year, so having a data architecture that is unified, open, cloud-agnostic, interoperable, secure and scalable, is integral to our success. Leveraging the Databricks Lakehouse is enriching our global Beauty Tech Data Platform, and we are excited to see the partnership evolve in the years ahead.”

Real-time AI database supplier DataStax has announced a new JSON API for Astra DB database-as-a-service built on the open source Apache Cassandra. Available via the open source data API gateway, Stargate, the JSON API lets JavaScript developers use Astra DB as a vector database for their large language model, AI assistant, and real-time generative AI projects. It provides a way to work with Astra DB as a document database and has compatibility with Mongoose, the most popular open source object data modeling library for MongoDB. This makes it simple for JavaScript developers to build generative AI applications with vector search using MongoDB. 

Cloud file services collaboration supplier Egnyte has strengthened its partnership with Microsoft. Customers now get real-time document collaboration and sharing features through Microsoft 365 and a Microsoft Teams integration. Customers can share and upload files directly within Teams or directly into Egnyte. Customers can co-edit documents in real time through an enhancement to the existing Egnyte integration into Microsoft 365. Co-editing enables users on the same Egnyte account to collaborate in real-time on Microsoft Word, Excel, or PowerPoint documents, whether they are in Office desktop or web. Customers now have the ability to default to Egnyte as their file storage location for all of their files uploaded and shared in the Teams app.

Decentralized storage provider Impossible Cloud has been certified by Veritas as a target for Backup Exec. Impossible Cloud has also become an Elite Partner in the Veritas Technology Ecosystem program. “This is a major step in the growth and development of Impossible Cloud, and in the delivery of cloud services to global corporations seeking to leverage the many efficiencies, cost-savings and security benefits of decentralized cloud infrastructure,” said Kai Wawrzinek, CEO and co-founder. “Veritas is a global leader, and Backup Exec – which is an ideal fit with Impossible Cloud’s storage solutions – is trusted by more than 45,000 businesses worldwide. We look forward to working with Veritas as we help drive the web3 B2B revolution to transform cloud storage.”

Kinetica has announced a native LLM that allows users to perform ad-hoc data analysis on real-time, structured data at speed using natural language. Unlike with public LLMs, no external API call is required and data never leaves the customer’s environment. This announcement follows Kinetica’s integration of its analytic database with OpenAI.

File data management supplier Komprise has revealed the results of its 2023 State of Unstructured Data Management survey. It finds that IT and business leaders are largely allowing employee use of generative AI but the majority (66 percent) are most concerned about the data governance risks from AI, including privacy, security and the lack of data source transparency in vendor solutions. Preparing for AI is the leading data storage priority in 2023 followed by cloud cost optimization.

  • The majority (40 percent) will pursue a multi-pronged approach to manage AI risk, encompassing storage, data management and security tools
  • Organizations managing more than 10PB of data grew from 27 to 32 percent this year, a 19 percent increase
  • Half of organizations are managing 5PB or more of data, similar to 2022
  • Nearly three-quarters (73 percent) are spending 30 percent or more of IT budget on data storage and protection, measurably higher than 67 percent in 2022
  • The top unstructured data management challenge is moving data without disrupting users and applications (47 percent) followed closely by preparing for AI and cloud services (46 percent)
  • Most (85 percent) say that non-IT users should have a role in managing their own data and 62 percent already have attained some level of user self-service for unstructured data management
  • Monitoring and alerting for capacity issues and anomalies led the pack for important future unstructured data management capabilities (44 percent)

Microchip Technology has an analog memory technology, the memBrain neuromorphic memory system, based on its SuperFlash technology and optimized to perform vector matrix multiplication (VMM) for neural networks. It uses an analog in-memory compute approach, enhancing AI inference at the edge. As current neural net models may require 50 million or more synapses (weights) for processing, it becomes challenging to have enough bandwidth for an off-chip DRAM, creating a bottleneck for neural net computing and an increase in overall compute power. In contrast, the memBrain solution stores synaptic weights in the on-chip floating gate – offering significant improvements in system latency. When compared to traditional digital DSP and SRAM/DRAM based approaches, it delivers 10 to 20 times lower power and significantly reduced overall BOM.

Microchip is partnering with Intelligent Hardware Korea (IHWK) to develop an analog compute platform to accelerate edge AI/ML inferencing. Using Microchip’s memBrain nonvolatile in-memory compute technology and working with universities, IHWK is creating a SoC processor for neurotechnology devices.

A researcher from Safe Breach was able to leverage Microsoft’s OneDrive as ransomware. The DoubleDrive ransomware attack primarily targets personal OneDrive accounts and takes advantage of the synchronization behavior. This attack demonstrates a potential security weakness where files outside the OneDrive sync folder are still vulnerable, even when stored on a cloud service. ThreatLocker’s Cybersecurity Research team has been able to replicate this attack on the latest version of OneDrive. OneDrive client versions 23.061.0319.0003, 23.101.0514.0001, and later are vulnerable. There is a proof of concept with updated info on Double Drive Ransomware here.

Massive scale analytics database storage supplier Ocient has released its annual Beyond Big Data Report showcasing how companies will treat their growing data. Data Quality (47 percent) was the top analytics priority of 2023 for IT Leaders, closely followed by streamlining AI and increasing flexibility. Key insights from the report include that 80 percent say their organizations have started advancing with LLMs or generative AI technologies; 90 percent of IT and Data Leaders are planning to remove or replace existing big data and analytics technologies in the next 6 to 12 months; 46 percent cite lack of proper IT, data engineering, DBA, and other technical talent/staff remains as a top-three challenge to business transformation; 34 percent of IT leaders say their data analytics roadmap for the next 12 to 18 months will include a hybrid cloud and an on-premises strategy. The full report is available for review here.

Oracle is adding semantic search capabilities using AI vectors to Oracle Database 23c.  There will an AI Vector Search feature set, with a vector data type, vector indexes, and vector search SQL operators. This will enable Oracle’s database to store the semantic content of documents, images, and other unstructured data as vectors, and use these to run fast similarity queries. A new AI vector similarity search allows the combination of search on semantic and business data resulting in highly accurate answers quickly and securely. The integrated Vector Database features will augment Generative AI and, Oracle claims, dramatically increase developer productivity.

RAID Incorporated today announced a scale-out and scale-up software defined product, Grid Storage Manager. It enables enterprises to create scalable, flexible, and cost-effective storage infrastructures. With support for all major file, block, and object protocols including iSCSI/FC, NFS/SMB, and S3, Grid Storage Manager storage grids may be configured to address the needs of complex workflows which span sites and datacenters. It is available as a pre-configured appliance that combines hardware and software. It is integrated with enterprise-grade open storage technologies, and supports a wide range of deployment scenarios, including on-premises, hybrid cloud, and multi-cloud environments. More info here.

Western Digital introduced SanDisk Professional brand products at IBC2023. The SanDisk Professional G-DRIVE PROJECT has a Thunderbolt 3 interface, compatible with USB 3.2 Gen 2, and a 7,200rpm Ultrastar disk drive with capacities up to 22TB and 260MBps read/write speed. Available now starting at MSRP $369.99 for 6TB in the United States with a five-year limited warranty. The SanDisk Professional G-RAID MIRROR has up to 44TB of max capacity (22TB in default RAID 1) on two 7200RPM Ultrastar drives. Change to JBOD or RAID 0 by flipping a switch. Pre-order the device now from the Western Digital Store. MSRP in the United States starts at $659.99 for 12TB with a five-year limited warranty. The SanDisk PRO-CINEMA CFexpress Type B card has minimum sustained write speeds of 1,400MBps and lets users capture cinema-quality 8K video without dropping frames. Pre-order the device now from the Western Digital Store. MSRP in the United States starts at $399.99 for 320GB with a limited lifetime warranty.

Interconnect technology supplier XConn Technologies said it will demonstrate the complete Compute Express Link (CXL) 2.0 ecosystem, from end-to-end, at Intel Innovation, September 19-20, in San Jose Convention Center, booth 229. It will showcase the CXL 2.0 specification in action, from host to device, for the ability to scale up to 15 TB to support “Just a Bunch of Memory” (JBOM) applications needed by HPC and AI environments. Its Apollo Switch which supports CXL 2.0 interoperates with Samsung DRAM Memory Expander supporting CXL, Micron CZ120 memory expansion module, Memory eXpander Controller (MXC) for CXL from Montage Technology, and the high-speed CPU interconnect (CMM) for CXL from Smart Modular Technologies. The Apollo switch is the industry’s first and only hybrid CXL 2.0 and PCIe Gen 5 interconnect solution. On a single 256-lane SoC, the XConn switch offers the industry’s lowest port-to-port latency and lowest power consumption per port in a single chip at a low total cost of ownership. 

Infinidat adds AFA inside hybrid InfiniBox

Infinidat has doubled its all-flash SSA II array’s capacity, added entry-level models that can scale up, and enabled the SSD cache in its hybrid flash/disk InfiniBox array to function as an embedded all-flash array.

Infinidat’s hybrid InfiniBox array is primarily disk-based and has a flash cache with Neural Cache software that loads read data into memory data for the lowest possible latency access – down to 35 microseconds. SSA Express software turns the hybrid InfiniBox flash cache into an internal all-flash array (AFA). The vendor also supplies an F4308T SSA II all-flash array, introduced in April 2022 with up to 2.624PB of effective capacity. The F4316T variant has doubled this to 6.64PB of effective capacity in a 42U rack.

CMO Eric Herzog said: “SSA Express software reimagines the InfiniBox hybrid platform, ensuring that critical applications and workloads have rapid, low-latency response rates through direct access to the flash layer, while simultaneously reducing costs and simplifying storage management.”

Infinidat Infinibox array with SSA Express
InfiniBox array with SSA Express

In effect, with the SSA Express software, applications can be pinned into the flash cache. Admins can select specific datasets, applications, and workloads to reside in the SSD layer of the InfiniBox hybrid with near 100 percent read cache hit rate. This eliminates the need for enterprises to buy a separate AFA to support smaller apps and workloads that require high performance and low latency. These apps and workloads can be consolidated into the InfiniBox.

The SSA Express software is part of the InfuzeOS v7.3 release and comes with no additional charge. It provides up to 320TB of usable all-flash capacity and 95 percent of all existing InfiniBox systems are supported. Additional flash capacity can be purchased if needed, and installed non-disruptively. Note that data efficiency services are not supported with the SSA Express software.

SSA Express supports volumes and file systems. Capacity starts at 17TB and goes up to 320TB in this software release. Read more about the software in an Infinidat blog and briefing document.

The entry level to the InfiniBox SSA has been lowered with 60 and 80 percent capacity populated models. As storage needs progress, these models can scale up to 100 percent capacity – the 60 percent model scaling up to 80 and then 100 percent.

Scott Sinclair, ESG practice director for Cloud, Infrastructure and DevOps, was impressed, saying: “It’s a brilliant move that opens up the InfiniBox hybrid to a wider range of enterprise applications and workloads.”

Larger SSA II array

The larger capacity 4316T SSA II array accompanies the prior 4308T and the capacity options are tabulated in an Infinidat blog by Tim Dales:

Infinidat array capacities
B&F table using Infinidat blog data

Capacity upgrades are non-disruptive. The larger capacity comes with no increase in electrical power requirements, meaning up to 50 percent less power is needed per TB or floor space. Dales says Infinidat SSAs are more environmentally friendly (use less electricity) than competing arrays and has a calculator available here for you to test how non-Infinidat arrays compare.

SSA Express software will be available in Q4 2023. The larger InfiniBox SSA II all-flash array is now shipping.

Comment

The transformation of a hybrid array’s flash cache into an embedded or virtual all-flash-array is one of those moves that, with hindsight, seem obvious. In fact it has been done before. Hyperconverged system supplier Nutanix has been effectively doing this for years with a pinning feature or flash mode. Using “flash mode for VM allows admins to set the storage tier preference to SSD for a virtual machine or volume group … By default, you can use up to 25 percent of the cluster-wide SSD tier as flash mode space for VMs or VGs.”

HPE-acquired Nimble Storage provided a pinning feature. An HPE Infosight webpage states: “Volume pinning allows you to keep active blocks of a volume in the cache, as well as writing them to disk. This provides a 100 percent cache hit rate for specific volumes (for example, volumes dedicated to critical applications), and delivers the response times of an all-flash storage system.”

An Infindat spokesperson said: “While Infinidat is leveraging pinning capability, the unique difference from past approaches is Infinidat’s patented Neural Cache software that loads read data into memory data for the lowest possible system and application latency.”

William Blair downgrades NetApp, but storage firm remains ‘confident’

NetApp
NetApp CloudJumper

NetApp has been downgraded by a financial analyst for what it claimed was underpar cloud business growth and falling on-prem storage product sales.

The company sells ONTAP file and block all-flash and hybrid arrays, StorageGRID object storage, BlueXP cloud operations services (Spot by NetApp and others) and has its ONTAP software OEM’ed by Lenovo and the big three public cloud suppliers as first party managed services:  Azure NetApp Files (ANF), Amazon FSx for NetApp ONTAP, and Cloud Volumes Service for Google Cloud, which all “took years to build and integrate and are unique in the storage industry.”

Customers can also use a self-managed instance of ONTAP in the big three clouds.

William Blair analyst Jason Ader said he saw problems in the core business: on-premises hardware and software, and in NetApp’s cloud business. He said he believed NetApp had neglected to keep its on-prem offerings competitive: “We worry that NetApp lacks competitive differentiation in the space, especially compared to all-flash specialists like Pure Storage and upstarts like VAST Data and Qumulo.”

Ader writes: “Over the last four quarters, the company’s product sales in aggregate are down 15 percent year-over-year, implying material loss in market share. … We believe NetApp’s … slowdown is due mainly to an over-rotation of investment to the cloud business, which has left the company flat-footed and poorly positioned in the core storage systems market.”

B&F chart.

Recent new products have filled gaps in NetApp’s portfolio: SAN-specific ASA A-Series, lower-cost capacity-optimized AFF C-Series, and the entry-level AFF 150 all-flash array.

Ader tells subscribers: “Growth in NetApp’s cloud business has hit a wall in recent quarters (cloud annual recurring revenue (ARR) flat sequentially and up only 6 percent year-over-year in the most recent quarter.” Public cloud revenue of $154 million in the most recent quarter made up 8 percent of total NetApp revenue, broadly in line with its showing last year, when it made up $132 million over the $1.59 billion in net revenues in Q1 fiscal ’23, an 8.3 percent slice.”

B&F chart.

Ader identifies what he sees as four cloud business issues. The first was execution following the acquisition of seven CloudOps companies when the cloud business was led by Anthony Lye: Spot, CloudHawk, CloudJumper, Data Mechanics, CloudCheckr, Fylamynt and Instaclustr. Lye subsequently left in July 2022, with Ader opining that the Cloud BU lacked direction and the acquisitions weren’t well-integrated onto NetApp’s overall business. VDI business CloudJumpr was shut down earlier this year. 

He says: “With enterprise customers intensely focused on cloud cost optimization over the last year, Spot should have been an ideal solution, but growth has fallen sort of expectations likely due to a misaligned go-to-market strategy.”

Ader says he believes there were unrealistic cloud total addressable market ideas: “Beyond established ANF use cases like SAP Migration and EDA chip design, we believe the opportunity could be more limited than we originally thought.” Also, the ANF “service is extremely expensive and … Microsoft itself has been one of NetApp’s largest customers,” he claims, adding: “The traction for Amazon FSx for NetApp has been underwhelming.” 

He reckons there has been conflict in NetApp’s sales organization between on-premises and cloud sales: “A core NetApp sales rep has little incentive to push an existing hardware customer to migrate to the cloud, where the rep is likely to loser a major upgrade sale.”

Because of these issues, he says, Ader is downgrading NetApp’s stock from OutPerform to Market Perform rating.

NetApp response

A NetApp spokesperson gave us a statement which we reproduce verbatim:

“NetApp has taken a number of steps to address challenges and drive growth. Our strategic initiatives, including new product launches and GTM (Go-to-market) optimizations, demonstrate our determination to navigate the evolving market successfully. We remain confident in our ability to leverage our competitive advantages and drive growth across our diverse product portfolio.”

Public cloud storage: 

“Overall, we remain confident in our cloud storage services and their potential to drive growth for NetApp. We are committed to delivering value to our customers and shareholders and believe that our strategic actions will position us for success in the evolving cloud landscape. Some of those actions include: 

  • At the beginning of this year, we strategically aligned our cloud sales specialists with our hyperscaler partners’ go-to-market structures, which we believe will further enhance the performance of our first-party services.  
  • Our first-party cloud storage services, those integrated natively into the public clouds, set us apart in the market. It’s worth noting that all three major cloud providers have chosen to integrate our technology into their offerings, demonstrating the value that ONTAP brings to customers. This endorsement by major cloud players reflects the strength and relevance of our product portfolio in today’s dynamic technology landscape. 
  • As workloads migrate from on-prem to the cloud, we have the opportunity to displace legacy on-prem competitors as data that resides on their on-prem systems move to NetApp-based cloud services. 
  • We’re pleased to report that we are successfully attracting new customers who are adopting our cloud storage solutions, including Azure NetApp Files, FSx for NetApp ONTAP, and Google Cloud NetApp Volumes. These services cater to a wide range of use cases, from enterprise workloads like SAP to cloud-native and AI workloads.”

All-flash arrays:


“We have initiated a series of strategic actions aimed at enhancing our Flash portfolio. These actions include:

  • The introduction of innovative new products such as the AFF A150 entry A-series product, the AFF C-series family of capacity flash products, and our new All-Flash San Array (ASA) family. Additionally, we’ve made crucial adjustments in our Go-To-Market (GTM) approach to reaccelerate the growth of our All-Flash Array (AFA) business. 
  • Notably, the positive impact of these changes is already evident in the growth of our sales pipeline. This early success reinforces our belief in the effectiveness of our strategy and the resilience of our market positioning. 
  • NetApp continues to provide all-flash systems at every price point and configuration based on a customer’s performance and budget needs.
    • We have unified data storage, where any workload, any data, anywhere (on-prem, MSP/Hosting, Public Cloud) can be serviced; interconnected and managed by BlueXP. 
    • We have the leading multi-protocol systems (e.g., File/Block/Object with full support for NVMe/FC and NVMe/TCP) for customers who want industry-leading consolidated storage. 
    • Dedicated purpose systems (e.g., ASA for block storage, StorageGRID for object storage) for customers with specific needs.”

  

“Only NetApp can provide this full, comprehensive portfolio of solutions at any performance point, price point and/or any environment, all managed by ONTAP and BlueXP.”

Veeam pumps money into SaaS backup startup Alcion

After buying Niraj Tolia and Vaibhav Kamra’s container backup startup Kasten, Veeam has now put cash into Alcion, their SaaS app backup startup.

Alcion was founded in 2022 by CEO Tolia and VP Engineering Kamra, and took in $8 million in seed funding in May this year. Now, after building a product to backup and protect Microsoft 365 against malware, based on open-source Corso software, it has raised $21 million in an A-round, led by Veeam.

A statement by Tolia said: “Since exiting stealth in May, we’ve witnessed a 700 percent increase in organizations using the platform, propelling Alcion to a petabyte of managed logical backups. This round of funding positions us to rapidly scale our mission of protecting all the world’s data against both malicious threats and accidents. Alcion customers and the rapidly growing open-source Corso community will benefit from greater functionality, ease of use, and heightened ransomware detection.”

Alcion’s Microsoft 365-protecting product is now generally available and includes multi-layered protection using AI-driven threat detection, intelligent backup scheduling, encryption, and delete protection, to combat ransomware and other malware. The AI-driven ransomware protection models constantly learn from user behavior and operate on a per-user or resource level to detect targeted attacks.

The product features:

  • Deepened focus on data security: with heightened ransomware detection, clarity in malware protection, actionable insights, immutable backups, malware alerts and notifications in app and via email.
  • SOC 2 Type II certification: demonstrating rigorous controls and procedures for customer data security, integrity and confidentiality.
  • Proactive approach: vs. notifying admins after exposure. New features include a Compliance Score for quick evaluation of configuration and backup health, in addition to targeted Data Protection Insights and Recommendations based on data activity in the tenant.
  • Increased integration with Teams, Shared Mailboxes, export of backups to local files, restore options, and notifications.
  • Listing on the Microsoft Marketplace
  • Worldwide availability: The general availability of a new Australian region is now paired with existing US and EU regions, with more expansion to come.

Looking ahead Alcion will build features for its managed service provider (MSP) customers and extend data protection features for additional SaaS services apart from Microsoft 365.

Alcion dashboard

The funding round had participation from all prior Alcion investors including Lip-Bu Tan, chairman of Walden International and Intel board member; Debanjan Saha, CEO of DataRobot; Abhinav Asthana, CEO and founder of Postman; and Amarjit Gill, serial entrepreneur and investor at Nepenthe Capital.

Comment

The SaaS app backup space is being targeted by many suppliers, such as Asigra, BackupLABS, Clumio, Cohesity, Commvault with Metallic, Druva, HYCU and OwnBackup. There are major, tier 1 SaaS app services such as Microsoft 365, Salesforce and Service Now and the big three cloud providers own services, a large number of tier2 apps such as Jira and Atlassian, and hundreds if not thousands of tier 3 SaaS apps.

A common problem for SaaS app backup protectors is building connectors between their backup and protection software engines and the SaaS apps. HYCU wants the SaaS app suppliers to build their own connectors using its R-Cloud APIs, and aims to have 100 connectors available in a few months time. Asigra also has a connector SDK for SaaS app developers.

Suppliers like Alcion can build connectors to a few major SaaS apps on their own, but, since their are over 17,000 SaaS apps available, according to HYCU, Backup-as-a-Service suppliers cannot write connectors to them all; it’s impossible. We note that Alcion wants to protect other business-critical SaaS services, so it will face this problem..

Similarly, we cannot expect the SaaS app suppliers to write connectors for their apps to every Backup-as-a-Service supplier either; Alcion, Asigra, BackupLABS, Clumio, Cohesity, Commvault Metallic, Druva, HYCU, OwnBackup et al.

Alcion is betting that its AI-driven ransomware protection and other features will propel it forward in the Microsoft 365 market, particularly with potential Veeam channel support. We think it will produce its own connectors for the major SaaS apps like Salesforce and ServiceNow. After that it’s unclear how it will get further SaaS app coverage, meaning more connectors.