Home Blog Page 295

Seagate pats itself on back for flat Q3 results

Seagate reported flat Q3 revenues of $2.73bn (up 0.4 per cent) and $329m net income, advancing 2.8 per cent The disk drive maker said it sold more nearline, high-capacity drives and is ramping up 18TB shipments from the current 16TB high-end.

In the earnings call yesterday, CEO Dave Mosley said: “Seagate delivered an outstanding March quarter, executing well across multiple dimensions,” referring to revenues, operating margin, earnings-per-share and share buybacks.

He added: “Strong cloud data centre demand and ongoing recovery in the enterprise markets drove our highest ever HDD shipments of 140 exabytes, a record mass capacity revenue of more than $1.6 billion.”

Financial summary:

  • Gross margin – 27.1 per cent (27.4 per cent a year ago)
  • Operating margin – 14.1 per cent (13.8 per cent a year ago)
  • Cash flow from operations – $378m
  • Cash and investments – $1.2bn
  • Diluted EPS – $1.48 ($1.38 a year ago)

Demand was strong in the enterprise nearline and data centre markets. PC drive demand was steady and the company noted increased demand for mission-critical 2.5-inch 10K rpm) drives. Video surveillance and image demand was down but should rise in the next quarter, the company said. The average capacity per drive rose to 5.1TB from 4.1TB a year ago. Shipments of 16TB, 18TB and 20TB drives represented nearly half of all the exabytes Seagate shipped in the quarter.

Mosley’s prepared remarks revealed that Seagate is “servicing the vast majority of market demand for 16TB and higher capacity drives. We’ve started to aggressively ramp 18-terabyte volume, and current demand suggests strong sequential growth through at least the calendar year.”

It is also shipping Mach.2 dual actuator drives, having “recently begun the high volume ramp of MACH.2 drives with a leading hyperscale customer and plan to expand shipments to additional customers later in the calendar year.”

Heat-assisted magnetic recording (HAMR) drives are being evaluated by users; “ Today customers are testing 20TB HAMR drives in their production environments, which offers valuable feedback that we are factoring into our product roadmaps.”

But Seagate is hedging its HAMR tech bets, planning to “begin shipping a few versions of 20-terabyte drives in the second half of the calendar year,” Mosley said. A shingled magnetic recording version was also mentioned in the earnings call, along with drives with different firmware environments, to cater for hyperscale customers who need 20TB drives in a variety of formats.

Lyve Drive

The Lyve Drive external disk drive program is expanding cautiously. Mosley said Seagate is “on track to have four Lyve Cloud sites up by the end of the calendar year. We are getting an ecosystem support and have now been certified with each of the leading backup software vendors.“ 

He is excited about “future potential for Lyve products and services, which open a large and growing market opportunity for Seagate estimated to reach about $50bn by 2025.”

Outlook

Mosley said: “Seagate continues to execute well and remains excited about the tremendous opportunities we foresee ahead, both in the near-term and longer term, driven by massive growth of data.’

Seagate expects fourth quarter revenues of $2.85bn, plus/minus $185m, 13.1 per cent Y/Y growth at the $2.85bn number. This would make for full year revenues of $10.5bn, the same as last year. That growth rate is a substantial acceleration on the current quarter’s 0.4 per cent Y/Y growth and if it were to continue throughout the next few quarters Seagate’s fy2022 results could show a substantial rise over fy2021.

Rubrik gains NetApp as a reseller

NetApp
NetApp CloudJumper

NetApp is to resell Rubrik’s Cloud Data Management software for data protection, security, compliance and governance.

Rubrik provides software to backup application data and help customers with data security, compliance and governance. It can use NetApp StorageGRID object storage as a target system for storing the backed up data on-premises and NetApp target facilities in the public cloud.

Kim Stevenson, an SVP and GM at NetApp said: “Rubrik and NetApp together are delivering data management solutions for digital transformation in a hybrid multi-cloud world.” 

The companies will offer a common set of data management tools across a data fabric spanning the on-premises and public cloud environments. Rubrik and NetApp say the combination of Rubrik Cloud Data Management and enhanced NetApp ONTAP software will provide consolidation benefits, deeper cloud integration and continuous data availability.

Rubrik’s Go Business Edition and Go Foundation Edition software will now appear on NetApp’s global price list. This deal will help both Rubrik and NetApp channel partners combine to win business.

Comment

This is a big win for Rubrik and gives it added credibility with NetApp’s salesforce and its channel partners. It will also enhance Rubrik’s standing in the overall enterprise data protection and governance market and reinforce its status in comparison to Cohesity, Commvault, Veeam and others as a major player to be reckoned with.

Regarding HPE’s May 4 unleash the power of data event

HPE is promoting a REALLY important webcast on May 4 entitled “unleash The Power of Data,” and we think we have worked out what it is about.

Tweets – such as this one – began appearing a few days ago.

A teaser video appeared on an HPE webpage:

HPE May 4 event teaser video.

The webpage references a downloadable eBook, Unleashing the Power of Your Data – For Dummies.

This dummy followed the link and downloaded this book, an HPE special edition:

It is organised into six chapters:

  1. Changing the Game with Data
  2. Establishing an Intelligent Data Strategy
  3. Understanding Intelligent Data Platform Components
  4. Transforming Your Business and Your IT
  5. The Cloud, the Edge, and Your Valuable Data
  6. Ten (Or So) Benefits of an Intelligent Data Platform

The book “examines how intelligence changes everything, the role of an intelligent data strategy, and how your business and IT can be transformed with an intelligent data platform at the heart of this strategy.” IT sets out to answer what it calls nine key questions:

  1. Do you have a data strategy?
  2. What elements make up an intelligent data strategy?
  3. What makes up an intelligent data platform?
  4. Is your infrastructure able to predict and prevent problems before they occur?
  5. Is your data residing on infrastructure that’s designed to be high-performing and available?
  6. Do you have a data protection strategy?
  7. Can you achieve data agility while enabling hybrid cloud?
  8. Are you getting the most value from your data?
  9. Are you optimising the cost of storing that data at reach step of the life cycle?

The basic message is that HPE storage products are integrated and organised with AI-driven global intelligence and AIOps to form an intelligent data platform.

The HPE products and offerings mentioned en route through the book include the Ezmeral software portfolio, InfoSight, Nimble Storage, Primera, SimpliVity, StoreOnce, Cloud Volumes Backup, Cloud Bank Storage, and GreenLake.

HPE’s Intelligent Data Platform concept has been around for a couple of years, certainly since June 2019. The book describes it like this: ‘An intelligent data platform collects data not just from storage devices, but from servers, virtual machines (VMs), networks, and other infrastructure elements across the stack. It applies AI and ML to spot what’s not right in order to predict and prevent issues.

It uses predictive analytics to anticipate and prevent issues across the infrastructure stack and to speed resolution when issues do occur.”

It’s about having a self-managing, self-healing and self-optimising infrastructure “with workloads that operate across the cloud, in on-premises environments and at the edge.”

Almost two years later, HPE is set to make a song and dance about it. We think it is going to centre on innovations around InfoSight, HPE’s intelligent cloud-based system monitoring  and predictive analytics service. The company acquired this with Nimble Storage and has extended the technology to cover SimpliVity edge HCI and Primera data centre arrays as well as HPE’s servers.

We think HPE will announce that InfoSight has been developed into a tool looking at storage and more in a hybrid multi-cloud and on-premises setting with greater AI capabilities to get the right data into the right locations.

Filebase raises $2m to raise decentralised storage game

Filebase has gained $2m seed funding for its geo-replicated object storage cloud that has edge caching and is up to 90 per cent lower cost than equivalent mainstream public cloud archives.

Filebase uses decentralised peer-to-peer network technology which traditionally means slow performance and can entail non-standard interfaces. Filebase’s caching and use of the S3 interface sidesteps these concerns and makes its network of interest to business developers needing highly reliable and long-term storage for archiving and disaster recovery.

Founded in 2019, the company offers a 5GB free tier to all users, with no expiration or trials. A subscription costs $5.99 per month and includes 1 TB of storage and 1 TB of transfer. Additional storage cost $0.0059/GB and $0.0059/GB for additional outbound transfers. There is no charge for writing data (ingress).

Zac Cohen, co-founder of Filebase, said: “Disaster recovery is now dead simple, thanks to the native geo-replication that’s offered by decentralised networks. Enterprises and IT professionals no longer need to worry about planning for a costly and complex DR strategy.”

Filebase supplies the object storage software technology and maintains a cluster of application servers connected to various third-party decentralised networks, such as the Sia, Skynet and Storj networks, to provide the geo-replicated capacity. It plans to add the Filecoin and Arweave networks by the end of the year. All these networks have erasure coding and use Blockchain for data integrity. Filebase provides an on-ramp to them that abstracts away their proprietary features, including crypto-currency billing.

Joshua Noble, Filebase CEO and co-founder, said the company has built a “performant access layer to decentralised storage networks, which are among the most distributed and secure networks in the world, with a familiar S3-compatible interface that developers know and generally love.”

Users access Filebase through a browser-based dashboard and use an S3-compatible API to write and read data. All interactions with the underlying networks are abstracted, and there are turnkey configurations to ease operations. Filebase guarantees that objects are stored with a 3x replication factor across thousands of server nodes across the globe which are in the underlying decentralised networks.

This decentralisation adds to read and write latency as object data has to be recovered from myriad servers which can be hundreds if not thousands of miles distant. Filebase has developed edge caching technology to increase throughput and lower read and write response times. If a recently or frequently accessed object is downloaded from its service, there is a high probability that this object will be cached in the edge layer. If data is cached then users don’t pay for any outbound data bandwidth and the time to the first byte can be less than 100ms. 

It claims this makes it possible to build applications and user experiences on decentralised networks that are indistinguishable from applications built on Amazon S3, rendering them a viable alternative to centralised networks, like AWS, for the first time.

Filebase says basic public-cloud storage is affordable, but geographically-redundant cloud storage is a different matter. One storage bucket on Filebase is equivalent to 3 regionally-distributed buckets in AWS. If you want to store 1 TB of data in 3 regions with Amazon, it’s going to cost nearly $200 per month. The same level of service and redundancy can be achieved on Filebase for $5.99 per month.

GridGain gets big speed boost from app-direct Optane

In-memory computing supplier GridGain is gaining 10x-100x speed boost by supporting Optane PMem and Intel’s AVX-512 instructions.

GridGain provides an in-memory facility for running transactions, streaming and analytics applications using clustered x86 server nodes in a grid defined by a distributed, massively parallel architecture. Its base software was donated to the Apache Software Foundation in 2014, where it became Apache Ignite. 

Nikita Ivanov

Nikita Ivanov, founder and CTO of GridGain Systems, said in a statement: “Native support in GridGain for … Optane persistent memory combined with vectorised computations is the ultimate solution for advancing our vision and gaining that extra boost. It eliminates the need to copy data from PMem to DRAM and enables much more data to be processed within a single CPU cycle, delivering a tremendous increase in performance for financial institutions, telcos, transportation companies, animation and gaming studios, and more.”

The software provides a single overall pool of memory, made up from DRAM and byte-addressable Optane PMem 200 series drives, which are used in AppDirect mode. (Block-level access to data in PMem is already supported in GridGain 8.) The Optane capacity is used as a high-density, low-latency storage tier for analytical data and training data sets. Changing to AppDirect mode provided a 10x performance boost compared to the same system with no Optane and Intel DC P4510 SSD capacity.

GridGain graphic.

However, GridGain’s coming 9.0 software release adds support for AVX-512 and other SIMD X86 CPU instructions. SIMD stands for Single Instruction, Multiple Data and it enables parallel processing on multiple data points. 

AVX-512 is a vectorising extension to the 256-bit vector extension in the SIMD instruction set. Intel said its use can accelerate performance in scientific simulations, financial analytics, artificial intelligence (AI)/deep learning, 3D modelling and analysis, image and audio/video processing, cryptography and data compression.

GridGain will support AVX-512 natively to accelerate vectorised computations over the data stored in both the PMem and DRAM layers of the GridGain storage engine.

Alper Ilkbahar, Intel’s VP for the Data Platforms Group and GM of the Optane Group, said : “This is definitely a situation where the whole is greater than the sum of the parts. The extraordinary performance gained by Intel Optane persistent memory 200 series and AVX-512 instructions in combination with the GridGain In-Memory Computing Platform will help transform data processing in the industry.”

GridGain 9.0 will support the PMem 200 series drives, AVX-512 and the latest Ice Lake 3rd gen Xeon processors. It will become available in the next few months.

Why croit thinks Ceph is ‘unbeatable general storage solution’ – and why it will only get better

Ceph software is a singular data storage technology, as it is open source and offers block, file and object access to data. But it has a reputation of being slow in performance, complicated to use, and hard to maintain. Also SUSE’s decision to withdraw from the Ceph market, which I revealed last month, raises questions about mainstream support for the technology.

So what are the benefits of Ceph, how can users overcome its challenges, and will this storage ever hit the big time? These were some of the questions I put to Martin Verges, co-founder and CEO of croit GMbH, a Ceph-specialist startup based in Munich.

Martin Verges

Blocks & Files: Why is Ceph so important? Some might say that, since it provides block, file and object access to data it does nor provide specific high performance access for any one of these access methods.

Martin Verges: Ceph is the only open source storage solution available that combines block, file and object type of storage into one system. It is therefore an unbeatable generalist solution. Many users need the resulting flexibility much more than the possible performance gain of individual edge case high performance solutions.

Blocks & Files: Who are the main users of Ceph in terms of workload types and vertical markets?

Martin Verges: Ceph can be used for almost all business cases due to its flexibility. It can be found in low-cost archive and backup clusters as well as performance-optimised NVMe clusters for virtual machines or as a primary storage in high performance compute environments.

Blocks & Files: Why was croit founded and when? What is its purpose?

Martin Verges: We founded croit GmbH to simplify technologies and thus make them accessible to the widest possible range of users. As our first product we have developed a Ceph storage-based management solution, which coupled with support and service enables everyone to use Ceph in the enterprise in the best possible and reliable way. We are also working on implementing the same methodology and user friendly approach to Intel DAOS, which will be released this summer.

Blocks & Files: How many customers does Croit have and how does it make its money?

Martin Verges: We currently have over 200 customers around the world. The main streams of income for the company are software licensing, professional consulting, reactive and emergency support and training offers.

Since the company was founded [in 2017], we have been growing continuously at well over 100 per cent year over year.

Blocks & Files: What effect will the withdrawal of SUSE from the Ceph market have?

Martin Verges: As a founding member of the Ceph Foundation, we deeply regret SUSE’s withdrawal from the Ceph market. SUSE has contributed greatly to the development of Ceph in the past and we are very grateful for that. Nevertheless, we as a company have/are benefiting from the departure with increased demand for our solution and consulting.

Blocks & Files: Ceph is thought to be hard and difficult to use? Why is this and how can the perception be changed?

Martin Verges: Because of the richness of features and flexibility that Ceph offers, Ceph is not easy to understand and operate. Although it gets better from version to version, deploying the open source version in production is quite a complex process and prone to problems due to all the options available to the operator. That’s why there are specialised companies like us that offer customers the ease of use and security they need to deploy Ceph in the enterprise environment.

Furthermore, this is exactly what makes our solution stand out, we are the only provider to offer a horizontally and vertically integrated solution that is completely web based and allows the operation of Ceph clusters in a simple and reliable way.

croit roadmap for Ceph.

Blocks & Files: Ceph is criticised for being relatively slow compared to other software products dedicated to block, file or object access? What does Croit think about this view?

Martin Verges: This criticism is sometimes not wrong, however, it mostly concerns small clusters. Ceph can also be very performant with the right design and hardware. The Ceph developers are working hard to make Ceph better. 

For example, the switch from FileStore to BlueStore has halved the IO overhead. With the SeaStore / Crimson OSD under development, there will be further significant performance improvements. Also, for example, with the new Pacific release, the inflation of particularly small files and objects has been massively reduced.

Blocks & Files: How does Croit view Ceph in the context of supply data to AI-type GPU servers such as those from Nvidia?

Martin Verges: We at croit think that more development effort is needed to make Ceph suitable for more usage scenarios. In general, however, Ceph is an excellent and sufficiently powerful solution for many workloads. It works reliably and especially very securely apart from the data sheets. We do see that there is also a need for very high performance storage that Ceph is not the best fit, and that is one of the reason we have been working closely with Intel in productising DAOS in the same manner as we did for Ceph.

Blocks & Files: Can Ceph be used to supply storage to stateful containers; for example, via the Kubernetes CSI plug-in?

Martin Verges: Yes, it is possible to use Ceph with the Kubernetes CSI interface without any problems. This allows persistent data to be processed by multiple containers simultaneously via the CephFS file system, as well as exclusive block devices.

Blocks & Files: How does Croit view the future of Ceph?

Martin Verges: We believe Ceph will continue to dominate the market as the best open source storage solution. Furthermore, due to the ongoing development, it can be assumed that Ceph will be adapted to modern workloads and constantly expanded with further functions.

Ceph is therefore the right choice for future-oriented storage solutions, which, as can currently be seen very clearly with SUSE, works independently of individual companies. This provides investment security, flexibility and a wide range of options.

Redis database gets strong consistency, AI inferencing and global speed

Redis Labs announced Redis Database 7.0. The in-memory NoSQL database software will deliver faster performance for a globally distributed Redis database, more reliable data, better searches and AI assistance, the company said at Redisconf yesterday.

Specifically Redis Labs is announcing:-

  • RedisRAFT to provide strong consistency
  • RediSearch with new indexing and querying capabilities
  • RedisJSON in-memory manipulation of JSON documents
  • RedisAI inferencing engine
Yiftach Shoolman.

“Through these innovations, we believe Redis has become the de facto data platform for a new wave of digital experiences by changing the way developers can build true real-time applications and then deploy them anywhere, cloud, multi-cloud, hybrid-cloud or on-premises, in a globally distributed manner, closer to where their customers are,” Yiftach Shoolman, Redis Labs co-founder and CTO, said.

RedisRAFT

RedisRAFT uses the Raft (Reliable, Replicated, Redundant, And Fault-Tolerant) consensus algorithm in which many servers agree on a leader server and values such as a hash table describing cluster state transitions in a fault-tolerant server cluster. The consensus value agreement continues if a minority of the servers fail.

RAFT enables parallel processes in a cluster to see data accesses in the same order. This contrasts with ‘weak consistency’, where cluster nodes can apply data accesses in a different order resulting in data value differences between the servers. These can be corrected in a ‘eventually-consistent’ system but that means wrong data values may occur, such as when a server is unavailable because of a fault.

The benefit of RedisRAFT is that it provides both strong consistency and high-availability. It has passed the Jepsen tests, which check data consistency guarantees with no unresolved issues and with Redis performance levels maintained.

RedisJSON

RedisJSON provides a tree-like, hierarchical document store in which storage and querying is faster than using JSON with the Lua programming language and core Redis data structures. RediSearch has gained integration with RedisJSON so that developers can now natively store, index, query, and perform full-text search on documents faster than before. Redis says that the combination of RediSearch and RedisJSON provides integrated data models which can combine data from different sources.

RediSearch and RedisJSON& can be deployed in a globally distributed manner, useful for disaster recovery purposes  and also to enable applications to run where the customers are located and so save time by avoiding data movement.

Active:Active and RedisAI

Redis says it uses an active:active geo-distributed topology based on conflict-free replicated data types (CRDTS) in a global database running across multiple clusters. It claims this provides global data distribution, spanning on-premises, multiple public clouds and hybrid environments, with local access speed, to deliver sub-millisecond latency.

The RedisAI inferencing engine provides a feature store containing features which are values calculated from raw data. For example, average monthly cost of some activity or a statistical likelihood (z-score) that a financial transaction is fraudulent. A feature is pre-built data calculation usable by data scientists developing specific analytics processes or models. RedisAI enables models to be served where the features are stored thus, Redis Labs claims, improving AI-based application performance by an order of up to two orders of magnitude.

Availability

RedisRaft will be generally available with the release of Redis 7.0 in the second half of 2021.

The integration of RedisJSON and RediSearch is in private preview and will be generally available in the second half of 2021 with Active-Active support.  

RedisAI, as an online feature store, is available today for on-premise deployments and will be available for Redis Enterprise Cloud in the second half of 2021.  

Hey MSPs! join Zadara’s Federated Edge to fight cloud giants

Zadara has launched a global Federated Edge offering for managed service providers (MSPs), who will sell private cloud services to their customers using Zadara’s global infrastructure and so fend off the public cloud giants.

The firm supplies zStorage – on-premises or co-lo storage arrays as a service – zNetwork and zCompute – on-premises servers with VM images – based on its recent acquisition of NeoKarm. The storage, network and compute resources are used on a pay-as-you-go basis.

The idea is that MSPs don’t have to make capital expenditures to set up points-of-presence near their customers. Instead they can use Zadara’s Federated Edge program to provision IT ‘as-a-service’ private cloud offers as close as necessary to those customers’ workloads. Zadara provides all of the hardware and software on a shared revenue basis.

Nelson Nahum, Zadara’s CEO, said in a statement: “We strive to be a true partner to MSPs, not just a technology provider. We understand their unique challenges and have designed the Federated Edge Program with their specific needs in mind…[It] harnesses the collective power of MSPs, where the whole is greater than the sum of its parts.”

Zadara graphic.

All the infrastructure – servers, network and storage – is provided and available on-demand via the Zadara’s Federated Edge network. Operators pay only for usage. Zadara says Federated Edge Zones, or points of presence, available in cities across the world ease concerns around data sovereignty and compliance issues. ;

Dave McCarthy, research manager, edge strategies, for IDC, supplied a supporting statement: “With their Federated Edge, Zadara is providing a lifeline for MSPs looking to boost their points of presence and deploy anywhere in the world the same way that they deploy in their own data centres.

“Latency requirements, cost considerations, operational resiliency, and security/compliance factors all contribute to the need to deploy infrastructures closer to where data is generated and consumed – at the far edge of a networks’ reach, away from the centralised cloud.”

Dell is developing a managed storage service with Project APEX. It’s not much of a stretch to see APEX including PowerEdge servers as a service too. Pure Storage has a similar deal through Equinix.

Public cloud suppliers have started supplying on-premises kit. For example, Amazon with Outposts, and Azure with its Azure Stack which is supported by Dell EMC and Pure Storage.

Zadara is responding to this increased competition by recruiting MSPs to act as service channel partners and suggesting they can get closer to customers and offer more tailored services than the cloud titans.

Zerto’s triple zinger for SaaS, Kubernetes and the AWS cloud

Zerto made three announcements at ZertoCon 2021 today: Zerto Backup for SaaS; support for Kubernetes; and expanded AWS backup services. The disaster recovery specialist also outlined features of its next major software release, Zerto 9.0.

Zerto Backup for SaaS (ZBaaS) runs on a secure private infrastructure, that delivers – the company says – data immutability, compliance, and guaranteed data availability. The service uses technology from Copenhagen-based Keepit to support Microsoft 365, Dynamics 365, Salesforce, and Google Workspace.

David Osman, Zerto director, technology alliances, said in a statement: “The cloud has enabled people to do business anywhere and at any time, resulting in more critical data constantly being placed in the cloud. The problem is that leaving SaaS backup as an afterthought can result in debilitating loss, especially since most SaaS vendors do not provide this critical service as a standard inclusion.”

Frederik Schouboe, Keepit CEO, said his company was “excited to partner with an industry leader like Zerto and … thrilled our platform is being recognised  for true cloud SaaS data protection and management, with added capabilities like archiving, eDiscovery, and open APIs.” 

ZBaaS features granular recovery, data availability outside the normal production data centre, and an independent secure data backup stored at a different location for added ransomware protection.

Z4K

Zerto for Kubernetes (Z4K) takes Zerto’s continuous data protection technology to Kubernetes-orchestrated containers, providing backup and disaster recovery for on-premises and public cloud workloads. Z4K supports Microsoft Azure Kubernetes Service (AKS), Amazon Elastic Kubernetes Service (EKS), Google Kubernetes Engine (GKE), IBM Cloud Kubernetes Service, and Red Hat OpenShift.

Kubernetes’ CSI interface has opened the door to widespread container data backup competition. Zerto has joined the fray to jostle with Commvault’s Metallic, Druva, HYCU, Mayadata, Pure Storage’s Portworx-based technology, Replix, Robin, Trilio, Veeam’s Kasten acquisition and more.

Zerto 9.0, the next major software release, will add disaster recovery across AWS Regions or Availability Zones; backup to S3-compatible storage such as Cloudian HyperStore; cloud tiering for AWS and Azure; and immutability in the public cloud. 

Gil Levonai, Zerto CMO, said in his statement: “Zerto and AWS customers can build cloud-native solutions and infrastructure, underpinned by a single  solution that delivers data protection, recovery, and migration of data to and across AWS.”

Organisations hit limits when trying to replicate thousands of Elastic Compute Cloud (EC2) instances across AWS Regions, according to Zerto. Zerto’s DR for AWS will improve volume replication concurrency across regions, use data APIs to reduce reads and writes, and optimise orchestration workflows to complete RPOs and RTOs in minutes without using agent software.

A Zerto spokesperson said: “Zerto DR for AWS is aimed towards protecting EC2 instances cross-region and cross availability zones. The focus for us was to deliver a solution that can protect at scale, this means an RPO of 15 mins when protection up to a 1000 EC2 instances across regions/zones. “

Cloud tiering for AWS and Azure tiers cloud data from online frequent access storage classes into cheaper infrequent access storage classes all the way through to archive storage such as Amazon S3 Glacier and Azure Archive. Users define retention policies in Zerto.

Immutability settings for backups in AWS can be managed within the Zerto UI to set how long backups can remain unaltered. This is intended to safeguard cloud backups against ransomware and its malicious deletion or modification of data. 

Other new features include Enhanced Backup Management, Instant VM Restore From journal and File Restore from LTR.

Zerto 9.0 is available in beta test and is generally available in July 2021.

How does computational storage interact with the host server?

Computational storage – adding compute capability to a storage drive – is becoming a thing. NGD, Eideticon, ScaleFlux have added compute cards to SSDs to enable compute processes to run on stored data, without moving that data into the host server memory and using its CPU to process the data. Video transcoding is said to be a good use case for computational storage drives (CSDs).

But how does the CSD interact with a host server. Blocks & Files interviewed ScaleFlux’s Chief Scientist, Tong Zhang, to find out.

Tong Zhang

Blocks & Files: Let’s suppose there is a video transcoding or database record processing application. Normally a new video file is written to a storage device in which new records appear in the database. A server application is aware of this and starts up the processing of the new data in the server. When that processing is finished the transformed data is written back to storage. With computational storage the overall process is different. New data is written to storage. The server app now has to tell the drive processor to process the data. How does it do this? How does it tell the drive to process a piece of data?

Tong Zhang: Yes, in order to off-load certain computational tasks into computational storage drives, host applications must be able to adequately communicate with computational storage drives. This demands the standardised programming model and interface protocol, which are being actively developed by the industry (e.g., NVMe TP 4091, and SNIA Computational Storage working group).

ScaleFlux CSDs

Blocks & Files: The drive’s main activity is servicing drive IO, not processing data. How long does it take for the drive CPU to process the data when the drive is also servicing IO requests? Is that length of time predictable?

Tong Zhang: Computational storage drives internally dedicate a number of embedded CPUs (e.g., ARM cores) for serving drive IO, and dedicate a certain number of embedded CPUs and domain-specific hardware engines (e.g., compression, security, searching, AI/ML, multimedia) for serving computational tasks. The CSD controller should be designed to match the performance of the domain-specific hardware engines to the storage IO performance.

As with any other form of computation off-loading (e.g., GPU, TPU, FPGA), developers must accurately estimate the latency/throughput performance metrics when off-loading computational tasks into computational storage drives.

Blocks & Files: When the on-drive processing is complete how does the drive tell the server application that the data has been processed and is now ready for whatever happens next? What is the software framework that enables a host server application to interact with a computational storage device? Is it an open and standard framework?

Tong Zhang: Currently there is no open and standard framework, and the industry is very actively working on it (e.g., NVMe.org, and SNIA Computational Storage working group).

ScaleFlux CSD components.

Blocks & Files: Let’s look at the time taken for the processing. Normally we would have this sequence: Server app gets new data written to storage. It decides to process the data. The data is read into memory. It is processed. The data is written back to storage. Let’s say this takes time T-1. With computational storage the sequence is different: Server app gets new data written to storage. It decides to process the data. It tells the drive to process the data. The drive processes the data. It tells the server app when the processing is complete. Let’s say this takes time T-2. Is T-2 greater or smaller than T-1? Is the relationship between T-2 and T-1 constant over time as storage drive IO rises and falls? If it varies then surely computational storage is not suited to critical processing tasks? Does processing data on a drive use less power than processing the same data in the server itself?

Tong Zhang: The relationship between T-1 and T-2 depends on the specific computational task and the available hardware resource at host and inside computational storage drives. 

For example, if computational storage drives internally have a domain-specific hardware engine that can very effectively process the task (e.g., compression, security, searching, AI/ML, multimedia), then T2 can be (much) smaller than T-1. However, if computational storage drives have to solely rely on their internal ARM cores to process the task and meanwhile the host has enough idle CPU cycles, then T-2 can be greater than T-1.

Inside computational storage drives, IO and computation tasks are served by different hardware resources. Hence they do not directly interfere with each other. Regarding power consumption, computational storage drives in general consume less power. If current computational tasks can be well served by domain-specific hardware engines inside computational storage drives, of course we have shorter latency and meanwhile lower power consumption.

If current computational tasks are solely served by ARM cores inside computational storage drives, the power consumption can still be less because we largely reduce data-movement-induced power consumption and the low power nature of ARM cores.

Blocks & Files: I get it that 10 or 20 drives could overall process more data faster than having each of these drive’s data be processed by the server app and CPU – but how often is this parallel processing need going to happen?

Tong Zhang: Data-intensive applications (e.g., AI/ML, data analytics, data science, business intelligence) typically demand highly parallel processing over a huge amount of data, which naturally benefit from the parallel processing inside all the computational storage drives.

Comment

For widespread use, CSDs will require a standard way to communicate with a host server so that it can request them to do work and be informed when the work is finished. Dedicated processing hardware on the CSD, separate from the normal drive IO-handling HW, will be needed for this to ensure predictable time is taken for the processing. 

Newer analytics-style workloads that require relatively low-level processing of a lot of stored data can benefit from parallel processing by CSDs instead of the host server CPUs doing the job. The development of standards by NVMe.org, and the SNIA’s Computational Storage working group will be the gateway through which CSD adoption has to pass for the technology to become mainstream.

We also think that CSDs will need a standard interface to talk to GPUs. No doubt the standards bodies are working on that too.

Nebulon expands from storage focus to overall HCI infrastructure

Nebulon, which launched its hardware-assisted cloud-defined storage in June last year, has broadened its scope to become a smart infrastructure SaaS supplier providing a better hyper-converged infrastructure (HCI) offering.

The startup’s first product was a Storage Processing Unit (SPU), an add-in, FH-FL PCIe card, with an 8-core, 3GHz ARM CPU plus encryption/dedupe offload engine. This is fitted inside servers and managed through a data management service delivered from Nebulon’s cloud. Server-attached storage drive capacity is aggregated across servers delivering a hyper-converged (HCI) system experience. The company has expanded this remotely-managed storage remits to become a – wait for it – server-embedded, infrastructure software delivered as-a-service.

Nebulon is calling this ‘smartInfrastructure’. CEO Siamak Nazari explained in a statement: “Customers want the cloud experience for their on-premises infrastructure across their core, hosted and edge deployments.”

The company is introducing two reference architecture products, branded smartEdge and smartCore, using Supermicro Ultra servers. The products deliver self-service infrastructure provisioning, infrastructure management-as-a-service, and enterprise shared and local data services. These include dedupe, compression, erasure coding, encryption, snapshots, clones and mirroring in a scale-out storage architecture. smartInfrastructure supports any application: containerised, virtualised and bare-metal, and is available anywhere, from core to edge to hosted data centres.

Nebulon said IT administrators and application owners benefit from simple deployment, zero-touch remote management, easy at-scale automation, AI-based insights and actions, and behind-the-scenes software updates. Zero server resources are used in delivering Nebulon’s enterprise data services, eliminating HCI limitations on server density and keeping 100 per cent of the server CPU, DRAM and network usable for applications. 

DPU angle

In essence, Nebulon’s SPU is a type of data processing unit (DPU) or SmartNIC, like the Fungible product and Nvidia’s BlueField SmartNIC,  which offloads low-level storage, network and security tasks from the host CPU. 

Nebulon – like Fungible and Nvidia – has a composability angle, stating on its website that you can “dynamically compose your infrastructure for your application when combining servers and storage media in a cluster.” 

smartInfrastructure details

Nebulon’s smartInfrastructure comprises its ON AI-assisted, cloud control plane which powers the SPU, described as an IoT endpoint-based data plane. This is embedded in a vendors’ application server and used to provide the data services. Nebulon claims the ON service and SPU products can replace an enterprise SAN – so storage is still a prime feature of Nebulon’s offering.

Nebulon SPU

Nebulon provides a catalog of services for different applications. These services will cover the provisioning of O/S and software packages, data volumes and data services for selected compute nodes. A customer’s IT organisation can contribute their own services to the catalogue. Users can then provision on-premises IT infrastructure that includes compute services (servers), the operating system on these compute nodes and its configuration, data storage for the clustered application installed on the servers, and associated data services.

The smartInfrastructure is delivered to customers as a service in the cloud and does not require customer maintenance. Updates to the on-premises infrastructure software components, including data plane, or server drive firmware can be rolled out enterprise-wide by a click of a button.

Nebulon smartCore and Nebulon smartEdge are reference architectures under the smartInfrastructure solution that are focused on specific deployment scenarios in enterprise data centres or as a hosted solution (smartCore), and at the enterprise edge (smartEdge). 

Nebulon infrastructure software is a subscription-based offering. It told us the SPUs/IoT endpoints come as a standard option from the customer’s default server vendor.

Takeway

The takeaway message here is that Nebulon smartInfrastructure is server-embedded, infrastructure software and an alternative to hyperconverged infrastructure software. The Nebulon Supermicro reference architecture offerings are available now.

IBM quarterly storage HW revenues continue declining

IBM’s quarterly storage hardware revenues have continued their long term decline. In the first 2021 quarter earnings report the company disclosed that “Power and Storage Systems declined,” with Power servers falling by 13 per cent and storage hardware 14 per cent.

Update: story amended reflecting discovery of storage hardware sales number. 21 April 2021.

IBM noted in the earnings presentation yesterday that “Storage and Power performance reflects product cycle dynamics.” Sales of the recently launched FlashSystem 5200 have not pushed storage hardware revenues higher as the product was not fully available in the quarter.

In its first 2021 quarter, ended March 31, IBM reported revenues of $17.7bn, up 1 per cent Y/Y and its first growth quarter after declines in all four 2020 quarters.

IBM segment revenue results:

  • Cloud and Cognitive Services – $5.4bn – up 34 per cent Y/Y
  • Global Business Services – $4.2bn – down 1 per cent 
  • Global Technology Services – $6.4bn – down 5 per cent
  • Systems – $1.4bn – up 2 per cent
  • Global Financing – $240m – down 20 per cent

Within the systems segment, the smallest of its four main business segments, overall hardware revenues rose 10 per cent Y/Y while operating systems software declined 18 per cent. Z mainframe business revenues were up 49 per cent Y/Y, acting as the Systems segment growth engine. CFO James Kavanaugh said in prepared remarks: “That’s very strong growth, especially more than six quarters into the z15 product cycle.’

The sales jump was attributed to customers appreciating mainframe reliability and security in the time of the pandemic with increased online purchases and remote working.

In the earnings report the company disclosed that “Power and Storage Systems declined,” with Power servers falling by 13 per cent and storage hardware 14 per cent. We calculate this to mean storage hardware brought in $359m compared to the year-ago $417.6m.

Storage hardware is a tiny fraction of IBM’s results, representing 2 per cent of its $17.7bn overall revenues.

IBM chairman and CEO Arvind Krishna is concerned with big picture issues, such as spinning off Global Technology Services – Kyndryl – and returning IBM to growth in a hybrid cloud world, using AI, Red Hat and quantum computing.

His prepared remarks included this comment: “We see the hybrid cloud opportunity at a trillion dollars, with less than 25 per cent of workloads having moved to the cloud so far. … We are reshaping our future as a hybrid cloud platform and AI company. … IBM’s approach is platform-centric. Linux, Containers and Kubernetes are the foundation of our hybrid cloud platform which is based on Red Hat OpenShift. We have a vast software portfolio, Cloud Paks, modernised to run cloud-native anywhere.”

Compared to this multi-billion dollar future, storage hardware, bringing in a few hundred million dollars, is small and shrinking potatoes. Its sales have seen an overall consistent and long-term decline, as a chart showing them by quarter each year shows;

There have been transitory storage hardware revenue rises, for example, one from Q1 2019  to Q1 2020, but that has not been repeated over the Q1 2020 to Q1 2021 period. Instead there has been another decline of 14 per cent.

IBM also sells storage software, such as Spectrum Scale, and storage services such as its IBM Cloud Object Storage. There is also a storage element in Red Hat’s sales. These various storage software revenues are not aggregated and revealed by IBM, and so we cannot know how IBM’s total storage business; hardware and software, is doing. All-in-all, it is a confusing and incomplete picture with a disappointing storage hardware element.