Home Blog Page 353

Your occasional storage digest with Samsung, Dell EMC, Nutanix and more

Tom’s Hardware has published a scoop-ette on an 8TB Samsung QVO 870 SSD using QLC (4bits/cell) NAND and produced in the U.2 format. The entry level starts at 1TB.

News of the 870 leaked via Amazon listing, which was pulled shortly after the Tom’s Hardware story. Samsung’s current 860 QVO uses the company’s 3D 64-layer V-NAND in QLC format and we understand the 870 QVO uses the later and denser 100+ layer 3D V-NAND generation.

The drive should have a SATA III interface like the 860, with these drives built for affordable capacity rather than high performance.

Rakers round-up

Wells Fargo senior analyst Aaron Rakers has given his subscribers a slew of company updates, following a series of virtual meetings with suppliers. Here are some of the insights he has gleaned.

Intel’s 144-layer 3D NAND is built with floating gate technology and a stack of three 48-layer components. This will be more expensive than single stack or dual-stack (2 x 72-layer) alternatives.

Optane Traction: “Over a year post launch (April 2019), Optane has now been deployed at 200 of the Fortune 500 companies, and has had 270 production deal wins, and an 85 per cent proof of concept to volume deployment conversion rate. In addition to optimization work with SAP Hana and VMware, [Intel] noted that the partner ecosystem / community has discovered new use cases for Optane, such as in AI, HPC, and open source database workloads.”

FPGA maker Xilinx: “Computational Storage. Xilinx continues to highlight computational storage as an area of FPGA opportunity in data centre. This includes the leverage of FPGAs for programmable Smart SSD functionality – Samsung representing the most visible partner; [Xilinx] noting that the company has active product engagements with several others.”

Dell EMC PowerStore: “The new Dell EMC PowerStore system also looks to compete more effectively against Pure Storage with Anytime Upgrades and Pay-per-Use consumption. The PowerStore systems can be deployed in two flexible pay-per-use consumption models with short and long-term commitments (including new one-year flexible model). This will likely be positioned against Pure Storage’s Pure-as-a-Service (PaaS) model. Anytime Upgrades allow customers to implement a data-in-place controller upgrade that will most likely be viewed as a competitive offering against Pure’s Evergreen Storage subscription (enabling a free non-disruptive controller upgrade after paying 3-years of maintenance).“

Nutanix and HPE: “Nutanix’s relationship with HPE continues to positively unfold. [Nutanix CFO Dustin] Williams noted a strong quarter for the partnership in terms of new customers. He said that Nutanix has been integrated into Greenlake but it is still in its infant stages.” 

Seagate: “Seagate will be launching a proprietary 1TB flash expansion card for the upcoming (holiday season) Xbox Series X, and the company will be the exclusive manufacturer of the product. While not a high margin product, this alignment provides brand recognition as well as potential to drive higher-capacity HDD attach for additional (cheaper) storage. Regarding the move to SSDs within the next-gen game consoles, we would note that Seagate has shipped 1EB of HDD capacity to this segment over the past two quarters, and thus we would consider the move as having an immaterial impact to the company.”

SVP Business & Marketing Jeff Fochtam characterised Seagate’s SSD “strategy as being complementary to its core HDD business. He noted that Seagate has been on a strong profitable growth path with SSDs, which he credits to the company’s supply chain efforts and strategic partnerships. The company currently has >10 per cent global share in consumer portable SSDs, after rounding out the portfolio ~1 year ago.” 

Nutanix Xi Clusters

Nutanix has been busy briefing tech analysts about Xi Clusters.

The company told Rakers: “Nutanix Xi Clusters (Hybrid Multi-Cloud Support): clusters give [customers] the ability to run software either in the datacenter or in the public cloud. This makes the decision non-binary. Clusters are in large scale early availability today, GA in a handful of weeks first with AWS and then with Azure. It will be cloud agnostic. The ability to run the whole software stack in the public cloud strengthens the company’s position in the core business by giving the customer the optionality to run Nutanix licenses in the public cloud at the time of their choosing.” 

And we learn from Nutanix’s briefing with William Blair analyst Jason Ader: “Through its Xi Clusters product, Nutanix enables customers to run Nutanix-based workloads on bare metal instances in AWS’s cloud (soon to be extended to Azure), leveraging open APIs and AWS’s native networking constructs. This means that customers can use their existing AWS accounts (can even use AWS credits) and VPCs and seamlessly tap into the range of native AWS services. From a licensing perspective, Nutanix makes it simple to run Nutanix’s software either on-premises or in the cloud, allowing customers to move their licenses as they so choose.

Shorts

Backup biz Acronis has a signed a sponsorship deal with Atlético de Madrid, and is now the football club’s official cyber-protection partner.

Accenture has used copy data manager Actifio to automate SQL database backup for Navitaire, a travel and tourism company owned by Amadeus, the airline reservation systems company.

SSD supplier Silicon Power has launched a US70 PCIe Gen 4 SSD in M.2 format and 1TB and 2TB capacities. It uses 3D TLC NAND, has an SLC cache, and delivers read and write speeds up to 5,000MB/s and 4,400MB/s, respectively.

Silicon Power US70 M.2 SSD.

Some more details of the SSD in Sony’s forthcoming PlayStation 5 games console have emerged. It has 825GB capacity and is a 12-lane NVMe SSD with PCIe 4.0 interface and M.2 format. The drive has a 5GB/sec read bandwidth for raw data and up to 9GB/sec for compressed data. In comparison, Seagate’s FireCuda 520 M/2 PCIe gen 4 SSD also delivers 5GB/sec read bandwidth. It has 500GB, 1TB and 2TB capacity levels. An SK Hynix PE8010 PCIe 4.0 SSD delivers 6.5GB/sec read bandwidth. Check out this Unreal Engine video for a look at what the PS5 can do.

Unreal Engine PlayStation 5 video.

Stellar Data Recovery has launched v9.0 of its RAID data recovery software. The new version adds recovery of data from completely crashed and un-bootable systems, and a Drive Monitor to check hard drive health.

The Storage Performance Council has updated the SPC-1 OLTP benchmark with five extensions that cover data reduction, snapshot management, data replication, seamless encryption and non-disruptive software upgrade. They provide a realistic assessment of a storage system’s ability to support key facets of data manageability in the modern enterprise.

Automotive AI company Cerence is building AI models using WekaIO’s filesystem, which won the gig following a benchmark shoot-out with Spectrum Scale and BeeGFS.

Striim, a supplier of software to build continuous, streaming data pipelines from a range of data sources, has joined Yellowbrick Data’s partner program.

WANdisco, which supplies live data replication software, has raised $25m in a share placement. The proceeds will strengthen the balance sheet, increase working capita and fund near term opportunities with channel partners The UK company said continues to work towards run-rate breakeven by capitalising on the Microsoft Azure LiveData Platform.

GigaOm: Cohesity, Komprise and Commvault lead unstructured data management pack

Blocks & Files has seen an extract of a soon to be published GigaOm report that assesses unstructured data management suppliers

Sixteen vendors are covered by analyst Enrico Signoretti in the GigaOm Radar for Unstructured Data Management. They are Aparavi, AWS Macie, Cohesity, Commvault, Druva, Google Cloud DLP, Hitachi Vantara, Igneous, Komprise, NetApp, Panzura, Rubrik, Quantum, Scality, SpectraLogic and Veeam.

Enrico Signoretti

Signoretti confirmed the imminent publication of the report. He told us: “As you know, interest in Unstructured Data Management is skyrocketing. Users want to know what they have in their storage infrastructures and need tools to decide what to do with it. We have identified two categories (infrastructure- and business-focused).

“The first category provides immediate ROI and improves overall infrastructure TCO, while the latter addresses more sophisticated needs, including compliance for example. Vendors are all very active and the roadmaps are exciting.” 

In the report, Signoretti writes: “Leading the pack we find Cohesity, Komprise and Commvault.” Hitachi Vantara and faster-moving Druva are moving deeper into the leaders ring. Challengers Igneous and Rubrik are entering the leaders ring. Dell EMC, HPE and IBM are not present in this overall group of suppliers.

Draft GiagaOm Unstructured Data Management radar screen diagram

GigaOm has already published Key Criteria for Evaluating Unstructured Data Management, which is available to GigaOm subscribers and provides context for the Radar report.

GigaOm Radar details

GigaOm’s Radar Screen is a four-circle, four-axis, four-quadrant diagram. The circles form concentric rings and a supplier’s status – new entrant, challenger, or leader – is indicated by placement in a ring.

The four axes are maturity, horizontal platform play, innovation and feature play.

There is a depiction of supplier progression, with new entrants growing to become challengers and then, if all goes well, leaders. The speed and direction of progression is shown by a shorter or longer arrow, indicating slow, fast and out-performing vendors.

The inner white area is for mature and consolidated markets, with very few vendors remaining and offerings that are mature, comparable, and without much space for further innovation. 

The radar screen diagram does not take into account the market share of each vendor. 

Scality’s Zenko cloud data controller gains data moving feature

Scality, the object storage vendor, has added data-moving to its Zenko data management software. The upgrade turns Zenko into a data orchestrator and controller that works across multiple public clouds and on-premises file and object stores.

This has echoes of the file lifecycle management capabilities of Komprise and InfiniteIO and the global metadata-based activities of Hammerspace

Zenko sprang out from Scality’s engineering team in September 2018. It is positioned as an object and file location engine, a single namespace interface through which data can be stored, retrieved, managed and searched across multiple disparate private and public clouds, enabled by metadata search and an extensible data workflow engine. 

Zenko overview video.

But at launch, it could not move data. Now it has “an open source, vendor neutral data mover across all clouds, whether private or public like AWS,” Scality CEO Jerome Lecat said.

Giorgio Regni, Scality CTO, added: “This release provides deeper integration with AWS.  Customers can now write directly to AWS S3 buckets and Zenko will see this data and manage it. Prior to this release, customers had to write the data into Zenko to apply workflow policies like tiering and lifecycle. Now any existing AWS S3 bucket and on-premises NFS (i.e.Isilon and NetApp) volume can be discovered by Zenko and form part of Zenko’s global namespace.”

The Zenko software supports AWS S3, Azure Blob, Google Cloud and Wasabi public clouds. On-premises systems supported Scality’s RING and other S3 object storage, Ceph and NAS (NFS only). Zenko inspects these sources and imports object and file metadata into its own store. Applications interface to Zenko with a single API and can search for and access objects in this store, with Zenko effectively acting as an access endpoint for the various source object and file storage repositories.

The data moving capability means Zenko can move objects and files between the source locations, as workload needs dictate.

The Zenko store is kept up to date, over time and not in real time. by using asynchronous updates. These are triggered with mechanisms such as the AWS S3 Bucket Notification, Lambda functions and AWS Identity and Access Mechanism (IAM) policies for cross-site access control.

These updates can trigger Zenko actions. For example, objects might have a specific metadata tag attached to them, such as “archive”. This could initiate Zenko data moving workflow action to archive the object into a public cloud cloud cold store, or a Fujifilm Object Archive tape library. Other tags could initiate a replication exercise or cause data to be moved to specific target sites and applications.

Zenko Orbit screen

Scality could develop Zenko to enhance its Zenko Orbit storage monitoring and analytics component to move data in response to policies setting, for example, cost and capacity limits.

Regni said “We are working with several ISV and system partners in Europe, the US and Japan to help them accelerate new solutions in cloud and object data management based on Zenko.”

There is a free version of open source Zenko and a licensed enterprise version. You can check out a Zenko White Paper for more information.

Infinidat closes mystery funding round

Infinidat, the high-end storage array maker, has completed a D-round of funding with existing investors, but it is not saying how much it raised. Prior to this round, the company has raised $325m in three slugs since 2010.

Infinidat said the cash “will be used to build on new initiatives, such as the increasing demand for flexible consumption models in the market, strengthening the company’s growth plans and enabling it to build further on its industry leadership position. It will also be used for technical research and product development.”

The news accompanies a management reshuffle, with Moshe Yanai relinquishing the chair. His replacement, executive chairman Boaz Chalamish, will oversee the two newly appointed co-CEOs.

Chalamish was most recently chairman and CEO at Clarizen and his background includes jobs at VMware, HP and Mercury. His appointment comes in the wake of Yanai resigning his CEO role last month, stepping aside to become the Chief Technology Evangelist.Two co-CEOS were appointed in his place – COO Kariel Sandler and CFO Nir Simon.

At the time Yanai said he was closely collaborating with Infinidat investors TPG and Goldman Sachs “to drive towards our next phases of growth”.

Boaz Chalamish

Infinidat has also promoted three execs: Catherine Vlaeminck to VP worldwide marketing, Dan Shprung becomes EVP, EMEA and APJ, and Steve Sullivan is now EVP, Americas.

StorONE touts Optane Flash Array

StorONE, the data storage startup, has crafted a super all-flash array by twinning Optane and QLC Flash SSDs in a 2-tier, 40TB, dual Xeon SP server chassis.

The Optane Flash Array (OFA), runs StorOne’s S1 Enterprise Storage Platform software and is intended as a replacement upgrade for first generation flash arrays. 

StorONE said the OFA, with its Optane-QLC combination, should cost less than a standard all-flash array but delivers more performance at equivalent capacity levels.

The design is akin to taking Intel’s H10 Optane+QLC flash drive and implementing its basic hardware scheme at array level. We described it earlier this month and can now provide more details.

StorONE S1 Optane Flash Array.

The initial product chassis contains two storage tiers; three Optane DC P4800X 750GB SSDs for performance and five Intel D5-P4320 NVMe 7.68TB QLC SSDs for capacity. 

Blocks & Files S1 Optane Flash Array diagram

The Optane drives form a performance tier and eliminate the need for cache in memory. They also stack up random writes and send them to the QLC tier as a small number of sequential writes which are optimised for the QLC NAND page size. This prolongs SSD endurance.

The S1 software promotes data from the QLC tier to the Optane tier when it identifies a read performance advantage. 

StorONE OFA supports configurations with 3, 4, 6 or 8 Optane drives and from 4 to 16 Intel QLC SSDs. Scaling Optane drives increases performance. Scaling the QLC SSDs increases capacity.

Performance

A StorONE Intel Optane Flash Array lab report details some performance numbers.

This OFA configuration, with its eight drives, delivers over one million read IOPS. Benchmarks include:

  • 1.05 million random read IOPS, 0.14ms latency
  • 310,000 random write IOPS, 0.6ms latency
  • 10GB/sec sequential read throughput
  • 2.8GB/sec sequential write throughput

How does this compare to a non-Optane StorONE all-flash array?

In September 2018 StorONE tested its S1 system using HGST SAS SSDs inside a Western Digital 2U24 chassis and reported 1.7 million random read 4K IOPS, delivered at less than 0.3 ms latency. The system provided 15GB/sec sequential reads, and 7.5GB/sec sequential writes.

StorONE’s George Crump told us the SAS SSD S1 array, with its 24 drives, exceeded the OFA’s bandwidth and IOPS numbers because of its ability to drive more IO with drive parallelism. However, the eight-drive OFA had lower latency than the SAS system.

We think a 24-drive OFA, with eight Optane drives and 16 NVMe QLC SSDs, would deliver a larger number of IOPS and greater bandwidth than the S1 24-drive SAS system. StorONE said its system delivers linear scalability, so we this configuration could deliver 3 million IOPS and 30GB/sec sequential read throughput.

Blocks & Files expects the S1 OFA to be made available in the next few weeks. We might ponder the idea of an all-Optane OFA. This would be suited for extreme performance use cases – but be hellish expensive.

Huawei, Pure and IBM enterprise storage sales up; Dell, Hitachi, HPE and NetApp are down

The Covid-19 pandemic sent enterprise storage revenues down in 2020’s first quarter but Huawei, Pure Storage and IBM went against trend and grew revenues.

The latest edition of IDC’s Worldwide Quarterly Enterprise Storage Systems Tracker reveals total storage capacity shipped in the quarter slipped 18.1 per cent Y/Y to 92.7EB. 

Sebastian Lagana, research manager at IDC, issued a quote: “The external OEM market faced stiff headwinds during the first quarter as enterprises across the world had operations impacted by the global pandemic. ODMs once again generated growth, taking advantage of increasing spend from hyperscalers – demand that we anticipate will remain solid through the first half of 2020.”

The 92.7EB of total storage capacity comprises server SANs, ODM (Original Design Manufacturers) sales to hyperscalers like Amazon and Facebook, and sales by OEMs to the enterprise external storage market.

ODM capacity shipped went down 20.2 per cent to 54.8EB, but revenues rose 6.9 per cent to $4.9bn. Enterprise external storage did the opposite; capacity rose 3 per cent but revenues fell 8.2 per cent to $6.5bn.

The total all-flash array (AFA) market generated $2.8bn in revenue, up 0.4 per cent. The hybrid flash array (HFA -disk + SSD) market generated $2.5bn, down 11.5 per cent. An industry source tells us the disk array portion of the enterprise external storage market declined 18 per cent.

Vendor numbers

There was a wide divergence in the fortunes of individual suppliers, as an IDC table comparing the first quarters of 2020 and 2019 shows: 

Dell continues to dominate in revenue and market share, but revenues fell 8.2 per cent in line with the market. We have charted vendor revenue growth rate changes to bring out the differences:

Are these changes significant? After all there has been minor revenue growth differences for years with little substantive effects over time as a chart of IDC Storage Tracker enterprise storage vendor revenues for the past 15 quarters makes clear.

Enterprise external storage vendor revenues from IDC’s Storage Tracker since 3Q 2016

Supplier positions

Dell is still top of the tree by a wide margin. Covid-19 has sent the market down but IBM experienced saw 3.8 per cent growth, driven by a mainframe refresh cycle drawing high-end DS8800 sales in its wake.

Huawei grew revenues at the fastest rate, up 17.7 per cent. It has dipped in and out of IDC’s top five vendor list and has shown a more jumpy curve than Pure.

HPE and NetApp have swapped positions regularly as have Hitachi and IBM. That implies the latest figures do not indicate substantive changes in these vendors’ positions.

Nor did it affect Pure Storage – that much, with growth at 7.7 per cent. We see its comparatively long term growth trend relaxing a little in this latest Storage Tracker edition. Fewer customers stopped buying kit and services and the total AFA segment, Pure’s sole market, grew 0.4 per cent, while Pure grew 7.7 per cent. 

Pure told us the market in Japan shrank 4.3 per cent while Pure grew 37.5 per cent. In LATAM the market grew 19.9 per cent and Pure grew 99.9 per cent. The North America market shrank 9.2 per cent and Pure grew 11.8 per cent. 

Mission-critical computing and HCI: the time has come

Sponsored Hyperconverged infrastructure (HCI) has been around for at least a decade, but adoption continues to grow apace, as shown by figures from research firm IDC which indicate that revenue from hyperconverged systems grew 17.2 per cent year on year for the fourth quarter of 2019, compared to 5.1 per cent for the overall server market.

Although it has become common for HCI to run general purpose workloads, some IT departments are still wary of using this architecture for the mission-critical enterprise applications on which their organisation depends for day to day business. Now, with technologies available as part of the Second Generation Intel® Xeon® Scalable processor platform, HCI can deliver the performance and reliability to operate these workloads, while providing the benefits of flexibility and scalability.

HCI is based on the concept of an appliance-like node that can serve as an infrastructure building block, enabling the operator to scale by adding more nodes or adding more disks. Each node integrates compute, storage and networking into a single enclosure, as opposed to separate components that have to be sourced and configured separately.

The real value of HCI is in the software layer, which virtualizes everything and creates a pool of software-defined storage using the collective storage resources across a cluster of nodes. This software layer facilitates centralised management and provides a high degree of automation to make HCI simpler for IT professionals to deploy and manage.

But that software-defined storage layer may be one reason why some organisations have been wary of committing to HCI for those mission-critical roles.

Doing it the traditional way: SANs

Enterprise applications, whether customer relationship management (CRM), enterprise resource planning (ERP) or applications designed for online transaction processing (OLTP), rely on a database backend for information storage and retrieval. This requirement will typically be met by a database system such as Oracle or SQL Server.

Traditionally, the database would run on a dedicated server, or a cluster of servers in order to cope with a high volume of transactions and to provide failover should one server develop a fault. Storage would be provided by a dedicated storage array, connected to the server cluster via SAN links. This architecture was designed so that the storage can deliver enough performance in terms of I/O operations per second (IOPS) to meet the requirements of the database and the applications using it.

But it means the database, and possibly the application, is effectively locked into its own infrastructure silo, managed and updated separately from the rest of the IT estate. If an organisation has multiple application silos such as this, it can easily complicate data centre management and hinder moves towards more flexible and adaptable IT infrastructure.

It also pre-dates the introduction of solid state drives (SSDs), which have a much higher I/O capacity – and much lower latency – than spinning disks. For example, a single Intel® 8TB SSD DC P4510 Series device is capable of delivering 641,800 read IOPS.

Partly, this is because of the inherent advantages of solid-state media, but also because newer SSDs use NVMe as the protocol between the drive and host. The NVMe communications protocol was created specifically for solid state media and uses the high-speed PCIe bus to deliver greater bandwidth than a legacy interface such as SAS while supporting multiple I/O queues. The NVMe protocol also ensures performance is not compromised by delays in the software stack.

Software-defined

With HCI, the database can run on a virtual machine, and the software-defined storage layer means that storage is distributed across an entire cluster of nodes. Every node in the cluster serves I/O and this means that as the number of hosts grows, so does the total I/O capacity of the infrastructure.

This distributed model also means that if a node goes down, performance and availability do not suffer too much. Most HCI platforms also now feature many of the capabilities of enterprise storage arrays as standard, such as snapshots and data deduplication, while built-in data protection features make disaster recovery efforts much easier.

With advances in technology, such as the Second Generation Intel® Xeon® Scalable processors, tends to feature more CPU cores per chip than earlier generations. This presents organisations with the opportunity to reduce the number of nodes required for a cluster to run a particular workload, and thus make cost savings.

But as the total I/O capacity depends on the number of hosts, such consolidation threatens to reduce the overall IOPS of the cluster. Fortunately, SSDs boast enough IOPS to counteract this, especially Intel® Optane™ DC SSDs, which are architected to deliver enough IOPS for the most demanding workloads. In tests conducted by Evaluator Group, a storage analyst firm, a four-node HCI cluster with Optane™ DC SSDs outperformed a six-node cluster using NAND flash SSDs under the IOmark-VM workload benchmark, with both configurations having a target of 1,000 IOmark-VMs.

Optimise the cache layer

It is common practice to implement tiered storage in HCI platforms. Inside each node in a cluster, one drive is treated as a cache device – typically an SSD – and the other drives are allocated as a capacity tier. In the past, capacity drives have been rotating hard drives, but these days the capacity tier is also likely to be SSD.

In this configuration, the cache tier effectively absorbs all the writes coming from every virtual machine running on the host system, which means it is critical to specify a device with very low latency and very high endurance for this role. In other words, you need a device that would not ‘bog down’ as those extra CPU cores are put to work.

Intel® Optane™ SSDs fit the bill here, because Intel® Optane™ is based on different technology to the NAND flash found in most other SSDs. Current products such as the Intel® Optane™ SSD DC P4800X series have a read and write latency of 10 microseconds, compared with a read/write latency of 77/18 microseconds for a typical NAND flash SSD.

In terms of endurance, Intel claims that a half terabyte flash SSD with an endurance of three Drive Writes Per Day (DWPD) over five years provides three petabytes of total writes. A 375GB Optane™ SSD has an endurance of 60 DWPD for the same period, equating to 41 petabytes of total writes, representing around a14x endurance gain over traditional NAND.

The capacity tier of the storage serves up most of the read accesses and can therefore consist of SSDs that have greater capacity but at lower cost and endurance. Intel’s second generation of 3D NAND SSDs based on QLC technology is optimised for read-intensive workloads, making them a good choice for this role.

Furthermore, IT departments can use the greater efficiency of Intel®  Optane™ SSDs to make cost savings by reducing the size of the cache tier required. Intel claims that the cache previously had to be at least 10 per cent of the size of the capacity tier. But with the performance and low latency of Intel® Optane™, 2.5 to 4 per cent is sufficient. This means a 16TB capacity tier used to require a 1.6TB SSD for caching but now customers can meet that requirement with a 375GB Intel® Optane™ SSD.

Boosting memory capacity

Another feature of Intel® Optane™ is that the technology is byte-addressable, so it can be accessed like memory instead of block storage. This means that it can expand the memory capacity of systems, boosting the performance of workloads that involve large datasets such as databases, and at a lower cost compared to DRAM.

To this end, Intel offers Optane™ DC Persistent Memory modules, which fit into the DIMM sockets in systems based on Second Generation Intel® Xeon® Scalable processors. The modules are used alongside standard DDR4 DIMMs but have higher capacities – currently up to 512GB. The latency of the modules is higher than DRAM, but a tiny fraction of the latency of flash.

These Optane™ memory modules can be used in two main ways; in App Direct Mode, they appear as an area of persistent memory alongside the DRAM and need applications to be aware there are two different types of memory. In Memory Mode, the CPU memory controller uses the DRAM to cache the Optane™ memory modules, which means it is transparent to applications as they just see a larger memory space.

In other words, App Direct Mode provides a persistent local store for placing often accessed information such as metadata, while Memory Mode simply treats Optane™ as a larger memory space.

VMware, whose platform accounts for a large share of HCI deployments, added support for Optane™ DC Persistent Memory in vSphere 6.7 Express Patch 10. In tests using Memory Mode, VMware found it could configure a node with 33 per cent more memory than using DRAM alone. With the VMmark virtual machine benchmark suite [PDF], VMware said this allowed it to achieve 25 per cent higher virtual machine density and 18 per cent higher throughput.

In conclusion, HCI might have started out as a simpler way to build infrastructure to support virtual machines, but advances in technology now mean it is able to operate even mission critical workloads. With Second Generation Intel® Xeon® Scalable processors and Intel® Optane™ DC SSDs, HCI can deliver the I/O, low latency and reliability needed to support enterprise applications and their database back-ends.

It can also potentially deliver cost savings, as the greater efficiency of Intel® Optane™ storage means that fewer drives or nodes may be required to meet the necessary level of performance.

Sponsored by Intel®

NetApp loves Iguazio’s AI pipeline software

Each month, NetApp’s Active IQ handles up to 10 trillion data points, fed by storage arrays deployed at customer sites. Data volumes are growing.

Active IQ uses various AI and machine learning techniques to analyse the data and sends predictive maintenance messages to the arrays. The service has a lot of hardware capacity at its disposal, including ONTAP AI systems, twin Nvidia GPU servers, fast Mellanox switches and ONTAP all-flash arrays. But what about orchestrating and manage the AI data pipeline in those arrays? This is where Iguazio comes in.

NetApp is using Iguazio software for end-to-end machine learning pipeline automation. According to Iquazio, this enables real-time MLOps (Machine Learning Operations), using incoming data streams. 

The Iguazio-based Active IQ system has led to 16x storage capacity reduction, 50 per cent reduction in operating costs, and fewer compute nodes, NetApp says. Also new AI services for Active IQ are developed at least six times faster.

Unsurprisingly, NetApp people are enthusiastic. Shankar Pasupathy, NetApp chief architect for Active IQ, said in a prepped quote: “Iguazio reduces the complexities of MLOps at scale and provides us with an end-to-end solution for the entire data science lifecycle with enterprise support, which is exactly what we were after.”

NetApp has now partnered with Iguazio to sell their joint data science ONTAP AI solution to enterprises worldwide. Iguazio also has a co-sell deal with Microsoft for its software running with Azure, and a reference architecture with its software on Dell EMC hardware. It is carving out a leading role as an enterprise AI pipeline management software supplier.

Let’s take a closer look at the Active IQ setup.

Hadoop out of the loop

NetApp needs real-time speed, scalability to cope with the massive and growing streams of data, and the ability to run Active IQ on-premises and in the cloud. It also wants Active IQ to learn more about customer array operations and get better at predictive analytics.

NetApp’s initial choice for handling and analysing the incoming Active IQ data was a data warehouse and Hadoop data lake. But the technology was too slow, too complex and scaling was difficult.

Active IQ dashboard.

The fundamental issue is that real-time processing involves multiple stages with many interim data sets and types, and various kinds of processing entities such as containers and serverless functions.

This complexity means careful data handling is required. Get it wrong and IO requests multiply and multiply some more, overwhelming the fastest storage media.

AI pipeline

Iguazio’s software speciality is organising massive amounts of data and metadata in such a way as to make applying AI techniques in real time possible. Its software provides a data abstraction layer on which these processing entities can run.

An AI pipeline involves many stages:

  • Raw data ingest (streams),
  • Pre-processing (decompression, filtering and normalization)
  • Transformation (aggregation and dimension reduction) 
  • Analysis (summary statistics and clustering) 
  • Modeling (training, parameter estimation and simulation) 
  • Validation (hypothesis testing and model error detection) 
  • Decision making (forecasting and decision trees) 

There are multiple cloud-native applications, stateful and stateless services, multiple data stores, and data selection and filtering into subsequent stores involved in this multi-step pipeline. 

NetApp and Iguazio software

Iguazio’s unified data model is aware of the pipeline stages and the need to process metadata to speed data flow. The software runs in a cluster of servers, with DRAM providing an in-memory metadata database, and NVMe drives holding interim data sets.

Iguazio data model concept.

NetApp uses Trident, a dynamic storage orchestrator for container images integrated with Docker and Kubernetes and deployed using NetApp storage. Iguazio integrates with the Trident technology, linking a Kubernetes cluster and serverless functions to NetApp’s NFS and Cloud Volumes Storage. Iguazio is compatible with the KubeFlow 1.0 machine learning software that NetApp uses.

Iguazio detects patterns in the data streams and relates them to specific workloads on specific array configurations. It identifies actual and potential anomalies, such as a performance slowdown, pending capacity shortage, or hardware defect.

Then it generates actions, such as sending alert messages, to the originating customer system and all customers with systems likely to experience similar anomalies. It does this in real time, enabling NetApp systems to take automated action or sysadmins to take manual action.

Zerto provides disaster recovery for containerised apps

Zerto is branching out from disaster recovery of virtual machines to offer general backup services. The company will also cover cloud-native applications with its continuous data protection (CDP) and journalling technology.

The company announced the plans, along with a roadmap for its IT Resilience Platform, today at the ZertoCON virtual customer conference.

Specifically, Zerto has announced Zerto for Kubernetes (Z4K – our acronym) to protect applications running on Amazon Elastic Kubernetes Service (Amazon EKS), Google Kubernetes Engine (GKE), Microsoft Azure Kubernetes Service (AKS), Red Hat OpenShift, and VMware Tanzu.

Gil Levonai.

Zerto’s Gil Levonai, CMO and SVP of product, said in prepared remarks: “With the clear shift towards containers based application development in the market, we are looking to extend our platform to offer these applications the same level of resilience we have delivered to VM-based applications.

“While next-gen applications are built with a lot of internal availability and resilience concepts, they still require an easy and simple way to recover from human error or malicious attacks, or to be mobilised and recovered quickly without interruption. This is where Zerto can help.”

Zerto for Kubernetes

Z4K protects persistent data and can protect, move and recover containerised applications as one consistent entity, including associated Kubernetes objects and metadata. It features continuous journaling for Kubernetes apps, including Persistent Volumes, StatefulSets, Deployments, Services, and ConfigMaps. This journal can provide thousands of recovery checkpoints.

It has always-on replication to provide protection and recovery of Kubernetes persistent volumes within and between clusters, data centres or clouds.

Entire applications with their component containers can be recovered as entities in an ordered way. Z4K can instantiate a full running copy of an entire application in minutes from any point in time for recovery from data corruption or ransomware without impacting production or for testing.

There are automated Z4K workflows for failover, failover test, restore, rollback restore, and commit restore. The Z4K software is managed through a kubectl plug-in and native Kubernetes tooling.

Competitive positioning

Z4K is not provisioning storage to containers – unlike Kasten and Portworx which also offer  containerised application protection. Deepak Verma, Director, Product Strategy at Zerto, told Blocks & Files it runs as a native K8s application which provides CDP-based journaling for persistent volumes running on the cluster nodes plus the backup, restore, and DR orchestration necessary to execute any desired use cases for resilience of K8s applications.

Verma said: “Kasten, judging from public documentation, appears to be relying the snapshot mechanism of the cloud platforms to capture data, with a minimum timeframe of five mins. Zerto for K8s on the other hand is providing CDP for persistent volumes at a 5-10 second interval and capturing all the necessary attributes of a K8s application in a consistency group to recreate locally or remotely down to a very granular point in time.”

In summary, Z4K has “less complexity and vendor lock-in than Portworx and more granular RPOs and having point-in-time consistency across multiple containers compared with Kasten”.

CPU load

Blocks & Files pictures a Kubernetes system executing thousands of containers over a few hours. Z4K is app-aware and so the host server has to run these containers and also do the granular continuous journaling for Z4K, with its thousands of checkpoints. We asked Zerto what is the burden on the host server processors of this additional load?

Verma said: “Zerto for K8s is built upon the same core journaling and CDP technology and intellectual property that Zerto has successfully deployed and improved for the last 10 years in very large VM environments.

“The relative overhead for most environments has been less than 10 per cent CPU and memory at the host level during busy times of high change rate. The predictable scaling of which is very well understood as well. Since we are still at the alpha stage, we have not run extensive performance tests, but do not expect the technology to be much different than what we currently utilise as guidelines.

“Part of the success of Zerto has been its very efficient CDP mechanism. For production application we view this as a minor overhead to provide the level of RPOs and RTOs that customers have come to expect from Zerto.”

Z4K will be available in an early adopter program later in 2020 and goes on general release next year.

Roadmap

Zerto said it is decoupling operational recovery from backup because continuous journaling eliminates the need for snapshot-based backup. 

Levonai said: “Historically, top-tier, customer-facing applications would be protected with multiple data protection and disaster recovery solutions while lower-tier applications would be protected with high RPO backups only, or not protected at all. Zerto is levelling the playing field by applying its CDP technology to each of these … applications, transforming the backup market.”

The company plans to extend its IT Resilience Platform with a mix of one-to-many replication, local and remote journaling, long-term repositories, and short-term and long-term retention policies. The aim is meet various SLA needs at various cost levels.

The company will offer in-cloud protection and DR for AWS, Azure and GCP. There will be tiering to cloud object stores, with built-in data reduction technology, for long-term retention.

Zerto will develop new workflows to simplify file or VM restores back to production from a local or remote journal. Additional roadmap features include added VM auto-protection, encryption and security, automatic VRA lifecycle management for maintenance, features for managed service providers, new analytics functionality and an improved licensing mechanism.

Pure Storage delivers Purity six appeal

Pure Storage today launched the sixth generation of the Purity FlashArray OS. The upgrade includes extended disaster recovery, integrated file support and NVMe-over-Fabrics with RoCE for Cisco UCS and VMware.

Purity 6.0 adds a DirectMemory Cache, multi-factor authentication, increased volume management and simpler quoting for Pure1 management software. A Cloud Block Store is in beta test on Azure.

James Kelly, senior systems administrator at Chapman University, a private university in Orange Country, California, said in a prepped quote: “The unified SAN and NAS capabilities of this new FlashArray OS represent a game-changer for our highest-performance file-based workloads that otherwise need to run in all-block environments. It offers us a great way to cost-effectively run VDI or performance-critical file-based applications right alongside our key enterprise and research workloads.” 

Purity 5.0, announced in June 2017, delivered initial NFS and SMB file services support and synchronous replication. There was also a demonstration of end-to-end NVMe over fabrics to Cisco UCS servers using a 40Gbit/s RoCE v2 Ethernet link at the v5.0 launch.

Active DR

Pure is offering “active disaster recovery” built on new continuous replication technology. It uses active-passive replication for geo distances, and provides a near-zero RPO (three to four seconds). It is bi-directional, there are no journals to manage and Pure says it provides a fast recover/failover time. Customers have test failover, failover, reverse and failback functionality.

Pure offers a wide range of replication options according to availability requirements and price points: synchronous, active-active, replication with ActiveCluster, snapshot-based asynchronous replication, and now continuous replication.

There are validated designs for VMware Site Recovery Manager and applications such as Microsoft SQL, Oracle, SAP and MongoDB. 

Files

In April 2019, Pure Storage acquired Compuverde, a Swedish storage software developer. Purity 6.0 incorporates Compuverde’s file access technology, which sits alongside the existing block protocol above the data reduction and encryption layers in the Purity software stack. This makes FlashArray a unified file and block storage platform.

Block and file data benefit from global deduplication and compression of the shared storage pool in FlashArray. 

iPhone screen grab of Pure slide.

Alex McMullan, Pure’s  VP & CTO, International, told us that the current FlashBlade file support is aimed at high-performance file applications such as big data and machine learning environments. FlashArray files is suited to user-stye file services that don’t need the scale-out capability of FlashBlade.

He said FlashArray files will get near-synchronous replication. Unstructured data on Purity’s new file services can be protected with Veeam and CommVault backup offerings. 

Other features

Pure announced the third generation FlashArray//X R3, which can be fitted with an Optane cache, in February. Purity 6’s DirectMemory Cache tries to satisfy read requests from this cache and delivers 50 per cent improvement in read latency – down to 125µs. Customers can add Optane capacity in 3TB or 6TB packs of 750GB DirectMemory Modules.

The NVMe-RoCE (RDMA over lossless Converged Ethernet) has a validated design for Cisco UCS servers. Pure will offer NVMe/TCP support but does not say when.

Purity 6.0 hits the streets on June 18. Evergreen Storage subscription customers receive all Purity 6.0 features, with no additional licenses or added support costs. Cloud Block Store for Azure should be delivered later this year.

Snowflake ‘preps $20bn IPO’

Snowflake is prepping an IPO this year that could value the data warehousing startup at up to $20bn.

The company has already submitted a confidential IPO filing with the US SEC, according to the Financial Times, citing unnamed sources.

Snowflake has raised $1.4bn in VC funding, including a $479m G-series round earlier this year which priced the company at $12.4bn. Salesforce was co-lead investor.

Snowflake CEO Frank Slootman said at the time that the company was poised to turn cashflow positive this year, revenue grew by 174 per cent last year, and would soon top $1bn. Snowflake claims more than 2,000 customers.

He told the San Francisco Business Times that $12.4bn valuation could be higher following an IPO: “The reason is that our growth trajectory is so fierce and our addressable market is so large. When companies grow so fast, as Snowflake has, the valuation may seem like a big number now but not later. When I was with ServiceNow (as CEO), the valuation was $2.5bn when we went out and now it has a $65bn valuation.”

Slootman replaced CEO Bob Muglia in May 2019. At that time Blocks & Files suggested Slootman’s track record of acquisition and successful IPO-based growth must have been attractive to Snowflake’s investors looking for a great exit.

HPE updates Primera and Nimble arrays

HPE has refreshed the Primera upper mid-range array and Nimble lower-mid-range arrays. Let’s take a closer look.

The Primera array gets NVMe support via Primera OS v4.2. HPE said all-flash Primera already delivers 75 per cent of I/O within 250 μs latency. Now Primera with all-NVMe supports twice the number of SAP HANA nodes at half the price. HPE has not provided an IO latency number for an all-NVMe Primera array.

Primera nodes and C=controllers

Primera has been given an AI sub-system so it can self-optimise array resource utilisation in real time.

A Primera array, with the Peer Persistence feature, can now replicate to three global sites for extra and transparent data access reliability if metro-scale disasters occur. The array also gets near-instant asynchronous replication over extended distances. This has a one minute recovery point objective.

A replication target site can be in the public cloud and replication can provide data for test and development and analytics.

Primera automation has been optimised for providing storage to VMware VVOLs and containers. The container support comes via a Kubernetes CSI plug-in and the VVOL support includes disaster recovery via Site Recovery Manager. Nimble arrays already support VVOLs and a CSI plug-in.

Nimble Storage

Nimble features include six nines availability and two-site replication. They also have asynchronous replication on-premises or to the cloud for extended distances. The arrays get global three-site replication via NimbleOS 5.2.

Storage class memory (SCM) is now supported and cuts response time in half for the all-flash Nimble array, according to HPE. It says SCM is too expensive to use as a storage tier for Nimble customers. A relatively small amount of SCM is used instead as a cache to speed IO. It provides an average sub-250 μs latency3, at near the price of an all-flash array.

InfoSight system management has been extended for Nimble arrays with cross-stack analytics for Hyper-V. These identify abnormal performance issues between storage and Virtual Machines, and under-utilised virtual resources. 

Availability 

Primera OS 4.2, is available worldwide in Q3 2020 at no additional charge for customers with valid support contracts. Primera All-NVMe is available for order direct and through channel partners. The Primera  CSI Driver for Kubernetes 1.1.1 is now available for Primera. 

HPE Nimble Storage 1.5TB Storage Class Memory Adapter Kits are available now for Nimble Storage AF60 and AF80 All Flash Arrays. Nimble 3 Site Replication is available as part of NimbleOS 5.2 release for any customer with an active support contract. InfoSight Cross Stack Analytics for Hyper-V is available next month for any Nimble customer with an active support contract. 

HPE launched the Primera arrays a year ago as a massively parallel product line, intended to replace the 3PAR array. They have a 100 per cent availability guarantee. Nimble and its array technology was acquired by HPE in March 2017. Both arrays are available as a service through HPE’s GreenLake.