Home Blog Page 297

Amazon Files get CRAB protection

Amazon has added cross-region and cross-account facilities to the AWS Backup facility for AWS FSx -calling this CRAB for FSx.

Amazon FSx enables customers to create secure, durable backups of their file systems. Before today’s launch, these backups could only reside in the same AWS Region and account as those of the file system. Now FSx backups can be copied to other AWS Regions to meet business continuity goals or across accounts to meet backup compliance needs.

The news was outlined in an AWS blog by senior solutions rachitects Adam Hunter and Fathima Kamal. “With this launch, storage and application administrators can copy backups across AWS Regions to implement cross-region business continuity. Administrators can now also copy backups across AWS accounts to protect their backups from accidental user activity, account compromises, and ransomware.”

“Organisations can now easily access and share data for quality assurance and testing, without running any risk of affecting production data.” 

AWS CRAB diagram

Backups are managed from a central backup console facility through which you can set up automated backup schedules, retention management, and lifecycle management. Data is encrypted in transit and at rest and backup activity logs aid compliance audits.

The blog has walk-throughs of setting up CRAB runs with illustrative screen shots.

AWS Backup protects user accounts and resources in Elastic Compute Cloud (EC2), Elastic Block Store (EBS) volumes, Relational Database Service (RDS) databases (including Aurora clusters), DynamoDB tables, Elastic File System (EFS), FSx for Lustre, FSx for Windows File Server, and Storage Gateway volumes.

AWS Storage Gateway is a hybrid storage service that enables on-premises applications to use AWS Cloud storage.

AWS Backup competes with in-AWS cloud data protection suppliers Clumio, Commvault, Druva, HYCU and others. Typically these suppliers have multi-cloud protection whereas AWS Backup is restricted to, naturally, AWS.

Huawei looks beyond von Neumann to build 100x denser storage systems

At its April 12 Global Analyst Summit in Shenzhen, Huawei said it is researching 100x denser storage systems that break free of the constraints of traditional von Neumann architecture.

We asked the presenter, William Xu, Director of the Board and President of Huawei’s Institute of Strategic Research, to explain his thinking.

William Xu.

He told us: “Storage capacity and performance are issues that must be addressed for future storage systems.”

We need to be able to store a lot more data and to get access to it faster. Each problem needs to be addressed separately.

Capacity

Xu said: “Firstly, we need much higher storage capacity. Capacity density should be 100 times higher than what we currently have; existing storage media cannot achieve this level due to restrictions surrounding process and power consumption. 

“To overcome the capacity hurdle, we need breakthroughs in new technologies including large-capacity and low-latency in-memory computing technologies, ultra-large capacity media technologies such as DNA storage and high-dimensional optical storage, as well as ultra-large storage space model and coding technologies.”

The storage industry is already exploring DNA and optical storage technologies. The large-capacity and low-latency in-memory computing technologies hints at DRAM advances, replacements or substitutes, such as storage-class memory. 

Performance

Turning to performance, Xu said: “Secondly, we must significantly improve storage performance. As the data access bandwidth of storage systems increases from TBs to PBs and access latency drops from milliseconds to microseconds, we require performance density to increase by 100 times of what we have today.

“Under the von Neumann architecture, data needs to be transmitted between CPUs, memory, and storage media. The PCIe and DDR bandwidths we currently have will not be able to keep up with network performance growth. To proceed, we need to move past the von Neumann architecture and shift away from CPU-centric storage and towards memory- and data-centric storage.”

He’s not saying we need faster PCIe buses or DRAM socket speeds. Instead he’s thinking more about not moving data to compute: “We also need to focus more on computing migration rather than data migration.”

That means bringing compute closer to data, as is the case with computational storage and in-memory computing.

Egnyte deals with Microsoft 365 content sprawl

Microsoft 365’s multiple applications may mean there are mini-content silos to manage – which Egnyte now unifies and syncs with Azure Blob storage.

Egnyte is a file sync and share provider with built-in compliance and governance features to protect sensitive data. Microsoft 365 (M365) office apps include Exchange, Word, Excel, OneDrive, PowerPoint, SharePoint and Teams. Egnyte integrated with Teams last year and today has added more integrations.

Tony Schwingel, Director of IT Service Delivery at Brookfield Properties, outlined an issue with M365 content sprawl in a statement: “Our employees love using Microsoft 365 but with the multiple applications, including Teams, SharePoint, OneDrive, and desktop applications, we find it increasingly difficult to manage and secure all of our content.”

Rajesh Ram.

Rajesh Ram, Egnyte co-founder and chief experience officer, said in a press statement: “Accelerated by the rapid shifts to remote work, Microsoft’s cloud suite, increasingly front-ended by Microsoft Teams, is now pervasive among organisations of all sizes, which has introduced new and unfamiliar risks.

“Mid-sized organisations, in particular, struggle to implement the various compliance, security and privacy tools that Microsoft offers as add-on options to its service. The complexity challenges are particularly acute in regulated industries, such as financial services and healthcare.”

Egnyte and Microsoft.

That’s an opportunity for Egnyte, which has added support for:

  • Automatically identifying and classifying sensitive content in Exchange Online and Exchange Server emails and attachments to help prevent improper disclosure or accidental data loss,
  • Classifying and finding regulated and sensitive content across different SharePoint and OneDrive libraries through a single screen interface,
  • Users accessing and co-editing Word, Excel, and PowerPoint files using the respective Microsoft desktop apps with IT and security teams governing the data through Egnyte.

Sensitive content refers to social security and credit card numbers and other personally identifiable information.

Egnyte has also released a public cloud connector to sync selected files between Egnyte and Microsoft Azure Blob object storage; the Egnyte Public Cloud Connector. This can support virtual desktop infrastructure (VDI) environments, data processing in Azure and archiving files to Azure. It can also be used for other cloud environments in addition to Azure.

Samsung switches up with the lightning-fast PCie 4.0 PM9A1 client SSD

Samsung has launched the PM9A1, a super-fast PCIe 4 client SSD with 2TB capacity, that matches Western Digital’s Black SN850 at 1 million random read IOPS and 7GB/sec sequential read bandwidth.

Jaejune Kim, corporate SVP of memory marketing at Samsung Semiconductor, said in a statement: “The PM9A1 represents a big step forward for SSD technology. From our newest generation V6 NAND, to the custom firmware and controller, everything is developed in house to deliver the best performance available in the market.”

The company is shipping the drive to the OEM market. Jim Nottingham, GM for the advanced compute and solutions unit at HP, a Samsung customer, said there is “clear demand for PCIe Gen 4’s capabilities and HP is working closely with Samsung to offer future support for the PM9A1, delivering increased bandwidth for even greater performance and seamless workflows for data-intensive users.”

Samsung PM9A1 M.2 drive

Life in the Fast Lane

Blocks & Files thinks that the PCIe Gen 4 interface, twice as fast as PCIe Gen 3, is a performance landmark for laptop, desktop and server systems. The NVMe interface running over the PCIe Gen 4 bus helps to produce low latency and high speed drives that leave SATA and SAS and PCIe gen 3 SSDs in the slow lane.

The PM9A1 delivers up to 850,000 random write IOPS and its sequential write speed is up to 5.2GB/sec. WD’s Black SN850 just pips it on sequential write speed, as it reaches up to 5.3GB/sec. The drive uses Samsung gen 6 V-NAND with 128 layers and TLC format (3 bits/cell). The gumstick format drive has a Samsung Elpis controller and firmware, and a DRAM buffer.

A prior Samsung PM981a gumstick drive with a PCIe Gen 3 interface provided up to 580,000/500,000 random read/write IOPS and up to 3.5/3.0 GB/sec sequential read/write speed, which emphasises the PCIe Gen 4 speed boost.

Samsung has not yet released endurance, warranty or reliability numbers.

DDN ports EXAScaler storage software to Bluefield-2 DPU

DDN has demonstrated its EXAScaler storage controller software running on Nvidia’s BlueField-2 data processing units (DPUs) which replace the existing EXAScaler controllers and add a security layer to enhance data integrity.

EXAScaler is a Lustre parallel file system array. BlueField-2 and Bluefield-3 devices are host server offload accelerator systems looking after data centre networking, storage and cybersecurity capabilities.

DDN Marketing VP Kurt Kuckein wrote in a blog post yesterday: “EXAScaler Software Defined on BlueField eliminates the need for dedicated storage controllers and allows storage to scale on-demand and simply in proportion to the network.” Effectively DDN is supplying JBOD hardware and its EXAScaler software which runs in the Arm-powered BlueField-2 SoC.

This EXAScaler-BlueField integration was demonstrated at Nvidia’s virtual GPU Technology Conference on April 12.

Video of Oehme’s Nvidia GTC session.

Sven Oehme, DDN CTO, presented a session titled ‘The Secure Data Platform for AI’, during which he demonstrated an EXAScaler/BlueField-2 array linked to an Nvidia DGX GPU server and feeding it with data.

Oehme said that running the EXAScaler storage stack on BlueField meant less network management complexity, with BlueField managing everything above the base physical connectivity layer. BlueField also provided security measures that isolated the EXAScaler storage and its data from attackers. 

Kuckein wrote: “IO data now inherits all the benefits of BlueField’s network isolation, significantly reducing the number of attack vectors available to a malicious actor. Right away, hacked user accounts, rogue users, man in the middle attacks, hacked root accounts and other avenues for malicious activity are eliminated.” [His emphasis.]

The EXAScaler software stack was split in the demo between array (server) software running in the array’s BlueField-2 DPU and client (DGX) software running in the DGX GPU server’s BlueField-2 DPU. There was effectively a private 200Gbit/s network link between the two BlueField cards.

The demo showed multiple tenants, with each tenant seeing only their portion of the EXAScaler datastore and making virtual API calls to the EXAScaler array.  

Oehme said the initial software port to Bluefield-2 has been completed. Over the next couple of quarters, more DDN EXAScaler software releases will provide integration with server CPU bypass GPUDirect storage, end-to-end zero copy, and application Quality of Service.

The end-to-end zero copy means that data is transmitted from and to the array and the DGX with no time-consuming intermediate copying for data preparation at either end.

Comment

Here we have DDN’s EXAScaler array software running on proprietary Nvidia hardware. We don’t know how the performance of the BlueField-2 hardware (8 x Arm A72 cores) compares to the existing EXAScaler dual active:active Xeon controllers but suspect it’s a lot less.

An EXAScaler array using DDN’s SFA18KX hardware has four gen 2 Xeon Scalable Processors and delivers up to 3.2 million IOPS and transfers data at 90GB/sec.

We suspect that Bluefield-3, with 16 x Arm A78 cores, would be needed to approach this performance level.

When DDN and Nvidia add GPUDirect support to EXAScaler BlueField, that could provide DDN with a lower cost product than other storage suppliers supporting GPUDirect, such as Pavilion Data, WekaIO and VAST Data. The EXAScaler-BlueField-GPUDirect performance numbers will be very interesting to see.

Another angle to this DDN EXAScaler BlueField port lies in asking the question; which other storage suppliers could do likewise? DDN has said both NetApp and WekaIO are BlueField ecosystem members. Will either port their software to BlueField?

Lastly, Blocks & Files sees a parallel between DDN porting its EXAScaler storage software to a directly-connected and Arm-powered DPU and Nebulon developing its own Arm-powered Storage Processing Unit (SPU) hardware to provide storage array functionality using SSDs in the server hosting the Nebulon SPU card. The storage software stack runs in the DPU/SPU card and reads and writes data to drives in an attached chassis – JBOF with DDN, and server with Nebulon.

ExaGrid becomes cash-positive, passes 3,000 customers

ExaGrid recruited more than 120 new customers in the first 2021 quarter and has become cash-flow positive. The data backup startup said it also posted record sales in EMEA. “We have actually had three really strong quarters in a row. The company continues to grow,” ExaGrid president and CEO Bill Andrews told us.

The quarter saw ExaGrid bookings grew over 40 per cent and revenue 24 per cent as the total customer count moved past 3,000. Blocks & Files calculates that, since the third 2018 quarter ExaGrid has won an average of 60 new customers each quarter. “Remember, we go after the upper mid-market to the enterprise so these are not smaller customers,” Andrew said.

ExaGrid supplies a scale-out, globally deduplicating backup appliance with a so-called disk cache Landing Zone for fast restores of the most recent backups. There are two backup tiers; the Performance Tier for the next-fastest restores and the virtual air-gap Retention Tier for slower restores from deduplicated backup files.

Exagrid scale-out hardware.

The company in January announced v6.0 of its software, which contained the ransomware-combatting ExaGrid Retention Time-Lock feature with immutable backups that could not be deleted. Andrews said at the time: “ExaGrid has the only backup storage solution with a non-network-facing tier, delayed deletes, and immutable data objects.”’

Today, Andrews said: “ExaGrid has upgraded over 1,000 customers to Version 6.0 and all of those customers have the Retention Time-Lock for Ransomware Recovery feature turned on.”

The company claims a 75 per cent competitive win rate against suppliers such as Dell EMC (PowerProtect), HPE (StoreOnce),  Quantum (DXi), and Veritas.

Comment

Data protection startups are pushing the idea of backup-as-a-service for on-premises and in-cloud applications, provided by software running in the public cloud. The fastest restore for in-cloud apps will be from in-cloud backup stores, while the fastest on-premises restores will be from on-premises backup appliances.

That is ExaGrid’s market. It is a big market.

Dell EMC -provided IDC chart of Purpose-Built Backup Appliance supplier revenue market shares in 2020. Dell had 47 per cent of a market worth $4.33bn.

An IDC chart published this month by Dell EMC shows that the purpose-built backup appliance market generated $4.33bn revenues in 2020. Dell EMC soaked up 47 per cent share – to take approximately $2.035bn. With nothing else to go on but a rough visuals from the pie chart above, we think the unnamed second- and third-placed vendors look to have over 20 per cent market share. Who could they be? Our entirely unscientific hunches are that HPE StoreOnce is in second place. Is Exagrid the third-placed vendor?

Nvidia unveils BlueField 3 DPU. It’s much faster

Nvidia says its next-generation BlueField SmartNIC device, launched today, will deliver the most powerful software-defined networking, storage and cybersecurity acceleration capabilities for data centres.

Update: 13 April 2021. Updated Nvidia roadmap diagram added with revised SPECint numbers.

Doubled Arm core count, doubled network speed and PCIe Gen 5 support will make Nvidia’s third generation BlueField hardware deliver a five times higher SPECINT score than the gen 2 BlueField device. Nvidia is placing BlueField-3 in the AI and “accelerated computing” market, and says the DPU will free up x86 servers for application processing.

Jensen Huang, Nvidia founder and CEO, said in a press statement: “A new type of processor, designed to process data centre infrastructure software, is needed to offload and accelerate the tremendous compute load of virtualization, networking, storage, security and other cloud-native AI services. The time for BlueField DPU has come.”

Nvidia gained the BlueField technology via the Mellanox acquisition in March 2019. Bluefield is a system-on-a-Chip (SoC) which offloads a host server CPU by performing tasks such as interfacing to storage devices, security checking, and network data transmission. Sample shipping of BlueField-2 was announced in November last year and the device is now generally available.

BlueField-3 SoC.

BlueField-3 will have 16 x Arm A78 cores, providing ten times more compute power than BlueField 2, 400Gbit/s bandwidth and the aforementioned PCIe gen 5 support – four times faster than PCIe gen 3. Nvidia will include accelerators for software-defined storage, networking, security, streaming, line rate TLS/IPSEC cryptography, and – like BlueField-2 – precision timing for 5G telco and time synchronised data centres.

BlueField-3 is rated at 42 SPECint and 1.5 TOPS (TeraOps).

BlueField Roadmap. An earlier version of this roadmap depicted BlueField -2X and BlueField-3x devices each with an attached GPU. This is integrated into the BlueField SoC with the BlueField-4 generation.

In comparison BlueField-2 is equipped with eight x A72 Arm cores, 200Gbits/s Ethernet/InfiniBand networking, dedicated security engines and two VLIW accelerators. It supports NVMe over Fabric (NVMe-oF) Storage Direct, encryption, elastic storage, data integrity, compression, and deduplication. Bluefield-2 delivers 9 SPECint and 0.7 TOPS.

Nvidia’a DOCA software development kit (SDK) enables developers to create software-defined, cloud-native, DPU-accelerated services, leveraging industry-standard APIs, for BlueField-2 and BlueField-3.

Nvidia DOCA diagram.

Both BlueField products can isolate data centre security policies from the host CPU and build a zero-trust data centre domain at the edge of a server. BlueField-3 can also function as the monitoring, or telemetry, agent for Nvidia’s Morpheus cloud-native cybersecurity framework.“ Morpheus combines Mellanox in-server networking and Nvidia AI to do real-time, all-packet inspection to anticipate threats and eliminate them as they arise,” Huang said.

Morpheus uses machine learning to identify, capture and take action on threats and anomalies such as leaks of unencrypted sensitive data, phishing attacks and malware. It can analyse every network packet with line-rate speed and no data replication.

A BlueField ecosystem is being set up by Nvidia with:

  • Server suppliers Dell Technologies, Inspur, Lenovo and Supermicro integrating BlueField DPUs into their systems,
  • Cloud service providers Baidu, JD.com and UCloud using BlueField,
  • Support from system software suppliers Canonical, Red Hat and VMware,
  • Fortinet and Guardicore in the cybersecurity area supporting BlueField,
  • Support by storage providers DDN, NetApp and WekaIO, 
  • Edge IT platform support from Cloudflare, F5 and Juniper Networks.

BlueField-3 is backwards compatible with BlueField-2 and expected to sample in the first quarter of 2022.

Comment

BlueField is comparable to Fungible’s DPU in its scope but the Fungible Data Center processor and FS1600 Storage Processor are elements in Fungible’s composable datacenter scheme which uses its own network fabric. Nvidia is not presenting Bluefield as a means to the end of composing disaggregated data centre resources.

However, BlueField is software-defined. That implies that APIs could be developed so that servers and storage systems front-ended by BlueField could function in a composable data centre environment.

The diagram above showing BlueField roadmap to gen 4 is hardware-centric. Blocks & Files expects Nvidia to increase the software element as its ideas about the data centre future meet and react to composable data centre concepts being promoted by suppliers such as Fungible, HPE (Synergy) and Liqid.

We think Nvidia will in due course add Compute Express Link (CXL) support for Bluefield, as the CXL link can run across a PCIe Gen 5 bus.

Amazon Re:Freeze makes it easier to transfer Glacier vault archives

Amazon has announced the Glacier Re:Freeze serverless service to transfer an entire Glacier data vault to another S3 class such as Glacier Deep Archive.

The idea is to transfer an entire archive vault of data in a single service operation, with Amazon Lambda functions taking core of the component source vault inventory, restore, archive copy operations and data integrity checks. There is a dashboard to track progress.

An Amazon whats-new post states: “Deploying this solution allows you to seamlessly copy your S3 Glacier vault archives to more cost effective storage locations such as the Amazon S3 Glacier Deep Archive storage class.”

Glacier is Amazon’s S3 object storage service to store low-access rate data in a low-cost repository with data access taking minutes to hours. Glacier Deep Archive costs less than Glacier and takes longer to access data. Ordinary data access can take up to 12 hours with bulk data access at the PB level taking up to 48 hours.

Three points. Firstly, the copied Glacier vault is not deleted. A user has to do that manually. Secondly the Amazon post clearly states that Re:Freezer “automatically copies entire Amazon S3 Glacier vault archives to a defined destination Amazon Simple Storage Service (Amazon S3) bucket and S3 storage class.” It does not restrict the target S3 storage class to Glacier Deep Archive.

S3 Re:Freezer architecture diagram.

We take that to mean that the source Amazon Glacier archive contents can be copied to any S3 storage class: S3 Standard, S3 Intelligent Tiering, S3 Standard-Infrequent Access (IA), S3 One Zone-IA as well as S3 Glacier Deep Archive.

Our third point? We perceive potential nomenclatural confusion. If data is in the Glacier archive vault it is already ‘frozen’ and a service to transfer an archive to S3 Deep Archive ought to be called DeepFreezer – not Re:Freeze. Alternatively, a service to transfer it out of the Glacier archive to faster access S3 storage classes could be called DeFreezer. 

Blocks & Files envisages Azure, the Google Cloud Platform and other public cloud data archive services responding to this Amazon move with their own bulk archive copy services.

A Github ReadMe webpage contains a lot of information about Re:Freezer and deserves a look if you are interested in the idea of bulk copying of data from S3 Glacier to another S3 storage class.

Gartner: external storage spend to lag server spend 2019-2025

Server-based IT expenditure will grow three times faster than external storage spending, over 2019-2025, according to Gartner forecasts.

Wells Fargo senior analyst Aaron Rakers has presented tabulated Gartner data to his subscribers which reveals that total IT end-user spending will grow 5.9 per cent from $3.84tn in 2019 to $5tn in 2025. He noted: “Worldwide IT spending includes spending on end-user devices (PCs, smartphones, etc.), data centre systems (servers, storage, and networking), enterprise software, IT services, and communications services.”

Data Centre server spend will grow from $80bn in 2019 to £111.6bn in 2025, a 5.6 per cent CAGR. External controller-based storage was worth $28bn in 2019 and is forecast to reach $29.9bn in 2025, a 1.8 percent CAGR.

Table supplied by Aaron Rakers.

The big growth areas are in the software:

  • Enterprise Application Software; 12.4 per cent CAGR from 2019’s $217.8bn to 2025’s $380bn,
  • Infrastructure Software; 10.7 per cent CAGR from 2019’s $258.8bn to 2o25’s $422.8bn,
  • Managed Services and Cloud Infrastructure Services; 10.2 per cent CAGR from $472.7bn in 2019 to $692.4bn in 2025.

Wish HCI could do more?Maybe it’s time for some disaggregation

Lego bricks
Lego bricks

Sponsored Hyperconverged infrastructure (HCI) ticks a lot of boxes for a lot of organizations. Recent figures from IDC showed growth in the overall converged systems market was static in the fourth quarter of 2020, but sales growth for hyperconverged systems accelerated, up 7.4 percent and accounting for almost 55 per cent of the market.

It’s not hard to see why this might be. The pandemic means that many organizations are having to support largely remote workforces, meaning a surge of interest in systems that can support virtual desktop infrastructure. Those all-in-one systems seem to offer a straightforward way to scale up compute and storage to meet these challenges in a predictable, albeit slightly lumpy, way.

But what if your workloads are unpredictable? Are you sure that your storage capacity needs will always grow in lockstep with your compute needs? Looked at from this point of view, HCI may be a somewhat inflexible way to scale up your infrastructure, leaving you open to the possibility of paying for and managing storage and/or compute resources that you don’t actually need. Suddenly that tight level of integration is a source of irritation. Aggravation even.

This is why HPE has begun to offer customers a path to “disaggregation” with the HPE Nimble Storage dHCI line, which allows compute nodes, in the shape of HPE ProLiant servers, to share HPE Nimble storage arrays, while still offering the benefits of traditional HCI.

HPE’s Chuck Wood, Senior Product Marketing Manager, HPE Storage and Hyperconverged Infrastructure, says that while the classic HCI model delivers when it comes to deployment and management, admins still face complexity when it comes to the actual infrastructure.

“In traditional HCI, when you need to do Lifecycle Management, like adding nodes or even doing upgrades, all of those operations can be disruptive, because your apps and your data are on the same nodes,” he says.

VM-centric experience

At the same time, he explains, customers want to apply HCI to workloads beyond VDI and user computing: “There was this simplicity piece that you wanted to bring to more and more workloads – databases, business critical and mixed workloads.”

At first, it might seem counter-intuitive to split up the hardware elements that make up traditional HCI systems in the pursuit of simplification.

But as Wood explains, HPE’s approach is centered around delivering a simple VM-centric experience: “The VM admin wants to operate and manage workloads, virtual machines, you don’t want to be worrying about the infrastructure. To them, that’s where a lot of the complexity lies.”

So, in practice, “what we’re offering that’s aggregated is this simplified management … vCenter is the management plane, and our tools to manage Nimble Storage dHCI are embedded completely in vCenter.”

At the same time, Nimble dHCI has a Container Storage Interface (CSI) driver, so, as Wood puts it, “you can run VMs alongside containers. We can be that common infrastructure.”

“The disaggregated part is just the notion that you can add more ProLiants if all you need is memory, and compute and CPU resources,” he says. “Or you can add disks and all-flash Nimble shelves, if all you require is capacity for storage.”

This all allows for a much more incremental, and non-disruptive, approach to building out the infrastructure. For instance, adding HPE Nimble dHCI array to an existing setup of HPE ProLiants and approved switches is a five-step process that takes just 15 minutes, Wood says.

It’s worth noting that adopting HPE Nimble-based dHCI architecture does not require you to start from scratch. “If you have existing ProLiant Gen9 or Gen10 servers, we can actually bring them into Nimble Storage arrays,” says Wood, though to date, he says, most customers have built from the ground up.

Resilient infrastructure

Resiliency is further underpinned by HPE InfoSight, HPE’s AI-powered intelligent data platform, which aggregates telemetry from HPE Nimble systems to provide predictive management and automation. According to HPE, InfoSight can automatically resolve 86 percent of issues, while time spent managing problems is slashed by 85 percent.

The combination of ease of deployment and management, disaggregated architecture and automation via HPE InfoSight, means a commitment guaranteed to six nines of data availability for the HPE Nimble Storage dHCI systems. “It yields a better infrastructure from a resiliency perspective,” says Wood. “If you don’t have a lot of dynamics in your data center, and you just sort of set it and forget it for your general purpose workloads, then HCI is great, it’s perfect. But when it’s a more dynamic environment, where you don’t know what you’re going to need three to six months from now, disaggregated dHCI can be a better play.”

This inevitably prompts the question that, if you need to cater for changeable workloads, wouldn’t it make more sense to start using the cloud to supplement your existing storage infrastructure, before moving entirely off-prem.

Wood says the cloud is absolutely part of HPE’s offering in the shape of its HPE Cloud Volumes service. This gives customers the benefits of keeping control of their data on-prem, while having the ability to flex into the cloud as needed, further reducing the temptation to over-provisioning. HPE Nimble dHCI also offers support for Google Anthos.

And HPE Nimble dedupe and compression features further reduce the need for over-provisioning on prem, or in the cloud, he says. “Many customers are getting five-to-one dedupe on their backups and primary storage, so it gives you very good capability and resource utilization.”

On-prem

Even companies who began as cloud native are now finding they need to build out on-prem infrastructure, says Wood. “There’s a tipping point that ‘we have to have something on premises for test dev or to scale’,” he explains. Having on-prem infrastructure, supplemented by the cloud, gives customers far more control of their destiny, he says.

Wood cites the example of Australian pet insurance firm PetSure which had been using AWS to support VDI for its employees and provide remote veterinary services for customers’ pets. It switched to on-prem HPE Nimble dHCI as the Covid-19 pandemic hit, to support newly remote workers while also expanding its mixed workloads.

Similarly, he says, a healthcare customer in the UK opted to process X-rays locally, because it couldn’t afford to take the chance on latency that a cloud-based approach could involve.

“That’s why not everything is going to go to the cloud,” he says. “Because you need that local CPU memory and disk to give you the speed that your business requires to do your transactions.”

And, HPE would argue, you can now do that with a lot less aggravation, and aggregation, than you could in the past.

This article is sponsored by HPE.

Ionir frees data from its location trap

Profile Last month Blocks & Files interviewed Ionir marketing head Kirby Wadsworth and CEO Jacob Cherian to find out more about this storage startup.

Ionir stores block data using a generated name and virtual, time-sensitive location mapping metadata so that it can be moved in place and in time to enable persistent volume mobility – fast mobility to any point in time – for Kubernetes applications.

Ionir software provides persistent block storage volumes to cloud-native applications using Kubernetes. It uses technology developed by Reduxio, a company that failed to progress, and the former home of many Ionir founders.

Wadsworth said Ionir’s software is “elastic and dynamic with near-infinite capacity and performance,” and quite different from traditional block storage.

The location anchor

He said that traditional block storage data is defined by its location in a storage volume and an offset from the start of the volume. The volume is a logical location mapped to physical disk or NAND storage media.

Ionir’s technology accepts an incoming data write and generates a hash value from its contents, called its Name. It also generates a timestamp so that data with that name can be accessed by time.

Ionir data naming.

The data is then stored in a volume and has three basic metadata items associated with it; name, physical location (volume + offset), and time. The volume and offset are collectively called a Tag.

Now assume a second data write comes in and its generated hash has the same value as the first item. Then no new data is stored but new metadata calculated; the second item’s tag and time. This is intrinsic deduplication in the block datastore, “Any block repeating an existing one is mapped to it,” Wadworth said.

Blocks & Files combination of two Ionir slides showing data name, volume, offset and time relationships plus the inherent deduplication

If a third item comes in and its name is different, the new data is stored together with its metadata; name, tag and time. The chart above depicts this process, with unique data items represented by green lego bricks.

This combination of features means that the Ionir block store provides continuous data protection with no need for a separate data protection application. There is also no need either for snapshots or application quiescing to enable a snapshot to be taken.

The Ionir structure can be seen as a key:value store with the name being the key, and the data being the value. The data can be arbitrarily large, according to Ionir.

We suggested it was similar to object storage in that a data item’s name (Ionir) or address (object storage) is a hash generated from its contents. Cherian was emphatic in denying this: “It is not an object storage system.”

Data Teleport>

Here we have the basics for Ionir’s DataTeleport technology. If data has to be moved back in time – to a point just before a ransomware attack, for example, the restore to a previous point in time, using the timestamps, takes about a second; meaning a 1 second RPO.

If data has to be moved to a new location the metadata is moved first and this takes up to 40 seconds – it depends upon the size of the metadata store – after which the data can be accessed from the new location as the name in the metadata store is mapped to the original location. The location maps are altered as the data itself is physically moved, using smart pre-fetch algorithms. In effect the data is cloned.

Ionir says that its Data Teleport technology makes data portable, quickly portable in fact, and available, at any location in an on-premises and public cloud ‘space’ which can include multiple on-premises locations and public clouds in a multi-hybrid cloud infrastructure.

Cloud-native technology

Ionir’s software has three components, called the Pitcher, the Catcher and the Keeper.

The three Ionir software components.

The pitcher is the front end and receives data read and write requests from applications and compresses/decompresses data. Metadata is managed and accessed through the Catcher software and it carries out the intrinsic deduplication. Data is actually written to storage media by the Keeper component and it uses NVMe-accessed flash storage.

These three software components are written as microservices and are Kubernetes-orchestrated containers. Other containers request storage services from the Ionir containers using Kubernetes’ CSI interface.

The Ionir software supports automated tiering.

Its route to market is via a small direct sales force and channel partners such as the service providers outside the hyperscaler segment. Wadsworth said “Kubernetes levels the playing field,” and this benefits a small startup.

Future developments

What about the future?

Cherian said: Today, we store blocks of data, put together into volumes and presented to applications.” But: “A Pitcher could be developed to have a new interface.” An actual key:value interface was mentioned as an example in passing.

Also: “A new Keeper could be developed to support objects and different storage media. … We could have an S3 front end or backend. We’re storage protocol and access agnostic.”

Your occasional storage digest with Intel Ice Lake, composable systems and HCI

The big enterprise hardware event this week centres on Intel’s Ice Lake Xeon processors, with HPE saying its Synergy composable system will support Ice Lake compute nodes. We envisage a sequence of rolling upgrades across the industry’s HCI and external storage systems as suppliers adopt Ice Lake Technology.

Following on from last month’s launch of Fungible’s data centre processor, IDC has written a report forecasting the rise of composable, disaggregated systems, like Fungible’s.

Synergy with Ice Lake

HPE’s Synergy composable server will support Intel’s gen 3 Xeon Ice Lake CPUs, giving it a boost in processing power.

Synergy provides servers dynamically composed for workloads from pools of physical and virtual compute, storage, and network fabric resources. When the workload completes the components are returned to the pools for re-use.

The Synergy 480 Gen 10 plus is a 2-socket compute module with up to 40 per cent more than performance than gen 2 Xeon servers due to a faster CPU with up to 40 cores, 8 rather than 6 memory channels, use of the PCIe Gen 4 bus,  and support for Optane PMem 200 Series vs the earlier PMem 100 drives.

The 480 Gen 10 Plus will be available in the summer. 

Synergy systems are available as a service through HPE GreenLake.

Ice Lake, storage controllers and HCI

Blocks & Files expects hyperconverged infrastructure hardware suppliers and external storage system suppliers to announce powerful new products using Ice Lake gen 3 Xeon processors in the next few months.

Ice Lake CPUs have up to 40 cores, compared to gen 2 Xeon’s 26-core maximum and a third more memory channels than gen 2 Xeons; enabling 2TB of DRAM per CPU. These new processors also support the PCIe Gen 4 bus; twice the speed of PCIe Gen 3. 

Inspur says its M6 servers, which use Ice Lake, have overall storage density and IOPS (input/output operations per second) scaled up by 3 times and 3.2 times, respectively over its gen 2 Xeon servers.

All-in-all, this should enable at least 30-40 per cent more virtual machines supported in an HCI node, compared with today’s gen 2 Xeon boxes. HPE has confirmed to us that HPE SimpliVity HCI systems will support Ice Lake.

Intel Ice Lake Gen 3 Xeon CPU.

Object storage nodes using Ice Lake servers should get a similar boost in power as will scale-out filer nodes. External block arrays and unified block+file arrays should also receive a boost in power. We could expect raw capacity per controller to increase quite substantially and data reduction speed to accelerate as well.

Purpose-built back up appliances, such as Dell EMC’s PowerProtect line, should also be capable of significant performance and capacity Ice Lake boosts. Expect a continual sequence of Ice Lake updated storage and HCI systems over the next few quarters.

IDC composability report

IDC analysts have detected a divide that is opening up between current and next-generation workloads. They say composable, disaggregated systems can bridge the gap.

The IDC report ‘Composable/Disaggregated Infrastructure Systems: Catalysts for Digital Transformation’ explains that current workloads “assume infrastructure resiliency and require computing technologies such as virtualization and clustering that provide portability and transparently preserve application state. They also depend on external shared storage arrays that provide multiple layers of resiliency and redundancy.“

In contrast, next-generation and cloud-native workloads “are designed to be horizontally scalable, with state maintained in resilient data layer processes provided by the infrastructure itself. They are designed to run on industry-standard computing platforms and inside lightweight application containers and carry no bias toward specific silicon architecture.“

IDC’s report was produced by IDC Custom Solutions and is being distributed by Fungible, a maker of function-offload accelerators.

The report authors declare that “emerging composable/disaggregated infrastructure solutions provide the necessary scale, agility, and flexibility needed for cohosting current-generation and next-generation workloads on a common data centre infrastructure architecture.”

Fungible FS1600 Storage Server DPU

Composable/disaggregated infrastructure is based on “special silicon engines called function-offload accelerators (FAs) (also called data processing units or DPUs and SmartNICs) to disaggregate compute, storage, and networking resources into shared resource pools, which in turn can be allocated to workloads on demand in a software-defined manner.”

Shorts

Alluxio in collaboration with Intel has announced the availability of an in-memory software acceleration layer with 3rd Gen Intel Xeon Ice Lake CPUs Optane persistent memory (PMem) 200 series drives. It’s aimed at analytics and AI pipeline workloads and uses Optane’s Storage over App Direct feature.

Research outfit Counterpoint says average smartphone NAND flash capacity crossed the 100GB mark in 2020 for the first time. It was 140.9GB in IOS phones in the fourth 2020 quarter but 95.7GB in Android phones during the same period.

Google Database Migration Service is available to support enterprises migrate their databases and other business IT infrastructure to Google’s public cloud. Freedom Financial Services used it and the Google Kubernetes Engine (GKE) to migrate its 1TB MySCQL database to Google’s Cloud SQL in 13 hours.

M-Files, an information management company, today announced the acquisition of Hubshare and its secure information exchange platform to bolster external content file sharing and collaboration.

Pavilion Data Systems is partnering with Graphistry to supply data analytics functionality using Nvidia Magnum IO and GPUDirect technologies. Graphistry has integrated data UI tools like Jupyter notebooks, Streamlit dashboards, and Graphistry visual analytics with the Nvidia RAPIDS GPU ecosystem (Dask_cuDF, GPU Direct storage, DGX A100) and with the Pavilion HyperParallel Data Platform.

Platform9 has updated its Platform9 Managed Kubernetes(PMK) product with IPv6 support for 5G deployments. It’s claimed to deliver near line-rate performance for network packets.

SaaS business Redstor has announced new Azure services. There is an AI engine to detect malware, and the ability to back up and recover Kubernetes environments on Azure. Redstor can also migrate and recover full systems to Azure from any cloud or data centre. 

ScaleOut Software has released an on-premises version of its ScaleOut Digital Twin Streaming Service with all of the analytics and visualisation tools available from its Azure-hosted cloud version. It enables customers to organisations develop and deploy streaming analytics applications on their own servers and so obviate public cloud security concerns.

Cloud storage supplier Wasabi is partnering with Atempo to offer the Tina backup and restore functionality across physical and virtual servers for applications, databases, NAS, Office 365, and more.

The Linac Coherent Light Source (LCLS) at the Stanford Linear Accelerator Centre (SLAC) National Accelerator Laboratory has deployed WekaIO filesystem software to store data for its LCLS Free Electron Laser (FEL) X-ray snapshots of atoms and molecules. These provide atomic resolution detail on ultrafast timescales to reveal fundamental processes in materials, technology, and living things.