Home Blog Page 275

Looking through the Veeam lens: hybrid IT arrives, containers and SaaS app protection needed

A snapshot of the IT environment provided by a Veeam survey shows that hybrid IT has become reality, cloud disaster recovery is difficult and a minority interest, and Office 365 data needs backing up — as does cloud-native app data.

Data protector Veeam has issued a 2021 Cloud Protection Trends Report based on data from a survey run by an independent research firm. It looked at data protection in four areas of IT: hybrid cloud, disaster recovery to the cloud, SaaS app protection and container protection.

There are three highlighted results”

  • Almost half of the 1550 respondents run production apps in the public cloud;
  • 80 per cent use the public cloud in their disaster recovery (DR) arrangements;
  • Twice as many organisations use a third party backup for Office 365 compared to a year ago.

The survey showed that hybrid IT — the mix of on-premises and public cloud IT environments — is a reality. Backup is changing from reliance on physical and virtual servers in the data centre to increased reliance on hosted virtual machines in the public cloud.

Veeam survey report chart.

More than half (55%) of the respondents used the public cloud for normal production workloads, 47 per cent for high-priority production workloads, 36 per cent for development and 21 per cent for disaster recovery.

Application movement is not one-way to the cloud. Some 23 per cent of respondents have decided to bring on-premises applications moved to the cloud back home. That movement reinforces a requirement that data protection facilities should cover both the on-premises and the public cloud worlds (AWS, Azure, GCP, etc.), and the environments within them — virtual machines and containers.

The public cloud is popularly used for DR, with less than a fifth of respondents (10%) not using it. For example, 40 per cent store backed up data in the cloud, for on-premises restoration, 39 per cent use the cloud as a secondary site for DR, and 27 per cent have a purpose-built DR-as-a-Service arrangement involving the public cloud.

Respondents using the public cloud in their DR arrangements pointed out there were difficult areas that could be smoother — network configuration and connectivity, securing remote sites, firing up servers and verifying remote server functionality. Expense was another issue.

On the more positive side, there was widespread acceptance that data used by SaaS applications, like Office 365, needs to be backed up by the SaaS user. This is to protect against accidental deletion, cyber and other malicious attacks, and to meet compliance needs.

There was also acknowledgement that containerised apps and data need protecting. In both this and the SaaS application case, the decision to protect and to manage the protection was split between backup admin staff and SaaS/PaaS admin staff.

Comment

In the world of hybrid IT, apps and data are running on physical, virtual and containerised servers in data centres, as well as on virtual and containerised platforms in the public clouds, and also accessed as services from the cloud. This inherently makes the provision of a single backup environment covering all three environments a formidable task. Even moreso if restoration is desired outside the source environment.

Ideally there should be a single or unified data protection control plane. This would enable different employees, backup admins, SaaS/PaaS admins, DevOps, compliance and line-of-business people to co-operate, co-ordinate and integrate their data protection needs.

You can download a copy of the PowerPoint deck-style 14-page report from Veeam’s web site.

Inspur goes world record SPC-1 benchmarking in the Optanical garden

A clustered 16-controller Inspur storage array system using Optane and NVMe SSDs has set an SPC-1 performance world record, overtaking Huawei.

The Storage Performance Council’s SPC-1 benchmark tests the performance attributes of a storage array responding to business-critical workload IO requests. It measures SPC-1 input/output operations per second (IOPS), price per IOPS, response time, capacity and price per gigabyte.

Inspur’s HF18000 G5-1 system scored 23,001,502 SPC-1 IOPS, beating the previous record set by Huawei of 21,002,561 IOPS. The Huawei Dorado 18000 v6 array used 576 x 1.92TB NVMe SSDs and Inspur’s array had 570 SDDS as well. But that number included 480 x 1.92TB NVMe SSDs plus 96 x 375GB Optane SSDs as well. The P4800X Optane drives tipped the balance.

A chart shows SPC-1 ranking by IOPS; 

Fujitsu was first to push performance past 7.5 million IOPS, but Huawei blew that away with its all-NVMe SSD system. And now Inspur has used Optane acceleration to overtake Huawei.

There are no Western suppliers in the top 10 SPC-1 rankings:

This benchmark is largely a game played by Chinese suppliers now. 

Plotting the top results in a 2D space defined by IOPS and $/IOPS we can’t see a general trend of lowered cost per IOPS as performance increases:

The chart shows that Inspur systems do tend to be lower cost than Huawei ones.

Will Western suppliers of external storage arrays — Dell, HPE, Infinidat, NetApp, Pure and VAST Data for example — bother with SPC-1 benchmarks anymore? There is little to no indication that their marketing of product performance needs an SPC-1 boost. And the SPC-1 test does not reflect NVMe-over-Fabrics technology, nor NVIDIA’s GPU Direct for that matter. In this sense it is yesterday’s benchmark and needs replacing.

Having said that, this is a great result for Optane — Intel will be pleased.

Intel sees CXL as rack-level disaggregator with Optane connectivity

Intel foresees the CXL bus enabling rack-level disaggregation of compute, memory, accelerators storage and network processors, with persistent memory on the CXL bus as well.

This was revealed when Intel presented a keynote pitch on the Compute Express Link (CXL) at the IEEE Hot Interconnects event.

CXL is based on the coming PCIe Gen-5.0 bus standard to interconnect processors and fast, low-latency peripheral devices.

Intel’s presenter was its Fellow and Director of I/O Technology and Standards, Dr Debendra Das Sharma and he started his session looking at Load-Store IO. This form of IO — loading and storing data into memory locations — is relevant because server memory capacity needs are rising due to the basic requirement to compute more data faster in AI, machine learning and other data-intensive applications such as genomics and big data analytics.

Load-Store IO is faster — much faster — than network IO, which transfers packets or frames of data, and is typically limited to taking place inside a server using CPU-level interconnect. Das Sharma said Load-Store IO physical layer (PHY) latencies are less than 10ns, whereas fast networking PHY latencies are in a >20 to >100ns range. 

The aim is to extend Load-Store IO out of the server, and the way to do that is to use the PCIe Gen-5 bus as a base. The memory in connected devices can then be treated as cached, write-back memory by the server processor, and not need a DMA data transfer to move data between devices and the physical server CPU-attached memory. An Intel slide shows this: 

Das Sharma mentioned three usage models for such CXL bus systems: 

  • Type 1 — Caching Devices and Accelerators accessed via network interface cards (NIC);
  • Type 2 — Accelerators (GPU, FPGA) with their own memory, such as HBM;
  • Type 3 — Memory buffers for memory bandwidth and capacity expansion.

PCIe Gen-5 is fast enough to support server access to a pool of DDR5 DRAM across the CXL interconnect, and have it treated as usable DRAM by a server CPU. This, Sharma said, is poised to be an industry game-changer. It decouples compute from the traditional DIMM memory bandwidth and capacity limitations.

In the future NVDIMM (non-volatile DIMMs or persistent memory) could move to CXL, with DRAM  backed up by storage-class memory (SCM) or NAND.

There could be computational storage devices — storage drives with on-board processors and memory and  caching. They would do do on-drive compression, encryption, RAID, key:value store compaction, search or vector processing for AI/ML applications. They would have a DMA engine for moving data and use PCIe services such as NVM-Express.

Das Sharma says CXL enables systems to scale with heterogeneous processing and memory, with a shared cacheable memory space accessible to all using the same mechanisms.

In fact there could be a cluster-wide memory tier — a byte-addressable data store, scalable to petabytes:

CXL would provide a Load-Store IO fabric at rack-level, disaggregating compute, memory pools, accelerator pools (GPU, FPGA), storage pool and network processing units across racks. 

A June 2021 version of Das Sharma’s presentation given at the EMEA Storage Developer Conference can be seen on YouTube.

Comment

The only SCM made by Intel is Optane. Das Sharma is talking about having Optane NVDIMMs hooked up to the CXL interconnect and have that capacity accessed by servers across the CXL link. He says persistent memory would then be cacheable, multi-headed for failover, and hot-pluggable.

As there is a 150+-member CXL consortium, we might expect other vendors’ SCM products to play nice in this space as well — such as potential ones from Micron, Samsung and others.

We think Das Sharma’s eventual rack-level disaggregation with CXL has a composability angle with sets of compute, memory, accelerator, storage and network processors dynamically composed to run specific application workloads.

Rubrik teams with Microsoft to develop Zero Trust anti-ransomware Azure services

Microsoft has made an investment in Rubrik as part of a strategic deal involving co-development of Zero Trust anti-ransomware services in the Azure public cloud with joint go-to-market activities.

The two companies will provide Microsoft 365 and hybrid cloud data protection and integrated cloud services on Azure. Zero Trust is the rejection of the idea that a network end-point location or user is trustworthy in itself.  Every access session has to be validated for legitimacy and access level change, and every user authenticated.

Bipul Sinha.

Bipul Sinha, Rubrik CEO and co-founder, told us in a Zoom briefing: “The sky is the limit in terms of how we can help our customers’ digital transformation.” 

Microsoft and Rubrik will ensure Microsoft 365 data is secure, discoverable, and always accessible in the case of a ransomware or other malicious attack, accidental deletion, or corruption. They will provide long-term archival of Microsoft 365 data for regulatory compliance purposes.

Rubrik also offers additional Microsoft 365 facilities — including instant search and restore, and policy-based management at scale.

Adding trust to Azure

As part of its Zero Trust approach, Rubrik-stored data is natively immutable and may not be modified, encrypted, or deleted by ransomware. Joint Azure-Rubrik customers will be able to recover their data after an attack and avoid paying any ransom.

The companies say that applications such as SAP, SQL, Oracle, VMware, and enterprise NAS workloads can tightly integrate protection and automation with Azure.

Sinha said: “Together with Microsoft, we are delivering tightly integrated data protection while accelerating and simplifying our customers’ journey to the cloud.”

Nick Parker, Corporate VP for Global Partner Solutions at Microsoft, said: “Customers, across industries, are migrating to the cloud to drive business transformation and realise growth … We believe that integrating Rubrik’s Zero Trust data management solutions with Microsoft Azure and Microsoft 365 will make it easy for customers to advance their Zero Trust journey and increase their digital resilience.”

If customers perceive that their data in Azure is not safe against ransomware attacks, they will stop migrating to Azure. Rubrik is Microsoft’s anti-ransomware weapon and ally to make sure that doesn’t happen.

The amount of Microsoft’s equity investment in Rubrik wasn’t disclosed. A Bloomberg report suggests it was in the low tens of millions of dollars, at a $4 billion valuation.

Comment

Microsoft has literally hundreds of thousands of enterprise customers, a few thousand of which are also Rubrik customers. Microsoft hasn’t disclosed much about the size of its Azure customer base, but one source reckons it is growing at 120,000 customers per month, and another has data on 204,306 Azure business customers. Let’s say it’s between half a million and a million customers — the potential for Rubrik penetration of that base is clearly immense. 

In a way, this Rubrik deal with Microsoft can be seen as its response to Cohesity’s relationship with AWS.

Rubrik can present itself as Microsoft’s recommended and trusted partner for protecting customers’ data against ransomware and other threats in the Azure cloud. That is a huge, huge win for Rubrik.

Rambus HBM subsystem more than doubles HBM2E speed

Extended high-bandwidth memory (HBM2E) is barely here, yet Rambus already has a third-generation HBM subsystem ready for use, and it goes more than twice as fast.

Server and GPU memory capacity and speed is set to rocket up from today’s socket-connected X86 server DRAM. NVIDIA GPUs have already abandoned that and are using HBM — stacked memory dies connected to physically close GPU using an interposer instead of socket channels.

We can view an HBM+interposer+processor system as a using chiplets, with the HBM part being one chiplet, and the processor another, both connected via the interposer.

HBM stacks memory dies one atop another to increase package capacity beyond a DRAM DIMM and uses a set of wide channels to break socket connectivity limitations. Intel’s Ice Lake generation of server processors supports eight sockets, up from its preceding Gen-2 Xeon’s 6. That’s not enough for multi-GPU servers and developing analytics and machine learning models which need huge amounts of data stored in memory for fast and repetitive access.

The JEDEC standards organisation has created three HBM standards: HBM, HBM2 and HM2E, with E standing for extended. Each one of these increases performance and capacity beyond the prior generation, as a table shows: 

This table is a work-in-progress with missing entries.

HBM3

The memory fabrication industry is pushing its technology boundaries beyond HBM2E with a third HBM generation: HBM3. There is no JEDEC standard for this as yet, although we understand JEDEC is working on one.

There are three HBM memory die suppliers: Micron, Samsung and SK hynix. SK hynix has discussed its HBM3 technology. Micron talked about “HBMnext” in August last year, saying it could have 4-Hi and 8-Hi stacks and a 3.2Gbit/sec data rate. That sounds like an HBM2E-class technology.

Samsung has ideas about putting processing elements directly into memory (PIM) to speed computation, and details of this technology are yet to emerge. It would be proprietary to Samsung. We have yet to hear from Samsung about its views on HBM3.

Rambus

Rambus has produced an HBM3 interface subsystem consisting of a PHY (physical layer in OSI model) and digital controller. This subsystem supports an up to 8.4Gbit/sec data rate, way beyond SK hynix’s 5.2Gbit/sec HBM3 data rate, as seen in the table. It delivers 1.075TB/sec of bandwidth, beating SK hynix’s 665GB/sec.

High Bandwidth Memory diagram, with PHY elements.

The Rambus PHY+Controller combo is accompanied by 2.5D interposer and package reference designs to help memory system designers develop products faster. 

AN IDC associate VP, Soo Kyoum Kim, provided a context quote for Rambus’s announcement: “The memory bandwidth requirements of AI/ML training are insatiable with leading-edge training models now surpassing billions of parameters. The Rambus HBM3-ready memory subsystem raises the bar for performance enabling state-of-the-art AI/ML and HPC applications.”

Rambus HBM3 PHY+controller diagram.

There are huge implication for overall system performance and design here. We could, possibly, see the replacement of socket-connected DRAM DIMMS by HBM. We could see HBM as a new, faster-than-basic-DRAM in the memory-storage hierarchy. Our sister publication The Next Platform has suggested that Optane could then replace DRAM in the hierarchy. 

We could then see X86-GPU servers with extremely large memories capable of running analytical and machine learning model development workloads orders of magnitude faster than today. The Rambus and SK hynix HBM3 technology glimpses suggest a tantalising future.

Lenovo claims global number-2 position in mainstream storage

Server and PC manufacturer Lenovo says it has secured the second-place ranking in worldwide mainstream storage supply. Has it really?

The claim was made during the company’s first fiscal 2022 quarter earnings release, in which it reported revenues of $16.93 billion, up 27 per cent year-on-year, and its profits of $466 million rose 119 per cent over the year-ago $213 million.

The company is split into three divisions:

  • Solutions and Services Group  (SSG) — revenues of $1.18B, up 38 per cent year-on-year with an operating profit of $264M, up 51 per cent.
  • Infrastructure Solutions Group (ISG) — revenues of $1.84B, up 14 per cent, and slightly improved operating loss of $11M compared to last year’s $60M loss.
  • Intelligent Devices Group (IDG) — revenues of $14.666B, up 28 per cent, and $1.1B operating profit, up 43 per cent.

IDG is a PC and smartphone business while SSG is the services business. ISG is Lenovo’s old datacentre group, which sells ThinkAgile and ThinkSystem server and storage products, with some storage products being OEM from NetApp. Its markets include supercomputing/HPC, enterprises, small and medium business.

Lenovo says that ISG recorded its best overall results for five years and investments it has made are paying off. ISG has outperformed the market for six consecutive quarters, and is now close to turning profitable. 

Lenovo ISG Outlook from Q1 fy 2022 earnings presentation.

There was record revenue from cloud service providers (CSPs) who bought servers and storage. Lenovo recorded its highest revenue for five years from its enterprise, small and medium business (ESMB) customers. Its high-margin storage, HPC and Hybrid Cloud segments grew faster than the market with record revenues.

ISG, Lenovo says, is number three on X86 server sales, following leader Dell Technologies and second-placed HPE. It also claimed to be number two in mainstream storage worldwide, and backed it up with this slide:

Lenovo ISG summary from Q1 fy 2022 earnings presentation.

What does “mainstream storage” mean? Lenovo doesn’t say. Nor does it quote any research analyst numbers to back up its claim.

Gartner and IDC views

In Gartner’s November 2020 Magic Quadrant for Primary arrays Lenovo was classed as a Challenger, behind DDN. It was behind all the suppliers in the Leaders quadrant — meaning Pure Storage, NetApp, Dell, HPE, IBM, Huawei, Hitachi Vantara and Infinidat.

Lenovo did not appear at all in Gartner’s October 2020 distributed file systems and objects MQ.

An IDC Worldwide Quarterly Enterprise Storage Systems Tracker in June this year did not include Lenovo in its top five list of storage suppliers — Dell, NetApp, HPE, Hitachi and Huawei.

It’s possible Lenovo is counting its acting as an ODM (Original Design Manufacturer) for cloud service providers (CSP) and hyperscaler customers as its storage business. That business would not be visible in the Gartner and IDC enterprise storage systems reports. Lenovo’s ODM business is, in fact, substantial.

A Lenovo ISG earnings presentation slide shows the split between CSP and enterprise and SMB customers:

It looks to be about a rough average 60:40 split in favour of enterprise and SMB customers with the Q1fy2022 CSP revenues being above $800 million. We do not know what proportion of this is storage.

Dell’s latest quarterly storage revenues were $3.8 billion, NetApp’s $1.56 billion and HP’s $1.14 billion. A $1.835 billion quarterly ISG revenue number for Lenovo is split between servers and storage — implying that storage would be less than $1.835 billion.

For Lenovo to claim the number-two storage sales slot by revenue implies its quarterly storage revenue was below Dell’s $3.8 billion and above NetApp’s $1.56 billion. Therefore, by our crude comparative math, and if Lenovo’s claim is real, Lenovo’s Q1 fy2022 storage revenues were between $1.56 billion and $1.835 billion. That is impressive, if true.

We have asked Lenovo’s investor relations and public relations people what the “mainstream storage” term means and who it is supplying.  So far it has been unable to reply. We are really looking forward to finding out.

WekaIO, Tesla and Hitachi Vantara

WekaIO’s President sees the company as the Tesla of storage suppliers, and says OEM Hitachi Vantara is making inroads into the Dell EMC Isilon customer base as Weka crosses the chasm between it and general enterprise use.

WekaIO’s scalable, parallel and high-performance filesystem software has made its name in high-performance computing and become popular in enterprises that have HPC use cases — such as AI, machine learning, and genomics. It’s now set to cross over into more general enterprise file workloads.

BMW motorcycle-riding Jonathan Martin became WekaIO’s President this month. He had previously been the Chief Marketing Officer at Hitachi Vantara, serving from March 2019 to May 2021. He was the CMO at Pure Storage  before that, and at EMC as CMO before that. During that period, in July 2020, Hitachi V signed up to OEM WekaIO’s filesystem software, with the Weka software integrated with the Hitachi Content Platform (HCP). 

Jonathan Martin astride his BMW steed.

That product supports NFS and SMB file access, S3, and tiers data to public cloud object stores. The hardware can be hybrid flash/disk or all flash, and the HCP software can be used in virtual appliances or — supporting S3 only — in AWS. Weka’s software supports S3 and tiering to the cloud as well.

Martin told us about his perception of WekaIO’s market situation in a briefing last week, and likens Weka to Tesla. Before Tesla, he says, the automobile industry had been in a prolonged period of incremental development, with no fundamental technology changes. Tesla changed all that by developing a battery-powered chassis and software-augmented driving and ownership experience.

Tesla’s first car was a roadster with phenomenal performance, and then it broadened its product set with the up-market S, the mid-range model 3, the X SUV and, latterly, a truck. Tesla become the world’s most valuable car company in stock market capitalisation terms in June last year. All other car manufacturers are following in Tesla’s footsteps and trying to catch up.

Martin sees Weka’s phenomenally fast and scalable Matrix filesystem software demonstrating a Tesla-like level of innovation in the staid storage industry, where incremental development has been the norm for decades.

Chasm-crossing

Tesla used the Roadster as an attention-grabbing product and then began crossing the chasm to ordinary, everyday sedan acceptance with the upmarket S. Once it landed with this it expanded its market with the Model 3, all the while riding the oncoming environmental wave of non-polluting electric vehicles.

Martin sees WekaIO crossing the chasm from specialised HPC use cases to general commercial use by riding the wave of HPC-style or class file storage use cases needed for AI, machine learning and data-centric analytical processing. Customers buy Weka for one use-case and then find that, of course, its software can be used for other file-using applications — and give them a speed boost as well.

A startup storage software vendor faces obstacles penetrating the general enterprise market, where large incumbent suppliers have sold filers for a long time — such as Dell EMC’s Isilon and NetApp. IBM, with its parallel Spectrum Scale software, is not as present in the enterprise market as Dell EMC and NetApp and it, like WekaIO, is trying to broaden its enterprise appeal on the back of AI and machine learning use cases and NVIDIA GPUDirect support.

Qumulo, like WekaIO, is a filesystem startup — but positioned as a lower-performing product. It is making enterprise progress and is targeting the Isilon and NetApp customer bases. Its founders are ex-Osilon people, so it can speak to the Isilon customer base with a strong voice.

Martin sees WekaIO as having fundamentally better software but, in our opinion, positioned as an HPC specialist and needing partners’ help to get into the general enterprise market. Step forward Hitachi Vantara as an on-ramp and bridge into that market.

Hitachi Vantara

Hitachi Vantara is an enterprise-focussed storage array, filer and object storage supplier with a focus on being a team player in the Hitachi industrial group — which spans power generation, trains, making parts for automobile manufacturing, smart cities and industrial automation.

Martin said that the Hitachi Vantara OEM deal with WekaIO is hugely successful, with dramatic takeouts of large Dell EMC Isilon customers — think Global 200. He wouldn’t name them, saying they did not want to publicise any competitive advantage they were gaining. 

As an ex-Hitachi Vantara CMO he is well-aware of Hitachi V’s capabilities and the best way to partner and help its sales force.

Take this at face value. Hitachi V is pulling off Isilon takeouts using WekaIO. Those customers will have done competitive bake-offs pitching WekaIO against against Dell EMC’s PowerScale (Isilon successor), Qumulo, probably NetApp and other filesystem suppliers. If WekaIO can win a series of these and, having landed such customers, grow its use cases, then it is set for general enterprise penetration.

Comparing WekaIO to Tesla is setting a high bar — a very high bar. It implies that WekaIO’s filesystem software sales are going grow and grow, overtaking those of other suppliers.

The firm has raised $66.7 million in funding — a comparatively small amount compared to other software startups such as Cohesity, ($660M), Rubrik ($552M) and Druva ($475M). Qumulo itself has raised $351 million. Weka last raised cash in 2019, and may return to its investors for more cash in the future.

There are several strategic investors amongst Weka’s backers: Hewlett-Packard, NVIDIA, Qualcomm, Seagate, and Western Digital. That could help attract other investors to a potential future round.

BMW DC Roadster electric motorbike concept.

WekaIO has to grow at a high and sustained rate to justify Martin’s Tesla comparison. And that means it has to make progress in the general enterprise market. If Hitachi Vantara’s experience is repeated elsewhere in WekaIO’s channel, then the Tesla resemblance could become real — and Martin afford a new motorcycle. Perhaps even a coming BMW DC Roadster with a driveshaft instead of a chain.

Etoro takes the Silk road to Azure

Silk’s accelerated storage IO in the public cloud can make lifting and shifting non-cloud-native databases and other workloads to the cloud straightforward, and have them operate much faster and with sub-millisecond latency.

Online trader Etoro, which started up on 2007, has grown to more than 20 million customers hitting its on-premises data centres with trading requests. These data centres couldn’t keep up, and Etoro decided to move its databases and applications across to the Azure cloud and take advantage of its ability to scale quickly.

It considered doing so using PaaS (Platform-as-a-Service), but that would have required Etoro to change its database and application code — adding middleware, to use the underlying Azure platform facilities. This was unrealistic.

The alternative was IaaS (Infrastructure-as-a-Service), with Azure server, network, operating systems, and storage facilities presented virtually. This was much faster to implement, but the resulting performance — particularly of SQL Server — wasn’t good enough

Israel Kalush, VP of Engineering, eToro, said: “While some applications can be migrated easily into the cloud, others — especially ones that require high throughput IO with very low latency — are more complicated and require significant redesign. Significant redesign is expensive and, therefore, less likely to take place.”

Silk provides virtual storage array services in the Azure cloud, developed from its on-premises Kaminario all-flash array code base. The cloud-native software delivers high-speed storage IO by using, and protecting, Azure’s fast and unprotected ephemeral OS disks, which incur no storage cost. In effect, it is a database acceleration layer of software.

Etoro video describing Silk use.

The Silk code provides compression, zero-footprint clones, and inline deduplication. It says its users are then able to reduce the amount of cloud resources they need, and cut cloud costs by around 30 per cent.

Etoro decided to use this Silk storage layer between its software and the Azure facilitie,s and found its software ran up to ten times faster on Azure than without Silk. Using Silk made its Azure incarnation capable of supporting hundreds of thousands of database transactions a second at low latency.

Kalush said: “We have some extremely IO-intensive databases. Silk was the only provider that actually promised and delivered on sub-millisecond latency for those database applications.” Etoro has found that this low response time is maintained under heavy loads.

It also says its Azure adoption time was cut in half by using Silk, obviating the need to refactor its code. Check out a video to find out more.

Weekly digest of storage news featuring IBM, Dell, Box, Toshiba, NVIDIA and Kioxia plus a flotilla of smaller items

The themes this week are containers, flash, flash arrays, enterprise file-based collaboration and disk drives. Everything seems to be selling more, growing more, shipping more, developing more — except Box revenues, which is why activist investor Starboard Value is upset. Really upset.

IBM developing OpenShift storage product

IBM’s RedHat acquisition has led to its deciding to build a storage product around OpenShift, Red Hat’s Kubernetes orchestrator software. An IBM job-spec for a software product management team reads “This position will be leading a new OpenShift focused Software Defined Storage product to market. As Product Manager, you will work cross-functionally within IBM and Red Hat teams to drive the product success; coordinate roadmap priorities across IBM stakeholder Brand teams; and play a key role in managing third-party relationships. The ideal candidate for this role will have considerable technical and market understanding of OpenShift and Container workloads.”

We might expect a late 2022/early 2023 product announcement.

IDC publicises Dell AFA success

IDC has charted five AFA suppliers’ revenue rises since their products were launched. Dell has made the chart public: 

It shows the suppliers’ revenues normalised to months following product launch, not in real calendar time. Dell is by far the leader — with its lead growing — followed by NetApp and then a group of three: HPE, Pure Storage and IBM.

It doesn’t tell us anything we don’t know, but it would be most excellent if VAST Data could be added to the slide. Perhaps Jeff Denworth, a co-founder and its chief marketeer, could oblige?

Starboard declares war on Box board

Activist investor Starboard Value has had enough of delay, obfuscation and alleged deception at file-sharer and collaboration company Box, by its board and CEO Aaron Levie. It says Box has consistently failed to deliver on vital business targets whilst the self-interested board entrenches its own interests at the expense of shareholders.

Starboard has gone public with its concerns, appealing to shareholders via an extraordinary 184-slide PowerPoint presentation — diatribe, really. It’s literal death by PowerPoint, with the intended victims being Box board members. 

Reading between the lines it seems to us that Starboard wants co-founder and CEO Aaron Levie gone.

The deck repeats itself again and again, almost as if saying when you have a hammer then (a) everything is a nail, (b) get another hammer, and (c) get a third one to really get that nail hammered down. The points it makes do seem quite powerful. Read it yourself and see.

Toshiba disk ship data

Toshiba posted a massive 30 per cent quarter-on-quarter rise in nearline disk drive shipments in the second 2021 quarter, with 2.79 million units and 34.25 EB in capacity shipped — sequential growth of 32 per cent and 33 per cent respectively.

According to TRENDFOCUS, Toshiba’s four-quarter average unit and capacity shipped growth rates for nearline HDDs led all companies at 35 per cent and 46 per cent, respectively — higher than the industry average and showing Toshiba is taking percentage revenue market share from competitors Seagate and Western Digital.

Total nearline HDD shipments topped 19 million units in 2CQ21, compared to 17.19 million set in 2CQ20. Nearline HDD exabytes shipped reached 243 EB for the quarter.

Toshiba is the third-ranked supplier in the industry, but it is making progress and not being left behind.

NVIDIA GPU Direct limitations

GPU supplier NVIDIA has listed quite a long set of known limitations for its GPU Direct Storage (GDS). For example:

  • The RTX series of GPUs supports only compatibility mode.
  • With any NVIDIA software components installed, downgrading from RHEL 8.4 to RHEL 8.3 is not supported.
  • For DDN EXAScaler, checksum is disabled in the read/write IO path.
  • For WekaFS, checksum is disabled in the read/write IO path.
  • cuFile APIs are not supported with applications using the fork() system call.
  • GDS Compatibility mode works on GDS qualified file systems — EXAScaler, ext4, IBM Spectrum Scale, VAST, and WekaFS.
  • GDS with IOMMU enabled or ACS enabled are not guaranteed to work functionally or in a performant way with all non-DGX based platforms.

IBM Spectrum Scale limitations are documented here.

The presence of these limitations is not that surprising in a complex new product with multiple third parties integrating their products with GDS. Expect them to be sorted fairly quickly.

Kioxia’s UFS performance numbers kept secret 

Kioxia has announced new UFS v3.1 memory cards using its 5th generation BiCS flash — meaning 112-layer flash, thought to be TLC (3 bits/cell). There are 256GB and 512GB capacities in 0.8 and 1.0mm-high packages respectively. Kioxia will not publicly reveal performance numbers. We asked, and were told they were only available under a non-disclosure agreement.

All Kioxia publicly says is that the new drives improve performance by 30 per cent for random reads and 40 per cent for random writes.

We calculated sequential speeds from a previous Kioxia UFS product — a UFS v2.1 product from December 2017 — and applied Kioxia’s percentage uplifts with subsequent generations. We also noted the announcement in March of a 1TB 112-layer UFS v3.1 drive being sampled by Kioxia, which did have sequential performance numbers supplied. This data was tabulated:

On this basis we think that Kioxia’s latest 256/512GB 112-layer UFS drives output around 2GB/sec sequential read bandwidth and 1GB/sec sequential write bandwidth. That’s just our guesstimate.

Shorts

Civo, a cloud-native service provider, has a Civo Academy — a free, full Kubernetes learning programme, consisting of over 50 videos from in-house developers at Civo. It is available now and needs email registration to access. 

The ninth Coldago Research Storage Unicorn Note edition June 2021 lists 15 private storage companies with a $1 billion valuation minimum. It includes companies belonging to private equity firms — ie not all of them are startups. Companies listed in alphabetic order are: AcronisBarracuda Networks, CohesityDDNDruvaInfinidatKaseyaNasuniOwnBackupQumuloRubrikVAST DataVeeam SoftwareVeritas Technologies and Wasabi Technologies. Will WekaIO join this list?

Cloud data protector Datto reported Q2 CY2021 revenues of $151.6 million, up 22 per cent year-on-year, with profits of $16.9 million, up 93 per cent year-on-year. CEO Tim Weller said: “Our second quarter results mark one of the strongest quarters in our history and are a clear indication of the power of the MSP model”. Subscription revenue growth accelerated to 21 per cent year-over-year and it added 500 net new partners in the quarter. Datto has new product launches in cloud and security planned for the second half of 2021. 

Data warehouse opponent Dremio has released a survey saying data warehouse users are dissatisfied. It says “A staggering 94 per cent of data leaders voice serious concerns over data warehouses and only 22 per cent saw a full return on investment. In order to run analytics, enterprises are making multiple copies of their data — 12 copies on average.” Dremio says they should use a data lake instead.

FileShadow announced an iOS smartphone app that connects data repositories from the cloud — such as Box, Dropbox, Google Drive, iCloud and Slack — with local storage — macOS, Windows Desktops, Windows Virtual Desktops — and network and direct-attached storage devices. Users can manage collections, apply tags, view and publish files, and manage their accounts from the app. With machine learning, data can be searched or organised based on file content, OCR results, GPS location or image analysis. A search for sailing can find images with a sailboat or the word “sailing” in a document. Check Apple’s App Store for a copy. 

Chinese server supplier Inspur and market researcher Omdia have released an Open Computing White Paper at the the third OCP China Day 2021 summit meeting. Omdia predicts that 40 per cent of servers worldwide will be based on open standards by 2025. The paper takes a look at computing’s environmental impact. Download the paper here.

Micron announced the launch of its Crucial P5 PCIe NVMe SSD for the consumer PC market. It uses Micron’s latest 176-layer 3D NAND, enabling lower power consumption, higher speeds, and increased density than prior 96-layer product. The P5 delivers up to 6600MB/sec sequential read speeds — nearly double the prior generation — as well as 66 per cent faster sequential write speeds, 67 per cent faster random read and 40 per cent faster random write speeds over the prior Crucial Gen-3 (96-layer) SSDs.

Model9, which produces VTL software writing to on-premises or public cloud object storage, announced the availability of its Cloud Data Manager for Mainframe in the Azure Marketplace.

Phison has introduced its PS7101 PCEe 5.0 Redriver integrated circuit (IC), claimed to fix signal attenuation and signal noise caused by the the transmission process on the motherboard or riser card in PCs and Servers. The PS7101 has high gain and high linearity which counterbalances attenuation and noise in the motherboard. Phison is also developing a PCIe 5.0 Retimer IC to suit different product environments in addition to the PCIe 5.0 Redriver IC.

Semiconductor IP developer Rambus has an HBM3-ready memory interface subsystem consisting of a fully-integrated PHY and digital controller supporting data rates of up to 8.4Gbit/ses and 1.075TB/sec of bandwidth, more than double that of high-end HBM2E memory subsystems. Rambus has interposer and package reference designs to speed customers’ products to market.

Redis Labs, which supplies an open-source in-memory NoSQL database, is rebranding to just Redis. This will not affect the licensing of open source Redis, which will continue to be BSD licensed, nor the governance model, which was introduced last year.

Scality’s RING object storage software running on HPE 4510 Apollo servers is being used by Australian broadcast service vendor MediaHub. It’s using this to offer Storage-as-a-Service aimed at the Australian broadcast market. HPE and Scality will be supporting ArkHub — MediaHub’s new low-cost data storage service — providing customers with storage and access to their expanding archives, without additional ingress, egress and retrieval fees.

SIOS announced the GA release of its Protection Suite for Linux v9.5.2 clustering software, with enhanced automation and application failover orchestration to make operating high availability (HA) clusters in complex SAP S/4HANA environments easier and more reliable. It also recently released SIOS Protection Suite for Windows v8.8.0 clustering software, making HA clustering in the cloud faster and easier.

Lend me your ears — Secondary storage systems supplier Spectra Logic says it’s starting up a podcast series called The Spectra Current. These will be fronted by its VP for Corporate Marketing, Betsy Doughty, and she bigs up the podcast, saying: “We’ve lined up a variety of remarkable guests who have agreed to share their unique stories and individual insights with our listeners.” The first episode, covering the The History of Spectra Logic, is posted on the Spectra Logic web site. In it, Betsy says she could not be more excited. Excellent.

Talend’s Data Fabric provides data integration and governance with a trust score indicating a dataset’s reliability. It has been updated with high-performance integrations to leading cloud intelligence platforms, a self-service API portal, collaborative data governance capabilities, and private connections between Amazon AWS and Microsoft Azure to ensure data security.

Data warehouser Teradata announced strong adoption in the first half of 2021 of its Teradata Vantage cloud data analytics platform. It listed 20 customers including Sony Pictures Entertainment, Gap, Jupiter Networks, Kobe Steel and MGM Resorts. With Vantage, enterprise-scale companies can eliminate silos and query all their data, all the time, regardless of where the data resides — in the cloud using object stores, on multiple clouds, on-premises or any combination thereof.

TerraMaster U12 storage server.

Chinese supplier TerraMaster has launched its U12, a 12-bay storage server with a quad-core Xeon E-2224G processor and 8GB of DDR4 memory upgradeable to 64GB. There are four gigabit network interfaces and two PCIe Gen-3 x 16 slots, with support for 10Gbit NICs and RAID cards. The U12 delivers up to 3000MB/sec with over 500,000 IOPS. The CPU is upgradable.

Toshiba says its 18TB MG09 disk drives work with all common Adaptec host bus adapters (HBAs) and RAID adapters from Microchip.

Quantum hires Cisco exec as chief sales boss

On the back of reporting its first Q1 growth quarter since FY2017, Quantum has hired Cisco executive John Hurley to be its Chief Revenue Officer.

John Hurley

Hurley is — was — Cisco’s VP for the global commercial segment and that capped a 13-year stint at the networking business. Before that he spent three years with Dell as an area sales VP for the midwest and, prior to that, was at Cisco — again — as a client director looking after Ford. This is a large enterprise sales guy through and through.

Quantum CEO Jamie Lerner said: “The appointment of John Hurley demonstrates the scope of our ambition as a company. His experience working with the largest global enterprise, commercial, and service provider customers will prove invaluable as we accelerate our growth trajectory.” 

Hurley said of the move: “Quantum’s experience in helping clients orchestrate colossal amounts of video and unstructured data sets the company apart from the competition, and I look forward to building on this to generate growth in new markets around the globe.”

Lerner added: “We’re now supporting organisations in cloud services, government, media and entertainment, research, transportation, finance, and beyond to achieve their digital transformation goals. Not only in the markets we’ve traditionally served, but also in emerging areas that are increasingly harnessing the power of video and data to drive business forward.”

Comment

The appointment says that a big league enterprise sales exec at a blue chip company — Cisco — thinks it worthwhile taking a punt on Quantum, a much smaller company with a strong and unfolding recovery story following a troubled past.

It’s a tribute to Lerner that Quantum’s situation has improved so much that he can recruit execs of Hurley’s standing.

ChaosSearch growing fast amidst mountains of chaotic data

Data analytics startup ChaosSearch is growing like crazy as its analytics software enables more needles to be found in haystacks faster.

The company, which recruited IBMer Ed Walsh as its CEO a year ago, started out as a log data analytics business but is now expanding its remit to include more general, multi-modal analyses of data lakes stored in cloud object storage repositories.

It increased revenues 611 per cent year-on-year — a growth rate indicative of a low starting point — and has doubled its headcount this year and should double again by the end of the year. The headcount increases come off the back of a $40 million B-round fund raise in December 2020. 

ChaosSearch also tripled its customer base, which now includes organisations like Equifax, Blackboard, Klarna, Armor, Hubspot, and BAI Communications. 

Andreessen Horowitz paper

There is a paper produced by Andreessen Horowitz entitled Emerging Architectures for Modern Data Infrastructure. This nicely, if in a somewhat complicated way, positions data sources, ingests, lakes, warehouses, and analytical operations for newer data infrastructures concerned with data/workflow pipelines for analysing historical and operational data.

Here is the authors’ unified architecture  diagram: 

This diagram excludes transactional systems (OLTP), log processing, and SaaS analytics apps.

ChaosSearch would be positioned in the Historical and Predictive columns of this diagram and combine ad hoc queries and real-time analytics.

The paper’s authors point out that the data warehouse is the backbone of the analytics ecosystem, and it needs loading from sources with Extract/Transform/Load (ETL) procedures. Data lakes are the foundation of the operational ecosystem and take in raw data.

There is a trend of developing combined data lake/data warehouse functionality with a unified data representation. If analytics and search runs can take place on raw data, then ETL procedures can be sidestepped.

Indexing background

Thomas Hazel.

ChaosSearch CTO Thomas Hazel briefed us about its technology, which is built on a new way of indexing content. For content to be searched it has to be indexed, and an index supporting full text search is large. Hazel found a way of combining compression and indexing that is much more space-efficient than existing methods.

His method takes a source file and compresses and indexes it by separating out what is being compressed — data elements or symbols — from its position in a file.

Imagine a text file which looks like this: “The cat had black hair. The dog had brown hair, unlike the cat.”

We have two occurrences of “cat” — one starting at the fifth byte from the start of the file, and the other starting at the sixtieth byte from the start of the file, if my counting is correct. A single entry for cat would be found in the symbol file and there would be another entry in the position file looking something like this: <5><60>

The overall space taken up by these two files can be up to 50 per cent less than with existing compression methods such as ZIP. That means that, when the files are loaded into memory for processing, less DRAM is needed.

The data file deliberately loses locality — but the index is lossless. Hazell says this make data reduction — compression and deduplication — much more effective.

The position file is composed of integers and mathematical operations, such as Join and Order, can be applied to them. He says: “By doing this, because locality is [now] so manipulable, the shape or schema can be altered easily. We can apply schemas on the fly. I can do transformations at petabyte scale.”

Hazel says that separate silos are no longer needed, such as ones having extracted, transformed and loaded (structured) data and others for raw, log data. “There are no separate silos. We have it all in ChaosSearch.”

He talked about the Snowflake data warehouse, saying: “it is great at searching relational data but it can’t do things very well with log data; it has no search … You can do Find operations in particular columns but you can’t do full text search.” You can do both with ChaosSearch.

Unified representation

He says ChaosSearch technology can search both operational log data and business intelligence (structured) data “and correlate log data to the BI database.” It’s a unified representation, as described in the Andreessen Horowitz paper.

ChaosSearch is adding relational data search to its product this year along with SQL access to log data.

It stores all its data in cloud object storage, meaning AWS S3, Azure blob and Google Cloud Storage. Hazel suggests a query run via ChaosSearch on a 9TB COS store could complete in seconds. “We’re highly optimised, index-optimised, for COS.”

Check out a detailed look at a white paper describing ChaosSearch’s technology here.

Comment

The notion of combining real-time analytics with historical analyses and methods has also been adopted by Kyligence, also by Ahana.

Dremio also has relevant technology. The whole area is a hotbed of development. We wonder if secondary data management companies like Cohesity and Rubrik will be forging links with analytics companies so that their Cohesity and Rubrik customers can look deeper into all the petabytes of backup data they are amassing.

Cohesity’s worldwide sales boss takes a walk to ‘entirely different industry’

Michael Cremen, Cohesity’s Chief Revenue officer, is quitting the company for a new opportunity, the details of which have not yet been disclosed.

Cohesity is a fast-growing and well-funded data management startup transitioning to a SaaS business model. The company has just reported a standout quarter. Founded in 2013, it was valued at $3.7 billion in March and has taken in $660 million in funding, with the most recent round bringing in $250 million last year.

Michael Cremen

Cremen was recruited as Cohesity’s first CRO in November 2019. He came from being a Veritas SVP for two years, responsible for sales, management, operations, and the financial performance of the Veritas business in the United States, Latin America, and Canada. Before that he had spent 16 years at Hitachi Data Systems, finishing up as EVP for global sales. There was a one-year stint as a portfolio exec at IBM Global Technology Services between the Hitachi and Veritas periods.

A Cohesity spokesperson said: “Michael Cremen … has decided to pursue a new opportunity in an entirely different industry. Michael will remain in his role for much of the quarter, managing the company’s Worldwide Field Organization and working closely with Cohesity’s executive team. In the meantime, an executive search is underway.”

He added this statement: “Michael has contributed in many ways to Cohesity’s ongoing success, including hiring or promoting industry veterans to lead the company’s sales and partner organizations globally. Under this dynamic leadership team, the company has continued to grow and report record-breaking results. With thousands of enterprises around the globe, including two of the top five Fortune 500, embracing Cohesity’s next-gen data management to mitigate threats from ransomware attacks and do more with data, Cohesity looks forward to onboarding another world-class CRO to manage an incredibly high performing team and take Cohesity’s exceptional performance to the next level.”

It may be a concern that Cohesity’s sales momentum isn’t slowed by Cremen’s departure. We understand Cremen will formally leave Cohesity at the end of September.