Home Blog Page 231

Catalogic partners with Backblaze to tackle ransomware and tape maintenance

LTO tape
LTO tape

Data protection vendor Catalogic has turned to Backblaze as a cloud destination to counter the twin evils of ransomware and expensive tape maintenance costs.

The two companies unveiled a partnership today that will see Catalogic’s DPX data protection suite serve up Backblaze’s B2 Cloud Storage as an “infinitely scalable” backup target. DPX provides block-level protection for physical and virtual servers, with Catalogic claiming it can reduce backup time by 90 per cent.

At the same time, Catalogic’s CloudCasa data protection, backup, and recovery service will support Backblaze B2 as a “low cost and secure” backup destination. The CloudCasa service is aimed at protecting Kubernetes, cloud databases, and cloud-native applications. In both cases, Backblaze will appear as an object storage option for backups.

The companies are pitching the tie-up as offering customers a massive speed premium over traditional backup, while also delivering a 75 percent reduction in costs compared to the competition.

That competition includes tape, with the newly minted partners’ statement saying: “Using Backblaze B2 for tape replacement… provides a significant cost reduction opportunity.” The DPX service uses S3 Object Lock to ensure backup and archive data is immutable.

Despite the focus on immutability, both companies are in the midst of rapid change.

Catalogic offloaded its ECX copy data management business to IBM last year to focus on cloud protection and security. The move followed its entry into Kubernetes persistent volume backup in 2020 with the launch of CloudCasa, putting it up against heavyweights including Commvault, Dell, Dura, Pure, and Veeam.

Backblaze was founded in 2007 and IPO’d last November with the aim of capitalizing on what CEO Gleb Budman described as a future “built on independent clouds.” Its most recent figures showed losses rising faster than revenues as it looked to grab a slice of the potentially enormous cloud backup market. In January, it partnered with Kasten by Veeam to offer an expanded ransomware protection and backup service.

Vcinity turns to Dell to help customers defy data gravity

Vcinity is banking on a tie-up with Dell’s OEM Engineered Solutions team to make it easier for data-heavy customers to adopt its hybrid cloud data platform, four years after first unveiling the technology.

The partnership is pitched as offering a “turnkey” solution to deliver its hybrid cloud software to customers who want to take advantage of cloud compute resources, without having to actually move vast amounts of data into the cloud.

Vcinity’s platform includes a Global Fabric taking advantage of RDMA and an integrated parallel file system, which delivers a single global namespace. Vcinity says its RDMA implementation isn’t restricted to InfiniBand and RoCE, meaning it can run over any network topology and, crucially, be extended outside the datacenter.

The platform can be attached to enterprise LANs as a standard NAS, or as a transitional high-speed storage tier, offering “geo-diverse” data exchange over metropolitan area networks or WANs. Data can be seen as NFS, SMB or S3, without the need for additional agents.

The company claims this means compute can be deployed whether on-prem, in the cloud, or at the edge, without having to migrate or cache data. The company is selling the tech as a way for heavyweight data sets – such as pharma, genomic or media – to use compute in the cloud without the headaches, security worries, and expense of getting the data into the cloud and back again.

The spec sheet for the Dell tie-up features two software-only options, the ULT X-1000v and ULT X-1000s, which offer throughput up to 3Gbps and 7Gbps respectively. The ULT X-1000e and ULT X-1000 come as physical appliances and offer up to 10Gbps and 60Gpbs respectively.

Vcinity came out of stealth mode in 2018 after three “family office investors” backed its acquisition of technology originally developed by Bay Microsystems. It demonstrated its Ultimate X products in 2019 by transferring 1PB of file data across a 4,350-mile 100Gbit/s WAN in 23 hours and 16 minutes, claiming it was able to utilize up to 97 percent of available bandwidth.

That topped a Bay Microsystems demo a year earlier, which saw it shift 1PB across a 2,800-mile network in just over 23 hours.

Intel: Third-gen Optane to be announced in the coming weeks

Intel will soon announce its third-generation Optane products and is working on another iteration in the series. This was stated by Intel veep Kristie Mann during a Storage Field Day presentation on Friday.

Optane is Intel’s implementation of its 3D XPoint technology, developed initially in partnership with Micron. Last year, Micron exited the business of 3D XPoint chip and drive manufacturing, and sold off its XPoint foundry in Lehi, Utah, which supplied components to Intel. First-generation XPoint was succeeded by generation two, code-named Alder Stream for SSDs and Barlow Pass for persistent memory in 2020.

During the presentation, Mann said Intel is ”about to announce gen 3 and working on the next generation [which] will do CXL Memory tiering.” This marries up with what Intel told us earlier this month. Mann’s team on Friday talked about two new Optane use-cases though added no information about actual gen 2 and 3 Optane products or timescales.

Our snap of Intel’s memory tier time line, shown during Friday’s briefing

The audience was told that Intel is working closely with VMware on getting vSphere and allied software developed to use Optane persistent memory. Project Capitola was cited, and Intel is working with other hypervisor makers, too. Asked about these partnerships, Intel cloud architecture consultant Flavio Fomin said, “yes, there is a lot happening” outside the VMware world. For him, “tiered memory is the new normal,” meaning DRAM is treated as tier zero memory and Optane persistent memory as tier one. 

Hypervisors ought to be tweaked to know about and take advantage of the difference between DRAM and XPoint memory, while application-level code does not need to be altered. We are left to envisage that Microsoft with Hyper-V, Nutanix with AOS, and Red Hat with KVM are all working with Intel on Optane persistent memory support.

Use cases

The first new use-case is an Intel in-house example involving searching data ingested through Kafka and Splunk with the information stored on a VAST Data system with 650TB of Optane SSD and QLC NAND SSD capacity. A search of this data took 29.5 seconds on an HP Ezmeral system using direct-attached SSDs, and 25.1 seconds on Intel’s setup. We note that VAST’s software had reduced the already-compressed Splunk data; the 650TB VAST Box could store more than 1.6PB of Splunk data.

The second use case was large-scale similarity searches, which are used by websites of the Baidu, eBay, Amazon, and YouTube ilk to make recommendations to netizens. This process involves searching billion to trillion-item datasets, with data stored as feature vectors and, separately, graphs. Intel principal engineer Dr Jawad Khan told the presentation’s audience that Intel won a metric in the Billion-Scale Approximate Nearest Neighbor Search Challenge at NeurlPS 2021.

Intel’s slide comparing its tech and that of its rivals

He contrasted a dual-socket 28-core Intel Xeon server fitted with 512GB of DRAM and 2TB of Optane persistent memory, costing $14,664 in street pricing,  with an Nvidia A100 GPU-powered server system needing a pair of 64-core AMD host processors with 2TB of DRAM, with the overall system costing around $150,000.

The Optane box was 8 to 19 times lower in cost than the Nvidia system for the same level of performance, or so we were told.

We are left thinking that Intel is still committed to Optane and its 3D XPoint technology, hammering away at extending the ecosystem of supporting partners, and pushing forward with denser gen-3 and -4 XPoint technology. That implies it is making or has made manufacturing plans, and has gen-2 persistent memory and SSD product families planned in some detail and gen-4 equivalents in outline.

Blocks & Files’ Optane roadmap diagram

We’ll hear more in later this month, or in April, when we expect Crow Pass XPoint products will be unveiled.

Storage news ticker – 11 March

Storage
Ticker tape women in Waldorf Astoria

DCIG president and founder Jerome Wendt has blogged about the Five Challenges of Managing Air-gapped Technologies Using Backup Software. We’ve not seen this issue raised elsewhere. He says storing backups on air-gapped technologies prevent ransomware from accessing backups to delete or encrypt them. The five challenges or problem areas include: (1) No support for physically air-gapped technologies, (2) Limited or no support for bucket or object lock, (3) Unacceptable performance, (4) Cost creep and (5) Policy creation and management. DCIG’s 2020-21 DCIG TOP 5 Enterprise Anti-ransomware Backup Solutions Report discusses backup products that can meet the challenges.

DDN A3I system.

DDN announced a partnership with Aspen Systems, the manufacturer of HPC products, to help the Pacific Northwest National Laboratory (PNNL) with the Coastal Observations, Mechanisms and Predictions Across Systems and Scales (COMPASS) project. The project will enhance the predictive understanding of coastal systems, including their response to short-term and long-term changes. Aspen Systems will use DDN’s A3I systems, EXAScaler parallel filesystems, and Storage Fusion Architecture (SFA) appliances to provide the custom HPC setup needed to meet the PNNL workload requirements.

Archive storage supplier FalconStor announced Q4 and full calendar 2021 year results with Q4 revenues of $3.8m compared to $3.65m year ago. There was a $191K net loss, versus the year-ago $109K loss. FalconStor recorded a 20 per cent increase in subscription revenue to $1.4m, compared to $1.1m a year ago. Progress is slow. Full 2021 revenues were $14.2m in contrast to the higher $14.8m for 2020, with a $203K profit that came in well below the prior’s year $1.1m profit. Year end cash was $3.2m, compared to $1.9m at the end of 2020. FalconStor CEO Todd Brooks said: “We are making good progress against our strategic plans to reinvent FalconStor and enable secure hybrid cloud backup and data protection.”

…..

The Texas Advanced Computing Center (TACC) has selected GigaIO as the composable disaggregated infrastructure solution for its Lonestar6 Dell-built supercomputer due to the ability to mix and match accelerators, for the open standards nature of their platform, and for the energy savings from reduced power, cooling and footprint.

Realtime database supplier Redis is partnering with Tecton and the two have integrated their products to enable low-latency, scalable and cost-effective serving of features to support operational Machine Learning (ML) applications. The Tecton feature store is a central hub for ML features – real-time data signals that power ML models. Tecton allows data teams to define features as code using Python and SQL, and then automates ML data pipelines, generates accurate training datasets and serves features online for real-time inference. Feature stores typically use key-value databases as online storage for low-latency serving. The Redis Enterprise Cloud, a fully managed Database-as-a-Service (DBaaS), can be the online store and provides 3x faster serving latencies compared to Amazon DynamoDB, while reducing the cost per transaction by up to 14x, says Redis.

….

Western Digital CEO details how he got the storage juggernaut back on track

Western Digital CEO David Goeckeler told to investors this week that the company had fallen behind in disk drive tech so he decided to restructure the exec leadership.

David Goeckeler

Goeckeler explained at the Morgan Stanley 2022 Technology, Media & Telecom Conference: “I’ve spent the last two years kind of rebuilding Western Digital, the executive team, the way the company is structured, all kinds of things… we spent a lot of time over the last two years, really getting the right leadership in place and getting the structure put in place in the company [so] that we could really produce and get the most out of the franchises we have.”

That’s the hard disk drive (HDD) and flash (SSD) business units inside WD.

He continued: “We split the product portfolio in two, put business units in place, brought general managers in so they could really focus on executing and driving the right roadmap decisions for the portfolio. And I feel like our technology roadmap is just in a vastly better position.”  The two general managers are Robert Soderbery for flash and Ashley Gorakhpurwalla for HDDs.

WD and Kioxia JV

On arriving at WD, Goeckeler found that the flash foundry joint venture with Kioxia had relationship problems. Previous CEO Steve Milligan had overseen contentious legal wrangles with Toshiba as it tried to spin off its foundry business. 

Goeckeler told the conference:  “I have invested quite a bit in making sure that this relationship is very, very strong. And it is, I mean, on an engineering basis, you probably wouldn’t know that there’s two different companies… I talk to my peer [Kioxia CEO Nobuo] Hayasaka every couple of weeks and make sure we’re really tightly engaged on what is a very, very important relationship for both of us.”

SSDs

Robert Soderbery

Goeckeler also disclosed at the event that the latest BiCS5 112-layer 3D NAND technology has proved very capital-efficient, saying WD “can use a lot of the same tooling in the fab that was used for BiCS4,” the previous 96-layer technology. More than half the bits coming out of the fabs are now built with BiCS5 technology.

He said: “Our client SSD portfolio is very strong, [in a] very good position… There’s been a complete transition of… our client hard drives to client flash… The company has played that transition extremely well… We have a strong position in consumer. It’s always good to have a captive franchise in your portfolio. The SanDisk brand is very, very strong. We have a strong position in mobile. Gaming has been very good for us. But the one pillar of the portfolio that we needed to build out was enterprise SSD… it’s an important TAM [total addressable market].”

Unlike the client drive area, where flash cannibalization is pretty well complete, HDDs in the datacenter remain strong, with nearline drives dominating. WD has found it hard going in the enterprise SSD market but is making progress. Goeckeler said: “We build our own controller. It’s been a big goal of mine since I came here and I think, really, calendar year ‘21 was a breakthrough for us.” 

He said: “We started the year with… our first qualification at one of the big web-scale players. We deployed throughout the year at that player. As we went throughout the year, we qualified at the second and the third and then also two big enterprise OEMs in the storage space.

“It’s a multi-step story. The qualification is the big piece of that and we feel very good about where we’re at.”

The recent chemical contamination incident has caused a $250m loss of production. Prices are going up and Goeckeler is selling bits in more expensive devices: “We don’t have as much volume as we used to. So we’re going to mix it in a way that has the most value.”

HDDs

Ashley Gorakhpurwalla

Goeckeler admitted WD failings at the 16TB drive level, saying: “We fell behind… I don’t think that’s a secret. I’ve been very open about that.”

WD is now back on a par with competitors Seagate and Toshiba, regaining parity at the 18TB level, according to the CEO. ”Our big focus on 18 was to get back on our front foot and lead that capacity point, which we did.”

The company has used its NAND technology to increase HDD capacity through higher track density with its OptiNAND drives. A 20TB drive is sample shipping and there is a roadmap out to 30TB through using microwave-assisted recording (MAMR), after which HAMR technology will be needed.

Another track density increase tech, shingled magnetic recording (SMR), which slows down data rewrite speeds and needs drive or host system software management, is becoming more popular with customers and important to WD. Goeckeler said: “We’re seeing customers now really start to adopt that.” 

He explained: “SMR is going to give you 10 percent to 20 percent more capacity on a drive, and when the drives get to 20 terabytes, that’s significant. And we’re seeing very good adoption now on SMR, where if you go back a year ago, it wasn’t quite there.” 

Overall Goeckeler doesn’t expect WD to be hit by component supply shortages as a result of Russia’s invasion of Ukraine. He was asked: “Any issues with Ukrainian supply chain whether neon gas, palladium from Russia, anything like that, that you can see impacting you?”

He replied: “No, we’ve looked at this very, very closely. It’s obviously a very tragic situation. But from our business perspective, we’ve got multiple suppliers for any gases and very low risk.”

Comment

The Goeckeler medicine is working. WD has healed relations with Toshiba/Kioxa and he’s helped the disk drive business to recover. If he can build up the enterprise SSD business as well then WD should be in great shape for the future. Goeckeler was not asked about supplying flash to the automotive sector but we don’t suppose he and Soderbery are going to let that market escape their attention.

Flash and software-defined storage are taking over the datacenter

A report by SWZD titled “Hardware Trends in 2022 and Beyond” says that although the public cloud is taking customer spend away from on-premises datacenters, they’re still crucial infrastructure. 

The report shows on-premises server spending declined from 33 per cent of 2020 IT budgets to 30 per cent in 2022. By 2023, SWZD (Spiceworks Ziff Davis) expects 50 percent of workloads to run in public clouds, up from 40 percent today. However, hardware still represents the largest IT spending category and physical servers won’t go away. Instead they will evolve to integrate more seamlessly with cloud infrastructure.

Server-attached SSDs are now growing in popularity. A majority of companies, 55 percent, use SATA-based SSDs in servers, with an additional 14 percent planning to within two years. A minority, 40 percent, use SAS SSDs, with another 16 percent intending to do so within two years.

NVMe SSD use is growing faster. SWZD found that 13 percent of businesses reported using NVMe locally in their physical servers in 2019. That has jumped to 37 percent in the 2022 report and will grow to 54 per cent inside two years. 

NVMe adoption has risen even higher to 45 percent of enterprises with 500+ employees.

External flash array storage was used by 14 percent of businesses in SWZD’s 2019 report; 24 per cent currently use them and that will rise to 44 per cent within two years. 

Only 14 percent of the smallest companies (1-99 employees) currently use all-flash storage, compared to 39 percent of larger enterprises. 

Other findings in the report concern technology use by percentage of businesses:

  • Software-defined storage (39 percent) 
  • Server automation/orchestration (37 percent) 
  • Software-defined networking (36 percent) 
  • Integration with a public cloud (36 percent) 
  • Workload migration (34 percent)

Currently, 36 percent of organizations have integrated their on-premises infrastructure with a public cloud. Within two years, an additional 18 percent plan to implement this capability, meaning the majority of companies are expected to have hybrid cloud capabilities by the end of 2023.

Nearly half (49 percent) of enterprises (500+ employees) are hybrid cloud-capable, compared to approximately 29 percent of SMBs (1-499 employees). Enterprises plan to add this capability at a faster rate going forward, with three out of four businesses planning to integrate with a public cloud within two years.

Public cloud-style billing is becoming more popular. Only 25 percent of organizations use this consumption-based infrastructure technology today but 57 percent of enterprises expect to adopt “pay-as-you-go” consumption-based infrastructure by the end of 2023.

Overall, the larger the enterprise, in headcount terms, the faster the adoption of new technology:

Enterprises are looking for more automation and more remote management. Overall there is great encouragement in the report for all-flash array, hybrid flash-disk array, and hybrid on-premises/public cloud infrastructure, particularly for subscription-based offerings such as Dell APEX, HPE GreenLake, and Pure Storage’s Pure1.

Azure Oracle database speed doubles with Silk storage: source

Wikipedia public domain image - https://en.wikipedia.org/wiki/Silk#/media/File:Meister_nach_Chang_Hs%C3%BCan_001.jpg

New Azure compute instances have doubled Oracle database speeds when using Silk storage, B&F has been told.

Silk’s software provides storage to apps running Azure virtual machines by using Azure’s Ephemeral OS disks. These are created on the local Azure VM storage but are not saved to Azure Storage. Silk spins up a Data Pod, a set of Azure Compute instances, and aggregates their performance with its own Flex software providing orchestration, resilience and enterprise features including RAID. As a result customers, through Silk, are effectively using Azure Compute to provide storage, and get to take advantage of Microsoft’s discounts for reserved compute instances.

In November Azure previewed new Ebs v5 Azure Virtual Machine and Ebds v5 Azure Virtual Machines, which use third-generation Intel Xeon Platinum 8370C (Ice Lake) processors in a hyper-threaded configuration. These new VM compute instances are memory-optimized and have 300 per cent higher VM-to-Disk Storage throughput and IOPS than the existing best Azure VM instances.

They offer up to 120,000 IOPS and 4,000 MB/sec of remote disk storage throughput. Microsoft said they are “ideal for the most demanding data-intensive workloads, including large relational databases such as SQL Server, high-performance OLTP scenarios, and high-end data analytics applications.”

A person close to Silk and Azure told us that with the “new EBS V5 Azure VMs … our throughput performance doubled from 5GB/sec to 10GB/sec without a single line of our code changing.”

We were shown a screenshot of an Oracle database running sequential, analytic-like workloads. “Crucially, it is running on a single VM, yet driving over 10GB/sec of bandwidth. This is an insane amount of data for a single-instance Oracle database to power.” 

Screenshot of an Oracle Database in Azure, using Silk storage, running sequential, analytic-like workloads

Oracle in Azure is licensed by CPU core so the fewer of them you use, the better it is financially. We were told: “For Oracle customers, this massive amount of data means their CPUs can be served faster, which drives higher CPU utilization, which means more efficient use of their hugely expensive Oracle processor core licenses.”

The source claimed customers are in some cases moving from high-end Oracle deployments like RAC or Exadata to Silk to achieve the performance they need.

Put this together with Azure’s multi-year discounts on Azure compute instances and the price/performance becomes even more attractive, they said.

“Even at list, the Azure pricing calculator gives 60-70 percent discounts when committing for three years, which would be typical for a customer deploying a database such as Oracle.” They said: “When we talk about cost, this is more important than any monthly cloud infrastructure costs.”

As far as we know, no other software-defined storage supplier is using Azure Ephemeral OS disks in this way. It makes Silk unique and appears to give it a significant price and performance advantage over other supplier’s block storage in Azure. Silk supports the Google Cloud Platform as well. We know AWS has instance store (ephemeral) volumes and Silk works its ephemeral compute instance volume magic on that as well. It’s got the main public clouds covered.

We have asked Oracle to comment and the reply was: “This is not something we can comment on.”

Storage news ticker – 10 March

Atlan, whose software functions as a virtual hub for data assets ranging from tables and dashboards to models and code, has raised a $50m in B-round funding at a $450m valuation. The round was led by Insight Partners, Salesforce Ventures, and Sequoia Capital India. Atlan’s tech enables teams to create a single source of truth for all their data assets and collaborate through deep integrations with tools like Slack, data warehouses like Snowflake and Redshift, BI tools like Looker, Sisense, and Tableau, data science tools, and more. The B-round comes just eight months after a $16m Series A led by Insight Partners and high profile angels like Bob Muglia, former CEO of Snowflake. 

Drive data erasure firm Blancco put out a report it says reveals shortcomings in public sector policies on device sanitization. The report, titled, The Price of Destruction: Exploring the Financial & Environmental Costs of Public Sector Device Sanitization, is based on survey findings from 596 government IT leaders in the U.S and in eight other countries. For the 70 organizations surveyed in the US the costs for SSD destruction and replacement reached between $6.9m and $7.3m. It says that, despite 54 per cent of global respondents agreeing that reuse of SSDs is better for the environment than physical destruction, and almost all respondents (93 per cent) saying their organization had defined plans to reduce the environmental impact caused by destroying IT equipment, less than a quarter (22 per cent) are actively implementing those plans.

Dell Technologies and Amazon have partnered to validate Dell EMC PowerStore infrastructure with Amazon EKS Anywhere. EKS Anywhere is a deployment option enabling customers to create and operate Kubernetes clusters on-premises using VMware vSphere while making it possible to have connectivity and portability to AWS public cloud environments. EKS Anywhere provides operational consistency and tooling with AWS EKS. Details are available in this blog.

Diamanti, which supplies Kubernetes app life cycle management software, has been awarded 10 patents. Several of them involve inventing “methods and systems for converged networking and storage,” and this refers to the common APIs it has developed to expand access to Kubernetes. It says it’s allowing any third-party storage or network provider to provide their services into the Kubernetes ecosystem. Diamanti will focus on harvesting deep telemetry data to get near real-time metrics and tracing on cluster and workload performance levels. It’s building AI/ML capabilities over this R/T telemetry data to allow customers to continuously monitor and optimize the resource allocations of their Kubernetes clusters. Diamanti also says it’s expanding its reach globally.

Lenovo has an expanded SMB portfolio with new single-socket ThinkSystem V2 Servers, newly enhanced TruScale Infinite Storage and a range of services. The new single-socket ThinkSystem ST50 V2, ST250 V2 and the SR250 V2 servers are for business-critical applications in retail, manufacturing and financial services. The ThinkSystem DM5100F storage system is an affordable, feature-rich all-flash SAN storage system. The ThinkSystem DM storage systems have built-in automatic ransomware protection.

Object storage software supplier MinIO recently added the ability to retire and upgrade server pools without compromising data availability. It says this feature is what it calls “enterprise enterprise” because most people don’t need it. But those that do…well, you get the point. A blog post shows how to decommission a server pool, view the status of a decommission in-progress or cancel it.

Lise-Marie Namphy.

Kubernetes-native data platform provider Ondat, said Lisa-Marie Namphy has been named to the company’s advisory board. She is Head of Developer Relations at database company, Cockroach Labs, a CNCF Ambassador, and a longtime advocate of open source serving as a volunteer organizing the San Francisco Bay Cloud Native Containers User Group – one of the world’s largest in the Cloud Native Computing Foundation (CNCF). Namphy joins Cheryl Hung, the former VP ecosystem at the Cloud Native Computing Foundation/Linux Foundation, who was added to the board last October. 

She writes in a blog post: “I still struggle with the shift away from the “StorageOS” name, but it was entirely right. The focus of the new Ondat brand is on Kubernetes-native delivery of the data services that developers want, and this is exactly what end-users now need. We are entering a time where how you do state in Kubernetes and what you do with state in Kubernetes become the all-important factors.”

OWC has announced storage and connectivity products for Apple’s new iPad Air with M1 and Mac Studio. They include:

– Envoy Pro SX super-fast extreme rugged portable SSD delivering  up to 2847MB/sec with today’s and tomorrow’s Thunderbolt and USB4-equipped Mac’s. Available in 240GB, 480GB, 1TB, and 2TB capacities.
– Atlas S Pro UHS-II V90 SD media cards in 32GB, 64GB, 128GB and 256GB capacities.
– ministack STX with a universal SATA HDD/SSD bay and an NVMe M.2 PCIe SSD slot, 3 x  Thunderbolt (USB-C) ports and in capacities ranging from 0GB enclosure (add your own drives), 2TB, 6TB, 8TB, 10TB, 14TB, and 18TB.
– ThunderBlade delivers transfer speeds up to 2800MB/sand is available in 1TB, 2TB, 4TB, 8TB, 16TB and 32TB capacities.

Veeam Software announced GA of the Veeam Backup for Microsoft 365 v6 SaaS offering. V6 provides automation and scalability for enterprise organizations and service providers, time savings in handling restores and not having to build and maintain your own portal, enhanced security with multi-factor authentication (MFA) access to restore data, and recovery confidence with a secondary copy of data in low-cost object storage.

Lightbits Labs heading towards the public cloud and MSPs

Now in its seventh year, Lightbits Labs is broadening its scope from being an NVMe/TCP-focused storage vendor to becoming a data services supplier which happens to use NVMe/TCP as part of its composable, disaggregated and scale-out storage. Its future looks to be based on moving to the public cloud, selling to MSPs, and using an Intel partnership and vSphere certification to open doors and raise credibility.

Lightbits Labs was started up in 2015 by a group of seven co-founders led by chairman Avigdor Willenz and CEO Eran Kirzner. In 2019, it announced Lightbox SuperSSD storage appliance, a 2U x 12 or 24-slot box with its LightOS software providing a global flash translation layer looking after wear-leveling across the SSDs and scale-out capability.

The OS also provides high-availability, thin provisioning, compression, RAID, erasure coding, and multi-tenant quality of service. There was an optional LightField acceleration card, with data reduction, data protection, NVMe/TCP, and global FTL acceleration, several years ahead of today’s SmartNICs, but that has been dropped.

It added a Kubernetes CSI plugin in 2020, and snapshots and thin clones came along in 2021, as well as multi-way replication and clustering.

This system was similar in overall scope to other NVMe storage arrays of the time, from Apeiron, DSSD, E8, Excelero, and Pavilion Data. Apeiron failed. DSSD was bought by EMC and subsequently canned. E8 was bought by AWS and Excelero has just been acquired by Nvidia. All these startups faced the same problem: the major incumbents bought or developed their own all-flash arrays and added NVMe networked access, often using RoCE – the datacenter-class lossless Ethernet protocol.

Pete Brey

NVMe/TCP offers similar remote direct memory access (RDMA) speed using ordinary Ethernet so provides less expensive data access. The incumbents have adopted this as well, and it means that Lightbits Labs is selling and marketing its NVMe/TCP all-flash storage in competition against Dell, HPE, Hitachi Vantara, IBM, NetApp, and Pure Storage, as well as newcomers like StorOne.

The big issue is how to differentiate itself from the pack with a product tech message that is unique and relevant. In a briefing Pete Brey, Lightbits’ VP for Product Marketing who was hired in December last year, said that recent business results have been encouraging with the customer count doubling in 2021 compared to 2020, increased deal sizes and a 2.3x expansion in its sales pipeline.

One differentiating factor is price, with Brey saying: “We can deliver the same performance at a much lower cost than competitors like NetApp and Pure.”

He said Lightbits’ LightOS was the first software-defined storage to be certified by vSphere and there are more than 10 vSphere customers running LightOS proof-of-concept tests and “a lot of runway with Tanzu and ESX.” Other deals are focused on OpenStack.

Another partnership is with Intel, which has invested in the company, while a German cloud services provider is a joint Intel and Lightbits customer. The LightOS software supports Optane SSDs.

Lightbits and Intel slide September 2020

Brey talked about the idea of LightOS running in the public cloud as the Lightbits Cloud Data Platform. The focus would be on edge clouds with general availability later this year. He discussed applications moving from private to public clouds and between public clouds, and said they needed a consistent (storage) interface across these environments.

“This is where Lightbits shines because it can deliver a consistent interface across the environments.”

Brey thinks the customer needs to buy a complete system and Lightbits should encourage an ecosystem that could deliver this. It could cover on-premises systems with hardware and software included, and it could also cover the public cloud, with managed service providers (MSPs) delivering a storage service as part of their offer.

He mentioned the idea of developing an AIOPs capability so that customers would not need a storage admin.

We’d suspect that Lightbits may need a slug of go-to-market funding in a year or so. Its last funding event was with Intel Capital in 2020, with $55 million raised so far.

There was a phenomenal burst of all-flash storage hardware and software array creativity in Israel in the 2000-2015 era with E8, Excelero, ExtremIO, Lightbits, and StorOne all part of it. Lightbits and StorOne are still around and growing but the others are gone, or their technology now part of something else.

We could see Lightbits focusing more on MSPs, and it may also find success selling to enterprises via a network of services-based resellers as a vSphere-certified software-defined storage supplier. Let’s check back at the end of the year and see how well it has done.

InfiniFS: Scientists claim to have solved the 100-billion-file problem

Chinese computer scientists reckon they have found a way to get through a metadata access bottleneck slowing file access when you’re looking for that one file in 100 billion.

Filesystems all store files inside a file and folder structure giving users control over where files are stored. This is in contrast to a database where a record is added and the database decides where it goes based on the record type and contents. If then a user needs to search for the record then some search language like SQL, is needed. Eg; SELECT * FROM Fruits WHERE Fruit_Color=‘Red’.

It’s different with files as you can just tell an OS or application to open a named file, and the system will happily tell you what files exist and where:

Mac OS folder and file listing

I can use this file-folder listing to tell the system to open file INfiniFS.pdf and it will then go to the right drive location (block address) and get it for me. Simple. But this is just my Mac PC and there are only a few thousand files in total on it, of which only a thousand or so, spread across tens of folders and sub-folders, are of interest to me, the rest being system files. A disk or flash drive is like a house, with folders being equivalent to rooms, sub-files stored in shelving units, and files stored on shelves in the units. It’s easy to enter the house, walk to the right room, find the right shelving unit and then the correct shelf.

The drive name and file access navigation path through the drive/folder/sub-folder structure are metadata items; special data describing where and how to access the actual content (inside a file) that I want.

But when the PC is actually a cluster of servers with millions of files, this metadata becomes extremely large and more so when you consider the file system software will check if you have permission to read or separately write to a file.

The 100-billion-file problem

Now let’s imagine the situation when there are thousands of servers and up to 100 billion files. It is now a distributed filesystem and the file location task is immensely difficult. The metadata is being checked constantly as users or applications read files, write to files, delete files, create new ones, and move groups of files from one sub-folder to another.

There have to be specific metadata servers, dedicated to just one metadata processing job, and they operate in a coordinated fashion within a single file namespace. The metadata processing burden has to be load-balanced across these servers so that file access bottlenecks can be prevented from slowing things down.

But that means my file access request may have to hop from one metadata server to another; it’s no longer local to one server, and the access takes more time to process. Access paths can become long, with 10, 11, 12 or more nested folders to navigate, and the top-level section of the file:folder structure, the drive and top level directories, could become processing hotspots, further delaying access as queues build up.

Five Chinese researchers devised InfiniFS to solve these three issues. Their scientific paper, “InfiniFS: An Efficient Metadata Service for Large-Scale Distributed Filesystems”, states in its abstract: “Modern datacenters prefer one single filesystem instance that spans the entire datacenter and supports billions of files. The maintenance of filesystem metadata in such scenarios faces unique challenges, including load balancing while preserving locality, long path resolution, and near-root hotspots.”

Three-way fix

Their InfiniFS scheme has three pillars:

  1. It decouples the access and content metadata of directories so that the directory tree can be partitioned with both metadata locality and load balancing.
  2. InifiniFS has speculative path resolution to traverse possible file access paths in parallel, which substantially reduces the latency of metadata operations.
  3. It has an optimistic access metadata cache on the client side to alleviate the near-root hotspot problem, which effectively improves the throughput of metadata operations.

An architectural diagram from the paper illustrates these three notions:

The key idea of the access and content metadata separation “is to decouple the access metadata (name, ID, and permissions) and content metadata (entry list and timestamps) of the directory, and further partition these metadata objects at a fine-grained level.”

InfiniFS stores metadata objects as key-value pairs: 

Generally high-content file systems treat the directory metadata as a whole. When partitioning the directory tree – dividing it between different metadata servers – they have to split the directory either from its parent or its children to different servers, which unintentionally breaks the locality of related metadata. The InfiniFS access and content decoupling function aims to fix this problem: “We group related metadata objects to the same metadata server to achieve high locality for the metadata processing phase.”

Another diagram depicts the decoupling concepts and is worth studying carefully to appreciate a file and directory metadata split and the access and content metadata separation (access, half-filled red circles, and content, half-filled blue circles):

The idea is to enable searches of a directory tree sub-structure (per-directory groups) that can be done inside a single metadata server and avoid hops to other metadata servers.

For the speculative path resolution, “InfiniFS uses a predictable ID for each directory based on the cryptographic hash on the parent ID, the name, and a version number. It enables clients to speculate on directory IDs and launch lookups for multi-component paths in parallel.” The researchers add: “Speculative path resolution (S-PR) reduces the latency of path resolution to nearly one network round-trip time, if correctly predicted.”

With regard to the caching: “Cache hits will eliminate lookup requests to near root directories, thereby avoiding hotspots near the root and ensuring scalable path resolution.”

Performance

The researchers compared InfiniFS metadata processing performance time with LocoFS, IndexFS, HopsFS, and CephFS, using a RAM disk to avoid drive IO speed differences. InfiniFS had higher throughput and lower latency:

The researchers conclude: “The extensive evaluation shows that InfiniFS provides high-performance metadata operations for large-scale filesystem directory trees.”

This is a 17-page paper and we have only provided a bare overview here. The paper is free to download and may suggest ways that, for example, Ceph could be improved to match filesystems with better performance, such as Qumulo Core and WEKA’s Matrix.

NetApp withdraws from Russia

Vladimir Putin

NetApp is suspending its Russian operations in light of the Ukraine invasion.

The company has employees in Russia and sells to local businesses via an OEM agreement with Fujitsu. Tech companies that have already withdrawn from Russia include Amazon, Alphabet, AMD, Apple, Cisco, Cogent Communications, Dell, Global Foundries, Google, HP, SAP, Oracle, HPE, IBM, Intel, Meta, Microsoft, Oracle, Samsung, SAP, TSMC, Veeam, VMware, and more.

A NetApp statement said: “NetApp is complying with the legal framework that the governments of the United States, European Union and United Kingdom have recently established, including all sanctions and regulations that are applicable to our business.” 

Therefore: “As a direct result, NetApp has temporarily suspended business operations in Russia and Belarus.”

Other big vendors have made similar noises. Cisco CEO Chuck Robins said in a March 3 statement: “Cisco is stopping all business operations in Russia and Belarus and will continue to focus on supporting our Ukrainian employees, customers and partners while providing humanitarian aid and accelerating our efforts to protect organisations in Ukraine from cyber threats. We stand with Ukraine and condemn this unjustified war.”

A March 4 statement by Microsoft President and Vice Chair Brad Smith said: “Like the rest of the world, we are horrified, angered and saddened by the images and news coming from the war in Ukraine and condemn this unjustified, unprovoked and unlawful invasion by Russia.”

Fujitsu said late last week it will “donate 1 million US dollars (approximately 115 million yen) to UNHCR, the UN Refugees Agency, to provide urgently needed humanitarian support for the many people displaced by the ongoing crisis in Ukraine and countries in the surrounding region.” The company refused to comment on whether it will pull business operations out of Russia.

NetApp added: “We are deeply concerned about the safety and security of the international community and our thoughts of care, support and resiliency are with all those impacted during this troubling time. NetApp is committed to the safety of our workers in Russia and Ukraine, and we are in constant contact with them to offer support. As we navigate this crisis, we encourage care, compassion, and empathy within our global communities.”

The wider European region “is an important market for NetApp, and we will continue to work carefully to assess and respond to any updates across the region as the situation evolves.”

Druva adds all-in-one snapshot, backup, DR service for Amazon EC2

Druva has expanded its SaaS data protection portfolio to include Amazon’s EC2 with combined snapshot, backup, and disaster recovery (DR) capabilities.

This service provides secure, air-gapped backups to protect against ransomware with a 50 per cent lower TCO (total cost of ownership) than Amazon’s services, Druva said. The air-gapping is virtual in that the EC2 (Elastic Compute Cloud) data is stored in Druva’s own AWS storage account and not in a physically separate offline tape cartridge.

David Gildea, Druva VP of product, said in a statement: “By combining critical snapshot, backup, and disaster recovery capabilities in a single seamless interface, AWS customers can significantly increase their data resilience, reduce costs, and ideally position their company for future cloud growth.”

AWS provides its own basic EC2 data protection service, snapshotting the underlying EBS (Elastic Block Store) volume with an initial full snapshot and then adding incremental change snapshots. It also has the AWS Backup Service which covers entire EC2 instances, not just EBS volumes, as well as the RDS, DynamoDB, and EFS services. 

Druva says it does more for less money by combining snapshots, backup, and DR into a single service with the air-gapped backups. These are not part of Amazon’s standard EBS snapshot service and users can get the same protection level by paying for extra AWS operations.

The Druva play includes source side global deduplication and automated cold-storage tiering for long-term retention. There is a central Druva management facility for tens to thousands of AWS accounts through which point-in-time restores can be made across AWS regions and accounts, taking only minutes, Druva claims.

There are several Druva competitors selling EC2 protection, including Clumio, Commvault Metallic, and HPE’s Zerto unit. We put a quick table together comparing these data protection products. Clumio, Druva, Metallic, and Zerto protect more than just AWS’s EC2 environment, but we’re focusing only on EC2 here:

Clumio sells EC2 snapshots and backup but not disaster recovery. Metallic is backup without snapshots or DR. Zerto is DR and backup without snapshots. All these suppliers use a virtual air gap as protection against ransomware.

Druva says it provides combined snapshot orchestration for fast, operational recovery, secure, air-gapped backups for ransomware protection, and AWS cloud-based DR so customers no longer need any extra software or hardware or a second data centre to provide DR for applications running in EC2 instances, or elsewhere in the IT environments that Druva protects, such as Microsoft 365, Oracle, Microsoft SQL, AWS databases, VMware and Hyper-V servers, NAS, Windows, and Linux servers.

Think of Druva’s EC2 product as another brick in its wall of SaaS-based data protection suites, and expect its in-cloud coverage to expand further over time. It has yet to tap the Azure and Google markets, but it can and probably will.

Grab a datasheet here. Druva’s cloud backup for EC2 will be generally available via the Druva Data Resiliency Cloud this spring.