Storage startup StorONE has developed a fast-recovering, ransomware-protecting, flash-to-disk auto-tiering version of its software called S1:Backup.
S1:Backup is a backup target for Commvault, HYCU, Rubrik and Veeam’s data protection software to provide fast ingest to ransomware-proof storage based on a 4–8 SSD flash tier with automated movement of older backups to high-capacity RAID-protected disk — more than 15PB of it. StorONE also has the ability for the clustered S1 array to be a standby production system, thus offering instant recovery.
Gal Naor, StorONE’s founder and CEO, said: “When we examined the points of exposure, we discovered a gap between backup software innovations and storage hardware capabilities, which led to vulnerabilities and higher cost. Legacy backup storage is the cause of that gap.”
In his view, “S1:Backup fills it and enables companies to complete their ransomware recovery strategy and extract the full value from modern backup software by eliminating ransomware concerns and elevating backup to deliver high-availability.”
StorONE claims S1:Backup delivers 100 per cent ransomware resiliency. In general, existing ransomware-protecting systems “force customers to use S3 for immutability. S1:Backup is the industry’s only solution that delivers immutability across all its protocols, including NFS, SMB, S3, iSCSI, Fibre Channel, and NVMe-oF.”
The company claims that high-capacity disk drives can be used in its S1:Backup because its software rebuilds a failed 18TB disk drive in three hours while other suppliers’ arrays can take days or even more than a week. And that situation only gets worse as capacities rise to 30TB and beyond. Newer, higher-capacity drives can be mixed with older, lower-capacity drives in the StorONE system as well, making scale-up more straightforward.
StorONE Ransomware Protection graphic.
S1:Backup captures every backup job in an immutable state, handling hundreds of incoming incremental backup streams with 30-second snapshot intervals, and retains immutable copies indefinitely. Its software can operate at normal speed when its disks are 90 per cent full — there is no performance drop-off past the 50 per cent utilisation level.
This high-utilisation supports long-term storage. The software detects and corrects so-called bit-rot errors which can occur in long-term disk storage. Naor claims this is a future unique to StorONE. S1:Backup can also consolidate block-level incremental backups and create virtual fulls in minutes by using its 100TB+ flash tier.
StorONE also claims a pricing advantage — its pricing is openly available on its website so that customers can make comparisons with competing suppliers’ systems.
When the S1:Backup becomes a standby production system, backed-up VMs can operate in the array and use block, file and object protocols. The S1:Backup system can expand and be used for other use cases — such as archive, NAS, virtual machine storage, databases and even HPC and AI.
Comment
StorONE has close to 100 customers, Naor says. “We are growing fast from several aspects,” he told Blocks and Files, and mentioned repeat orders from existing customers as one example. “Repeat orders give us a lot of confidence.”
The alliances with Commvault, HYCU, Rubrik and Veeam, together with its S1:Backup features — such as ransomware protection, fast ingest and restore, stand-by production capability, and high disk utilisation and fast RAID rebuild capabilities — should make it an attractive alternative to other backup storage targets.
Naor thinks that the market is not yet ready for all-flash backup targets like FlashBlade because they are currently too expensive, and S1:Backup can provide similar or equivalent performance with cheaper archiving.
Commvault has appointed a new Area VP & GM of UK & Ireland: Stuart Abbott. It says he brings extensive experience to his new role, including leading both UK and Ireland accounts at Dell Technologies, and building the current Global Alliances business for EMEA at Dell Financial Services, plus previously working in partnership with multinational organisations including Capgemini and Atos.
…
Databricks vs Snowflake wars — round 3. Databricks now claims that Snowflake’s rebuttal of Databricks performance superiority in a TPC-DS Power benchmark run over Snowflake was wrong. A blog states: “We stand by our blog post and the results: Databricks SQL provides superior performance and price performance over Snowflake, even on data warehousing workloads (TPC-DS).” The blog claims Snowflake used a different (prefaced) dataset in its testing which skewed the results in its favour. When Databricks used the official TPC-DS dataset with Snowflake’s SWsoftwarethe result was worse than Snowflake’s number:
BSC is Barcelona Supercomputer Centre test result.
Databricks’ blogger writes: “We agree with Snowflake that benchmarks can quickly devolve into industry players ‘adding configuration knobs, special settings, and very specific optimizations that would improve a benchmark’. Everyone looks really good in their own benchmarks. So instead of taking any one vendor’s word on how good they are, we challenge Snowflake to participate in the official TPC benchmark.” Well … yes.
…
Datacentre composability supplier Liqid released a white paper with industry analyst firm Enterprise Strategy Group (ESG) to highlight the potential of technologies like its Matrix composable software as part of a sustainable ecosystem for next-generation applications such as artificial intelligence and machine learning (AI+ML), high-performance computing (HPC), cloud and edge computing environments to drive intelligent global infrastructure expansion.
…
Nebulon announced its smartIaaS infrastructure-as-a-service offering, designed to help cloud providers deliver new services at lower cost across both hosted and customer-owned datacentres. It also announced that UK-based service provider Inca Cloud has chosen smartIaaS with Supermicro as a part of its new cloud service: WSO by Inca. The service will be built for both hosted and private cloud deployments and will provide enterprises with a multi-cloud solution as an alternative to standalone Google Cloud, AWS and Microsoft Azure. As a reminder, Nebulon’s technology can streamline the costs of a CSP’s existing services by offloading all data services from the server CPU, memory, and network to Nebulon’s SPU (services processing unit) IO controller in each server.
…
Fox Sports is using OpenDrives’ scalable NAS systems for NASCAR and NFL, and will be deploying it for Qatar 2022, the FIFA World Cup event. The two say OpenDrives provides a modular, portable system with a turnkey installation to dramatically reduce complexity, physical footprint and set-up time. A traditional architecture takes days or weeks to set-up while OpenDrives’ solution can be up and running in less than an hour. Using IP-based standard open protocols, OpenDrives integrates with best-in-class broadcast technology and data solution providers, including CMSI, EVS, Western Digital, Signiant, Google, Aspera and Arista, to serve as the centralised hub powering seamless onsite connections and real-time access to content.
.…
There will be a SmartNICs Summit event at the San Jose DoubleTree Hotel from April 26–28, 2022. It will focus on network adapters that can process data and protocols faster. SmartNICs promise better networks with little extra cost or complexity. The Summit will feature vendor keynotes, expert tables and technology and market updates. It will also offer sessions on architectures, development methods and applications. And it will include panels on choosing the right adapter and long-term trends. Chuck Sobey, Summit chairperson, said: “The event will educate designers, present the state of the art and describe standards and open source projects.”
…
Big Data analytics supplier Vertica is partnering with NetApp to use its StorageGRID object system as an on-premises data source for its cloud-native analytics software. Vertica with NetApp StorageGRID has a fast multi-site, active-active architecture and can perform queries 10–25 times faster than conventional databases. Customers experience improved node recovery, superior workload balancing, and more rapid compute provisioning. Vertica says the separation of compute and storage architecture of Vertica in Eon Mode allows administrators to use NetApp StorageGRID as the main data warehouse repository or as a data lake.
…
Western Digital is shipping 20TB capacity Ultrastar DC HC560 and WD Gold disk drives with its OptiNAND technology. They use a 9-disk platform (2.2TB/platter) with a CMR (Conventional Magnetic Recording) recording format — these are not shingled drives. Both drives have a SATA interface and spin at 7,200rpm. The Gold is “for use in enterprise-class datacentres and storage systems” while the DC HC560 is for “hyperscale cloud, CSPs, enterprises, smart video surveillance partners, NAS suppliers and more”. They overlap, in other words.
Take a deep breath. Hammerspace, the unstructured data silo-busting supplier, has raised its game to provide a Global Data Environment (GDE) across file, block and object storage on-premises and in public clouds with policy-driven automated data movement and services.
That’s quite a mouthful, and it means what it says. It builds on Hammerspace’s existing file and object metadata-based technology that unified distributed file and object silos into a single network-attached storage (NAS) resource. This can give applications on-demand access to unstructured data in on-premises private, hybrid or public clouds. The idea is to keep data access points updated with the GDE metadata so that all data is visible and can be accessed from anywhere.
CEO and founder David Flynn told Blocks and Files: “Data Gravity is central to the story. In effect, what we’re doing is elevating data to exist in an antigravity field able to access any infrastructure.”
Molly Presley, SVP marketing, said: “For years, I have spoken to customers in Media & Entertainment, Life Sciences, Research Computing and Enterprise IT that constantly struggled with sharing their file data with distributed users. They have tried to build their own solutions, they have tried to integrate multiple data movers, storage solutions, and metadata management solutions and never been satisfied with the results. Hammerspace took on this really tough innovation challenge and created an elegant, efficient, integrated solution.”
Global DataEnvironment
Hammerspace’s GDE can use existing storage resources, assimilating them, taking them over in effect.
An organisation’s stored data is then accessed, as it were, through a logical Hammerspace front end or gateway.
It’s this concept that lay behind its notion of storageless storage, espoused in December last year. Data, Flynn believes, should be free to move between block, file and object silos, wherever they are. We note that GDE is based on file metadata technology and how it interacts with block storage and volumes will be interesting to find out. For example, if files or objects are moved to block storage, do they become one volume or several? We can ask a similar question about block to file or object conversion.
Setting that aside, GDE supports persistent data in Kubernetes environments and treats clouds as regions, with cross-region access supported.
A layer of file-granular data services exists above the storage silo abstraction layer and includes snapshots, replication, file versioning, tiering with policy-driven data movement, deduplication, compression, encryption, WORM, delete and undelete. Policies can be created for any metadata attribute including entity type, name, creation and modification date/time, and owner. Snapshots can be recovered at file level and also at the fileshare level
Hammerspace will dedupe and compress when replicating or moving data over the WAN when data is stored on object storage. The Hammerspace software can run on bare metal servers, in virtual machines (ESX, Hyper-V. KVM and Nutanix AHV) and in the three main public clouds.
Gradually more and more single store-type environments are emerging. Ceph covers block, file and object. Unified block and filesystems get object support through S3. Suppliers, led by pioneering NetApp, are erecting data fabric structures covering the on-premises and public cloud environments. Public cloud suppliers are establishing on on-premises beachheads, such as Amazon’s Outposts and Azure Stack.
Komprise and others are building multi-vendor, multi-site file lifecycle management systems. And now Hammerspace emerges with its comprehensive offering. As Flynn says: “We are talking about universal access to the data.”
David Flynn.
He asks: “How can you have [data] local to this user, local to this application, local to this datacentre? To each and every one of them simultaneously, without ever copying the data, without it being a copy of the data? It’s a different instance of the same piece of data. And that’s in essence what we are doing by by empowering metadata to be the manager of the data.”
Flynn aims to change the relationship between data and infrastructure: “Changing the relationship is done through intent-based orchestration, through granular orchestration, through live data orchestration. And nobody else does that, where you can have statements of objective, and have the system pre-position the data where you’re going to need it.”
Access has to be fast. “Your system has to be able to serve data, once it is positioned onto a specific locale. It has to be able to serve it in parallel, in the highest performance fashion.”
Flynn does not like the notion of an actual central data store. “If you look at other attempts to address globalising data and its access, invariably, all of them use the notion of a central silo — a big central object store. Then you stretch and put file access to where you use a central file or you stretch and you put caches on front of it. It almost makes [things] worse now, because you’re not only dependent on a centralised [store] but now each of your decentralised points become critical points of failure as well.”
Spinning up Hammerspace GDE
Flynn told us: “Hammerspace can be spun up at the click of a mouse through APIs. As matter of fact, we have one of the major cloud vendors — actually, they did the work to automate the spin-up of Hammerspace. And have been positioning Hammerspace as the way to do multi-region, even within their cloud.” He wouldn’t say which public cloud this was.
”This, the ability to simply turn on Hammerspace, and have it stretch your data from on-prem into the cloud, or from one region to the cloud to another, through an API and in a matter of mere minutes, is extremely powerful.
“So this ends up being super simple to spin up through … scripting, a presence of your data, and you don’t even have to wait for the data to get there. Because the system replicates metadata, and the data comes granularly. And based on policy, so you’re not waiting for whole datasets to come.”
Customers
We suggested Hammerspace was asking a lot of its customers to trust Hammerspace with their data crown jewels.
Flynn agreed. “Yes, we are asking our clients to trust us to be their data layer. So in the end, they are subjugating all of the storage infrastructure to Hammerspace. And using that to automate, arguably, one of the most manual processes in the IT world — the placement and movement of data across different storage infrastructure. It’s kind of a sin. There’s nothing more digital than data. And yet, something in the most desperate need of digital automation is the, how we manage the mapping of data to the infrastructure.
Customers have responded. “This is something where the company has made major strides in the past year [with] some very large organisations that have gone into production. And even in the seven figure deals, million dollar-plus range. And even at that, it’s still just a fraction of how far they want to grow. So we have three of the world’s largest telcos, some of the largest online game manufacturers, media and entertainment companies.”
How did Flynn know this was the right time to make a big push into the market?
“When we had one of these telcos I was talking about in production for nine months, continuous operation, zero downtime, running mission-critical applications, 24/7, 365 days a year, over 3,000 metadata ops per second. So you look [at that] as a as a startup guy, you’re looking for signs, and how do you know when you’re ready? It’s hard, right? Because if you go too soon, you can stub your toe. If you go too late you miss opportunity.
“As soon as I had not just that, but other proof points similar to it, we were there. … The first thing that we decided that we want to go after [is] the enterprise NAS world, and the need to move enterprise NAS workloads to the cloud.”
Partners and execs
So the software is ready, customers are interested and the timber is ripe for launch. What next? Hammerspace is ramping up its go-to-market operation, having recently hired:
Jim Choumas VP channel sales;
Chris Bowen as SVP global sales;
Molly Presley as SVP marketing.
There’s a mass of related activity. Sales teams have been hired for the southern California media and entertainment market, high tech companies in northern California, and life sciences in the Boston area.
It’s set up a new partner program, Partnerspace, which integrates with the channel for 100 per cent of its customer engagements.
DataCore sells Hammerspace software as its vFilo product. We expect many more channel partners will be recruited.
Fungible has increased its NVMe/TCP storage-to-server bandwidth from 6.55 million to 10 million IOPS by replacing Mellanox NICS with its own Storage Initiator cards, and claimed a world record.
Update. Fungible comparison to Pavilion Data section added 17 Nov 2021.
Fungible has developed its own Data Processing Unit processor chip, saying it’s better for so-called east-west communications in a datacentre than x86-based servers and standard Ethernet NICs and switches, Fibre Channel switches, etc. The FS1600 is a clusterable 2U x 24-slot NVMe SSD array with Fungible’s DPU chip and software controlling it, that produced 6.55 million IOPS when linked to a server using Mellanox ConnectX-5 NICs.
Eric Hayes, CEO of Fungible, said: “The Fungible Storage Initiator cards developed on our … Fungible DPU free up tremendous amounts of server CPU resources to run application code, and the application now has faster access to data than it ever had before.”
The 10 million IOPS run took place at the the San Diego Supercomputer Center and used a Gigabyte R282-Z93 server with a dual 64-core AMD EPYC 7763 processor, 2TB of memory, and five PCIe 4 expansion slots.
Gigabyte R282-Z93 server.
The slots were filled with five Fungible S1 cards and these linked to an unrevealed amount of capacity and unrevealed number of FS1600 Fungible storage target arrays across an Ethernet LAN of unrevealed bandwidth. Unlike the system using Mellanox NICs, which almost saturated the host server’s CPUs, this one, running Fungible’s NVMe/TCP Storage Initiator (SI) software, took up only 63 per cent of the Gigabyte server’s cores — meaning 47 of them could run application code.
Fungible storage initiator cards.
John Graham, a principal engineer with appointments at SDSC and the Qualcomm Institute at UC San Diego, said: “The Fungible solution has set a new bar for storage performance in our environment. The results are potentially transformational for large-scale scientific cyber-infrastructure such as the Pacific Research Platform (PRP) and its follow-on, the National Research Platform (NRP).”
Fungible provided, he said: “A high-performance storage solution that achieves our planned density and cost requirements.”
Comment
If we say each Fungible S1 card delivered 2 million IOPS from an FS1600 then 10 of them could provide 20 million IOPS and thus equal the Pavilion HyperParallel Flash Array’s 20 million IOPS system performance, in theory. That could be distributed across ten servers in a datacentre. If Fungible has an equally compelling price/performance message to back up its performance message then we have a new entrant in the enterprise storage array market as well as in the supercomputing and high-performance computing market.
One aspect of its set-up though, is that its storage initiator cards are proprietary — they are effectively not fungible, as other Ethernet NICs cannot sustain the 2 million IOPS performance number. Of course, once Fungible’s composability software is fully available then the cards will be fully fungible within its composable datacentre infrastructure scheme.
Fungible vs Pavilion
We are told by a source familiar with the situation that it takes Pavillion 20 x 86 processors, and 40 x 100Gbit ethernet ports in their standard enclosure to get its 20 million IOPS number. Fungible’s 10 million IOPS came from a single FS1600 node with 2 x DPU chips, drawing 750w. That’s a considerable difference in power draw right there.
We understand that there’s no data resilience in the single server-single node Fungible 10 million IOPS set up; it’s been built just for raw performance. If we considered a theoretical 20 x DPU based system, the theoretical limit in terms of IOPS would be 400 million. With Fungible’s current design there are only 6 x 100Gbit ports to the DPU, which means you top out at around 6.5 million IOPS per DPU.
We hear that Fungible’s DPU was designed to support around 20 million IOPS per processor, but it’s not economically prudent to include the amount of networking per box to support that performance profile.
Composable disaggregated systems supplier Liqid is making life easier for customers by integrating a PCIe fabric switch inside its new EX-4400 expansion chassis so customers don’t need to buy separate switches.
Ordinarily, customers buy a Liqid PCIe fabric switch and then Liqid expansion chassis into which they slot PCIe-connected GPUs, FPGAs, NVMe SSDs, and Optane storage-class memory drives. These form a resource pool and are dynamically hooked up to bare metal X86 + DRAM servers by Liqid’s Matrix software to compose complete server systems to run applications. Components are returned to the pool when no longer needed.
Sumit Puri.
Sumit Puri, CEO & co-founder, Liqid, said: “While software is our business, the EX-4400 was born out of desire for industry-leading simplicity and PCIe device density. Now we’re thrilled to offer the world’s first and only fabric-enabled, high-density expansion chassis powered by the most advanced CDI software in the world.”
Liqid supplies 2U LQD 300 and 4U LQD 400 expansion chassis. It has added the EX-4400 with a built-in fabric switch and 16x onboard 16GB/sec PCIe 4 interface ports supporting up to 16 directly connected host servers. A single host can consume up to four fabric ports, for 64GB/sec throughput.
There are two models:
Liqid EX-4410, supporting up to 10 FHFL double-width devices
Liqid EX-4420, supporting up to 20 FHFL single-width devices
Both include independent PCIe slot power management to allow PCIe devices to be added and removed without powering down the chassis, preventing disrupted access to the other PCIe devices. Liqid is calling this combined chassis and switch concept CDI Simplified.
Backblaze raised $100 million from its November 11 IPO in which it aimed to sell 6.2 million shares in a range set at $15 to $17. They were sold at $16, netting the company $99,200,000. Shares are now trading at $25.50, 59.4 per cent higher than the offer price. Investors like this cloud backup and storage providing company’s prospects.
…
Data migrator-cum-data manager Datadobi has added API programmability to its data mover SW DobiMigrate. With v5.13 customers and partners can now extend existing automated storage provisioning workflows with data migration steps. This enables automation of file and object migrations, reorganisation and/or clean up projects. Organizations can first use the storage system APIs to provision a new group of on-premises or cloud storage and then use the DobiMigrate API to set up the NAS or object migration. Following the cutover to the new storage, the administrator can then again use storage systems APIs to deprovision the original storage. The process is fully auditable.
…
HPE has updated its Azure Stack HCI on Apollo 4200 validated solution, adding the Apollo 4200 Gen10 Plus server with an updated processor (Intel Ice Lake), faster and larger capacity of 3200MT/sec DRAM support, Optane Persistent memory 200 series, and select GPU/FPGA support. This 2U server can accommodate high-performance NVMe, and SAS/SATA SSDs, capacity-oriented 3.5-inch disk drives (HDD) or 2.5-inch ones as well. It runs Azure Stack HCI software v21H2. This includes support for Azure workloads, Azure Virtual Desktop (in preview) and new Azure management capabilities like Azure Arc-enabled VM provisioning and management. HPE is working to get Apollo 4200 Gen10 plus into the Azure Stack HCI catalog as soon as General Availability of Azure Stack HCI version 21H2 takes place.
…
An IBM source tells us adding and removing memory is already a reality and supported by SAP HANA running on IBM Power9 servers. There is an IBM “Dynamic LPAR” (DLPAR) operation to add memory to or remove memory from a running LPAR on Power9 Server. LPARs must use HANA 2.0 SPS05 revision 52 (or newer) and SLES 15 SP2 or RHEL 8.3. Use a DLPAR operation to adjust memory if you immediately need more memory to fulfill a critical business task, and shutting down the SAP HANA system is not possible. When using a DLPAR operation on an SAP HANA system, it is recommended to verify the NUMA layout of the system afterwards due to a possible performance impact. For details see the SAP tech note.
…
Chinese supplier Memblaze launched its PBlaze6 6530 series PCIe 4.0 x 4 lane enterprise SSD using 176-layer 3D NAND and supporting NVMe v1.4. Max random read IOPS are 1.1 million, max random write IOPS are 450,000 with sequential read/write bandwidth reaching 6.8GB/sec and 4.8GB/sec, respectively. Overall write latency is 10μs. It features a 11W typical write power consumption and offers dynamic power adjustments from 6W to 14W by per 1W. Capacity levels are 1.92TB, 3.84TB and 7.68TB. and its write endurance can be up to 1.5 DWPD (5 years). It comes in PCIe Add-in-Card and 2.5-inch form factors. A PBlaze6 6536 variant offers three capacity sizes, including 1.6TB, 3.2TB and 6.4TB, and its endurance can be up to 3.3 DWPD (five years).
…
Panasas announced the appointment of Bret Costelow as EVP of global sales to oversee sales and business development worldwide. His CV includes being VP global sales and alliances at DDN and director global sales, software and services sales, HPC, at Intel with the open-source Lustre parallel file system.
…
Paragon Software announced the availability of its Paragon File System SDK for embedded developers. It says Paragon FS is a new, AUTOSAR-compliant file system designed with flash memory in mind. It focuses on flash memory longevity and performance under the heaviest use cases and fits a wide range of hardware — from tiny, low-resource IoT devices to heavy automotive virtual cockpits. Paragon FS consists of independent modules, like VFS and various levels of cache to fine-tune performance, so that unneeded modules do not increase the footprint or affect performance.
…
Quantum announced its ActiveScale object storage product has achieved AWS Outposts Ready designation, part of the AWS Service Ready Program. ActiveScale object storage systems can now be used alongside an AWS Outposts deployment to provide Amazon S3-compliant object storage for Amazon Web Services running on an AWS Outposts rack.
…
Cloud datawarehouser Snowflake announced that data scientists, data engineers, and application developers can now use Python — the fastest-growing programming language — natively within Snowflake as part of its Snowpack developer framework. Snowpark for Python is currently in private preview.
…
VAST Data announced that Jump Trading Group selected VAST’s Universal Storage as the foundation for its high-performance computing (HPC) cloud infrastructure. As a global research-based trading firm, Jump employees include quants, technologists and researchers needing good technology to innovate and push scientific breakthroughs in the field of algorithmic trading. A VAST-Jump video will tell you more.
Kioxia has boosted the performance of its M.2 2230 matchbook-size SSD by adding a PCIe 4 x 4 interface.
The new BG5 drive comes in 256GB, 512GB and 1TB capacities and is destined for OEM use in small notebooks, gaming consoles and similar equipment. It has pretty much the same capacities as the Jan 2019-era preceding BG4 drive but reverts to TLC NAND from the BG4’s QLC flash.
Neville Ichhaporia, VP SSD marketing and product management at Kioxia America, said: “With this latest addition to our comprehensive PCIe 4.0 SSD portfolio, Kioxia is delivering premium performance to the mainstream swim-lane by enabling PCIe 4.0 without sacrificing affordability.”
M.2 2230 drives are 22mm wide and 30mm long. These drives are DRAM-less, using host buffer memory technology to map the drive’s contents. Performance is aided by an on-drive SLC NAND cache.
The BG4 was optimised for reading over writing but the BG5 has more equivalent read and write performance. Here is our tabulation of BG4 and BG5 performance numbers together with the percentage increases:
Kioxia has not supplied an endurance number. It does say the BG5 is a virtual multi-LUN (VML)-enabled client SSD. It supports NVMe v1.4, the TCG Pyrite and Opal standards, and End-to-End Data Protection. There is a Power Loss Notification signal and support for basic management commands over the System Management Bus (SMBus). It is also available in the longer M.2 2280 format (22mm x 80mm).
The BG5 is sampling to OEMs and Kioxia partners, and no prices have been released.
Memory virtualizer MemVerge is supporting the checkpointing of distributed, multi-threaded high-performance computing jobs by allying with the DMTCP Project.
Checkpointing is the saving of an application’s state so that it can be restarted if the system fails. Saving the state of a single application is a well-understood technique, but saving the collective state of an application that is distributed across several compute nodes and also multi-threaded is vastly more difficult. MemVerge says that checkpointing is almost impossible for complex distributed HPC apps with massive datasets. The open-source Distributed MultiThreaded Checkpointing Project (DMTCP) has achieved this.
Mark Nossokoff, senior research analyst at Hyperion Research, provided a statement: “Bringing checkpointing capability to big memory architectures with pooled, distributed memory across multiple nodes operating on large datasets should further enable adoption of in-memory computing techniques within the HPC and AI communities. Kudos to MemVerge for stepping up to provide the industry stewardship to make DMTCP a commercial reality.”
DMTCP transparently checkpoints a single-host or distributed computation in user-space — with no modifications to user code or to the OS. It works on most Linux applications, including Python, Matlab, R, GUI desktops, MPI, etc. It’s usable for workloads including VLSI circuit simulators, circuit verification, formalisation of mathematics, bioinformatics, network simulators, high energy physics, cybersecurity, big data, middleware, mobile computing, cloud computing, virtualization of GPUs, and general high performance computing (HPC).
The MemVerge-DMTCP partnership will facilitate DMTCP’s move into the market and includes:
MemVerge developers joining the DMTCP Project and contributing to open-source development;
MemVerge providing commercial support for the open-source DMTCP software; and
MemVerge integrating the fully tested and supported version into application-specific Big Memory Solutions.
MemVerge has started collaborating with the National Energy Research Scientific Computing Center (NERSC) to optimise MPI-Agnostic Network-Agnostic (MANA), a plugin on top of DMTCP that has been used for transparent checkpointing of MPI (Message Passing Interface) on the Cori and Perlmutter supercomputers.
NERSC DMTCP diagram.
NEERSC documentation states: “DMTCP implements a coordinated checkpointing, as shown in the figure below. There is one DMTCP coordinator for each job (computation) to checkpoint, which is started from one of the nodes allocated to the job, using the dmtcp_coordinator command. Application binaries are then started under the DMTCP control using the dmtcp_launch command, connecting them to the coordinator upon startup. For each user process, a checkpoint thread is spawned that executes commands from the coordinator (default port: 7779). Then, DMTCP starts transparent checkpointing, writing checkpoint files to the disk either periodically or as needed. The job can be restarted from the checkpoint files using the dmtcp_restart command later.”
Charles Fan, CEO of MemVerge, said: “Distributed checkpointing is a perfect complement to ZeroIO In-Memory Snapshot technology that MemVerge has pioneered. We look forward to collaborating with the DMTCP community on future technology and market development.”
Gene Cooperman, a professor at Northeastern University, and leader of the DMTCP Project, said: “The collaboration among NERSC/LBNL, MemVerge, and the DMTCP open-source community will bring reliable and efficient transparent checkpointing to MPI (and later to CUDA) for the production market. While DMTCP and MANA will always remain free and open source, the use of MemVerge technology for rapid writing of memory to stable storage will bring an important enhancement to this technology.”
The MemVerge-DMTCP partnership should enable MemVerge to make more progress in selling its Big Memory technology to HPC customers — particularly ones using DMTCP code.
Backblaze is trading publicly after its IPO ran with no fuss. It went public on November 11. A Backblaze blog reads: “It means we have more resources with IPO proceeds to increase investment in the development of our Storage Cloud platform and the B2 Cloud Storage and Computer Backup services that run on it.” Backblaze plans to expand its sales and marketing efforts to bring Backblaze to more businesses, developers, and individuals.
…
Codenotary’s open-source immudb database (download here) provides data immutability at scale and, with its v1.1. update, can be deployed in cluster configurations for applications that require high scalability — up to billions of transactions per day — and high availability. The Amazon S3 storage cloud can be used as a back end tier so that the database isn’t limited in capacity by on-premises disk space. Data in immudb comes with cryptographic verification at every transaction to ensure there is no tampering possible. It supports SQL, making it possible to move data to immudb without having to make changes to applications.
…
Fujifilm Recording Media USA and the iRODS Consortium announced a collaboration and integration, creating a joint solution built upon FujiFilm Object Archive objects-on-tape software and the iRODS data management platform. FujiFilm Object Archive becomes a deep-tier archive storage target while iRODS provides a data management platform for users. The Object Archive software has been tested with the iRODS S3 plugin and fully supports the Amazon S3 abstraction that iRODS provides. Fujifilm and the iRODS Consortium jointly added functionality comparable to Amazon’s Glacier to the iRODS S3 Resource Plugin. This new functionality will be available as part of the upcoming iRODS 4.2.11 release.
…
HPEPrimera arrays can now be managed with a cloud operational experience like the newer Alletra arrays. They are managed in the cloud with GreenLake for storage and its SaaS-based Data Services Cloud Console. Both Primera and Alletra arrays can be managed through this console. The change comes courtesy of Primera OS 4.4,and more details can be found in an HPE blog.
…
Data security company Protegrity announced the appointment of ex-Pure Storage COO Paul Mountford as the company’s new CEO. Mountford said: “Protecting sensitive data in use, in transit, and in storage across cloud and on-premises environments is one of most pressing concerns for companies as hybrid-cloud computing and AI solutions accelerate. Protegrity is the best solution on the market today, and I look forward to working with our customers, our partners, and the Protegrity team in further establishing Protegrity as the leading platform for protecting critical data for enterprises globally.”
…
TigerGraph has updated its open source Graph Data Science Library with:
Library collection — 20+ new algorithms, including embedding algorithms for graph ML.
Library structure and management — Improved organization, grouping algorithms by category, and placing each algorithm in its own folder with a README and Change Log file. The repository will use tags to identify major releases.
A blog by VP Of Machine Learning and AI, Victor Lee, discusses the details.
…
Startup Zesty has raised $35 million in A-round funding. Its Zesty Disk product uses machine learning to proactively expand and shrink AWS cloud disk volumes according to real-time application needs. It creates a virtual disk comprised of several EBS volumes. These volumes can be detached to reduce provisioned storage or additional volumes can be attached to increase provisioned storage. Attaching additional EBS volumes to the virtual disk has the added benefit of extra IOPS, for a significantly lower price. Download a Solution Brief here.
Snowflake competitor Databricks claimed a TPC-DS benchmark record for its data lakehouse technology and said a study showed it was 2.5x faster than Snowflake. Databricks lacks integrity according to Snowflake, which has come out fighting, saying the study was flawed.
Databricks claims that source data in it its so-called data lakehouse can be analysed faster than if it were filtered and processed through an Extract-Transform-Load (ETL) procedure and then loaded into a data warehouse, such as Snowflake’s, for analysis. TPC-DS is a decision support benchmark with audited results. Databricks achieved 32,941,245 QphDS @ 100TB, beating the previous world record held by Alibaba’s custom built system, which achieved 14,861,137 QphDS @ 100TB.
Databricks announced the research team at Barcelona Supercomputing Center (BSC) ran a different benchmark comparing Databricks SQL and Snowflake, and found that Databricks SQL was 2.7x faster than a similarly sized Snowflake setup. They benchmarked Databricks using two different modes: on-demand and spot (underlying machines backed by spot instances with lower reliability but also lower cost). Databricks was 7.4x cheaper than Snowflake in on-demand mode, and 12x in spot.
A Snowflake blog by founders Benoit Dageville and Thierry Cruanes said Snowflake had deliberately not engaged in “benchmarking wars and making competitive performance claims divorced from real-world experiences. This practice is simply inconsistent with our core value of putting customers first.”
Also: “Anyone who has been in the industry long enough can likely attest to the reality that the benchmark race became a distraction from building great products for customers.” However, in this Databricks instance, “Though Databricks’ results are under audit as part of the TPC submission process, it’s turned the communication of a technical accomplishment into a marketing stunt lacking integrity in its comparisons with Snowflake.”
The two founders say “The Snowflake results that it published were not transparent, audited, or reproducible. And, those results are wildly incongruent with our internal benchmarks and our customers’ experiences.”
The Databricks blog included this chart:
The TPC-DS power run consists of running 99 queries against the 100TB scale TPC-DS database.
Snowflake took issue with the Databricks-Barcelona result and ran the test itself:
It said: “Out of the box, all the queries execute on a 4XL warehouse in 3,760s, using the best elapsed time of two successive runs. This is more than two times faster than what Databricks has reported as the Snowflake result, while using a 4XL warehouse, which is only half the size of what Databricks indicated it used for its own power run.”
But Databricks was still faster, though not by so much. However Snowflake is developing 5XL warehouse technology and claims “Our 5XL in its current form significantly beats Databricks in total elapsed time (2,597s versus 3,527s), and we expect material improvements when it reaches general availability.”
Price/performance
Databricks also said the Barcelona study showed it had vastly better price/performance than Snowflake:
The Snowflake founders dislike Databricks’ price/performance comparison too, saying it is misleading. “Our Standard Edition on-demand price for a 4XL warehouse run in the AWS-US-WEST cloud region is $256 for an hour. Since Snowflake has per-second billing, the price/performance for the entire power run is $267 for Snowflake, versus the $1,791 Databricks reported on our behalf.” Here is its chart showing this:
So, again, Databricks was better than Snowflake, although by much less of a margin. However, the Snowflake founders argue: “Using Standard Edition list price, Snowflake matches Databricks on price/performance: $267 versus $275 for the on-demand price of the Databricks configuration used for the 3,527s power run that was submitted to TPC.”
They say interested parties can run the SnowFlake TPC-DS benchmark power run themselves. It only takes a few mouse clicks and about an hour of elapsed time. Snowflake itself “will not publish synthetic industry benchmarks as they typically do not translate to benefits for customers.”
Certainly not in this instance, as it would show that Databricks is slightly faster at roughly similar price/performance.
GPU-influenced storage controller startup Nyriad has pulled in $28 million in new funding from existing investors.
Nyriad’s startup idea was to have a combined GPU plus CPU controller to cope with tremendously high IO bandwidth. It did not go well. Its Nsulate technology was inspired by a proposed New Zealand Square Kilometre Array astronomy research effort which came to naught. The startup saw its founders ousted and a new CEO appointed by the investor-led board in September. The global HQ was relocated to the USA as well.
Guy Haddleton, Nyriad chairman of the board and lead investor, issued a new funding announcement statement: “Nyriad will deliver a breakthrough data storage solution. Equally important, the company’s vision is backed up by a seasoned executive leadership team, exceptional engineering talent, and a strategy that focuses on delivering business value for customers. That this round of funding is coming almost entirely from existing investors is a strong statement in our confidence in Nyriad’s ability to execute and succeed in the marketplace.”
Herb Hunt.
CEO Herb Hunt said: ”We’re reimagining storage with an architecture that delivers an entirely new way to control and manage storage devices at scale. This new round of funding ensures that solutions based on our new architecture will be available to customers early in 2022.”
Comment
Nyriad says its technology “architecture combines the power of GPUs and CPUs to deliver an unprecedented combination of performance, resilience, and efficiency, enabling massive amounts of data and multiple data types to be managed in a single storage system that is simple to deploy, operate, scale, and maintain”.
Hunt spent 32 years at IBM, finishing up as a VP. Then he went to Siebel Systems for four years as a CTO and then SVP for Strategy, followed by four years as an EVP at a private equity company, the Symphony Technology Group. This group has investments in RSA Security, McAfee and FireEye amongst others. He then founded and spent nine years as a managing principal at Transformation Services, helping client businesses develop their technologies, business and go-to-market structures.
It looks as if Hunt has an enterprise IT background followed up with a period spotting value in unwanted or troubled companies and helping build them up to realise that value. From his CEO role at Nyriad he says: “Storage solutions must empower businesses to grow, adapt, and stay competitive in a data-driven world. Nyriad is on a fast track to provide businesses with these breakthrough capabilities.”
We are looking at a Nyriad array which will have massive amounts of data and multiple data types, and use persistent memory. This is the world of VAST Data, StorONE, Pavilion Data and others. These companies are shipping product and Nyriad has to catch them up and produce something that will make customers sit up and take notice. It’s a big ask and $28 million says the investors think Nyriad’s answer to that request will be a good one. Otherwise it’s throwing good money after bad.
StrongBox is looking likely to change its name to StrongLink as it pivots away from the high-end cross-silo file and object management market to pursue the broader enterprise market.
This was the thrust of a briefing from new CEO Andrew Hall following the dismissal of the company’s former CEO Floyd Christofferson and some other executive leaders in the past month.
Hall, who has been a Portfolio Manager at StrongBox’s parent company PartnerOne Capital, told us: “I then had nothing to do with the last four years of StrongLink but I was aware in the last month that there was friction between the investors and the management team as it related to the focus of the product because Floyd [Christofferson] and Erik [Murrey, the CTO] … were very much biased towards the high-end supercomputing space.”
Andrew Hall.
The investors were concerned that the company was not growing fast enough. It had accumulated some 30 or so high-end customers — such as NASA, the Library of Congress, Bosch, Canon, and Amadeus — but the software required detailed understanding and management, resulting in lengthy sales cycles, customer-specific customisations and proof-of-concept (POC) exercises.
Hall said: “The intention of the Partner One investors was to put it in a can and sell it to lots of people to do exactly the same thing, as opposed to rolling up our sleeves in every new project, having to develop one-off stuff. … That’s what led to the departure of Floyd and Erik, as well as some of the other engineering team, which in fairness was a short-term cost-based decision while we regrouped.”
PartnerOne is very well funded, which means the product re-development costs can be comfortably afforded. There is no plan for PartnerOne to cast StrongBox adrift, with Hall arguing: “There was never any intent to take our foot off the pedal and stop spending money. What there was an intent to do, was understand that we were putting the money in the direction that it had some better potential for higher levels of return.”
Hall is not new to StrongBox. He was CEO of PartnerOne-owned ETI-Net from 1996 to 2016 and then became an advisor to PartnerOne. ETI-Net bought the archiving, VTS and other assets of crashed CrossRoads Systems and set up StrongBox Data Solutions to develop and sell the StrongBox file management and archiving software.
“Specifically in the month that I’ve been here,” he says, “on a daily basis, we’ve looked at how we can extract some of the overall StrongLink functionality and bundle a … version that’s easier to deploy … as opposed to a nine-month sales cycle with a proof of concept in the tenth or twelfth month, we can put some code in the hands of the guys in the cubicles who are responsible for storage. They can get visibility that they haven’t had as to the dispersion of their various storage silos and open a conversation about where the economics are of layering in StrongLink with some second- and third-tier storage.”
This is going to involve “reformulating the visualisation in what we call our control panel”.
Hall sees this product, we might call it StrongLink Lite, being “brought to the table as part of a package of backend hardware, cloud services, pick an element”. Through channel partners, in other words. He’s also going to be recruiting new people to help in this effort.
Doing a Zuckerberg
He will probably change the company’s name to StrongLink, after its products. That would make life simple and, we suppose, mark a fresh beginning. The company’s website is also being refreshed.
The actual number of StrongBox customers is not, at the moment, well understood as some have come through reselling partner Fuji, but 30 plus/minus ten per cent was mentioned. Hall says that the willingness of StrongBox customers to reference the product after successful projects was something he valued.
That heritage is surely going to help provide impetus and credibility to StrongLink Lite or whatever the new product is called when it reaches the market — some time, we estimate, in 2022. We’ll look forward to that.