Home Blog Page 179

Hammerspace’s full coverage in its market positioning radar diagram

Hammerspace is making progress selling its Global DataEnvironment file data location, management and sharing services – because it provides more functionality than either enterprise and cloud storage players, or data management and protection suppliers.

Update. Panzura disagreement with Hammerspace’s characterisation of their company added as a boot note. 12 Sep 2022.

This is part two of our look at Hammerspace’s technology positioning, market positioning and business progress. It’s laid out a radar screen-type diagram with 12 dimensions of storage infrastructure functionality and plotted out the screen occupancy of vendors in the enterprise and cloud storage, data management and protection areas.

Founder and CEO David Flynn said in a briefing: “In the world of data and storage, there are many different capabilities which matter: having the right file system protocols and semantics, being able to talk to Windows as well as Linux, being able to talk legacy.”

“There’s really important data services, like snapshots and clones, all of this stuff that has made Data ONTAP the gold standard for manageability of data, your security of making sure that you’re not vulnerable to theft, or or somebody encrypting your data, compliance to legal regulation.”

Talking about Hammerspace’s 12-dimension radar diagram, he said: “In this diagram, they all have an equal amount of real estate. But some things are more important. And I would say performance and scale and data protection are some of the bigger ones.”

The radar screen diagram indicates a supplier’s strength on any on dimension or axis by giving it a score from 0 percent at the center out to 100 percent at the periphery. Hammerspace has plotted its view of the radar screen diagram coverage of the enterprise and cloud storage vendors, with Dell EMC (Isilon/PowerScale), NetApp, Pure Storage, Qumulo and VAST Data highlighted as example enterprise storage vendors: 

The enterprise storage suppliers (orange outline) are strongly positioned in a section of the chart from the top right round to the lower left. This area includes file protocols and semantics, data services, security, compliance and performance, and indicates they have less strong coverage in the scalability, data protection and disaster recovery areas. They have low coverage of the cloud-native, metadata workflow and remote access dimensions.

The cloud storage suppliers – AWS, Azure and Google – have strengths in the cloud-native area (unsurprisingly) and disaster recovery and scalability areas (again unsurprisingly). Hammerspace marks them down a tad in their coverage of other areas’ functionality: file semantics and protocols round to performance.

A second radar diagram looks at the data management and protection area with three groups of suppliers:

These three groups have more limited domain expertise and applicability. Hammerspace would say that, like the enterprise and cloud storage suppliers, they are all excellent at what they do. What it supplies – on top, so to speak – is additional functionality that they don’t have, because it has taken the file system layer out of the storage system layer in an application’s data access stack. We discussed its view of this in our part one story.

Here is the radar screen diagram with all five sets of vendors shown plus Hammerspace’s view of its own positioning:

Hammerspace’s technology provides an abstraction layer of services that the other players can use. This suggests there are partnering opportunities between the other suppliers and Hammerspace. Put another way, though, this diagram suggests that customers can integrate things like data orchestration and metadata-driven workflows, provided by Hammerspace, into the other suppliers’ offerings to get a more complete file-based data services capability.

Why should they do this? Because they will find their IT operations more efficient and less expensive. We’ll discuss Hammerspace’s view of this and take a look at its business progress in the third part of this series of articles. In other words, we’ll find out if customers are responding to Hammerspace’s view of the data universe and whether operating that way is beneficial for them.

Bootnote

A Panzura spokesperson contacted us to say that they disagreed with the Hammerspace characterisation of there company. They said: “Panzura is no longer a “File object gateway”.  More than three years ago, Panzura refounded and overhauled the entire company from staff to solution. It is, in fact, the exact same category “enterprise hybrid, multi-cloud data management” as Hammerspace. If you wanted to compare, it would be important to note that Hammerspace is $3 million in ARR and Panzura is 15X larger. This among other stats reflects that Panzura actually leads the enterprise hybrid, multi-cloud data management space.”

We should understand that “Since its refounding in 2020, Pazura has become the market leader in hybrid multi-cloud data management with skyrocketing rapid growth.  Within three years, Panzura has achieved a record-breaking rise in annual recurring revenue of 485 percent and is growing at four times the rate of its nearest competitors. Additionally, the company has an 87 NPS score.” 

Does mission critical data mean taking things slow? Nah, let’s take it to the Max

Sponsored Feature We’re used to hearing how data is the secret to unlocking your organization’s potential, if only you’re brave enough to let it flow freely.

The problem is that experienced tech leaders and data specialists in large organizations are likely to be far less cavalier about letting financial, credit or medical data “just flow”. As Dell engineering technologist Scott Delandy explains, “They’re relatively risk averse, just by the nature of the types of applications that they run.”

The mission critical data systems underpinning these applications have been built up over years, decades even, with a sharp focus on “quality, stability and reliability,” he says, because “You always expect your credit card to work. You always expect the thing that you bought to ship and – usually – get there on time. And you expect to walk out of the hospital.”

At the same time, these organizations know that they do need to run new workloads, pouring their data into AI for example, or building out new, more agile but still critical applications on containers and microservices rather than traditional monoliths.

“Now, as they’re being asked to support next generation workloads, they don’t want to have to rebuild everything from scratch,” says Delandy. “They want to take the existing infrastructure, [or just] the operational models that they have in place, and they want to be able to extend those to these new workloads.”

Containerized OS

These were the challenges Dell had in mind when it began to plan the refresh of its PowerMax line, the latest in a series of flagship storage systems which is at the heart of tech infrastructure in the vast majority of Fortune 500 companies. The most recent update introduces two new appliances, the 2500 and the 8500, which feature new Intel Xeon Scalable CPUs, NVMe dynamic fabric technology, and 100Gb Infiniband support, as well as a new storage operating system, PowerMaxOS 10.

Given the focus on containerized applications, it shouldn’t be a surprise that the new OS is containerized itself, making it easier for Dell to develop and launch new features, and to share them across both PowerMax and the vendor’s other storage platforms.

“Because of the way the microcode, the software, the operating environment has been developed, that gives us the ability to cross pollinate data services between the different platforms,” says Delandy. One example of this cross pollination is a new set of file capabilities, which first appeared on the PowerStore platform, and is now also available on PowerMax.

But users still face the challenge of how to straddle the traditional VM world and modern applications built around containers – an architecture that was never even built with persistent storage in mind – and access the same automated, low touch, invisible type of workflow.

“And that’s why a lot of the work that we’ve been doing [is] around integrating to things like CSI (container storage interface),” he explains, “By putting a level of automation between the CSI API and the automation that we have on the infrastructure.”

This is based on Dell’s Container Storage Modules technology, “which is the connection between the CSI API’s and the infrastructure API’s, and it allows you to do all of those higher level things around replication, visibility, provisioning, reporting.”

This is a key example of “Taking the things that people have already built for and have had in place for decades and saying, ‘Okay, I’m just gonna go ahead and plug these new workloads in.’”

It also allows for active-active Metro replication for both VM and containerized applications, even though CSI block data is managed very differently to VM block data, Delandy explains. “We’ve been doing that in the VM world for like 100 years, right?”

Radical density

The software improvements combine with the hardware improvements to enable what Delandy describes as “radical density”, offering more effective capacity with less physical storage, to the tune of 4PB of effective capacity in a 5U enclosure, or 800TB of effective storage per rack unit.

One significant contributor to this is the ability to support higher density flash, while also supporting “very granular capacity upgrades”.

Key to this is Flexible Raid which allows single drives to be added to a pre-existing RAID group. This means, “When we put an initial configuration onto a user’s floor, we can use very dense flash technology, because we know if we start off with 15 terabyte drives, we’ll see 30 terabyte drives in the next release. We know that when the customer needs to upgrade, we can just add another 15 terabytes versus having to add 150 terabytes.”

Further flexibility comes courtesy of Dell’s Advanced Dynamic Media Enclosure technology which decouples the compute and storage elements of the PowerMax appliance. This allows more options on the balance of compute nodes versus storage capacity, as well as scalability. It also heads off the dilemma users face when an array starts topping out on performance, but because they have no way to upgrade the controllers, they are forced to add another entire array.

But even with the improvements that have led to this radical density, the remorseless growth of data means that admins still have to consider just how much they want to keep on their premium storage platforms. PowerMax has had multi cloud capabilities since its previous generation of appliances, with the ability to be able to connect in and copy and move data between primary block stores on site to an S3 object store. “That can either be a private object store, like in the Dell World, it could be a PowerScale, or it could be an ECS platform. Or it could be a cloud provider.”

The latest generation, Delandy continues, brings higher performance, resiliency, and high availability. But also, he says, “a better understanding of the use cases.” For example, analysis of users’ arrays suggests up to a quarter of capacity is taken up with snapshot data. There are perfectly good reasons why companies want to keep snapshots, but it also makes perfectly good sense to move them off the primary storage and into the cloud, for example.

“Now you can run more transactional databases, more VMs, more Oracle, more SQL, more applications that need the throughput and the processing power of the array versus just holding all this stale, static data.”

It’s big, but is it secure?

Whether the data is transactional or static, security is the “number one thing” that users want to talk about these days, Delandy says. Often the conversation is simply a question of highlighting to users the pre-existing features in the system: “It’s really helping them understand what things they can do and what types of protection they already have, and how to enable what are the best practices around the security settings for that.”

But the biggest concern customers have is “somebody getting into the environment and not finding out fast enough that you’ve been breached.”

Two new features are crucial here. One is the inclusion of hardware root of trust, which sees cryptographic keys fused onto the controller chips. Everything then has to be authenticated against these, from boot ups to upgrades and driver updates. This significantly reduces the risk of a bad actor obtaining backdoor access to the systems.

In addition, PowerMax now uses anomaly detection to monitor the storage and detect changes to the types of datasets being written – including any failure to meet the 4:1 data reduction rates the system’s updated reduction algorithms can deliver. “One of the things that we look at is what’s the reducible versus non reducible rate, and how does that change. We have it set to be so sensitive, that if we start to see changes in reducibility rates, that can indicate that something is being encrypted,” explains Delandy.

It’s a huge advantage for customers if they can get an indication of ransomware being at work within minutes, because typically encryption due to ransomware takes days, weeks, or even months.
The ability to introduce automation, and balance both the long and short term view, is crucial to the whole PowerMax ethos. Dell has sought to take what was already a highly reliable platform and make it simultaneously a highly flexible platform on which it can deliver rapid innovation.

But as Delandy says, Dell has to ensure it is taking a deliberate, targeted approach to actually solving customer problems. Or, put another way, “We’re not just doing it because it’s cool.”

Sponsored by Dell.

NetApp sees ‘early signs of relief in supply availability’

Supply chain woes, war in Ukraine, inflation – none of these appear to have had much affect on NetApp, which declared strong results for its first fiscal 2023 quarter and forecast continued growth for the next, with CFO Mike Berry remarking: “We are seeing early signs of relief in supply availability.”

Revenues of $1.59 billion were reported for the quarter ended July 29, up 9 per cent on the year, with profits of $214 million, up 5.9 percent annually. The outlook for the next quarter is for revenues between $1.595 billion and $1.745 billion, $1.67 billion at the mid-point, which would represent a 6.3 percent year-on-year increase.

Berry said in his earnings call remarks: “In Q1, despite elevated freight and logistical expense, significant component cost premiums and unprecedented FX headwinds, we delivered solid revenue with both operating margin and EPS coming in above the high end of guidance.”

CEO George Kurian added: “We delivered a great start to the year, with company all-time Q1 highs for billings, revenue, gross margin dollars, operating income, and EPS, fueled by broad-based demand across our portfolio and geographies. Achieving record results in the face of ongoing macroeconomic uncertainty, decades-high inflation, and supply constraints underscores our disciplined operational management.” 

NetApp revenues
A chart of quarterly revenues by fiscal years shows this is NetApp’s ninth consecutive growth quarter and conforms its Q1 record revenue (blue line)

He said that the COVID pandemic and turbulent macroeconomy – meaning inflation, the Ukraine-Russia war, and supply chain woes – caused an urgency for customers to respond to “the complexities created by rapid data and cloud growth, multi-cloud management, and the adoption of next-generation technologies, such as AI, Kubernetes, and modern databases.” Dollar foreign exchange rates did not work in NetApp’s favor either.

Financial summary

  • EPS: $0.96 compared to $0.88 a year ago.
  • Cash flow from operations: $281 million. It was $242 million a year ago.
  • Free cash flow: $216 million, down from last quarter’s $343 million but up from the year-ago $191 million.
  • Net cash: $802 million compared to $1.489 billion a quarter ago.
  • Gross margin: 66.7 percent vs 69.3 percent a year ago.
  • Deferred revenue: $4.2 billion, up 7 percent annually.

Product revenues were $786 million, up 7 percent. Software product revenues grew 15 percent year-on-year to $476 million, while hardware product revenues declined 2 per cent on the year to $310 million.

AFA revenues

The run rate for all-flash arrays (AFA) was said to be $3 billion, implying $750 million AFA revenue for the quarter. This run rate was down on the prior quarter’s $3.2 billion.

Kurian attributed this to “the combination of some supply constraints and also we have an FX (foreign exchange) headwind to product revenue.” 

There was a possible switch in customer spending as well. “At times like this in the past, we have seen customers choose to buy more economic configurations in certain cases. So we had a strong quarter in our hybrid flash segment, which is really targeting customers who want to buy the most cost-effective configuration.”

Back in June, Pure, which only sells all-flash arrays, reported 50 percent year-on-year revenue growth to $620.4 million. Dell’s Q1 2023 results were reported in May with 16 percent revenue growth overall and storage growing at 9 percent to $4.2 billion. That storage growth rate is the same as NetApp’s, with Pure outpacing both of them.

Inside NetApp’s installed base, 32 percent of customers are now running all-flash arrays. It was 31 percent and 29 percent in the previous and year-ago quarters, respectively. 

NetApp revenues
NetApp’s revenues and profits fy2017 to fy2023 with Q1 column tops circled

Public cloud on the Spot

NetApp’s public cloud area is doing well, helped by the acquisition of Instaclustr, which contributed about $35 million of annual recurring revenue (ARR) in the quarter. Kurian said there was “strong demand for our Public Cloud services. Public Cloud ARR grew 73 percent year-over-year, exiting Q1 at $584 million. Public Cloud segment revenue grew 67 percent from Q1 a year ago to $132 million and dollar-based net revenue retention rate of 151 percent remains healthy. We continue to expand our Public Cloud customer base, the penetration into our Hybrid Cloud installed base, and the percentage of customers using multiple of our public cloud services.”

Public cloud revenues – led by Azure NetApp Files, the largest portion, AWS FSx for ONTAP, and Google CVS – are a small but constantly growing fraction of NetApp’s main (hybrid cloud) revenues.

NetApp cloud revenues

CFO Mike Berry said: “Our Public Cloud business had an outstanding quarter, with excellent performance by our Cloud Volume service offerings from AWS, Azure and Google Cloud, which collectively grew ARR over 100 percent year-over-year. We also saw improved execution in our CloudOps portfolio.”

Kurian commented: “We’ve had a really good start to the year in the Spot portfolio, where we’ve had new sales leadership, strong disciplined execution in the product team and in the field organization, and I feel pretty good about the focus so far.”

William Blair analyst Jason Ader said there was “a rebound in the Spot portfolio and continued Cloud Volumes Service strength across the three big hyperscalers (collectively up triple digits).”

Kurian answered an earnings call question about the public cloud business, saying: “We see strong demand for our offerings because they help our customers use the cloud more efficiently. Our storage cloud services are much more efficient than sort of hyperscaler native services, so you can get more performance for less dollars using our capabilities.

NetApp will be adding sales and marketing head count in the public cloud area. It provided a forecast for Public Cloud ARR at the exit of fiscal 2023 in the range of $780 million to $820 million.

Supply chain

Wells Fargo analyst Aaron Rakers said public cloud ARR is expected to grow 10-11 percent quarter-on-quarter in the next quarter, implying 72 percent annual growth. He added: “NetApp noted that it is seeing early signs of supply chain relief and expects inventory normalization to be a tailwind to FCF (Free Cash Flow) through fy2023.”

Looking ahead, Mike Berry said: “We are seeing early signs of relief in supply availability. The timing of a full supply recovery remains uncertain… We are cautiously optimistic that supply constraints will ease further in the second half of our fiscal year, reducing our dependence on procuring parts at significant premiums. We should also start to see a benefit from declining prices for our hardware components.” Good news on the supply chain front will be welcome.

NetApp provided a full fiscal 2023 forecast of 6 to 8 percent revenue growth over 2022. At the mid-point that means $6.76 billion, which would be a record amount.

Hammerspace fixes broken data access layering

The basic idea behind Hammerspace’s technology is that the file system should sit above data orchestration and storage layers in the application data access stack and not be the bottom layer.

This leads to cross-vendor and cross-cloud data orchestration and access. Hammerspace CEO and founder David Flynn briefed us on what the ramifications of this repositioning were as well as discussing Hammerspace’s business progress and recent revenue growth. We’ll look at the filesystem layer re-positioning here, and move on to the follow-on features, and advantages, with reference to a radar diagram, and business progress in later, part 2 and part 3 articles.

Here’s a Hammerspace briefing deck slide to show David Flynn’s layering views:

The left-hand stack is today’s stack with file systems embedded in a storage system layer with a data management and protection layer above that. These three layers sit below the application. The right-hand stack has the storage system at the bottom with the file system layer sitting directly beneath the application. The application talks to the file system layer and it then talks to the data orchestration layer, not a data management and protection layer, which, in turn, operates on the storage layer.

New layer cake

David Flynn

In Flynn’s view: “File systems have been embedded within the storage system or service. If I use EFS, that’s inside of Amazon and a specific region. If I use a NeTapp that’s in a specific data centre in a specific locality, and data management has always been done between these. This leads to a fractured and shifting view of data, because the act of managing data is actually creating different data, because you’re copying from one file system to another in a different thing.”

“And that leads to, I would call data schizophrenia. Where’s my data? I have no control of my data. Either I imprison it in a silo, or I manage it by copy.”

Flynn said: “Data management disappears as a separate thing and it becomes data orchestration, behind the facade of the file system.”

“What we are proposing is radical in that it’s a reordering of the very layering, that whole industries have been built around. Data management and storage infrastructure have assumed this broken ordering and layering. And so what we’re saying is, look, pull the file system out. The file system should be adjacent to the applications and the users. It should transcend any of the infrastructure, and that allows data management to disappear as a separate thing and become data orchestration behind the file system.”

This re-layering provides a single view of data: “It gives you a singular consistent view of data, data that you have minute and powerful control of, because, through the file system, you’re expressing how the data should be rendered, how it should exist across your highly decentralised and highly diverse infrastructure.” 

This requires a new kind of file system. 

File system development history

Flynn sees a progression of file system development over time: “Going back to the beginning, file systems have always been the heart of the problem, file system and have always been the most challenging part of the operating environment, whether it was operating systems before or cloud operating environments. The file system has been the most difficult thing.” 

Hammerspace slide

“DOS was named after a disk operating system, right? The old fat 16 Fat 32. And it’s what made Microsoft after they pulled that file system out of the mainframe and put it into the individual OS, and allowed a commoditization away from mainframes.

NetApp took that file system and embedded it in an appliance. Remember, they took the file system and put it into the array. And that was the defining difference between a RAID array or a Fibre Channel, SAN, and NAS. And what made NAS so sticky is it had a file system embedded in it, which gives you a shared view of data. It improved the scope and enabled the data to be more global than it was before. And there was no going back from that.”

“We went forward with Isilon, where you now were able to scale out across multiple nodes. And so they improve the scalability. And by the way, they didn’t take market share from NetApp, they create a new market by taking it to a new scale, where where file systems had not gone before.”

The public cloud vendors wanted more scale and prioritized that over file system performance: “Along comes folks who want to build the cloud. They want a utility model of computing, they want to host multiple tenants in the hundreds and 1000s in their data centre scale mainframe. And building a file system is massively difficult. So what do they do? 

“They throw out the baby with the bathwater, and they say, let’s invent super simple storage. Let’s have it do nothing but data retention, not have any performance. Accessing OSes, don’t know how to talk to it naturally, you’d have to rewrite your application. You don’t have random read and write. The OS can’t just page it in. Right. So they basically said, let’s throw out everything that’s hard and everything that’s useful out of a file system.” 

He backs this up by saying Amazon’s EFS “doesn’t support Windows and doesn’t have snapshots and clones.”

He asks: “And why has NetApp and Isilon survived with the onslaught of cloud? It’s because their file system, the WAFL file system, and OneFS , they’re really the only games out there when it comes to a true enterprise grade capable file system. And the file systems in the cloud are very poor approximations of that.” 

In his thinking Hammerspace’s Global Data Environment  platform is what comes next: “What I’m telling you is we are the logical successor here.”

Hammerspace is: “building a file system that is now fully spanning of all infrastructure, including across whole data centres, that has never existed before. And we did that by using parallel file system architecture. And pushing open standards to have that parallel file system plumbing.”

Flyyn says: “The NFS 4.2 open source implementation came from my team; the standard, the open source, and we put windows SMB, and old NFS, v3, in. We bridge them on top. So now they get the benefit of scalability of parallel file system. But you can still serve all the legacy protocols. It’s the best of both worlds parallel file system scalability, with legacy protocol in front of it.”

He said: “In the supercomputing world, in HPC, they’ve used parallel file system architecture to scale performance. It just turns out, not surprisingly, that that same architecture lets you scale, across third party infrastructure, across data centres, and to support many, many different clients, because now you can instantiate this on behalf of each client on each customer. But it’s a new breed of file system. It’s a file system that operates by reference, instead of a file system that’s operating by being in the storage infrastructure.”

Therefore: “At the end of the day, what this means is that you have a unified, unified metadata. And you have a single consistent view of data, regardless of location, and regardless of the location, you’re using the data or the location where the data is originally housed. And through AI and machine learning, we can orchestrate data to be where you need it when you need it.”

Management

He says: “We’re in desperate need of digital automation, to automate the management of data.” And it can’t be done with old-style layering.

Flynn asserts that: “If the management is outside of the file system then, by copying between file systems, you’re fracturing the view of the data. It’s different data. You have to put the management behind the file system before you can automate it.”

In his view: “You have to fix the layering and the entire industry, these industries worth 10s of billions of dollars, have been operating under that broken assumption of management is outside of the file system.”

Once data management, as a separate upper stack layer, is replaced by a data orchestration layer below the file system, then you can automate data management; it becomes a set of orchestration functions. 

Up-down flow in Hammerspace layers

Flynn says current file data sharing, between dispersed entity components in a film studio’s video production supply chain for example, work on the basis of file copying. 

He says that: “What Hammerspace represents for the first time is the ability to share data by reference by having people in the same file system across different organisations. … the point is that you’re sharing data by reference within the file system instead of sharing data by copy.”

If we say the dispersed people in this data supply chain operate in the same file system namespace then that might make things clearer. The filesystem moves files to where they are needed instead of the application or users making and sending copies around this ecosystem, from silo to silo. How does Hammerspace achieve this?

It uses metadata tagging. It has an integration with Autodesk ShotGrid, the production management and review toolset for VFX, TV, movie production, animation and games teams.

Autodesk ShotGrid asset library screen

Flynn said: “So ShotGrid, just by saying I want to do this rendering job over in this data centre, will label the metadata in the file system and that will cause the data to be pre-orchestrated [moved] into the other data centre, just the stuff that’s actually needed for that job.”

Here, the application, ShotGrid, gives its intent to the filesystem. It needs a file or files to be made available to to users at a distant data centre and inputs that intent as a metadata item. The file system picks this up and orchestrates the data movement from the source storage system to the destination storage system – showing the up-down interaction between the layers in Hammerspace’s re-organised application data stack. ShotGrid is not concerned with the underlying storage systems at all here.

****

Part two of this examination of Hammerspace will look at its radar positioning diagram and then we’ll have a check on its business progress in a part three.

Zilliz raises $60m for cloud vector database

Zilliz
Zilliz

Startup Zilliz has raised $60 million to boost engineering and go-to-market efforts for its cloud vector database.

In July we wrote that Zilliz, founded in 2017 with $53 million funding, had developed the Milvus open-source vector database. It’s aimed at helping AI applications turn unstructured data into intelligent, usable information for applications such as new drug discovery, computer vision, recommendation engines, and chatbots.

Charles Xie, Zilliz founder and CEO, said: “Milvus has now become the world’s most popular open-source vector database with over a thousand end-users. We will continue to serve as a primary contributor and committer to Milvus and deliver on our promise to provide a fully managed vector database service on public cloud with the security, reliability, ease of use, and affordability that enterprises require.”

We understand that vector databases are designed to index vector embeddings for search and retrieval by comparing values and finding those that are most similar to one another. A vector embedding consists of numeric values in arbitrary dimensions, hundreds or even thousands of them, that describe a complex data object.

These objects can be as simple as words, and move up the complexity scale to include sentences, multi-media text, images, video, and audio sequences. 

Machine learning processes are used to put unstructured data – so-called vector embeddings – as objects into the database. The vector database carries out basic create, read, update and delete (CRUD) operations on its object content and provides metadata filtering. It can then be searched without users needing to know specific keywords or metadata classifications for the stored objects. The search term is processed into a vector using the same machine learning system used to create the database’s contents (embedded objects). The returned results can be identical or similar (near-neighbor) to the search term. 

We might imagine a vector database of images could be asked to find all occurrences of a kitten, for example, or a film/TV program website could provide recommendations to a user based on their viewed video material.

Zilliz says modern AI algorithms use feature vectors to represent the deep semantics of unstructured data, necessitating purpose-built data infrastructure to manage and process them at scale. 

Its fully managed offering is currently in private preview for early access on Zilliz Cloud. This is available by invitation to customers for testing and feedback before becoming more broadly available. The long-term idea for Zilliz Cloud is for it to become a fully managed vector database-as-a-service (DBaaS) providing an integrated platform for vector data processing, unstructured data analytics, and enterprise AI application development.

The funding round is an extension to Zilliz’s initial $43 million Series B round, and was led by Prosperity7 Ventures, a diversified growth fund under Aramco Ventures, with participation from existing investors Temasek’s Pavilion Capital, Hillhouse Capital, 5Y Capital, and Yunqi Capital. Total Zilliz funding is now $113 million.

The $60 million cash influx follows profound growth from Zilliz. Milvus downloads crossed the one million mark, tripling from 300,000 downloads a year ago; production users grew by 300 percent; GitHub stargazers grew 200 percent to over 11,000; and the number of contributors doubled.

To apply for early access to the Zilliz Cloud preview, fill out the form here.

XenData kit takes tape copies of cloud archives

Archiving system supplier XenData has launched an appliance which makes a local tape copy of a public cloud archive to save on geo-replication and egress fees.

XenData’s CX-10 Plus is a networked rackmount box containing an SSD system drive and a 14TB disk drive cache. Its system software sends incoming data out to the public cloud and retains a local synchronized copy of the cloud archive files on LTO tape cartridges.

XenData CX-10 Plus
2-rack unit CX-10 Plus atop a 2-cartridge LTO tape drive chassis

CEO Phil Storey stated: “The CX-10 Plus has two key benefits. Creating a local synchronized copy of every file written to the cloud gives peace of mind. And the Appliance easily pays for itself by minimizing cloud storage fees.”

That’s because having the local tape copy of the archive means customers don’t have to pay public cloud geo-replication fees and, with restores coming from local tape, they don’t have to pay public cloud egress fees either. The CX-10 Plus pricing starts from $11,995, which doesn’t cover the necessary tape appliance – two managed LTO drives or an LTO autoloader.

XenData CX-10 Plus diagram

The system’s data flow starts with one or more users sending it files for archiving. They land on a 14TB disk drive. A multi-threaded archive process then writes the data to the public cloud – AWS S3 (Glacier Flexible Retrieval, Deep Glacier), Azure Blob (Hot, Cool and Archive tiers), Wasabi S3, and Seagate Lyve.

All archive files uploaded to the cloud are retained on the disk cache for a defined retention period, typically a day; the disk is not that big. Every few hours the files are synchronized to LTO creating a mirror copy of the file-folder structure that has been archived to the cloud.

XenData says the CX-10 Plus is optimized for media archives enabling users to play video files directly from the cloud. It integrates with many media applications including Media Asset Management systems.

The CX-10 Plus fits into XenData’s appliance range:

  • X1 – Appliance to manage local LTO tape drives
  • CX-10 Plus – cloud archive appliance with local LTO tape backup 
  • X20-S Archive appliance – to manage 2-drive external LTO tape library
  • X20-S – 2-drive LTO tape library
  • X40-S – 4-drive LTO tape library
  • X40-S Archive appliance – to manage 4-drive external LTO tape library
  • X60-S Archive appliance – to manage up to 10-drive external LTO tape library
  • X100 Archive – manage LTO robotic libraries that scale to 100+ PB

An S3 object storage interface may be added to any XenData LTO Appliance via a software upgrade. This creates a private cloud that competes with public cloud storage services such as AWS Glacier and the Archive Tier of Azure object storage.

XenData was started in 2001 and has a headquarters office in Walnut Creek, California, and an outlying office in Cambridge, UK. The CX-10 Plus will be available in September.

Storage news ticker – August 23

Storage news
Storage news

High-end storage array supplier Infinidat has announced NVMe/TCP certification with vSphere ESXi, and vVols replication with VMware Site Recovery Manager (SRM) integration. These new capabilities complement Infinidat’s VMware integrations, including vSphere, vRealize, and VMware Tanzu. VMware’s general support of NVMe/TCP was a milestone reached less than a year ago with vSphere 7.0U3. InfiniBox vVols replication with SRM adds VMware-native disaster recovery to Infinidat’s vVols implementation. SRM users can coordinate InfiniBox asynchronous replication for vVols via VMware Storage Policy-Based Management without needing a separate Storage Replication Adapter (SRA). Infinidat customers not using vVols can continue using the existing InfiniBox SRA to manage InfiniBox replication at no charge.

Enterprise data observability supplier Acceldata has released its Data Observability Platform (DOP), which looks at data in cloud-native, multi-cloud, hybrid or on-premises environments. The DOP is said to ensure reliability of data pipelines, provide visibility into the data stack – including infrastructure, applications, and users – to help identify, investigate, prevent, and remediate data issues. Acceldata customers include Oracle, PubMatic, PhonePe (Walmart), Pratt & Whitney, DBS, and others.

UK and Ireland-based INET Computer Solutions helps its clients safeguard their systems and data with Arcserve offerings, including the new SaaS Backup to protect Microsoft 365 data. INET has worked with Arcserve for over 10 years to provide its clients with backup and recovery services. It uses Arcserve ShadowProtect to protect data for a large number of clients. Featuring HeadStart Restore, this enables INET to recover systems and data for its clients in the event of an incident in the time it takes to reboot a server. INET also uses it to safeguard its own systems and data.

… 

Data protector Commvault says that Gartner has given it the highest Product Score across all three use cases in the 2022 Critical Capabilities report Critical Capabilities for Enterprise Backup and Recovery Software Solutions: Data Center Environments (4.23/5), cloud environments (4.18/5), and edge environments (4.22/5). Commvault Complete Backup & Recovery scored highest in all three use cases evaluated in this research. 

NVMe/TCP storage supplier Lightbits has announced its latest patent (US 11,385,798) ) for a “method and system for application aware, management of write operations on non-volatile storage.” The technology reduces write-amplification in SSDs. Muli Ben-Yehuda, Lighbits co-founder and Chief Scientist, said: “Lightbits Intelligent Flash Management uses the ideas in this patent as well as other issued and pending patents for maximizing SSD performance and endurance. The exact write amplification depends on the application data streams and the classification process. In the worst case, Lightbits IFM can provide overall write amplification of less than 2.3x for random 4K write workloads with the system at 80 percent capacity utilization. This is the write amplification for the system as a whole, including both the SSDs internal write amplification and any additional software-induced write amplification.”

The patented technology takes account of some application-specific aspects of data being sent to the SSD: “We (1) classify different data streams based on their metadata properties as having certain characteristics (e.g., how long the block is likely to remain alive before being overwritten, i.e., Time Before Rewrite (TBR)); (2) write the data to a certain region of the SSD which is associated with this classification; (3) when needed, use the systems’ GC process to correctly place mis-classified data. This is a natural fit for zoned SSDs but can also be done on commodity SSDs and does not require zoned-SSD support.” 

Rockport Networks recently posted a video interview with Dr Alastair Basden from Durham University discussing the use of the Rockport scalable fabric at his tier 1 national supercomputing facility. Additionally, Rockport and Intersect360 Research published a whitepaper on the top considerations and trends when architecting today’s HPC clusters based on 2022 feedback from HPC users – you can download it here. On August 24 at 1pm ET, Rockport has a webinar, “Security at the Speed of Research”, with CANARIE, the University of Toronto and University of Carleton’s Cyber Lab, to discuss the current threat landscape, explain how research environments are unique, and discuss a Canadian-based innovation that can mitigate network risks without sacrificing performance. Full details and registration info can be found here

Object storage system supplier Scality has a customer win case study. Canada’s Edmonton Police Service overhauled its storage infrastructure environment to improve data availability, scalability, and accessibility while saving public funds. The deployed system includes the HPE GreenLake edge-to-cloud platform, HPE scalable object storage with Scality RING, HPE 3PAR StoreServ 8000 storage, and the HPE Apollo 4510 Gen10 and HPE Apollo 4500 systems, with Veeam for backup. This system saves $2 million CAD over five years, reduces backup times by 76.9 percent, provides infinite scalability and reliable performance, and achieves multisite resiliency with datacenter rapid recovery.

But why a 3PAR 8000 when Primera and Alletra arrays have been announced to replace them? A Scality spokesperson said: ”The contract with Edmonton was signed about two years ago, just around the time Primera was announced (and before Alletra). They chose the 8000 because it was a well-established product and met the customer’s needs. Also, because Greenlake is essentially a cloud service, the physical infrastructure is not as relevant; the service level it provides is what matters.”

Talend has been named a leader in the 2022 Gartner Magic Quadrant for Data Integration Tools. This is the seventh consecutive time that Talend has been positioned in the Leaders Quadrant based on the company’s ability to execute and completeness of vision. For a complimentary copy of Gartner’s complete report, click here.

TidalScale, whose software “glues” AWS EC2 bare-metal instances (and on-premises physical servers, together so that they function as a single larger system, is now available in the AWS Marketplace. TidalScale requires no changes to applications or operating systems and is deployable within minutes. Anyone with an AWS account can now purchase TidalScale’s software-defined server technology through their AWS account in just a few clicks. 

StorPool adds NVMe/TCP and NFS and ports to AWS

StorPool Storage has added NVMe/TCP access to its eponymous block storage system, file access with NFS, and ported it to AWS.

This v20.0 release also adds more business continuity, management and monitoring upgrades, and extends the software’s compatibility. StorPool was started in Bulgaria in 2011 to provide a virtual SAN using the pooled disk and SSD storage – StorPool – of clustered servers running KVM. It has been extensively developed and improved steadily over the years since then. For example, v19.3 came along in August last year adding management features and broad NVMe SSD support. v19.4 in February brought faster performance, updated hardware and software compatibility, management and monitoring changes, and improvements in the business continuity area.

A statement from CEO Boyan Ivanov said: “With each iteration of StorPool Storage, we build more ways for users to maximize the value and productivity of their data. These upgrades offer substantial advantages to customers dealing with large data volumes and high-performance applications, especially in complex hybrid and multi-cloud environments.”

StorPool diagram.

StorPool says its storage systems are targeted at storing and managing data of primary workloads such as databases, web servers, virtual desktops, real-time analytics solutions, and other mission-critical software. The company’s product was classed as a challenger in GigaOm’s radar report looking at Primary Storage for Midsize Businesses in January this year.

NVMe/TCP and NFS

The added NVMe/TCP access, which is becoming a block access standard, provides an upgrade for iSCSI access, using the same Ethernet cabling. Customers experience high-performance, low-latency access to standalone NVMe SSD-based StorPool storage systems, using the standard NVMe/TCP initiators available in VMware vSphere, Linux-based hypervisors, container nodes, and bare-metal hosts. The NVMe target nodes are highly available. If one fails, StorPool fails over the targets to a running node in the cluster. 

The NFS server software instances on v20 are also highly available. They run in virtual machines backed by StorPool volumes and managed by the StorPool operations team. These NFS servers can have multiple file shares. The cumulative provisioned storage of all shares exposed from each NFS Server can be up to 50TB. 

StorPool is careful to say that NFS is for specific use cases, mentioning three. Firstly, this NFS supports moderate-load use cases for access to configuration files, scripts, images, and for email hosting. Secondly, it can support cloud platform operations, such as secondary storage for Apache CloudStack and NFS storage for OpenStack Glance. Thirdly, it’s good for throughput-intensive file workloads shared among internal and external end users. Think of workloads such as video rendering, video editing, and heavily loaded web applications.

However, NFS file storage on StorPool is not suitable for IOPS-intensive file workloads like virtual disks for virtual machines.

StorPool on AWS

StorPool storage can now be deployed in sets of three or more i3en.metal instances in AWS. The solution delivers more than 1.3 million balanced random read/write IOPS to EC2 r5n and other compatible compute instances (m5n, c6i, r6i, etc.). StorPool on AWS frees users of per-instance storage limitations and can deliver this level of performance on any instance type with sufficient network bandwidth. It achieves these numbers while utilizing less than 17 percent of client CPU resources for storage operations, leaving the remaining 83 percent for the user application(s) and database(s).

A chart shows latency vs IOPS of 4KB mixed read/write storage operations on an r5n client instance. The StorPool storage system, when running on 5x i3en instances, delivers more than 1,200,000 IOPS at very low latency, compared to io2 Block Express, which tops out at about 260,000 of the same type of IOs.

StorPool graph

Read the technical details about StorPool on AWS here.

StorPool on AWS is intended for single-node workloads needing extremely low latency and high IOPS, such as large transactional databases, monolithic SaaS applications, and heavily loaded e-commerce websites. Also, workloads that require extreme bandwidth block performance can leverage StorPool to deliver more than 10 GB/sec of large block IO to a single client instance. Several times more throughput can be delivered from a StorPool/AWS storage system when serving multiple clients.

ESG Practice Director Scott Sinclair said: “Adding NVMe/TCP support, StorPool on AWS and NFS file storage to an already robust storage platform enables StorPool to better help their customers achieve a high level of productivity with their primary workloads.”

Comment

With its 20th major release, StorPool’s storage software is mature, reliable, fast and feature-rich. Think of it as competing with Dell PowerStore, NetApp ONTAP, HPE Alletra, IBM FlashSystem, and Pure Storage in the small and medium business market, where customers may need unified file and block access in a hybrid on-premises and AWS cloud environment. Find out more about StorPool v20 here.

Burlywood launches FlashOS firmware for SSDs

SSD controller startup Burlywood has launched its FlashOS software to build software-defined SSDs claiming the longest-lasting drive performance in the industry – but gave few detailed numbers to back it up.

FlashOS firmware is supposed to analyze application performance at the flash controller level, then tune the SSD’s performance to improve it for better production application performance and endurance. Firmware is loaded into an SSD’s controller and acts as the drive’s operating system – also called the Flash Translation Layer, as it converts host logical addressing to physical drive mapping.

Burlywood CEO Mike Jones said: “FlashOS will transform what the industry expects from an SSD and ignite the next wave in datacenter innovation.” CTO and founder Tod Earhart added: “In the market today, datacenter workloads are not being met. We are excited to bring FlashOS to cloud and datacenters that are tired of simply accepting SSD performance failure rates.”

The promised feature list includes:

  • Self-Adaptive Architecture driven by real production workload analysis, provides visibility for data placement intelligence to optimize performance and efficiency;
  • Data Stream splitting to NAND physical partitions minimizes the noisy neighbor problem;
  • Longest-lasting drive performance in the world;
  • Best in class latency consistency, eliminating spikes;
  • Lower Total Cost of Ownership (TCO) through higher endurance;
  • Advanced software algorithms with a flexible base, enables rapid adaptation to changing production workloads;
  • Best performing applications in write-heavy workload environments.

Burlywood says an unnamed premier European storage manufacturer was the first to embrace FlashOS before its public release. Its production SSD is specifically designed and optimized for typical datacenter workloads. It claims the disk delivers “2–5x” better performance with a steady state, consistent latency, and endurance over its life under real-world, complex workloads without any host modifications. 

Burlywood chart
Burlywood chart

It also claims customers realize 2x greater drive endurance resulting in a significant reduction in drive replacements over the life of the platform. But no actual numbers are supplied.

This sounds great – doubled drive endurance, no latency spikes for consistent performance, and great write-heavy performance. How do you get it? Burlywood says customers can realize the benefits of FlashOS in two different ways. The first is buying SSD drives directly from one of the manufacturers using its controller in their SSDs – but it doesn’t say who these manufacturers are.

The second is by licensing FlashOS and using Burlywood’s SSD reference design to have one of its contract manufacturing partners build SSDs to its specifications.

A Burlywood product sheet document talks about a FlashOS SSD in 2.5-inch, U.3 format with 8TB capacity using TLC NAND and a PCIe 4 x4, NVMe 1.4 interface. Its sequential read and write performance numbers are up to 7GB/sec reading and 4GB/sec writing. That’s pretty similar to Intel D7-P5500/5600 and Solidigm D7-P5520/5620 PCIe 4 SSDs. The endurance is not revealed, but charts show the FlashOS SSD’s superiority in write workloads and consistent bandwidth and write latency over time.

Burlywood chart
Burlywood chart

Comment

Effectively these FlashOS SSDs are out of reach unless you are a large enterprise/hyperscaler customer who can afford the time, complexity, and expense involved in having Burlywood arrange a contract manufacturer for you.

It needs to answer the question of why you would bother, when FADU partner SK hynix can supply a 2, 4 or 8TB PCIe 4 SSD with FADU’s FC4121 controller, which is claimed to deliver the industry’s highest PCIe 4 performance with consistent low latency and superior QoS. The write performance is up to 7.2GB/sec and read performance is up to 4.4GB/sec – faster than the FlashOS SSD. This FADU SSD, on the other hand, is a relatively known quantity compared to the FlashOS SSD.

Burlywood chart
Burlywood chart

Burlywood was started up in 2015 in Longmont, Colorado, by then-CEO Tod Earhart, and its initial financing was funded primarily by angel investors, not VCs. It had a $1.4 million round in 2017 and a $10.6 million A-round in 2018. The A-round was led by Michael Jones and John Scarano with participation from Acadia Woods Partners. There has been no subsequent funding.

Mike Jones.

Jones became CEO in April this year. LinkedIn shows 13 Burlywood employees, most focused on firmware, and just three executives – CEO, CTO, and an operations/HR person. A spokesperson said: “Mike has an extensive technology background, having built, and led several thriving high-tech companies. He was inspired by the opportunity he saw to bring meaningful innovation to the storage industry, providing better SSD performance for data centers and enterprises. Mike is leveraging his experience in launching and building successful companies to help Burlywood launch its flagship product and rapidly scale its presence in the storage market.”

FlashOS could be great SSD firmware, but Burlywood needs a way to get more detail on that out to prospective customers. At the moment, we can’t see how it is going to ignite the next wave in datacenter innovation without a major increase in market presence and momentum.

Storage news ticker – August 22

Storage news
Storage news

Dell has updated its multi-hypervisor-supporting PowerFlex HCI software to v4.0, adding unified tools and UIs for lifecycle management and IT Ops with automated storage management across file and block storage services in the unified PowerFlex Manager. PowerFlex file services complement the existing block storage. NVMe/TCP protocol support has been added for extra host connectivity. There’s more information in a PowerFlex blog.

CXL momentum is building. Analyst Dylan Patel says every major semiconductor and datacenter company has joined the standard with a wave of first-generation devices nearing release. Products coming from some 20 companies will include CPUs, GPUs, and accelerators, switches, NICs, DPUs, IPUs, co-packaged optics, memory expanders, memory poolers, and memory sharers. Read this substack post to find out more.

Gigabyte has announced its Aorus Gen5 10000 SSD supporting PCIe 5 and with 200-plus layer 3D NAND in 1, 2 and 4TB M.2 22800 format. It delivers 1.3 million/1.16 million random read/write IOPS and 12.5/10.1 GB/sec sequential read/write bandwidth. The drives uses Phison’s PS5026-E26 8-channel controller and comes with a removable copper heat sink. The performance is on a par with other gaming gumstick PCIe 5 drives from ADATA and Apacer. For the latter it should be; they both use the same Phison controller. 

Aorus storage

Here is an Infinidat slide from an Eric Herzog presentation at FMS 2022, which makes some good general points:

Flash storage, Infinidat

Industry standards body JEDEC has published JESD220F: Universal Flash Storage 4.0. UFS 4.0 introduces significant bandwidth and data protection improvements over the earlier version of the standard. It leverages the M-PHY version 5.0 specification and the UniPro version 2.0 specification to double the UFS interface bandwidth and enable up to ~4.2 GB/sec for read and write traffic. The UFS 4.0 standard also Introduces a Multi-Circular Queue definition for more demanding storage I/O patterns, and an advanced RMPB interface allows increased bandwidth and protection for secure data. An update to the complementary JESD223E UFSHCI 4.0 standard, and a new companion standard for UFS version 3.1 and above, JESD231 File Based Optimization, have also been published. All three standards are available for download from the JEDEC website.

Open-source relational database supplier MariaDB has bought Cubewerx for its cloud-native geospatial technology for an undisclosed amount. It will offer it through its fully managed SkySQL service. CubeWerx manages data in tiers for extreme scale and high performance. It uses MariaDB to manage frequently used vector data (e.g. a geolocation) and for intelligent caching while relying on unlimited cloud storage for raster data which tends to be voluminous (e.g. satellite imagery). Data volume is no longer a limiting factor with this approach and geospatial data is able to grow to any scale.

Jags Ramnarayan, VP and gGM for SkySQL at MariaDB Corporation, said: “While other databases such as PostgreSQL and Oracle have added geospatial capabilities directly into the database, we are taking a modern cloud-native approach of managing virtually infinite amounts of geospatial data on low-cost, durable cloud storage and providing OGC standards-based REST APIs to access the data. We believe this approach will allow MariaDB to leapfrog the database world for geospatial application development.”

Intel and MinIO have jointly announced a stronger collaboration so that that MinIO’s object software can deliver better performance on Xeon processors. The two will “deliver superior performance across mission critical workloads, creating optimized infrastructure options for customers.” That’s it then; game over for AMD and Arm.

Napatech says it has reached the milestone of over 350,000 programmable SmartNIC port shipments and has more than 400 customers globally and using them in infrastructure processing (IPU) and data processing (DPU) designs. It says it’s the number one vendor of FPGA-based SmartNICS. Hyperscalers were the early adopters and more than 70 percent of all SmartNIC ports deployed globally are based on FPGAs.

Cloud file data services supplier Nasuni announced the immediate availability of Nasuni Access Anywhere, a file access offering for hybrid and remote workers based on the acquisition of Storage Made Easy (SME) back in June. Enterprises can extend the Nasuni File Data Platform to enable high-performance file access for the hybrid workforce and the ability to manage files from anywhere, and on any device. The add-on service is a secure VPN-less solution, enabling resilient access to files from any location. Access Anywhere integrates across Microsoft 365, Microsoft Teams and Slack, enabling frictionless collaboration through a single platform.

FADU goes for SSD controller gold

FADU is a fabless SSD drive and controller company making screamingly fast NVMe technology, such as these coming PCIe 5 drives with 14.6GB/sec write and 10.4GB/sec read bandwidth. Where does it come from, and how has it achieved this?

It was founded in South Korea in 2016 by four co-founders:

  • Jackie Lee – first CEO and an investor;
  • Lee Ji-hyo (Jihyo Lee) – current CEO and ex-Bain partner;
  • Nam Eyee-hyun (Peter Nam) – CTO;
  • Lee Dae-keun  (Dae Keun) – COO.

The NAND technical smarts came from Peter Nam – Eyee Hyun Nam on LinkedIn – who has a PhD in computer science from Seoul National University, spent two years at SK hynix as a project leader, and then started working on FADU ideas with the others.

E1.S Echo drives using the FC5161 controller
FADU E1.S Echo drives using the FC5161 controller

Jihyo Lee was a manager and then partner at Bain & Company in Korea where he initiated and then led Bain Seoul’s tech sector operations. Jackie Lee was, we understand, also a Bain partner and consultant for memory companies. He is a FADU investor and left about two years after it was started.

FADU is funded by South Korean investors, all private, and there have been three rounds of funding with a fourth expected soon. In May it was reportedly seeking ₩200 billion ($157 million) for global expansion. It raised ₩30 billion ($38 million) in February this year.

The report said FADU’s business partner, SK hynix, signed a deal to supply SSDs to Meta (Facebook as was) earlier this year. FADU reported revenues of ₩5.2 billion in 2021 with an operating loss of ₩33.7 billion.

The company opened a Mountain View office in 2019, and John Baskett (ex-Memblaze and Broadcom), VP business development & strategic partnerships, and Anu Murthy (ex-Seagate and Toshiba), VP marketing, are both based there.

In 2022 FADU expects to ship around 300,000 controllers. There should be a million-plus shipped in 2023 as FADU’s production ramps up for hyperscaler customers.

Technology

FADU’s controllers are based on ASIC technology with a RISV-V CPU core and hardware accelerators. We’re told the controller architecture features minimal shared bus traffic. The company optimizes three SSD controller attributes: increasing performance (IOPS, bandwidth and lower latency); lowering active power consumption by 30 percent or so compared to competitors; and increasing reliability and consistency with better quality of service. The drives are pre-conditioned to prevent any fresh-out-of-the-box performance spikes. 

The design also addresses heat issues and enables SSDs to meet heat limitations (thermal design power enevelope) without throttling performance.

FADU diagram
FADU diagram

FADU also claims it delivers a rich and widening feature set with each new generation. We know of three generations of its controller technology:

  • FC3081 – PCIe 3 and 100,000 IOPS per watt, 8 channels, TLC, used in Bravo SSD. In mass-production;
  • FC 4121 – PCIe 4–12 channels, TLC, OCP-compliant, ZNS support, used in Delta SSD,  and in mass production. Doubles FC3081 performance. EDSFF E1.S and U.2 form factors;
  • FC 5161 – PCIe 5 16-channel interface, ZNS support, TLC, QLC, <5.2 W average power, used in Echo SSD, EDSFF and U.2 SSD form factors, availability in 2023. FAL in the diagram above stands for flash acceleration layer.

FADU competes with SSD controller companies such as Microsemi, Phison, and Silicon Motion. It sells specifically to OEMs and hyperscalers, will privately label its SSDs, and does not sell to consumers. SK hynix is the only customer we know about. FADU co-presented with Meta at FMS 2022 which indicates a close relationship, although FADU would not confirm specifics.

The firm has said it aims to go public in 2023 and move towards profitability. It has protected its technology with seven Korean patents and three in the US.

Comment

FADU came out of nowhere, so to speak, in 2016 and said it would build a better SSD controller geared for hyperscalers and OEMs. That’s what it has done, with eye-catching performance speed, consistently low latency, and lower power draw compared to the legacy competitors.

Its progress has been dramatic, and it has to carry on that way – with hyperscalers taking millions of its controllers – for its planned IPO to be successful. We think that if its controller technology is really solid, SSD suppliers may be casting potentially acquisitive eyes over its books. Then they could have all the FADU technology goodness just for their own SSDs and grab market share.

SingleStore integrates with SAS Viya

SAS Viya sceen

SingleStore has added an upstream analytics player to its ecosystem with SAS and its Viya offering.

SAS is a long-time analytics processing software producer. Viya is its cloud-native AI, analytic and data management business intelligence software that runs the CAS (Cloud Analytics Services) multi-server parallel processing system. SingleStore is a combined transactional and analytic database that can run either in memory or in a hybrid memory and storage drive (disks, SSDs) mode. Its combined transaction and analytical features, it says, make time-consuming extract, transform and load (ETL) procedures faster. This intergration enables the use of Viya analytics and AI technology on data stored in SingleStore’s cloud-native real-time database. 

SingleStore CEO Raj Verma said: “The integration of SingleStore’s hybrid, multi-cloud database into the parallel analytics engine, SAS Viya, will dramatically improve performance, reduce cost, and enable real-time applications for organizations.”

SAS CTO Bryan Harris added: “For many organizations, getting value out of analytics, machine learning and AI is often associated with complexity, extended timelines and significant costs. I’m excited to introduce the next generation of analytic architecture, SAS Viya with SingleStore, that is hyper-focused on addressing each of these challenges.”

Viya is a multi-machine (server or VM) distributed platform. A server controller and server worker machines networked together, usually via Ethernet, collectively provide SAS Cloud Analytic Services (CAS). The CAS setup talks to accessing clients through an SPRE (SAS Programming Run-time Environment) server. 

SAS Viya is being integrated with SingleStore
SAS Viya explainer video screen grab

In Viya, analytics processing multiple tables can be loaded and retained in memory for repeated use. Each table is distributed across the workers and processed in parallel and saved to storage. Multiple users can use the same in-memory data for different processing tasks.

SPRE provides client sessions for individual users. It includes the SAS Foundation and SAS Studio as the user programming interface. Users access SAS Viya from a browser on remote client machine, generally using a SAS Studio interface. SPRE is used to start a CAS session on the CAS server controller. Viya can run on-premises or in the AWS, Azure, and GCP clouds.

Data needs to be imported from external sources into Viya, with Excel spreadsheet files and CSV data loaded into a SAS Drive. Alternatively data connectors can be used, either serial or parallel. SAS has around 20 connectors already for Viya:

Connectors for SAS Viya, coming to SingleStore

Now SingleStore gets added to the list. SAS and SingleStore say their integration enables SAS’s AI and machine learning analytics to be executed directly against relational database tables in SingleStore. 

For its part, SingleStore has connectors for Spark, IBM Cognos analytics, and SAS-competitor Tableau.

The SingleStore-SAS partnership was first announced in December 2020. It has taken 20 months for the two to develop software to integrate SAS Viya and SingleStore. It’s a win-win as Viya is now an analytics processing layer for SingleStore and SingleStore an integrated external data source for Viya.

More information can be found on SingleStore’s website now, and on the SAS website later today.