Home Blog Page 80

Nasuni launches Edge for Amazon S3

Nasuni Edge for Amazon S3 is now available to provide unified file and object storage at edge locations.

The biz supplies cloud file services from a File Data Platform using an edge caching appliance providing access through a global namespace to a centralized file store built on an S3 object storage base. Now it’s taken its backend file-to-S3 conversion capability and applied it to its edge caching devices, giving them select S3 API support.

David Grant.

Nasuni president David Grant explained in a statement: “Nasuni has been a long-time AWS partner, and this latest collaboration delivers the simplest solution for modernizing an enterprise’s existing file infrastructure. With Nasuni Edge for Amazon S3, enterprises can support legacy workloads and take advantage of modern Amazon S3-based applications. Nasuni Edge for Amazon S3 allows an organization to make unstructured data easily available to cloud-based AI services.”

Nasuni Edge for Amazon S3 runs in an Amazon EC2 instance and integrates with AWS Outposts, Wavelength Zones, or Local Zones. It enables applications and app developers to read and write using the Amazon S3 API to access AWS Local Zones, AWS Outposts, and on-premises environments. These S3 users get access through S3 to Nasuni’s edge device caching, global namespace, file sharing, and cloud file services such as data protection, ransomware recovery, and data intelligence.

The Nasuni edge caching is claimed to provide LAN-like data access performance. It provides S3, CIFS, SMB or NFS protocol support, and extends file metadata with an extended number of tags, the number of characters in tags, and the size of file metadata. It also supports petabyte-sized workloads.

Locally generated data may have to be sent to a cloud or datacenter-based analytics app, or both, and adding the S3 protocol to the edge device can make that easier.

Alex Serban, senior manager of site reliability and operations at Electronic Arts, enthused: “This approach to engaging with Nasuni not only enhances our operational efficiency, but also establishes a standardized process. The potential of Nasuni Edge for Amazon S3 becomes evident in its ability to boost performance, speeding software build distribution to teams around the globe. This improvement is instrumental in accelerating the delivery of games to the global market.”

Get access to a Nasuni Edge for Amazon S3 datasheet here and find out more by reading a Nasuni blog.

Weka shines in cloud performance and efficiency benchmark

Weka running in the public cloud has made big gains in the SPECstorage Solutions 2020 benchmark arena with clear wins in four of its five categories.

Update. Weka blog point about the withdrawal of its use of latency measures added in bootnote. 23 April 2024.

SPECstorage Solutions 2020 is a Storage Performance Evaluation Corporation (SPEC) validated test of file storage performance in five workload scenarios: AI image processing workload, representative of AI Tensorflow image processing environments; Electronic Design Automation (EDA); Genomics; Software Builds; and Video Data Acquisition (VDA). Each workload results include jobs or builds, ORT (Overall Response Time), and other measures detailed in a supplier’s submission on the benchmark’s results webpage. Weka supplied mostly winning on-premises results using Samsung SSDs in January 2022.

Weka’s VP for product marketing, Colin Gallagher, said: “What makes these results so interesting isn’t just that Weka is either raw or effectively the #1 result in all the SPEC 2020 benchmarks – it’s the impact of being able to handle any IO profile with zero tuning changes between benchmarks … For customers, this latency advantage shows up as real wall-clock time reductions in time-to-completion of jobs.”

The company’s parallel file system runs across a cluster of nodes – public cloud fast ephemeral instances – and uses kernel bypass technology, with no manual tuning needed. We have charted each benchmark workload results in a 2D space defined by a horizontal axis for the number of jobs or builds, and a vertical axis for overall response time (ORT). Suppliers with results furthest to the right complete more jobs, and then lower down complete them faster.

AI image processing

Weka ran in Azure to provide a direct comparison with Qumulo, also running in Azure, and also provided an AWS result focused on sheer speed.

Gallagher said Weka “beat Qumulo by 175 percent in raw performance at 64 percent of the infrastructure cost on Microsoft Azure. When factoring in latency, Weka can perform 2.5x the number of jobs in the same amount of time as Qumulo at 25 percent of the cost per job.” 

In AWS, “Weka delivers 6x higher load count (2400) than Qumulo at 76 percent the cost per job. Net: Big or small, in multiple clouds, Weka is faster and has a better cost-per-job.”

Electronic Design Automation

Here Weka beats an eight-node NetApp AFF A900 array with a lower ORT.

Gallagher said: “NetApp: 6,300 jobs at 1.39ms ORT.  Weka: 6,310 jobs at 0.87ms ORT. Weka delivered a 60 percent faster response time in the cloud vs NetApp’s fastest eight-node system (AFF A900 NVMe). Effective result: Weka can process 10,000 jobs in the same time NetApp can do 6,300.”

Genomics

Weka in the public cloud overtook UBIX Technology with 2,000 jobs achieved.

Software Builds 

Here Weka did not score an outright win. “It achieved 3,500 builds with an ORT of 0.74ms, landing at #2 for load concurrency,” Gallagher said. NetApp’s eight-node AFF A900 system was faster but it “did 6,120 builds at 1.58ms ORT. Weka’s advantage of being half the latency means an effective 7,472 builds in the same time earning an effective #1 result.”


The SW Build workload in this benchmark is a continuation of the one in SPECsfs2014_SP2 and NetApp led that too with its prior AFF A800 systems. The workload description reads: “The software build type workload is a classic metadata intensive build workload. This workload was derived from analysis of software builds, and traces collected on systems in the software build arena. Conceptually, these tests are similar to running unix ‘make’ against several tens of thousands of files. The file attributes are checked (metadata operations) and if necessary, the file is read, compiled, then data is written back out to storage.”

For some reason Weka performs less well on this kind of test run, although it does its work in half the time.

Video Data Acquisition

Another clear-cut win. Gallagher said: “Weka still holds the #1 spot from two years ago with a small on-prem system (8,000 streams). We beat it with 12,000 streams. Net: On-prem or in the cloud, Weka is the highest performing video platform around.”

Weka running in AWS and once in Azure pretty much trounces the other suppliers in the survey, except for NetApp in the SW Builds workload. Overall, Gallagher concluded: “With the hyperscalers continually improving their back-end infrastructure, the public cloud can now provide comparable results to on-prem for storage.” Just not yet for SW builds, although Weka is able to claim a calculated win by taking response time into account.

Bootnote

Just to be clear, the latency points that Weka mentioned are not part of the SPEC report. An updated Wela blog removes them and notes: “In a prior version of this blog, we went beyond what is allowed in the SPECstorage 2020 reporting rules. For SPECstorage 2020, the only items to have a comparison are the load point value and ORT. Anything else, such as applying latency as an extrapolation, is an estimation and should not be used. SPEC specifically calls this out at https://www.spec.org/fairuse.html. With our sincere apologies, we have updated the blog on the recent WEKA results.”

OptraSCAN combines pathology imaging with ‘affordable’ storage

OptraSCAN, a digital pathology solution provider headquartered in the US, has announced the availability of a combined image management and storage product to “enable and simplify the digital transformation of pathology laboratories.”   

The OptraSCAN IMAGEPath image management system is an open platform that can easily integrate with scanners from multiple manufacturers and over 20 different image formats, in addition to being compatible with external third-party AI applications for pathology.  

The combined offer includes image management and 50 GB to 100 TB of storage with multiple adoption tiers at “an extremely affordable monthly subscription fee.”

“This provides the flexibility to fit the needs of a wide range of users – from one single pathologist to large, enterprise-level solutions,” the provider said.

IMAGEPath includes a conferencing feature for real-time synchronized or asynchronous review of a case, for online collaborations, and second opinions from remote pathologists. Smart features such as case prioritization and audit trail, and the ability for users to customize the user interface to optimize lab operations and efficiency are also included.  

Abhi Gholap, OptraSCAN
Abhi Gholap

“Now that we have addressed the data acquisition challenge with our affordable digital scanners, we are looking beyond this to address the challenges of image management, storage and access,” said Abhi Gholap, OptraSCAN CEO.  

“We designed our solution to simplify image management and image storage at a very compelling price point. We can address the needs of a single researcher, with a few images to share, all the way to a large institution, where an open system provides the flexibility and interoperability they need.”

Features for pathologists, pharma partners, AI partners, and academic research centers include case management, an analytics framework, user authentication, and an audit trail.

OptraSCAN has five patents across its devices, AI analytics, and cloud streaming platforms. It has received CE-IVDR certification for general safety and performance of its scanners, and its systems are being used at more than 100 sites globally.  

Blocks & Files asked OptraSCAN for prices of the scanning and storage packages, and ballpark figures for the data costs per TB.

Arnol Rios, OptraSCAN
Arnol Rios

Arnol Rios, VP, sales at OptraSCAN, said: “We have several scanner platforms, selling for between $25,000 and $160,000, with several options and configurations within that range. The data storage pricing varies widely, with the most economical pricing for higher storage volumes of 10 TB or more.”

He said data bundles including data storage and an image management user-licenses comes with various pricing tiers. As an example, a bundle with two user licenses and 10 TB can cost $300 per month/$3,600 per year. The same bundle with 50 TB would cost $900 per month/$10,800 per year. “The savings increase as the storage volume increases,” said Rios.

The California-based firm has more than 23 distributors in the US, the UK, Denmark, Italy, Greece, France, Morocco, Russia, China, India, the UAE, South Korea, Singapore, and Japan.

Equinix brings in Google head as CEO to scale data services

Global datacenter services heavyweight Equinix is to replace its CEO as part of a planned transition in “late Q2.”

Update. Merrie Williamson hire note added at end of story, 14 March 2024.

Current president and CEO Charles Meyers will move to the role of executive chairman, and current Google Cloud go-to-market president Adaire Fox-Martin will take over Meyers’s two roles in the second quarter.

Adaire Fox-Martin, Equinix
Adaire Fox-Martin

Peter Van Camp, currently executive chairman, will step away from his responsibilities as a board member to take the role of “special advisor” to the board.

Meyers joined the company in 2010, and was appointed CEO in 2018. During his tenure, he has helped extend the Platform Equinix offering to more than 70 markets across 33 countries – supporting cloud data management and storage vendors, managed service providers, network as-a-service (NaaS) providers, communications service providers, and enterprises in the process.   

“I am confident that Adaire’s capabilities and experience will be deeply additive to our team and our culture, helping us meet the evolving needs of our customers, fuel our growth and unlock the extraordinary power of Platform Equinix,” said Meyers. “I am grateful to our board for their support of my desired transition timeline. As executive chairman, I will actively support Adaire as she leverages her tremendous global experience to extend and expand our market leadership.”

With more than 25 years of experience in the technology sector, Fox-Martin is currently head of Google Ireland, and is the global president of go-to-market for Google Cloud – leading sales, professional services, the partner ecosystem, and customer success efforts. Prior to Google, she held senior global positions at both SAP and Oracle. The Equinix CEO hire is a pretty tidy one as she has been a member of the Equinix board of directors since 2020.

“In today’s dynamic digital landscape, Equinix has uniquely amassed global reach, highly differentiated ecosystems, strong partner relationships, and an innovative range of product and service offerings, collectively forming a robust and future-proofed platform to address diverse customer challenges,” said Fox-Martin. “I will leverage my experience to drive continued innovation and growth.”

Dell, NetApp, Pure Storage, and Seagate all have offerings based on Equinix colo datacenters.

Update: Equinix has also appointed Merrie Williamson as EVP and Chief Customer and Revenue Officer (CCRO), effective March 25. Williamson previously served as Corporate VP of Azure Infrastructure and Digital and Application Innovation at Microsoft and was responsible for commercial sales strategy and execution for the core multibillion-dollar Azure business.

Storage news ticker – March 14

Virtual distributed file system supplier Alluxio has been included for the second consecutive year in Forbes’s “America’s Best Startup Employers” which ranks the 500 top startups displaying excellence across employer reputation, employee satisfaction, and growth.

The DNA Data Storage Alliance, a SNIA Technology Affiliate, has announced its first two specifications for storage of digital data in DNA. They define a recommended method for storing basic vendor and CODEC information within a DNA data archive, and the CODEC provides the required conversion from digital information to DNA and back.

Unlike existing storage media, DNA does not have a fixed physical structure with a known start point. The two specifications are Sector Zero and Sector One. Sector Zero defines the minimal amount of information needed for the archive reader to identify the CODEC used to encode Sector One, as well as a pointer to the company or organization that synthesized or “wrote” the DNA. The reader can then access the data in Sector One and identify the CODEC used for the main body of the archive. The Sector One spec includes information such as a description of contents, a file table, and parameters to transfer to a sequencer.

The Sector Zero and Sector One specs are now publicly available, allowing companies to adopt and implement them. DNA Data Storage Alliance members include Catalog Technologies, Quantum Corp, Twist Bioscience Corp, and Western Digital.

A much higher quality Huawei OceanStore Arctic magneto-electric drive slide has come to us via analyst Tom Coughlan: 

The actual drive is labelled “Magneto-electric Disk” (MED) with the commentary “Driver + Tape.” The right-hand upper section of the slide indicates that OceanStor Arctic is robot-free, meaning the MEDs are read and written in place, and employs sealed disks. That must mean that there is a tape drive inside the MED, unlike existing tape technology that relies on separate tape cartridges and drives. There must be, if the slide text is right, two motors per MED – one for the disk and one for the tape.

The right-hand lower section says it’s a 72TB disk with a 20 percent lower TCO than tape and 90 percent lower power consumption than hard disk drives. We aren’t told if the 72TB is raw or logical (after compression) capacity. We recently suggested that MED might involve spun-down disks and are now more convinced of this, given the two motors point above.

One more thing: The Magneto-electric storage chassis containing the MEDs must have a server to provide MED content cataloging and access. It may and probably will provide parallel access to the MED units it contains.

Huawei’s Data Storage Product Line has reiterated that they can’t reveal more details about OceanStor Arctic at this point of time.

IBM is expanding its sustainability portfolio with a suite of capabilities:

  • Identifying and understanding sustainability possibilities with IBM IT Sustainability Optimization Assessment, with the option to leverage IBM’s portfolio of project and labor services to implement these actions.
  • Evaluating end of life or underutilized IT equipment and placing it back into the circular economy with IBM Asset Recovery and Disposition.
  • Securely removing end of life data with IBM Data Erasure Services.

Read a solution brief here.

TrueNAS supplier iXsystems today announced the company has been named a Strong Performer in the 2024 Gartner Peer Insights Voice of the Customer for Primary Storage. The report can be accessed here (subscription required). 

Micron announced the appointment of ex-Intel CFO and CEO Robert “Bob” Swan to its board of directors. Swan was the stabilizing CEO at Intel after Brian Krzanich’s period, and before Pat Gelsinger returned as CEO in 2021.

NAKIVO Backup & Replication Proxmox VE will be the first backup offering to support data protection for Proxmox virtual environments. It can back up and restore data, applications, and operating systems off Proxmox VMs. NAKIVO already supports VMware, Hyper-V, and Nutanix AHV.

NetApp released commissioned YouGov research on the state of AI adoption in the UK business landscape. It found that just half (51 percent) of British organizations understand how AI can benefit their operations, with only 20 percent of UK businesses having a strong understanding of how they can harness AI technology. Spend on AI projects is set to increase in 2024 as leaders see it as crucial to their future business success. Most IT leaders are adopting AI to remain competitive.

NetApp has hired Pravjit Tiwana as GM and SVP of cloud storage. He will focus on accelerating the growth of NetApp’s first-party storage services in all three public clouds and reporting to CPO Harvinder Bhela. Most recently, Tiwana served as CTO at Gemini and CEO at Gemini APAC. Prior to that, Tiwana was general manager at AWS leading Edge Services, including Amazon CloudFront, AWS Edge Computing, AWS Data Transfer Operations, and Amazon Productivity applications.

Hyperscale analytics data storage supplier Ocient has scored $49.4 million in extended B-round funding, having raised $40 million in its 2021 B-round. Participants include Buoyant Ventures, Levy Family Partners, Riverwalk Capital, and Wolf Capital Management, as well as all prior major investors. Total funding is now $119 million. Ocient saw 109 percent year-over-year growth in revenue in its last fiscal year. The cash will be used to advance product capabilities and deliver hyperscale data analytics systems to its global customer base. Ocient CEO Chris Gladwin said: “The close of this latest round of financing is an indication that the need for the solutions we bring to market is growing across industries, and geographies.”

Data protector Rubrik announced GA of its Rubrik Enterprise Proactive Edition (EPE) supporting Data Security Posture Management (DSPM) for cloud, SaaS, and on-premises environments. This quickly follows Rubrik’s recent acquisition of Laminar.

Unstructured data manager Starfish celebrated its tenth anniversary, saying it manages well over an exabyte of capacity across its client base. The typical Starfish customer uses high-performance parallel file systems and scale-out NAS to service production workloads. These environments consist of billions of files, tens and sometimes hundreds of petabytes of capacity, and have myriad data management challenges.

Analyst house TrendForce says that starting this year, the HBM market’s attention will shift from HBM3 to HBM3e, with expectations for a gradual ramp in production through the second half of the year, positioning HBM3e as the mainstream HBM technology. SK hynix gained HBM3e validation in the first quarter, closely followed by Micron, which plans to start distributing HBM3e products toward the end of the first quarter, in alignment with Nvidia’s planned H200 deployment by the end of the second quarter. Samsung, slightly behind in sample submissions, is expected to complete its HBM3e validation by the end of the first quarter, with shipments rolling out in the second. Samsung is poised to significantly narrow the market share gap with SK hynix by the end of the year, reshaping the competitive dynamics in the HBM market.

Virtual datacenter supplier VergeIO has launched ioGuardian for VergeOS backup to minimize downtime if there are multiple drive or node failures. ioGuardian offers inline recovery, ensuring near-continuous data access without the need for traditional recovery time frames.

Veritas Technologies, about to be acquired by Cohesity, announced its position as a Top Player in the 2024 Information Archiving Market Quadrant from analyst firm The Radicati Group. This is the sixth time it’s been so placed with its Alta set of products. 

Cloud storage supplier Wasabi has added the National Hockey League’s (NHL) Vancouver Canucks to its sports sponsorship roster. Wasabi becomes the Preferred Cloud Storage of the Canucks with extensive branding integration for the 2024 NHL season, while the Canucks benefit operationally from Wasabi’s cloud data storage services via a multi-year technology deal.

DBOS raises $8.5M to get cloud apps off the ‘dirt roads’

DBOS, a developer of what it claims is the “world’s first cloud-native operating system,” has raised $8.5 million in seed funding and released its first product.  

The Cambridge, MA firm is co-founded by Turing Award laureate and Postgres creator Mike Stonebraker, and Databricks co-founder and CTO Matei Zaharia, along with a joint team of MIT and Stanford computer scientists. It has announced the release of DBOS Cloud, a transactional serverless application development platform, which it says make cloud applications “vastly easier to develop, deploy, and secure.”

DBOS co-founder Mike Stonebraker
DBOS co-founder Mike Stonebraker

The funding was led by Engine Ventures and Construct Capital, along with Sinewave and GutBrain Ventures. Engine Ventures general partner Reed Sturtevant now joins Andy Palmer, Peter Kraft, and Stonebraker on the DBOS board.

“Cloud applications have been running on decades-old operating systems, the equivalent of driving cars and trucks cross-country on dirt roads – DBOS provides a necessary and overdue overhaul,” said Sturtevant.

Based on years of joint MIT-Stanford research and development, the company created DBOS (database-oriented operating system), which runs operating system services on top of a high-performance distributed database. The result is said to be a scalable, fault-tolerant, and cyber-resilient foundation for cloud-native applications, with the added ability to store all state, logs, and other system data in SQL-accessible tables.

“The cloud has outgrown 33-year-old Linux, and it’s time for a new approach,” said Stonebraker. “If you run the OS on a distributed database as DBOS does, fault tolerance, multi-node scaling, state management, observability, and security get much easier. You don’t need containers or orchestration layers, and you write less code because the OS is doing more for you.”

DBOS Cloud is initially available for developers to build and run serverless functions, workflows, and applications. They are promised reduced complexity of development, deployment, and operations as a result, while also “increasing cyber security and cyber resilience.”

Zaharia said: “With DBOS, developers can build applications in days that now take months on conventional cloud platforms. They can also seamlessly use the same tools they use today, so there is very little learning curve before benefiting from the rapid development, guaranteed transactions, and increased cybersecurity of DBOS.”

DBOS Cloud includes support for stateful functions and workflows, built-in fault tolerance with “guaranteed” once-and-only execution, time-travel debugging, SQL-accessible observability data, and the enablement of cyber attack self-detection and self-recovery.

“The cybersecurity implications of DBOS are truly transformative,” added Michael Coden, DBOS co-founder and former head of cyber security practice at BCG Platinion. By simplifying the cloud application stack, DBOS is said to “greatly reduce” the attack surface of cloud applications. On top of that, DBOS enables self-detection of cyber attacks “within seconds” without the use of expensive external analytics tools. And it can restore itself to a pre-attack state “in minutes.” Coden enthused: “It’s a DevSecOps game-changer.”

The company said it will use the raised funds to grow its engineering team and to enhance the transactional computing platform and its components.

Stonebraker spoke to sister publication The Register in December about his new organization, read it here. And elsewhere in our family of publications, TheNextPlatform took a deep dive on DBOS, and that article is here.

Meta hooks up with Hammerspace for advanced AI infrastructure project

Meta has confirmed Hammerspace is its data orchestration software supplier, supporting 49,152 Nvidia H100 GPUs split into two equal clusters.

The parent of Facebook, Instgram and other social media platforms, says its “long-term vision is to create artificial general intelligence (AGI) that is open and built responsibly so that it can be widely available for everyone to benefit from.” The blog authors say: “Marking a major investment in Meta’s AI future, we are announcing two 24k GPU clusters. We are sharing details on the hardware, network, storage, design, performance, and software that help us extract high throughput and reliability for various AI workloads.”

Hammerspace has been saying for some weeks that it has a huge hyperscaler AI customer, which we suspected to be Meta, and now Meta has described the role of Hammerspace in two Llama 3 AI training systems.

Meta’s bloggers say: ”These clusters support our current and next generation AI models, including Llama 3, the successor to Llama 2, our publicly released LLM, as well as AI research and development across GenAI and other areas.”

A precursor AI Research SuperCluster, with 16,000 Nvidia A100 GPUs, was used to build Meta’s gen 1 AI models and “continues to play an important role in the development of Llama and Llama 2, as well as advanced AI models for applications ranging from computer vision, NLP, and speech recognition, to image generation, and even coding.” That cluster uses Pure Storage FlashArray and FlashBlade all-flash arrays.

Meta’s two newer and larger clusters are diagrammed in the blog:

Meta AI graphic

They “support models larger and more complex than that could be supported in the RSC and pave the way for advancements in GenAI product development and AI research.” The scale here is overwhelming as they help handle “hundreds of trillions of AI model executions per day.”

The two clusters each start with 24,576 Nvidia H100 GPUs. One has an RDMA over RoCE 400 Gbps Ethernet network system, using Arista 7800 switches with Wedge400 and Minipack2 OCP rack switches, while the other has an Nvidia Quantum2 400Gbps InfiniBand setup.

Meta’s Grand Teton OCP hardware chassis houses the GPUs, which rely on Meta’s Tectonic distributed, flash-optimized and exabyte scale storage system. 

This is accessed though a Meta-developed Linux Filesystem in Userspace (FUSE) API and used for AI model data needs and model checkpointing. The blog says: ”This solution enables thousands of GPUs to save and load checkpoints in a synchronized fashion (a challenge for any storage solution) while also providing a flexible and high-throughput exabyte scale storage required for data loading.”

Meta has partnered with Hammerspace “to co-develop and land a parallel network file system (NFS) deployment to meet the developer experience requirements for this AI cluster … Hammerspace enables engineers to perform interactive debugging for jobs using thousands of GPUs as code changes are immediately accessible to all nodes within the environment. When paired together, the combination of our Tectonic distributed storage solution and Hammerspace enable fast iteration velocity without compromising on scale.”   

Hammerspace diagram of its Meta installation
Hammerspace diagram of its Meta installation

The Hammerspace diagram above provides its view of the co-developed AI cluster storage system.

Both the Tectonic and Hammerspace-backed storage deployments use Meta’s YV3 Sierra Point server fitted with the highest-capacity E1.S format SSDs available. These are OCP servers “customized to achieve the right balance of throughput capacity per server, rack count reduction, and associated power efficiency” as well as fault tolerance.

Meta is not stopping here. The blog authors say: “This announcement is one step in our ambitious infrastructure roadmap. By the end of 2024, we’re aiming to continue to grow our infrastructure build-out that will include 350,000 NVIDIA H100 GPUs as part of a portfolio that will feature compute power equivalent to nearly 600,000 H100s.”

Cloudian becomes a file mount point

Cloudian is supporting AWS Mountpoint, so that applications can issue file calls to a data lake stored on Cloudian HyperStore S3-compatible object storage.

HyperStore is a classic S3-compatible object storage system. Cloudian added NFS and CIFS/SMB access to it last year and has now extended this. AWS Mountpoint is an open source file-level interface to AWS’s S3 buckets which doesn’t support complex file-folder operations and is not POSIX-compliant. It makes datalakes stored in S3 accessible to file-level applications. Cloudian has layered Mountpoint on top of HyperStore and its customers can now access objects using file protocols, simplifying the integration of HyperStore into existing file-based applications.

Jon Toor, Cloudian’s chief marketing officer, explained in a statement: “By bridging the gap between local file systems and an on-prem Cloudian cluster, we are delivering on our promise to make object storage as accessible and functional as possible for our customers.”

The Mountpoint-Cloudian combo provides:

  1. Native mounting of object storage buckets as local file systems, using standard Linux commands and traditional file operations.
  2. Streamlining workflows by eliminating the need to copy data to local storage.
  3. Simultaneous data access allows multiple clients to access data, making use of the parallel processing and throughput of Cloudian’s HyperStore.
  4. Unified view of object data through both file and object APIs.

Cloudian says common use cases for AWS Mountpoint include large-scale machine learning, autonomous vehicle simulation, genomic analysis, data ingest (ETL), and image rendering. 

AWS Amazon scholar James Bornholt blogs: “We’re designing Mountpoint around the tenet that it exposes S3’s native performance, and will not support file system operations that cannot be implemented efficiently against S3’s object APIs. This means Mountpoint won’t try to emulate operations like directory renames that would require many S3 API calls and would not be atomic. Similarly, we don’t try to emulate POSIX file system features that have no close analog in S3’s object APIs. This tenet focuses us on building a high-throughput file client that lowers costs for large scale-out datalake workloads.

Mountpoint is a restricted file system and not a full-featured one such as Amazon’s Elastic FileSystem (EFS) or its FSx series of products. Bornholt says filesystem operations – like file and directory rename, OS-managed permissions and attribute mappings, POSIX features and others – don’t overlap with S3 object storage. This restricts Mountpoint’s applicability to use cases such as datalakes which, Bornholt claims, don’t need them.

Cloudian’s support for AWS Mountpoint is now available. Read Bornholt’s blog for more information.

ScaleFlux makes go-faster computational storage SoC

Computational storage drive developer ScaleFlux has a new System-on-Chip (SoC) that it claims quadruples capacity, doubles sequential read/write bandwidth and random read IOPS, and nearly doubles the random write IOPS number.

Update: 5016 raw and logical capacity set at 256 TB. 14 March 2024.

The startup currently supplies CSD 2000 and CSD 3000 NVMe-format solid state drives. The CSD 3000 has an SFX 3016 drive controller with integrated hardware compression accelerators that provide up to four times more logical than physical capacity. This claimed result is its SSDs having up to 10x better performance, latency, endurance, and economics than ordinary NVMe SSDs.

A statement from ScaleFlux CEO, Hao Zhong, claimed that the new SoC’s design choices “will provide significant advantages over other controllers for AI-centric workloads, based on our discussions with strategic customers with extensive AI deployments.”

The SFX 5016’s performance compared to the earlier SoC looks like this:

The 5016’s raw and logical capacities are the same; 256 TB. A ScaleFlux spokesperson told us: “Even though the data will still compress and thus see performance and endurance gains, we won’t be able to address more than 256TB of space.”

The random write IOPS number drops to 750,000 for incompressible data. The 5016 is claimed to have “stunning low latency” – with no number provided.

There are more differences between the 3016 and 5016 SoCs:

  • The design team bumped up the internal buses, memory controller capability to LPDDR5, and NAND interface to enable the chip to take advantage of the faster host interface (PCIe 5).
  • A newer 7nm process provides better power efficiency – nearly tripling the IOPS/Watt of the SFX 3016. It needs 6W power when typically active, and <2W when idle.
  • Improved ECC and NAND management capabilities to support multiple generations of TLC and QLC NAND from several vendors.

A ScaleFlux exec, Fei Sun, EVP of engineering, was thrilled with the speed of the SoC’s development to production status, which was as fast as that of the SFX 3016: “Achieving first pass production on a highly complex SoC once may be dismissed as lucky. But twice in a row is a result of teamwork and discipline.”

The ScaleFlux team utilized a firmware and silicon co-design process, which enabled feature-complete firmware to be available to the silicon bring up team from day one.  

A forthcoming CSD 5000 drive will be based around the SFX 5016 SoC. Sampling the controller with turnkey firmware has begun for key customers planning to build their own drives via samples of the ScaleFlux CSD 5000 drive design.

Vawlt seals in €2.15M for its distributed data storage ‘supercloud’

Lisbon-based startup Vawlt Technologies has secured an additional €2.15 million ($2.35 million) round of funding to help widen the reach of its “supercloud” distributed storage system.  

The last round of funding was in 2021, and the total funding now stands at €3 million ($3.27 million.)

Three new investors made up the latest round, including round leader Lince Capital, along with Basinghall and Beta Capital. There was also participation from existing investors Armilar Venture Partners and Shilling VC, and further investment from two former Cisco and OVHcloud executives who act as business advisors to the firm. Vawlt was founded in 2018 by researchers from the LASIGE research and development group within the University of Lisbon.

Ricardo Mendes.

“This injection of capital will not only propel us into new markets and reinforce our support for channel partners, it will also fuel the continuous innovation of our product to provide our customers with exactly what they need,” said Ricardo Mendes, CEO of Vawlt.

The startup is also expanding its team, and is now looking to fill positions in both business and product development.  

The firm’s proposition seems to be all-encompassing when it comes to file storage and management. Data is distributed in multiple clouds or on-premise nodes, creating that supercloud which enables companies to take advantage of “best-of-breed” multiple storage environments through a single pane of glass. All users’ data is “always available”, promised the firm: “even if some of the clouds are down, if they lose or corrupt the data, and also in the case of a ransomware attack”.  

Only the data owner has access to file contents, and all the data is encrypted at the client side. Data never goes through Vawlt’s servers. It travels directly between the users’ machines and the storage clouds. Some of the techniques used by Vawlt include erasure coding and Byzantine-quorum systems “for both dependability and cost-effectiveness”.  

“In an era of increasing concerns around data security and privacy, Vawlt’s highly skilled team has been able to materialise almost a decade of research into a leading product in the supercloud environment, positioning the company at the forefront of the future of cloud storage,” gushed Vasco Pereira Coutinho, CEO of Lince Capital.

Vawlt’s platform enables channel partners to tailor and optimise storage to their customers’ respective needs and most common use cases, whether that be to support “hot” or “cold” data.  

The default interface is the Vawlt file system, and you can then activate NFS/SMB, S3-API or FTP interfaces according to your needs.  

In terms of public clouds, the main ones to work with in the provider’s pool are AWS, Microsoft, Google, IBM, Oracle, OVH and Backblaze.

Prices vary according to the selected specifications, such as provider list, location, volume size, download quota or storage interfaces.

According to the company’s website, Hot storage for recurrent storage and sharing starts at €30.90 ($33.71) per terabyte per month, plus VAT. Warm storage optimised for editable data, accessed less frequently, starts at €16.90 ($18.44) per TB/month, plus VAT. Immutable solutions for files that don’t require changes upon storage are €15.90 ($17.35), and archival systems for files that are stored for long periods of time and rarely read cost €7.90 ($8.62) per TB.

Generative AI and the wizardry of the wide-open ecosystem

COMMISSIONED: IT leaders face several challenges navigating the growing generative AI ecosystem, but choosing a tech stack that can help bring business use cases to fruition is among the biggest. The number of proprietary and open-source models is growing daily, as are the tools designed to support them.

To understand the challenge, picture IT leaders as wizards sifting through a big library of magical spells (Dumbledore may suffice for many), each representing a different model, tool or technology. Each shelf contains different spell categories, such as text generation, image or video synthesis, among others.

Spell books include different diagrams, incantations and instructions just as GenAI models contain documentation, parameters and operational nuances. GPT-4, Stable Diffusion and Llama 2 rank among the most well-known models, though many more are gaining traction.

Moreover, this “library” is constantly growing, making it harder for IT leaders to keep up with the frenetic pace of conjuring – er – innovation. Kind of like chasing after a moving staircase.

You get the idea. If you’re unsure, you can brush up on Harry Potter books or films. In the meantime, here are three key steps to consider as you begin architecting your AI infrastructure for the future.

Pick models and modular architecture

As IT leaders adopted more public cloud software they realized that a witches’ brew of licensing terms, proprietary wrappers and data gravity made some of their applications tricky and expensive to move. These organizations had effectively become locked into the cloud platforms, whose moats were designed to keep apps inside the castle walls.

If you believe that GenAI is going to be a critical workload for your business – 70 percent of global CEOs told PwC it will change the way their businesses create, deliver and capture value – then you must clear the lock-in hurdle. One way to do this is to pick an open model and supporting stack that affords you flexibility to jump to new products that better serve your business.

Tech analyst Tim Andrews advocates for reframing your mindset from predicting product “winners” to one that allows you to exit as easily as possible. And a modular software architecture in which portions of your systems are isolated can help.

Fortunately, many models will afford you flexibility and freedom. But tread carefully; just as spell books may harbor hidden curses, some models may drain the organization’s resources or introduce biases or hallucinations. Research the models and understand the trade-offs.

Choose infrastructure carefully

GPU powerhouse NVIDIA believes that most large corporations will stand up their own AI factories, essentially datacenters dedicated to running only AI workloads that aim to boost productivity and customer experience. This will be aspirational for all but the companies who have the robust cash flow to build these AI centers.

Public cloud models will help you get up and running quickly, but, if right-sizing your AI model and ensuring data privacy and security are key, an on-premises path may be right for you. Your infrastructure is the magic wand that enables you to run your models. What your wand is made of matters, too.

In the near term, organizations will continue to run their AI workloads in a hybrid or multicloud environment that offers flexibility of choice while allowing IT leaders to pick operating locations based on performance, latency, security and other factors. The future IT architecture is multicloud-by-design, leveraging infrastructure and reference designs delivered as-a-Service. Building for that vision will enable you to run your GenAI workloads in a variety of places.

Know this: With organizations still evaluating or piloting GenAI models, standardization paths have yet to emerge. As you build, you must take care to head off technical debt as much as possible.

Embrace the wide-open ecosystem

While wizards may spend years mastering their spells, IT leaders don’t have that luxury. Eighty-five percent of C-suite executives said they expect to raise their level of AI and GenAI investments in 2024, according to Boston Consulting Group.

It’s incumbent on IT leader to help business stakeholders figure out how to create value from their GenAI deployments even as new models and iterations regularly arrive.

Fortunately, there is an open ecosystem of partners to help mitigate the challenges. Open ecosystems are critical because they help lower the barrier to entry for most mature technology teams.

In an open ecosystem, organizations lacking the technical chops or financial means to build or pay for LLMs can now access out-of-the-box models that don’t require the precious skills to train, tune or augment models. Trusted partners are one of the keys to navigating that ecosystem.

Dell is working with partners such as Meta, Hugging Face and others to help you bring AI to your data with high-performing servers, storage, client devices and professional services you can trust.

Keeping your options open is critical for delivering the business outcomes that will make your GenAI journey magical.

Learn more about Dell Generative AI Solutions.

Brought to you by Dell Technologies.

MinIO goes macro for mega AI workloads

MinIO has developed an Enterprise Object Store to create and manage exabyte-scale data infrastructure for commercial customers’ AI workloads.

The data infrastructure specialist provides the most popular open source object storage available, with more than 1.2 billion Docker pulls, and is very widely deployed. However, the Enterprise Object Store (EOS) product carries a commercial license. 

AB Periasamy.

AB Periasamy, co-founder and CEO at MinIO, issued a statement: “The data infrastructure demands for AI are requiring enterprises to architect their systems for tens of exabytes while delivering consistently high performance and operational simplicity.

“The MinIO Enterprise Object Store adds significant value for our commercial customers and enables them to more easily address the challenges associated with billions of objects, hundreds of thousands of cryptographic operations per node per second or encryption keys or querying an exabyte scale namespace. This scale is a defining feature of AI workloads and delivering performance at that scale is beyond the capability of most existing applications.”

EOS features include:

  • Catalog Enables indexing, organizing and searching a vast number of objects using the familiar GraphQL interface, facilitating metadata search in an object storage namespace.
  • Firewall Aware of the AWS S3 API and facilitating object-level rule creation from TLS termination, load balancing, access control and QOS capabilities at object-level granularity. It is not IP-based or application-oriented.
  • Key Management Server MinIO-specific, highly available, KMS implementation optimized for massive data infrastructure. It deals with the specific performance, availability, fault-tolerance and security challenges associated with billions of cryptographic keys, and supports multi-tenancy.
  • Cache a caching service that uses server DRAM memory to create a distributed shared cache for ultra-high performance AI workloads.
  • Observability Data infrastructure-centric collection of metrics, audit logs, error logs and traces.
  • Enterprise Console A single pane of glass for all the organization’s instances of MinIO – including public clouds, private clouds, edge and colo instances.

As we understand it, ultra-high performance AI applications run in massive GPU server farms. MinIO’s cache pooled DRAM does not operate in the GPU servers in these server farms and the GPUs there cannot directly access the MiniO pooled x86 server DRAM cache.

On that basis, we suggested to MinIO CMO Jonathan Symons that MinIO’s cache, although designed for ultra-high performance AI applications, does not support direct supply of data by cache-sharing to the GPU processors used for these ultra-high performance AI applications.

Symons told us GPUDirect-supporting filers use networking links – like MinIO – to send data to GPU server farms. “GPUs accessing the DRAM on SAN and NAS systems using GPUDirect RDMA are also limited by the same 100/200 GbE network HTTP-based object storage systems use. RDMA does not make it magically better when the network between the GPUs and the storage system is maxed out.

“Nvidia confirmed that object storage systems do not need certification because they are entirely in the user space. Do you see Amazon, Azure or GCP talking about GPUDirect for their object stores? No. Is there any documentation for Swiftstack (the object store Nvidia purchased) on GPUDirect? No. SAN and NAS vendors need low level kernel support and they need to be certified.

“GPUDirect is challenged in the same way RDMA was in the enterprise. It is too complex and offers no performance benefits for large block transfers. 

“Basically we are as fast as the network. So are the SAN and NAS solutions taking advantage of GPUDirect. Since neither of us can be faster than the network, we are both at the same speed. To qualify for your definition of ultra-high performance, do you have to max the network or do you simply have to have GPUDirect?”

Point taken, Jonathan.

MinIO’s Enterprise Object Store is available to MinIO’s existing commercial customers immediately, with SLAs defined by their capacity.