Home Blog Page 5

XenData Z20 slings media files from SMB to cloud and back

XenData has launched an on-prem gateway appliance for moving Windows SMB media files to and from public cloud object storage.

XenData provides storage products, such as on-prem X-Series tape archives and Media Portal viewers for the media and entertainment industry and allied customers. The Z20 Cloud Media Appliance is a Windows 11 Pro x86 box that hooks up to a local SMB network and can move files to object storage in the cloud, utilizing both online and archive tiers.

XenData Z20

CEO Dr Phil Storey stated: “The Z20 makes it easy for users to store media files in the cloud and it is especially useful when content is stored on the lower-cost Glacier and Azure Archive tiers. It allows users to check that they are rehydrating the correct media files before incurring rehydration and egress fees. Furthermore, it provides users with a self-service to easily restore individual files without the need to bother IT support staff.”

The system has a multi-tenant, web-based UI. It’s compliant with Microsoft’s security model and can be added to an existing Domain or Workgroup. Remote users are supported using HTTPS when a SSL security certificate is added. Physically, the device is a 1 RU rack-mount appliance with 4 x 1 GbE network ports and options for additional 10 and 25 GbE connectivity. 

XenData previously released Cloud File Gateway software, running on Windows 10 Pro and Windows Server, to enable file-based apps to use cloud object storage such as AWS S3, Azure Blob, and Wasabi S3 as an archive facility. In effect, it has updated this software to support deep cloud archives, such as AWS Deep Glacier or Azure’s Archive Tier, and added in Media Asset Viewer functionality to provide users with a self-serve capability.

By using the web-based UI, they can display media file previews and change the storage tier for a selected file, rehydrating a file from a deep archive, and then downloading it, for example.

The Z20 is available from XenData Authorized Partners worldwide and is priced at $9,880 in the US.

Arcitecta rolls out Mediaflux Real-Time to streamline global media workflows

Arcitecta is rolling out a real-time content delivery and media management aimed at media production pros.

Australia-based Arcitecta provides distributed data management software, its Universal Data System, supporting file and object data storage with single namespace and tiering capability covering on-premises SSDs, disk and tape, plus the public cloud, with a Livewire data mover and metadata database. Its Mediaflux Multi-Site, Mediaflux Edge, and Mediaflux Burst products enable geo-distributed workers to collaborate with faster access to shared data more effectively across normal and peak usage times. Mediaflux Real-Time accelerates access speed to provide virtually instant access to media content data.

Jason Lohrey, Arcitecta
Jason Lohrey

Jason Lohrey, CEO and founder of Arcitecta, stated: “Mediaflux Real-Time is revolutionary and will power the future of live production, supporting continuous file expansion such as live video streams and enabling editors to work with those files in real-time, even while they are still being created.”

He said Arcitecta’s Livewire data transfer module “securely moves millions or billions of files at light speed” to accelerate workflows. “In pre-release previews, broadcasters have praised Mediaflux Real-Time as ‘a game-changer’ for live broadcast, live sports, and media entertainment production.”

Mediaflux Real-Time is hardware, file-type, and codec agnostic, delivering centralized content management, network optimization, collaboration tools, security, and cost efficiency. Customers can organize storage and metadata for easy access and retrieval, have a reliable infrastructure for handling large file transfers, and use version control and integrated feedback systems. They can share content with multiple locations in real time and grow files with live content. The content can be protected with encryption and access controls. 

Arcitecta Mediaflux LiveWire with Dell PowerScale and ECS
Arcitecta Mediaflux LiveWire with Dell PowerScale and ECS

Arcitecta is aiming the product at editors in the sports production, broadcast, and media entertainment environments who need access growing video file content “for live productions and rapid post-event workflows. Editors working remotely often experience delays due to slow transfers and playback speeds, which extend the time to the final product.” Remote editors can work collaboratively, creating highlight reels or edit live footage almost instantly, “dramatically cutting post-production time.”

Mediaflux Real-Time supports real-time editing, with faster content delivery, removes single-location-based workflow bottlenecks and enhances remote collaboration. Content can be played back in real-time across sites. It “eliminates the need to buy and configure dedicated streams or connections to each editing location, requiring only a single stream to transfer the data to multiple sites – reducing cost and infrastructure requirements.” 

We asked Arcitecta how MediaFlux Real-Time differs from the 2024 release of Livewire. Lohrey told us: “Mediaflux Real-Time is a file system (shim) that intercepts all file system traffic and uses Livewire to transport changes to other locations/file systems in real-time.”

“Livewire is a system/fabric that can be asked to transmit a set of data from A to N destinations. What is different here is that we are transmitting file system operations as they happen. For that to happen our file system end point is in the data path and dispatching changes/modifications to other end-points with Livewire. That is, we have tapped into (by being in the data path) the file system and teeing off the modifications as they happen.” In practice, this means:

  • I make a file -> transmitted
  • I rename a file -> transmitted
  • I write to a file -> transmitted
  • I delete a file -> transmitted (although the receiving end may decide not to honor that)

Mediaflux Real-Time is available immediately. It is part of the Mediaflux and Livewire suite and works seamlessly with a wide range of storage and infrastructure solutions and protocols.

Arcitecta and Dell Technologies will showcase Mediaflux Real-Time, combined with Dell PowerScale and ECS, in the Dell Technologies booth #SL4616 at the NAB Show, April 6-9, at the Las Vegas Convention Center.

Storage news ticker – March 31

Amazon Web Services (AWS) wants to build a two-story datacenter for tape storage in Middlesex, England. Its building application has been granted. The planning documents say: “This datacenter will be a data repository which requires significantly less power consumption than a typical datacenter. This building will be designed to house tape media that provides a long-term data storage solution for our customers. It will utilize magnetic tape media.” The lucky tape system supplier has not been identified.

AWS tape storage datacenter planned for Hayes, Middlesex

CoreWeave dropped its IPO target to $40/share for 37.7 million shares and valuing it around $23 billion, which would raise $1.5 billion. It had planned to sell them for between $47 and $55/share, with 49 million shares on offer, valuing it at up to $32 billion and raising up to $2.7 billion. The company reported 2024 revenues of almost $2 billion with a net loss of $863 million. CoreWeave shares should be available on Nasdaq today. It is thought Microsoft’s reported withdrawal from datacenter leases, implying a lower-than-expected growth rate for GPU-heavy AI processing, spooked investors during CoreWeave’s pre-IPO investor roadshow.

DDN has been recognized as a winner of the 2025 Artificial Intelligence Excellence Awards by the Business Intelligence Group for its Infinia 2.0 storage system. See here for more details on the awards and a complete list of winners and finalists.

HighPoint announced its RocketStore RS654x Series NVMe RAID enclosures measuring less than 5 inches tall and 10 inches long with PCIe 4.0 x16 Switch Architecture, built-in RAID 0, 1, and 10 technology, and up to 28 GBps transfer speeds. These four and eight-bay enclosures are specifically designed for 4K and 8K video editing, 3D rendering, and other high-resolution applications.

IBM announced Storage Ceph as a Service so clients can leverage the block+file+object storage software as a fully managed, cloud storage experience on-premises. It’s designed to reduce operational costs by aligning spending with actual usage, avoiding under-utilization and over-provisioning, and scaling on-demand. Prices start at $0.026/GB/month. More information here.

NVMe TCP-connected block storage supplier Lightbits has an educational blog focusing on block storage, which “is evolving into a critical component of high-performance, accelerated data pipelines.” Read it here.

Microsoft has announced new capabilities for Azure NetApp Files (ANF): 

  • A flexible service level separates throughput and capacity pricing, saving customers up to 10-40 percent – think of it as a “pay for the capacity you need, and scale the performance as you grow” model.
  • Application Volume Groups are now available for Oracle and SAP workloads, simplifying management and optimizing performance.
  • A new cool access tier with a snapshot-only policy offers a cost-effective solution for managing snapshots – allowing customers to benefit from cost savings without compromising on restore times.

A blog has more.

OneTrust has launched the Privacy Breach Response Agent, built with Microsoft Security Copilot. When a data breach occurs, privacy teams have to analyze security requirements and regulatory privacy requirements if personal data is compromised. Privacy and breach notification regulations are fragmented and complex, varying by geography and type of data, and the notification windows are often very short. The Privacy Breach Response Agent enables privacy teams to evaluate the scope of the incident, identify jurisdictions, assess regulatory requirements, generate guidance, and coordinate and align with the InfoSec response team. More information on the agent can be found here.

Other World Computing (OWC) launched its Jellyfish B24 and Jellyfish S24 Storage products. The Jellyfish B24 delivers a cost-effective, high-capacity solution for seamless collaboration and nearline backup, while the Jellyfish S24 offers a full SSD production server with lightning-fast performance for demanding video workflows. The B24 has four dedicated SAS ports to which you can connect B24-E expansions via a mini-SAS cable, included with every expansion chassis. By adding four B24-E expansion chassis to a B24 head unit, the total storage capacity can reach up to 2.8 petabytes.

The SSDs in the S24 are the OWC Mercury Extreme Pro SSDs. The S24 can be combined with an OWC Jellyfish S24-E SSD expansion chassis for up to 736 TB of fast SSD storage.

M&E market focused file and object storage supplier OpenDrives is introducing a cloud-native, data services offering it has dubbed Astraeus that merges on-premises, high-performance storage with the ability to provision and manage integrated data services like the public cloud. Customers can “easily repatriate their data, bringing both data and cloud-native applications back on-premises and into the security of a private cloud.” Compute and storage resources can scale independently with dynamic provisioning and orchestration capabilities. Astraeus follows an unlimited capacity pricing model, licensing per-node instead of per-capacity, enabling cost predictability. OpenDrives will be exhibiting at the upcoming 2025 NAB Show in Las Vegas, Booth SL6612 in the South Hall Lower, April 6 to 9.

PNY announced its CS2342 M.2 NVMe SSD in 1 and 2 TB capacities with PCIe Gen 4 x 4 connectivity. It has up to 7,300 MBps sequential read and 6,000 MBps sequential write speeds. The product supports TCG Pyrite and has a five-year or TBW-based warranty.

The co-CEO of Samsung, Han Jong-Hee, has died from a heart attack at the age of 63. Co-CEO Jun Young-hyun, who oversees Samsung’s chip business, is now the sole CEO. Han Jong-Hee was responsible for Samsung’s consumer electronics and mobile devices business.

SMART Modular Technologies announced it is sampling its redefined Non-Volatile CXL Memory Module (NV-CMM) to Tier 1 OEMs based on the CXL 2.0 standard in the E3.S 2T form factor. “This product combines non-volatile high-performance DRAM memory, persistent flash memory and an energy source in a single removable EDSFF form factor to deliver superior reliability and serviceability for data-intensive applications … PCIe Gen 5 and CXL 2.0 compliance ensures seamless integration with the latest datacenter architectures.” View it as a high-speed cache tier.

SMART Modular’s NV-CMM details

There will be an SNIA Cloud Object Storage Plugfest in Denver from April 28 to 30. Learn more here. There will also be an SNIA Swordfish Plugfest at the same time in conjunction with SNIA’s Regional SDC Denver event. Register here.

Team Group announced the launch of the TEAMGROUP ULTRA MicroSDXC A2 V30 Memory Card, which delivers read speeds of up to 200 MBps and write speeds of up to 170 MBps. The ULTRA MicroSDXC A2 V30 meets the A2 application performance standard with a V30 video speed rating and a lifetime warranty.

Tiger Technology has officially achieved AWS Storage Competency status.

Financial analyst Wedbush has identified eight publicly owned suppliers it believes will benefit greatly from an exploding AI spending phase by businesses. It says: “While there is a lot of noise in the software world around driving monetization of AI, a handful of software players have started to separate themselves from the pack … We believe the use cases are exploding, enterprise consumption phase is ahead of us in the rest of 2025, launch of LLM models across the board, and the true adoption of generative AI will be a major catalyst for the software sector and key players to benefit from this once in a generation Fourth Industrial Revolution set to benefit the tech space.” Wedbush identifies Oracle and Salesforce as the top opportunities. The others are Amazon, Elastic, Alphabet, IBM, Innodata, MongoDB, Micron Technology, Pegasystems, and Snowflake. “The clear standout over the last month from checks has been the cloud penetration success at IBM which has a massive opportunity to monetize its installed base over the next 12 to 18 months.”

Prophecy 4.0 brings self-service data prep to Databricks analysts

Databricks data lake analysts can use Prophecy software to build their own data prep pipelines for downstream analytics and AI processing.

Prophecy, which describes itself as a data transformation copilot company, is a Databricks-focused AI and analytics data pipeline development tool that collects unstructured data from multiple corporate data sources – structured or unstructured, on-premises or in the cloud. The software then transforms it and delivers it to Databricks SQL queries. Its AI-powered visual designer generates standardized, open code that extracts, transforms, and loads the required data. It automatically builds the necessary data pipelines and tests, generates documentation, and suggests fixes for errors. The v4.0 ETL product delivers self-service, production-ready data preparation that operates within guardrails defined by central IT. 

Roger Murff, VP of Technology Partners at Databricks, said: “Organizations have put their most valuable data assets into Databricks, and Prophecy 4.0 makes it easier than ever to make that data available to analysts. And because Prophecy is natively integrated with the Databricks Data Intelligence Platform, platform teams get centralized visibility and control over user access, compute costs, and more.”

Prophecy Studio visual design interface

Prophecy v4.0 features include: 

  • Secure data loading from commonly used sources such as files via SFTP (Secure File Transfer Protocol), SharePoint, Salesforce, and Excel or CSV files from analysts’ desktops
  • Last-mile data operations, allowing analysts to send results to Tableau and notify stakeholders via email
  • Built-in automation with a drag-and-drop interface that lets analysts run and validate pipelines without the need for separate tools
  • Data profiles showing distribution, completeness, and other attributes
  • Packages of reusable, governed components
  • Simplified version control
  • Real-time pipeline observability to track performance and detect failures, thereby reducing downtime

A blog by Prophecy’s Mitesh Shah, VP for marketing and analyst relations, declares: “Self-service data preparation has been a game-changer for accelerating AI and analytics.” Analysts can get data prepared themselves and don’t have to hand the task off to data engineers, theoretically saving time and duplicated effort.

Shah says that as “Prophecy is deeply integrated with Databricks, we allow organizations to enforce cluster limits and cost guardrails automatically.” That prevents data prep costs from getting out of hand. Prophecy works with Databricks’ Unity Catalog and “analysts inherit existing permissions from Databricks.”

Prophecy Studio code interface
Raj Bains

Prophecy was founded by CEO Raj Bains and Vikas Marwaha in 2017. It raised $47 million in a B-round in January, justified by 3.5x revenue growth in 2024 with 160 percent net revenue retention from existing customers. It has raised a total of $114 million, $35 million in a 2023 B-round, $25 million in a 2022 A-round, and $7 million before that.

Bains stated: “Analysts can design and publish pipelines whenever they want, with security, performance, and data access standards predefined by IT. We’ve visited companies where analysts would outline data workflows in their data prep tools and then engineers downstream would recode the entire pipeline from scratch with their ETL software. It was a huge waste of time and energy. With Prophecy 4.0, everything is done once.”

There will be a Prophecy v4.0 in action webinar on April 24; those looking to attend can register here.

Infinidat and Veeam push VMware migration to Red Hat OpenShift

Infinidat and Veeam are encouraging VMware migration to Red Hat OpenShift Virtualization (RHOS-V) by providing immutable backups from a RHOS-V system using Infinidat storage to an InfiniGuard target system.

The two companies are positioning their joint initiative as a way to protect important petabyte-scale virtualization workloads, with billions of files, migrated from VMware. Their scheme is based on Veeam’s ability to protect the underlying RHOS-V Kubernetes container workloads using its Kasten v7.5 software and Infinidat’s CSI driver. Infinidat provides Infinibox block, file, and container access storage arrays using memory caching to speed data access to all-flash, hybrid, and disk drive array storage media. Infinidat is in the process of being acquired by Lenovo.

Erik Kaulberg

Erik Kaulberg, VP of Strategy and Alliances at Infinidat, stated: “Infinidat’s comprehensive support for Veeam Kasten v7.5 enables large-scale Kubernetes production deployments that are reliable, robust, and cyber secure … InfiniBox systems can scale to hundreds of thousands of persistent volumes. Partners like Veeam and Red Hat help fuel our containers innovation pipeline, providing a steady stream of enhancements that help our joint customers simplify all aspects of their container storage environments at enterprise scale.”

Gaurav Rishi, VP for Kasten Product and Partnerships at Veeam, said: “Kubernetes has become a vital part of enterprise infrastructure, especially in large enterprises and service providers, from its infancy as a DevOps application development and deployment environment to now being a production platform for delivering enterprise-class business applications. It is essential for our mutual customers that Veeam and Infinidat provide a highly cyber resilient, highly scalable, and highly performant next-generation data protection solution.”

Gaurav Rishi

Kasten v7.5 was released earlier this month and extended source system support to RHOS-V and also SUSE Virtualization. The software was faster at backing up large data volumes – for example, achieving 3x faster backups of volumes containing 10 million small files. It provided multi-cluster FIPS support to adhere to strict US government benchmarks, visibility into immutable restore points, and support for object lock capabilities in Google Cloud Storage.

The v7.5 release added Infinidat InfiniBox integration, based on Infinidat’s InfiniSafe immutable snapshot technology for persistent file and block volumes. The release also added NetApp support. Veeam now includes Infinidat in its Veeam Ready for Kubernetes program.

Analyst house GigaOm rated Veeam’s Kasten subsidiary as a Leader and Outperformer for the fifth time in its latest GigaOm Radar for Kubernetes Protection.

Mike Barrett

Infinidat and Veeam say customers can now bring new, existing, and large-scale VMware and other virtual machine workloads and virtualized applications to Kubernetes and container deployments, such as RHOS-V.

Mike Barrett, VP and GM for Hybrid Platforms at IBM-owned Red Hat, said: “As the virtualization landscape continues to evolve, many organizations are looking for a future proof virtualization solution. Red Hat OpenShift provides a complete application platform for both modern virtualization and containers, and through our collaboration with Infinidat and Veeam, users can leverage enhanced capabilities to scale and protect their VM and Kubernetes workloads.”

The smart way to tackle data storage challenges

SPONSORED FEATURE: Object storage used to trade performance for cost and scalability. With features like high performance, data intelligence, and simple management, HPE is updating the technology for modern use cases.

When object storage first launched in the late 1990s, it enabled companies to tackle a perennial problem: how to store large amounts of data at low cost. That requirement hasn’t gone away, with research company IDC predicting that the volume of enterprise data stored globally will expand at a CAGR of 28.2 percent from 2022-27. But now there’s an additional need to process information faster to meet the demands of more modern applications and workloads.

Object storage scales out to manage vast amounts of data across distributed systems of general-purpose storage nodes. It’s excellent for unstructured data like multimedia files, storing extensive metadata with those objects for more advanced data management and retrieval. However, this mass-scale storage traditionally came with a performance penalty.

Companies have lived with that trade-off by restricting their object storage to low-performance applications such as archiving and large digital repositories where retrieval speed wasn’t a factor. Now, with the rise of data-intensive applications like analytics, AI, and modern data protection, there’s more demand for high-performance object storage that can handle low-latency, high-speed data storage and retrieval.

Traditional object storage can’t keep up

Traditional object storage has not been able to keep up. In November 2024, HPE addressed the issue by launching HPE Alletra Storage MP X10000, powered by AMD EPYC™ embedded processors. The company’s first home-grown entry into the object storage market is an all-flash solution that adds speed to scalability and data intelligence capabilities, while also making object storage easier to manage.

The X10000 handles the same high-volume storage applications that legacy object storage tackles, but its fast operation makes it particularly well-suited for modern use cases, including AI, says HPE. It’s good at supporting the AI lifecycle, for example – an application area that IDC estimates will drive $21.9 billion in global enterprise spending by 2028 – because its low-latency retrieval can help to accelerate training and inference.

HPE is especially targeting HPE Alletra Storage MP X10000 at the newer generation of generative AI (GenAI) applications. With enterprises embracing retrieval-augmented generation (RAG) as a way to tailor large language model (LLM) technology to their own applications and data, they need rapid retrieval of indexed unstructured data.

The driving force behind the X10000’s improved performance is what HPE calls data intelligence. It’s a real-time process where data is scanned and ingested into the object store. HPE creates vector embeddings from the data. These are numerical values that represent semantic meaning, enabling it to quickly retrieve data based on similar concepts. The device stores these in a vector database that enables large language models to fold the retrieved data into their responses.

Data intelligence makes data stored on the X10000 ready for AI applications to use as soon as they are ingested. The company has a demo of this, where the X10000 ingests customer support documents and enables users to instantly ask it relevant natural language questions via a locally hosted version of the DeepSeek LLM. This kind of application wouldn’t be possible with low-speed legacy object storage, says the company.

The X10000’s all-NVMe storage architecture helps to support low-latency access to this indexed and vectorized data, avoiding front-end caching bottlenecks. Advances like these provide up to 6x faster performance than the X10000’s leading object storage competitors, according to HPE’s benchmark testing.

Software-defined flexibility

HPE designed the X10000 from the start to be deployed on-premises and in the cloud thanks to its software-defined architecture. Its control software runs on a containerized Kubernetes (K8s) platform. The X10000 is a modular solution based on what HPE has dubbed a Shared Everything Disaggregated Architecture (SEDA), with compute and storage (or Just Bunch Of Flash -JBOF) nodes scaling independently of each other. Organizations can lean towards performance or capacity according to their own needs when adding modules, rather than having to scale storage and compute linearly with each other. HPE’s research suggests that this disaggregated architecture can reduce storage costs by up to 40 percent.

The containerized architecture opens up options for inline and out-of-band software services, such as automated provisioning and life cycle management of storage resources. It is also easier to localize a workload’s data and compute resources, minimizing data movement by enabling workloads to process data in place rather than moving it to other compute nodes. This is an important performance factor in low-latency applications like AI training and inference.

Another aspect of container-based workloads is that all workloads can interact with the same object storage layer. The X10000 architecture offers native support for Amazon’s S3, widely perceived as the industry standard private cloud API.

Data analytics is the third leg of this market, representing a $17.1 billion opportunity in 2028 calculates IDC. These applications increasingly draw on data lakes, holding all manner of structured and unstructured data. The X10000’s low latency makes it suitable for advanced applications in real-time analytics which increasingly use AI algorithms.

Good for data protection

While AI and analytics are strong growth areas for the X10000, it’s also likely to gain significant traction in high-speed, scalable backup and restore applications. This is a focal point for HPE, which views data protection and backup storage as a core market for the device. Its all-flash architecture gives it the IOPS to recover and restore data quickly, which becomes more important as the scale of backup data grows. Meanwhile, high-performance recovery helps to minimize the business impact of an outage.

Enterprises can rarely, if ever, afford data corruption or leakage, which is why the X10000 includes several features designed to secure data. These include always-on monitoring and auditing for data integrity, along with object lock-based immutability for ransomware protection. Customers can apply authentication and encryption options, both for in-place and in-flight data. The system also features erasure coding, which gives customers more data resilience while preserving storage space; splitting data copies across multiple nodes and using parity blocks enables customers to recover data without storing full copies of it.

This performance boost has already been demonstrated in real-world environments. French technology integrator and IT service provider AntemetA was a prime candidate to beta test the X10000, given its work with AI applications and its provision of backup-as-a-service options for customers.

AntemetA primarily tested the X10000 for data protection and data analytics applications, explained its pre-sales architect Jeff Charpentier at a recent HPE Discover conference. The former is a key application for the company given regulatory changes in the EU. “Data protection is evolving at the moment. There are regulations around the world, like DORA [the Digital Operational Resilience Act] in Europe, and our customers want to accelerate recovery,” he says.

The X10000 passed the company’s tests, and then some, according to Charpentier. “We were quite amazed by the performance,” he continues. “On this system, we were able to reach 40 gigabits per second of write, and also 40 gigabits per second of read.”

Mastering storage management

Charpentier was also impressed by the X10000’s management features, which are based on the Data Services Cloud Console (DSCC). This integrates with HPE GreenLake to create a cloud-based management system that enables customers to manage their entire HPE storage infrastructure – including on-premises and cloud-based systems like the HPE Alletra Storage MP B10000 (formerly GreenLake for Block) – in a single interface.

Unifying management of the entire HPE Alletra Storage range makes it easier to monitor and control not just structured and unstructured data from that one interface, but also block, file, and object storage. This is a big advantage for the X10000 over legacy object storage systems which can be difficult to manage, says HPE.

DSCC enables customers to control functions ranging from performance optimization based on AI-driven predictive analytics, through to rapid data restoration. It supports security features including encryption, intrusion detection, and detailed auditing. It also helps to simplify onboarding with automatic detection, deployment, and integration of new hardware components. That was a big deal for Charpentier, who experienced a component failure during his beta test.

“We do maintenance for our customers, so we are used to failing parts and failures,” he says. “During our test, we had a failure of one flash module among the 24. HPE just shipped a module, we replaced it online, and everything went on as usual with no interruption to the service.”

By supporting identity management for controlled access, DSCC provides a role-based management model that makes it easier for customers to use the same system for multiple functions without confusion, Charpentier adds: “What is interesting also in the DSCC console for management is that you can segregate infrastructure management roles from the DevOps roles.”

Maximizing performance for multiple AI applications

HPE has been forging industry cooperation to drive storage performance at a deep technical level, enabling direct memory access transfers between GPU memory, system memory and indexed metadata storage. This results in reduced latency and CPU overhead says HPE, with a further boost to overall system performance delivered by the HPE Alletra Storage MP’s AMD EPYC embedded CPUs. The AMD EPYC embedded processors at their core are designed to offer a scalable X86 CPU portfolio delivering maximum performance with enterprise-class reliability in a power-optimized profile.

HPE is clearly targeting a big list of applications for the new X10000 model, which means it had to build in sufficient flexibility for customers to tailor it for their specific use cases. The company has also tried to make that scalability more financially manageable by offering the product on a subscription basis.

How can companies best take advantage of a unit like the X10000? It’s a massively scalable system, but you don’t have to start big. Beginning with as little as three nodes, you can test your assumptions and gain confidence with the management interface before scaling up, concentrating on either storage capacity or compute, or a mixture of both. That will help you to minimize the total cost of ownership by avoiding overprovisioning.

Object storage isn’t going anywhere. IDC expects it to grow at a five- year CAGR of 14.9 percent through 2027 in on- premises and public cloud environments. But it needed reinventing for a new age of AI, data analytics, and data protection at mass scale. The X10000’s software-defined architecture and data intelligence feature delivers high performance while also simplifying management by unifying everything under a simple HPE GreenLake cloud interface.

With these enhancements, HPE is banking not so much on nudging the object storage needle as sending it spinning at rapid speed. It seems to be succeeding – and it has the customer stories to prove it.

Sponsored by Hewlett Packard Enterprise and AMD.

Datadobi boosts StorageMAP for smarter data management

The Datadobi StorageMAP v7.2 release has added metadata and reporting facilities so that it can better lower costs, help customers get greener, and track a wider range of object storage data.

StorageMAP software scans and lists a customer’s file and object storage estates, both on-premises and in the public cloud, and can identify orphaned SMB protocol data. There are new archiving capabilities that allow customers to identify and relocate old or inactive data to archive storage, freeing up primary data stores on flash or disk. Datadobi cites a Gartner study, saying: “By 2028, over 70 percent of I&O [infrastructure and operations] leaders will implement hybrid cloud storage strategies, a significant increase from just 30 percent last year.” The implication is that managing a hybrid on-prem/public cloud file and object data estate efficiently will become more important.

Michael Jack, Datadobi
Michael Jack

CRO Michael Jack stated: “Unstructured data continues to grow at an unprecedented pace, yet many I&O leaders still struggle to gain appropriate levels of visibility and control over their environments.”

StorageMAP has a metadata scanning engine (mDSE) with parallelized multi-threaded operations, a metadata query language (mDQL), an unstructured data workflow engine (uDWE) and an unstructured data mobility engine (uDME) to move between storage tiers and locations. It works across on-premises and public cloud environments, converts SMB and NFS files to S3 objects, and is deployed as a Linux VM.

Datadobi file scanning uses multi-threading to process directory structures in parallel. As object storage, with its flat address space, does not have any nested file/directory structure, StorageMAP’s scanning engine splits the object namespace into subsets, scanning them in parallel to lower scan times.

New metadata facilities in v7.2 let customers track costs, carbon emissions, and other StorageMAP tags with greater precision.

v7.2 introduces automated discovery for Dell ECS and NetApp StorageGRID object stores, enabling customers to identify their tenants and their associated S3 buckets. It extends its orphan data functionality to NFS environments so that it can identify and report on data not currently owned by any active employee. This feature applies to all data accessed over the SMB and NFS protocols.

The new software can find and classify data suitable for GenAI processing, “enabling businesses to feed data lakes with relevant, high-quality datasets” for use in retrieval-augmented generation (RAG). An enhanced licensing model lets customers scale their use of StorageMAP’s features according to their specific requirements.

Datadobi told us that, today, data is found based on customer-supplied criteria specific to intrinsic metadata and assigned enriched metadata such as StorageMAP tags. This means that customers can do searches on files satisfying certain metadata properties. Metadata such as last access, owner (active or deactivated), path of the file, type of file, etc. This is what we call “intrinsic metadata”, i.e. metadata that comes with the file (or object) in the storage system. These sources of metadata help to identify which data is not relevant for feeding into AI training or for use in RAG-based queries.

In the future StorageMAP will employ algorithms that examine patterns in metadata that can point to potentially interesting files/objects that are candidates for deep scanning that will analyze the content. Output from that analysis can result in tags being assigned. It can also serve as a guide for files/objects to copy to storage, feeding GenAI, Agentic AI, or other next-generation applications. The core problem with deep scanning massive numbers of files/objects is the time required to conduct the scans. Therefore, creative methods are required to containerize the deep scans and focus them on subsets of data that will allow meaningful insight to be determined in a timely fashion. Solutions that execute brute-force scans of billions of files/objects will literally run for years, which is not tenable.

Bootnote

The Gartner study is titled “Modernize File Storage Data Services with Hybrid Cloud.” The report has three recommendations:

  • Implement hybrid cloud data services by leveraging public cloud for disaster recovery, burst for capacity, burst for processing and storage standardization.
  • Build a three-year plan to integrate your unstructured file data with the public cloud infrastructure as a service (IaaS) to match your objectives, SLAs, and cost constraints.
  • Choose a hybrid cloud file provider by its ability to deliver additional value-add services, such as data mobility, data analytics, cyber resilience, life cycle management, and global access. 

Databricks partners with Anthropic, Palantir on enterprise AI

Datalake supplier Databricks has signed a deal with Palantir, developed a better way to fine-tune large language models (LLMs), and is including Anthropic’s Claude LLMs with its datalake in a five-year alliance.

Databricks bought generative AI model maker MosaicML for $1.3 billion in June 2023 and developed its technology into Mosaic AI. This is a set of tools and services to help customers build, deploy, and manage GenAI models with retrieval-augmented generation (RAG) applied to their proprietary data. It is integrated into Databricks’ datalake, the Data Intelligence Platform. Now it is also integrating the Palantir Artificial Intelligence Platform (AIP) with the Data Intelligence Platform and offering Anthropic’s reasoning-level AI models through it. The TAO (Test-time Adaptive Optimization) initiative uses test-time compute to augment and simplify model tuning without needing labeled data.

Ali Ghodsi

Ali Ghodsi, Databricks co-founder and CEO, stated: “We are bringing the power of Anthropic models directly to the Data Intelligence Platform – securely, efficiently, and at scale – enabling businesses to build domain-specific AI agents tailored to their unique needs. This is the future of enterprise AI.”

Anthropic co-founder and CEO Dario Amodei said: “This year, we’ll see remarkable advances in AI agents capable of working independently on complex tasks, and with Claude now available on Databricks, customers can build even more powerful data-driven agents to stay ahead in this new era of AI.”

Palantir’s AIP is a workflow builder that uses GenAI models to analyze natural language inputs, generate actionable responses, and, in military scenarios where it could analyze battlefield data, suggest strategies and tactical responses. Earlier Palantir AI systems, Gotham and Foundry, are data integration and analytics systems designed to help users analyze complex, disparate datasets and produce action-oriented responses, particularly in Gotham’s case for military and national security datasets. AIP combines Palantir’s ontology (Foundry) framework, a digital representation of an organization’s operations, to bring more context to its request responses.

Databricks is bringing Palantir’s military-grade security to its Unity Catalog and Delta Sharing capabilities. The Unity Catalog is a unified governance software layer for data and AI within the Databricks platform. Delta Sharing is Databricks’ way of sharing third-party data. Databricks says that joint Databricks and Palantir customers will be able to use GenAI, machine learning, and data warehousing within a secure, unifiedm and scalable environment. 

Dario Amodei in Youtube video

Rory Patterson, chairman of the board of Databricks Federal, said the combination of Unity Catalog and Delta Sharing with the Palantir system “will deliver the best of both worlds to our joint customers.”

The Anthropic deal enables Databricks to offer Anthropic models and services, including its newest frontier model, Claude 3.7 Sonnet, natively through the Databricks Data Intelligence Platform, available via SQL query and model endpoint. This means that “customers can build and deploy AI agents that reason over their own data.” Claude can “handle large, diverse data sets with a large context window to drive better customization.”

Databricks says its Mosaic AI provides the tools to build domain-specific AI agents on customers’ own data “that deliver accurate results with end-to-end governance across the entire data and AI lifecycle, while Anthropic’s Claude models optimize for real-world tasks that customers find most useful.”

The Unity Catalog works with Anthropic Claude, providing the ability for users to enforce access controls, set rate limits to manage costs, track lineage, implement safety guardrails, monitor for potential misuse, and ensure their AI systems operate within defined ethical boundaries. Customers can customize Claude models with RAG by automatically generating vector indexes or fine-tuning models with enterprise data.

The fine-tuning angle brings us to TAO and its way of side-stepping labeled data needs. An LLM trained on general data can be fine-tuned through additional training by using input items paired with output items, a label, clearly indicating a desired response. This “teaches” the model to generate better responses by adjusting its internal parameters when comparing its internal predictions against the labels. For example, the input could be “Is rain wet?” for a weather-related session with an answer of “Yes.”

However, this can involve a huge amount of human labeling effort, with tens or even hundreds of thousands of input-output labels. A Databricks blog explains that TAO uses “test-time compute … and reinforcement learning (RL) to teach a model to do a task better based on past input examples alone, meaning that it scales with an adjustable tuning compute budget, not human labeling effort.”

The model “then executes the task directly with low inference costs (i.e. not requiring additional compute at inference time).” Unexpectedly, TAO can achieve better model response quality than traditional fine-tuning. According to Databricks, TAO can bring inexpensive, source-available models like Llama close to the performance of proprietary models models like GPT-4o and o3-mini and also Claude 3.7 Sonnet.

The blog says: “On specialized enterprise tasks such as document question answering and SQL generation, TAO outperforms traditional fine-tuning on thousands of labeled examples. It brings efficient open source models like Llama 8B and 70B to a similar quality as expensive models like GPT-4o and o3-mini without the need for labels.” TAO will enable Databricks users to use less expensive and capable GenAI models and improve their abilities.

To learn more about the Databricks and Anthropic partnership, sign up for an upcoming webinar with Ghodsi and Amodei. Anthropic’s Claude 3.7 Sonnet is now available via Databricks on AWS, Azure, and Google Cloud.

Data sovereignty in focus as Europe scrutinizes US cloud influence

Analysis: Rising data sovereignty concerns in Europe following Donald Trump’s election as US President in January are increasing interest in Europe-based storage providers such as Cubbit, OVHcloud, and Scaleway.

The EU has GDPR, NIS2, and DORA regulations that apply to customer data stored in the bloc. However, US courts could compel companies under US jurisdiction to disclose data, potentially overriding EU privacy protections in practice.

EU data regulations

GDPR, the General Data Protection Regulation, harmonized data privacy laws across Europe with regard to the automated processing of personal data as well as rules relating to the free movement of personal data and the right to have personal data protected. GDPR’s scope applies to any organization located anywhere that processes the personal data of EU residents.

NIS2, the EU’s cybersecurity Network and Information Security 2 directive, took effect in October last year with operational security requirements, faster incident reporting, a focus on supply chain security, harsher penalties for non-compliant organizations, and harmonized rules across the EU. While NIS focuses on the security of network and information systems, the UK GDPR is concerned with the processing of personal data.

DORA, the EU regulation on Digital Operational Resilience for the financial sector, establishes uniform cybersecurity requirements for financial bodies in the EU. DORA and NIS complement and coexist with GDPR.

There is an EU-US Data Privacy Framework (DPF) set up in 2023. This is a legal agreement between the EU and US intended to allow the secure transfer of personal data to US companies that participate in the framework, thus ensuring that the data is protected at a level comparable to the EU’s GDPR. 

US companies – excluding banks and telecom providers – can self-certify through the Department of Commerce, committing to privacy principles like data minimization, purpose limitation, and transparency. Periodic reviews by the European Commission and data protection authorities will monitor compliance.

GAIA-X

Lastly, there is a European GAIA-X cloud framework, launched in 2019 to create a federated, secure, and sovereign digital cloud infrastructure for Europe. It aims to ensure that European data remains under European control, adhering to EU laws such as GDPR. But it is a framework and new and existing cloud service suppliers in the EU have to adopt it.

US suppliers like AWS, Azure, Microsoft, and Palantir have joined GAIA-X as not quite full members. They are subject to US jurisdiction under the CLOUD Act, potentially compromising the initiative’s goals. French founding member Scaleway left the organization in 2021 due to such doubts.

US supplier EU data sovereign clouds

US-based public cloud suppliers have set up operations they say comply with GDPR rules. AWS has set up its European Sovereign Cloud, standalone cloud infrastructure physically located in the EU (starting with Germany), operated by EU-resident personnel, and designed to keep all data and metadata within EU borders. It enables customers to select specific EU region centers, such as Frankfurt, Ireland, and Paris, for data storage and processing.

Azure supports GDPR constraints with an EU Data Boundary concept. This ensures customer data for services such as Azure, Microsoft 365, and Dynamics 365 is stored and processed within EU and European Free Trade Association (EFTA) regions. Azure also provides multiple EU regions, such as Germany, France and Sweden, to further localize data within the EU geography and says it supports the GAIA-X framework.

Google Cloud partners with EU suppliers, such as T-Systems in Germany, to offer local sovereign cloud options, restricted to, for example, Belgium, Finland, or Germany. Data residency and operations are managed within Europe, sometimes with encryption keys controlled by external partners rather than Google. Even Oracle has set up an EU-only sovereign cloud.

US law and EU data sovereignty

However, certain US legal rights affect the situation and raise doubts about the ability of US-based EU sovereign cloud providers to refuse US government requests for access to EU citizens’ data.

The 2008 Foreign Intelligence Surveillance Act’s section 702 (FISA 702) authorizes the warrantless collection of foreign communications by US intelligence agencies like the NSA, targeting non-US persons located outside the United States for national security purposes. A Court of Justice of the European Union (CJEU) ruling in 2020 declared that FISA 702’s lack of judicial oversight and redress for EU citizens makes US privacy protections inadequate under GDPR.

The 2018 CLOUD (Clarifying Lawful Overseas Use of Data) Act raises questions about the vulnerability of US public cloud suppliers to government demands for access to their EU-stored data. It allows authorities to compel US-based tech companies to provide data about a specific person or entity, stored anywhere in the world, under a warrant, subpoena, or court order.

Companies can challenge such US orders in court if they conflict with foreign laws like GDPR, and if the target isn’t a US person and doesn’t reside in the US. This CLOUD Act could override the 2023 EU-US Data Privacy Framework.

Until a court rules that US public clouds and supplier-controlled EU sovereign clouds are not subject to FISA 702 and/or CLOUD Act requests for access to EU citizens’ data, and can refuse them, their solid adherence to EU data privacy laws must be in doubt.

We might imagine what could happen if the US Trump administration told a US public cloud supplier to give it access to an EU citizen’s data. They could well accede to such a request.

The most certain way for an EU organization to ensure that citizens’ private data is not accessible to US government inspection is to have it stored in strictly EU-controlled IT facilities, such as their own systems, France’s OVH Cloud, Cubbit’s decentralized cloud, and other regional cloud storage suppliers.

Cohesity launches NetBackup 11 with quantum-proof encryption

In what appears to be a tangible sign of its commitment to its acquired Veritas data protection product portfolio, Cohesity has issued a v11.0 major release of NetBackup.

It features extended encryption to defend against quantum attacks, more user behavior monitoring, and an extended range of cloud services protection.

Vasu Murthy, Cohesity
Vasu Murthy

Cohesity SVP and chief product officer Vasu Murthy stated: “This represents the most powerful NetBackup software release to date for defending against today’s sophisticated threats and preparing for those to come …The latest NetBackup features give customers smarter ways to minimize the impact of attacks now and post-quantum.”

NetBackup v11.0 adds quantum-proof encryption, claiming to guard against “harvest now, decrypt later” quantum computing attacks. Quantum-proof encryption is designed to resist attacks from quantum computers, which could potentially break current encryption methods. Encrypted data could, it is theorized, be copied now and stored for later decryption using a yet-to-be-developed quantum computer. Post-quantum cryptography uses, for example, symmetric cryptographic algorithms and hash functions that are thought to be resistant to such attacks. The US NIST agency is involved in these quantum-proof encryption efforts.

Cohesity claims that NetBackup v11.0 “protects long-term confidentiality across all major communication paths within NetBackup, from encrypted data in transit and server-side dedupe, to client-side dedupe, and more.”

v11.0 also includes a wider range of checks looking for unusual user actions, saying it’s a “unique capability” and “can stop or slow down an attack, even when threat actors compromise administrative credentials.”

It automatically provides recommended security setting values. An included Adaptive Risk Engine v2.0 monitors user activity, looking for oddities such as unusual policy updates and user sign-in patterns. Alerts and actions can then be triggered, with Cohesity saying “malicious configuration changes can be stopped by dynamically intercepting suspicious changes with multi-factor authentication.”

A Security Risk Meter provides a graphic representation of various risks across security settings, data protection, and host communication. 

Security Risk Meter screenshot

NetBackup 11.0 has broadened Platform-as-a-Service (PaaS) protection to include Yugabyte, Amazon DocumentDB, Amazon Neptune, Amazon RDS Custom for SQL Server and Oracle Snapshots, Azure Cosmos DB (Cassandra and Table API), and Azure DevOps/GitHub/GitLab. 

It also enables image replication and disaster recovery from cloud archive tiers like Amazon S3 Glacier and Azure Archive.

The nonprofit Sheltered Harbor subsidiary of the US FS-ISAC (Financial Services Information Sharing and Analysis Center) has endorsed NetBackup for meeting the most stringent cybersecurity requirements of US financial institutions and other organizations worldwide.

A Cohesity blog by VP of Product Management Tim Burowski provides more information. NetBackup v11.0 is now available globally.

Rubrik taps NTT DATA to scale cyber-resilience offering globally

Fresh from allying with Big Four accounting firm Deloitte, Rubrik has signed an agreement with Japanese IT services giant NTT DATA to push its cyber-resilience services.

NTT revenues for fiscal 2024 were $30 billion, with $18 billion of that generated outside Japan. Rubrik is much smaller with revenue of $886.5 million in its latest financial year, and its focus on enterprise data protection and cyber-resilience led to a 2024 IPO, and product deals with suppliers including Cisco, Mandiant, and Pure Storage. Rubrik and NTT DATA have been working together for some time and have now signed an alliance pertaining to security services.

Hidehiko Tanaka, Head of Technology and Innovation at NTT DATA, said: “We recognize the critical importance of cyber resiliency in today’s digital landscape. Our expanded partnership with Rubrik will significantly enhance our ability to provide robust security solutions to our clients worldwide.”

NTT will offer its customers Rubrik-influenced and powered advisory and consulting services, implementation and integration support, and managed services. Its customers will be able to prepare cybersecurity responses before, during, and after a cyber incident or ransomware attack affecting their on-premises, SaaS, and public cloud IT services. The NTT DATA partnership includes Rubrik’s ransomware protection services.

In effect, Rubrik is adding global IT services and accounting firms to its sales channel through these partnerships, giving it access to Fortune 2000 businesses and public sector organizations, potentially offering an advantage over competitors such as Commvault, Veeam, and Cohesity. Once it has won a new customer this way, Rubrik will no doubt put in its salesteam to work on cross-sell and upsell opportunities.

Ghazal Asif, VP, Global Channel and Alliances at Rubrik, said: “As a trusted and longstanding strategic partner, Rubrik is proud to expand our collaboration with NTT DATA for cyber resilience. Together, we will empower hundreds of organizations with differentiated offerings that ensure rapid recovery from ransomware attacks and other cyber threats, no matter where their data lives.”

NTT DATA has used Commvault internally in the past, as a 2018 case study illustrates, and NTT UK is a Veeam partner.

N2WS adds cross-cloud backup for Azure, AWS, Wasabi

N2WS says its latest backup software provides cross-cloud backup across AWS, Azure, and Wasabi, which it claims lowers costs and enables direct backup cold tier storage for Azure Blob and Wasabi S3.

N2WS (Not 2 Worry Software) backup uses cloud-native, platform-independent, block-level snapshot technology with an aim to deliver high-speed read and write access across Azure, AWS, and third-party repositories like Wasabi S3. It charges on a per-VM basis and not on a VM’s size. N2WS says it provides vendor-neutral storage and, with its cloud-native snapshot technology, delivers control over cloud backups, designed to unlock “massive cost savings.” N2WS is a backup and recovery supplier that has not broadened into cyber-resilience.

Ohad Kritz

Ohad Kritz, CEO at N2WS, claimed in a statement: “We excel in protecting data – that’s our specialty and our core strength. While others may branch into endpoint security or threat intelligence, losing focus, we remain dedicated to ensuring our customers are shielded from the evolving IT threat landscape.”

N2WS claims Backup & Recovery v4.4 offers:

  • Lower long-term Azure storage costs with up to 80 percent savings through Azure Blob usage, and more predictable costs compared to Azure Backup, with new per-VM pricing ($5) and optimized tiering
  • Seamless cross-cloud automated archiving with low-cost S3-compatible Wasabi storage
  • Faster, more cost-effective disaster recovery via Direct API integrations with AWS and Azure with greater immutability
  • New custom tags improve disaster recovery efficiency, with better failover and failback
  • Targeted backup retries, which retry only failed resources to cut backup time and costs

The tiered Azure backup is similar to N2WS’s existing and API-linked AWS tiering. Azure VM backup pricing is based on VM and volume size. An N2WS example says that if you have 1.2 TB of data in one VM, the cost of using Azure Backup would be $30 plus storage consumed versus $5 plus storage consumed for the same VM using N2WS.

N2WS Azure customers can also select both cross-subscription and cross-region disaster recovery for better protection with offsite data storage, we’re told.

The cross-cloud backup and automated recovery provides isolated backup storage in a different cloud and sidesteps any vendor lock-in risks associated with, for example, Azure Backup.

N2WS uses backup tags to identify data sources and apply backup policies. For example, AWS EC2 instances, EBS volumes, and RedShift databases are identified with key-value pairs. N2WS scans the tags and applies the right backup policy to each tagged data source. If a tag changes, a different backup policy can be applied automatically.

It says its Enhanced Recovery Scenarios introduce custom tags, enabling the retention of backup tags and the addition of fixed tags, such as marking disaster recovery targets. This improvement enhances the differentiation between original and DR targets during failover. v4.4 has a Partial Retry function for policy execution failures, which retries backing up only the failed sources and not the successful ones.

N2WS Backup & Recovery v4.4 is now generally available. Check out N2WS’s cost-savings calculator here.