ExaGrid shines up systems, gets Metallic

By

-

August 28, 2024

Backup target system supplier ExaGrid now supports Commvault’s Metallic backup software.

ExaGrid supplies deduplicating backup appliances that have a direct-to-disk landing zone with post-ingress deduplication to a virtually air-gapped, non-network-facing tier of disk storage for long-term retention, with a Retention Time-Lock option. The landing zone provides fast backup ingest, quick restores, and instant VM recoveries. ExaGrid includes Commvault as a supported backup vendor and has done for nearly 15 years. Commvault’s Metallic is a SaaS backup application and it leads the cloud-native data protection market according to a GigaOm Sonar report. ExaGrid is providing support to Metallic users who want to back up data on-premises.

Bill Andrews, ExaGrid’s president and CEO, said in a statement: “ExaGrid has been a target for Commvault software for nearly 15 years. We are excited to offer even more value to Commvault users by expanding our support to Metallic as well.”

Incoming Metallic backup data can be deduplicated and compressed with an up to 5:1 data reduction ratio. ExaGrid applies its own deduplication on top of that and can achieve an up to 15:1 total data reduction ratio, three times more than Commvault alone. This increases the effective capacity of the ExaGrid system, lowering its cost/TB.

For flexibility, ExaGrid supports Commvault Metallic with dedupe on or off, compression on or off, and any combination of these options.

ExaGrid supports many other backup source systems, and protocols:

Acronis
Arcserve
Bacula
BridgeHead Healthcare Data Management
Dell EMC Networker
HPE Zerto
HYCU
IBM Spectrum Protect, IBMi/LaserVault
MicroFocus
Veeam Backup & Replication
Veeam backup to S3 object storage
Veritas BackupExec, NetBackup (NetBackup Accelerator, OST Integration, integration with NetBackup media server hardware, AIR (Auto Image Replication), GRT (granular level restore), optimized deduplication, and instant recovery)

ExaGrid also accepts backup data directly from applications such as IBM, IDERA, Microsoft SQL, Oracle (Dump and RMAN dump), and Redgate. It can be a target system for Linux/UNIX Direct Tar. Check out more source data sender details here.

Metallic supports a variety of other on-premises backup storage target systems, such as Dell PowerProtect and HPE StoreOnce plus on-premise S3 object storage systems like Cloudian HyperStore and Dell ECS and others.

Storage news ticker – August 27

By

Chris Mellor

-

August 27, 2024

The Acronis Threat Research Unit is sharing new insights regarding a misconfiguration in Microsoft Exchange Online settings that could result in email spoofing. Users who have a hybrid configuration of on-premises Exchange and Exchange Online that communicate via connectors, as well as those who use a third-party email security solution, could be vulnerable to misconfiguration exploitation. The details are as follows: In July 2023, Microsoft announced changes to its Domain-based Message Authentication, Reporting and Conformance (DMARC) policy. If a tenant recipient’s domain points to a third-party email security solution that sits in front of Microsoft 365, then Honor DMARC will not be applied. Honor DMARC will be applied if enhanced filtering for connectors is enabled. However, if it is misconfigured, Honor DMARC will be ignored. You can find the full report here.

…

Storage exec Steven Campbell, Cerabyte — Steven Campbell

Ceramic-based archive storage startup Cerabyte has hired Steven Campbell, a former CTO with Hitachi Global Storage Technology (HGST) and Western Digital, as its CTO. He will define the company’s technical vision, integrating evolving customer needs to execute a leadership roadmap. Cerabyte says that throughout his career, Campbell’s contributions have been pivotal in shaping data storage technology – including implementing shingled magnetic recording (SMR) and driving the development and launch of the first helium-filled hard drive. In addition to HGST and Western Digital, Steve was previously CEO of Singapore-listed Magnecomp International, Thai-listed Magnecomp Precision Technology, and InnoTek.

…

Object storage supplier Cloudian has won Fastweb – one of Italy’s main telecommunications operators with 3.3 million wireline and 3.6 million mobile customers – as a customer. Fastweb’s Cloudian deployment includes 3.4PB of capacity in two datacenters. Cloudian CMO Jon Toor tells us: “Telcos continue to be a major market for us worldwide. Cloudian’s first customer, way back when, was a telco in Japan, so it makes sense.”

…

Backup and data management supplier Cohesity has recorded 26 percent revenue growth for its latest fiscal year, according to CEO Sanjay Poonen talking to Bloomberg.

…

IT and security data engine startup Cribl has raised $319 million in a Series E round, bringing its total capital raised to over $600 million and increasing its valuation by 40 percent to $3.5 billion. The funding round is led by new investor Google Ventures, with participation from GIC, Capital G, IVP, and CRV. This is one of Google Ventures’ largest investments to date and marks the first by general partner Michael McBride, who will also join Cribl’s board of directors where he will work closely with CEO Clint Sharp to help steer the business toward IPO. Cribl has doubled its annual recurring revenue (ARR) year-over-year and had over 140 percent net dollar retention (NDR) growth over the past 12 months.

Cribl has seen triple-digit customer growth for five straight years and a quarter of Fortune 500 companies are Cribl customers. Last year, Cribl surpassed $100 million in ARR, becoming the fourth-fastest infrastructure company to reach centaur status, following Wiz (1.5 years), HashiCorp (three years), and Snowflake (3.5 years). Recently, Cribl appeared on the Forbes Cloud 100 list (second consecutive year), Fortune Cyber60, Inc. 5000’s Fastest Growing Private Companies, CRN Security 100, and more. With more than 700 employees worldwide, Cribl is actively broadening its business footprint, with a presence in the US, Europe, and Australia.

…

Supercomputer and enterprise storage supplier DDN says Joseph George has joined the DDN team as field CTO and VP for Strategic Alliances. He was previously VP for strategic alliances at HPE, and with Cray until September 2018.

…

Dell says its PowerScale OneFS v9.9 OS adds support for 61TB QLC SSDs and 200 GbE networking support for both front-end and back-end fabrics. The front-end fabric provides 200GbE connectivity between clients and the PowerScale cluster enables seamless ingestion of massive datasets, ensuring your GPUs never go hungry for information. Back-end fabric 200 GbE interconnects between storage nodes, meaning the PowerScale cluster itself becomes a high-speed data highway, ensuring rapid communication and efficient data distribution. Support will initially be for the Nvidia CX-6 VPI 200 GbE network card.

Dell says the AI market is rapidly evolving, and the infrastructure supporting it must evolve as well. 200 GbE Ethernet is proving to be a critical technology for enabling the next generation of AI applications. Its speed, scalability, and low latency make it the perfect match for the demanding requirements of AI workloads. It may well support 400 GbE and 800 GbE speeds in the future.

…

Gartner gurus have predicted that 75 percent of enterprises will prioritize SaaS app backup as a critical requirement by 2028, compared to 15 percent in 2024, due to the increasing risk of IT outages. Worldwide end-user SaaS spending is projected to grow by 20 percent, totaling $247.2 billion in 2024 and forecast to reach nearly $300 billion in 2025. Michael Hoeck, senior director analyst at Gartner, said: “Given the vulnerability of SaaS data to errors, cyber attacks, and vendor mishaps, robust backup solutions are indispensable. Integrating backup as a service (BaaS) is essential for safeguarding cloud workloads and maintaining operational continuity.”

The SaaS application backup market is rapidly growing, initially led by specialized startups, but now also includes established enterprise backup and recovery software solutions companies. Users should use third-party SaaS backup solutions to complement the native capabilities of SaaS vendors. By 2028, 75 percent of large organizations will adopt BaaS alongside on-premises tools to back up cloud and on-premises workloads. Gartner clients can read more in “Top Trends in Enterprise Backup and Recovery for 2024.“

…

On-prem and SaaS backup supplier HYCU now supports Microsoft’s Azure AD replacement Entra ID within its R-Cloud platform. This latest SaaS integration brings the total number of HYCU supported applications and cloud services to more than 80. HYCU says Entra ID is the cornerstone of identity management for millions of organizations worldwide. With over 610 million monthly active users as of 2023 – including approximately 400 million from Microsoft tenants and 210 million from non-Microsoft workloads. HYCU for Microsoft Entra ID is available immediately in the HYCU Marketplace.

R-Cloud’s Entra ID support includes:

One-click restore of configurations, from individual items to the entire Microsoft Entra ID tenant ;
Autopilot backups with “backup assurance,” providing 24/7 protection with complete logging and notifications; 
Ransomware-proof copies stored in customer-controlled, immutable cloud storage; 
Instant visualization of the entire data estate, exposing unprotected applications and third-party risks; 
Unified protection and recovery across Microsoft Entra ID and Okta (Workforce Identity Cloud and Customer Identity Cloud) along with AWS IAM;
Additional protection and restore of Amazon Virtual Private Cloud (VPC), Amazon Route 53, AWS Web Application Firewall (WAF), AWS Parameter Store, Amazon Key Management Services (KMS).

…

Cloud file services supplier Nasuni has announced rapid growth in the media and advertising industry, claiming a 121 percent increase in data under management over the last 24 months. Leading media companies are actively looking to implement GenAI into production processes such as 9Rooftops Marketing, Centaur Media, TBWA, Omnicom, and Crain Communications.

…

Data protector N2WS has named Jay Iparraguirre as global VP of sales and Nir Veledniger as head of customer success. These appointments come on the heels of hiring Alon Maimoni as CRO in March 2024. N2WS’s statement refers to its transformation to maximize the value of the N2WS platform and drive revenue growth by reshaping the company’s customer success, go-to-market strategy, and channel business. Iparraguirre will help move the company’s sales strategy from a reactive to a proactive approach, with an increased emphasis on outbound sales and channel partnerships. As head of customer success, Veledniger will focus on implementing robust onboarding processes and educational initiatives to ensure customers maximize product value.

lon Maimoni, Jay Iparraguirre, and Nir Veledniger, N2WS — From left, Alon Maimoni, Jay Iparraguirre, and Nir Veledniger

…

GPU supplier Nvidia announced its NIM Agent Blueprints catalog of pretrained, customizable AI workflows. NIM Agent Blueprints provide a jump-start for developers creating AI applications that use one or more AI agents. They include sample applications built with Nvidia NeMo, Nvidia NIM and partner microservices, reference code, customization documentation, and a Helm chart for deployment. The first NIM Agent Blueprints now available include a digital human workflow for customer service, a generative virtual screening workflow for computer-aided drug discovery, and a multimodal PDF data extraction workflow for enterprise retrieval-augmented generation (RAG) that can use vast quantities of business data for more accurate responses. NIM Agent Blueprints are free for developers to experience and download and can be deployed in production with the Nvidia AI Enterprise software platform.

…

Korea’s Chosun Daily reports that SK hynix is gradually increasing wafer input at its M15 production line in Cheongju, with the goal of boosting monthly wafer output by approximately ten percent early next year. Subsidiary Solidigm, facing strong SSD demand, turned a profit in the second quarter and plans to increase production by around five percent starting early next year. Solidigm leads the market with its 60TB QLC enterprise SSDs. SK hynix plans to release 128TB SSDs in early 2025.

…

Virtualized datacenter supplier VergeIO closed one of its most successful quarters ever with record sales, a full pipeline, partner wins, and an expanding presence in Europe, Asia, and Latin America. It set a record for new customers that was 50 percent higher than any prior quarter, and twice what were closed in the first quarter – due in no small part to organizations seeking alternatives to VMware. Sales were also four times the second quarter of 2023. The biz saw a fivefold increase in its sales pipeline including double-digit large enterprise prospects, and incoming interest from more than 6,000 potential new customers. Ten new reseller partners signed on during the quarter, and 80 percent of enrolled partners brought in new business opportunities. Outside the US, VergeIO now boasts active resellers in Canada, England, and elsewhere in Europe, Asia, and South America. New customers have signed on from Liechtenstein, Australia, Indonesia, Taiwan, England, and Canada.

…

Western Digital’s PCIe 5.0 DC SN861 enterprise SSD uses a controller from South Korean company Fadu, according to an Anand Tech teardown.

…

Western Digital is investing ฿23.5 billion ($693 million) to expand its HDD manufacturing capacity in Thailand. It is likely to generate an additional ฿200 billion ($5.897 billion) per year in exports. WD’s disk drive ASP is $163 so the export uplift implies the manufacturer is building an extra 36.18 million HDDs a year. In its latest quarter, WD manufactured 12.1 million drives – meaning 36 million a year. The Thailand investment will thus increase its HDD manufacturing capacity by ten percent.

Rubrik and MSP bring managed data protection to Latin America

By

Chris Mellor

-

August 27, 2024

A joint venture between Rubrik and MSP Assured Data Protection is taking managed Rubrik data protection and security services to prospective customers in Latin America.

Assured Data Protection (ADP) is Rubrik’s largest global managed services supplier. It has main offices in both the UK and US, and operates 24/7 in more than 40 countries where it has deployments. It has datacenter infrastructure in six worldwide locations.

ADP has expanded operations into Latin America, including the joint venture establishing Rubrik’s presence in the region through Assured’s 24/7/365 managed service. It reckons enterprises of all sizes across the continent will benefit from a more flexible approach to Rubrik deployment and be able to “rapidly recover” from cyberattacks, including ransomware.

Ghazal Asif Farhadi, Rubrik’s VP for Global Channels & Alliances, stated: “Rubrik is proud to support Assured’s new expansion into Latin America. Organizations need to enable cyber resilience in the face of the increasing cyber threat landscape and there is no better way for Latin American companies to do that than working with Rubrik and Assured.”

ADP has set up operations in markets such as Mexico, Peru, Costa Rica, Chile, and Colombia, with customers including Niubiz, América TV, Ferromex, Farmacias Roma, and others.

Fiorella Minaya Rey, ADP’s Latin America Channel Account Manager and a former Rubrik LATAM sales rep, stated: “To ensure that enterprises of all sizes can take advantage of Rubrik, there will be no limitations on the amount of data that customers can secure – it can be as small or as large as required.”

ADP is launching a regional Center of Excellence located in Costa Rica, staffed by engineers, and has hired Spanish-speaking technical, sales, and client services staff across the region to provide 24/7 support in all countries, across sectors including manufacturing, banking, education, and others.

Simon Chappell, ADP CEO, said: “Our launch into Latin America is another important step forward in our ambitious growth plans for 2024 and beyond … Our new Center of Excellence is a strategic move to help build on our best-of-breed customer service for customers and will see all of our customers across the globe benefit from its expertise.”

ADP’s Costa Rica Center of Excellence will operate with a mix of human technical skills and automation, and many threats will be negated before customers are aware of them. This model and approach to cyber resiliency will be deployed by ADP as it expands into other regions, beyond Latin America, North America, and Europe.

The company launched its first Canadian datacenter in April.

Hitachi Vantara and Broadcom unveil updated Unified Compute Platform RS

By

Chris Mellor

-

August 27, 2024

Hitachi Vantara has partnered with Broadcom to produce an updated Hitachi Unified Compute Platform RS, powered by the newly launched VMware Cloud Foundation v9 software, unveiled at VMware Explore 2024.

Update: Hitachi Vantara query answers added. 30 August 2024

The Unified Compute Platform RS (UCP Rack Scale) relies on Hitachi’s Virtual Storage Platform One (VSP One) arrays and Cisco Nexus switches. UCP RS is turnkey infrastructure with lifecycle management features and a pay-per-use consumption model. It runs traditional VMs and containers, and is suited for on-premises AI workloads. Hitachi Vantara claims its design reduces greenhouse gas emissions and energy costs, stating that the inclusion of VSP One arrays reduces CO₂ emissions by up to 96 percent and datacenter storage footprint by up to 35 percent.

Paul Turner, VP for products in Broadcom’s VMware Cloud Foundation Division, said in a statement: “This not only addresses the current challenges of data management and infrastructure modernization, but also aligns with organizations’ sustainability goals, making it an essential tool for businesses looking to thrive in the era of generative AI and beyond.”

UCP RS supports both vSAN (server-based virtualized storage) and external (VSP One) storage, and the pair say it offers 100 percent data availability. The infrastructure and hardware firmware lifecycle automation comes through Hitachi’s UCP Advisor and SDDC Manager.

Hitachi Vantara says its new UCP RS has “a particular focus on healthcare IT” as “the healthcare sector is experiencing exponential growth in data volume, velocity, and variety.”

We asked the company some questions about this new UCP RS;

Blocks & Files: Does the “RS” mean Rack Scale?

Hitachi Vantara: Yes, but we do not expand RS in the official brand.

Blocks & Files: Is it true that The Hitachi Unified Compute Platform RS combines compute, storage, and networking resources into a single, pre-configured, and optimized system that can be rapidly deployed and easily managed, and can be scaled by adding additional racks as needed?

Hitachi Vantara: Yes, it’s a co-engineered turnkey solution that brings together all data center components and delivers advanced automation and enterprise grade scale, performance, and availability. The solution can be scaled up to eight racks per system.

Blocks & Files: Does it use servers supplied by Cisco Systems; Cisco UCS (Unified Computing System) servers?

Hitachi Vantara:: No. We have a separate converged infrastructure offering Cisco and Hitachi Adaptive Solutions for Converged Infrastructure that is rapidly growing in adoption too.

Blocks & Files: Where does the compute and networking come from? Which suppliers?

Hitachi Vantara: Hitachi branded DS and HA series servers and networking switches are from Cisco, or Brocade (Broadcom) FC-SAN.

Blocks & Files: Isn’t this just a Hitachi version of the Dell EMC-Cisco Vblock?

Hitachi Vantara: Hitachi’s UCP RS, a co-engineered integrated solution, provides deep integration between UCP Advisor and VCF Operations (SDDC Manager) for a unified management experience for day0-2 cloud operations including provisioning, configuration, lifecycle management, monitoring, auditing, troubleshooting, etc. Network management and firmware upgrades are reliably managed by UCP Advisor. UCP Advisor delivers an advanced multi-tenancy framework that aligns well with workload domains with VMware Cloud Foundation. The biggest source of differentiation is the most scalable, resilient Virtual Storage Platform (VSP) arrays for enterprise private and hybrid cloud environments such as 100% data availability guarantee, and 3-DC fail-safe architecture, enabling multi-failure domains that match or exceed the availability offered by public cloud regions.

Blocks & Files: Can you elaborate how UCP-RS might address health’s data challenges?

HV: Yes, data sovereignty issues make health care providers store data on-premises.

VCF v9

Broadcom’s VMware Cloud Foundation (VCF) 9 features:

Unified Operations and Automation with a self-service cloud portal for provisioning services, reducing the total number of management consoles from more than a dozen to just a single console each for operations and automation. New integrated workflows are intended to simplify the transition between operations and automation tasks, and better insights and analytics are designed to enable more proactive management.

Expanded VCF Import reduces the complexity and downtime associated with manual migrations of existing environments into VCF. It will have the ability to import VMware’s NSX, vDefend, Avi Load Balancer, and more complex storage topologies into existing VCF environments, and use and integrate older versions of existing infrastructure. A new UI will simplify management and deployment.

Memory Tiering with NVMe to enhance data-intensive applications, such as AI, databases and real-time analytics by reducing latency and accelerating data throughput, which is good for training and inference tasks and helps with storage cost-efficiency.

Integrated VCF Multi-Tenancy previously provided separately by VMware Cloud Director.

Native VPC Deployment to simplify networking by enabling users to access self-service isolated connectivity without VLAN complexities and enable non-disruptive integration with existing networks.

Accelerated Adoption of VMware Private AI Foundation with Nvidia for deploying, managing, and scaling AI-driven applications securely and efficiently on VCF-based private clouds with vGPU profile visibility, GPU reservations, data indexing and retrieval service, and an AI agent builder service.

Unified VCF Security Management with a centralized information hub and a new security view. Configuration drift detection will correlate and proactively notify IT of inconsistencies in system configurations across an entire VCF fleet.

Native vSAN-to-vSAN Data Protection with Deep Snapshots: vSAN remote snapshot replication will have a deep history of immutable snapshots, reduce downtime with enterprise-grade DR orchestration, and simplify the management experience with a unified appliance. It supports vSAN disaggregated storage for increased scalability and storage efficiency. Customers can use the immutable vSAN snapshots to recover from ransomware attacks with an on-premises Isolated Recovery Environment, complementing the existing cloud-based ransomware recovery offering.

Advancing Cyber Threat Prevention: vDefend will expand to deliver new features such as distributed firewall rule impact analysis to help simplify micro-segmentation security policy operations; distributed intrusion detection and prevention (IDPS) enhancements to support large, dense and multi-instance VCF environments; rapid threat assessment to help harden security posture by enabling threat profiling of VCF environments; and on-premises malware prevention for regulated organizations that require air-gapped deployment of VCF. Project Cypress will deliver GenAI-based intelligent assistance to help IT security teams proactively triage sophisticated threat campaigns and recommend remediation options.

Broadcom has also launched VMware Tanzu Platform 10 and Tanzu AI Solutions. Purnima Padmanabhan, Broadcom’s Tanzu division GM, said in a statement: “Now, with Tanzu Platform’s built-in AI development framework, developers can build high-performing, intelligent apps, regardless of their experience level or knowledge of Python. Tanzu AI Solutions help development teams go from dabbling in the sandbox to deploying enterprise-ready intelligent applications to production with confidence.”

Tanzu Platform 10 automates both application and platform management tasks such as patching vulnerabilities, performing rolling upgrades, and enforcing policies with broad visibility and AI-powered insights. Developers can use simple operations to automate secure container builds, bind services to apps, deploy code with a single command, and easily scale applications.

More VCF v9 details can be found here.

Pinecone vector database available on AWS, Azure, and Google clouds

By

Chris Mellor

-

August 27, 2024

Pinecone has made its AWS-supporting vector database available on the Azure and Google clouds and added a bulk object storage insert capability.

Startup Pinecone has developed a database specifically to store vector embeddings, the symbolic representations of multiple dimensions of text, image, audio, and video objects used in semantic search by generative AI’s large language models (LLMs) to build responses to users’ requests. Its vector database, already available on AWS, is serverless, meaning users don’t need to be concerned about the underlying server instance infrastructure across the now three public clouds.

Pinecone’s Director of Product Management, Jeff Zhu, writes in a blog post: “With our vector database at the core, Pinecone grounds AI applications in your company’s proprietary data.” A customer’s own data can be used to help with RAG (Retrieval-Augmented Generation), enabling a generally trained LLM to base its responses on a customer’s proprietary data, making them more accurate and less prone to erroneous results or “hallucinations.”

Cisco’s Sujith Joseph, Principal Engineer, Enterprise AI & Search, writes that by using Pinecone’s “vector database on Google Cloud, our enterprise platform team built an AI assistant that accurately and securely searches through millions of our documents to support our multiple orgs across Cisco.”

Pinecone has released Pinecone Assistant in beta as a managed service on Google Cloud. This, Zhu says, “delivers high-quality and dependable answers for text-heavy technical data such as financial and legal documents.” Pinecone points out that “all the infrastructure, operations, and optimization of a complex Q&A system are handled for you” through a simple API.

Zhu writes: “Along with AWS and GCP, you can now build with serverless on the cloud and region that suits you best [and] we’re introducing new features to give you greater control and protection over your data. This includes backups for serverless indexes and more granular access controls.”

The backups, he adds, “enable seamless backup and recovery of your data.” Available to all Standard and Enterprise users, these features allow you to:

Protect your data from system failures or accidental deletes
Revert bad updates or deletes and restore an index to a known, good state
Meet compliance requirements (e.g. SOC 2 audits)

“You can manually backup and restore your serverless indexes via a simple API. Backups for serverless are now in public preview for all three clouds.”

The granular access controls come from an API Key Roles feature, also in public preview. These “enable Project Owners to set granular access controls – NoAccess, ReadOnly, or ReadWrite – for both the Control Plane and Data Plane within Pinecone serverless.”

Pinecone will “be introducing more User Roles at the Organization and Project levels in the coming weeks. Organization-level User Roles including Org Owner, Billing Admin, Org Manager, and Org Member will let you determine access to managing projects, billing, and other users in the organization. Project-level User Roles including Project Owner, Project Editor, and Project Viewer will let you determine access to API keys, the Control and Data planes, and other users in the project.”

Object storage upload

The company has also enabled bulk insert of records from object storage, which – we’re told – is up to six times cheaper than doing normal record-by-record updates and inserts (upserts, in Pinecone’s terminology). It’s intended for the insertion of millions of objects. Data is read from a secure bucket in a customer’s object storage. This is an asynchronous, background, long-running batch-like process with, Pinecone says, “no need for performance tuning or monitoring the status of your import operation.”

As a cost example, Pinecone suggests that “ingesting 10 million records of 768-dimension will cost $30 with bulk import.”

Customers first need to integrate their object store (e.g. Amazon S3) with Pinecone. This integration lets the customer store IAM credentials for their object store, which can be set up or managed via the Pinecone console. Import is done from a new API endpoint that supports Parquet source files.

We asked if Pinecone vectorizes the object data during the insert from object storage. Zhu told us: ”The current implementation requires that the content has already been vectorized. We are exploring vectorization during import for a future release.”

Startup Onehouse has a vector embeddings generating capability and integrates with Pinecone.

We understand that by developing a specialized and serverless vector database, Pinecone reckons it will have both a performance (search time, latency) edge on competitors, including other dedicated vector databases and multi-protocol databases such as SingleStore.

Availability

Pinecone serverless is available for all Standard and Enterprise customers and currently supports the “eastus2” (Virginia) region on Azure, and Google Cloud’s “us-central1” (Iowa) and “europe-west4” (Netherlands), with more regions coming soon for both. Customers can start building on Pinecone serverless using one of Pinecone’s sample notebooks or subscribe through the Azure or Google Cloud marketplaces.

Import from object storage is now available in early access mode for Standard and Enterprise users at a flat rate of $1.00/GB. It is currently limited to Amazon S3 for serverless AWS regions. Support for Google Cloud Storage (GCS) and Azure Blob Storage will follow in the coming weeks.

Bulk imports are limited to 200 million records at a time during early access and import operations are restricted to writing records into a new serverless namespace; you currently cannot import data into an existing namespace.

Bootnote

Pinecone’s VP of R&D, Ram Sriharsha, has provided a technical deep dive on the company’s serverless database, writing: “Traditionally, vector databases have used a search engine architecture where data is sharded across many smaller individual indexes, and queries are sent to all shards. This query mechanism is called scatter-gather – as we scatter a query across shards before gathering the responses to produce one final result. Our pod-based architecture uses this exact mechanism.

“Before Pinecone serverless, vector databases had to keep the entire index locally on the shards. This approach is particularly true of any vector database that uses HNSW (the entire index is in memory for HNSW), disk-based graph algorithms, or libraries like Faiss. There is no way to page parts of the index into memory on demand, and likewise, in the scatter-gather architecture, there is no way to know what parts to page into memory until we touch all shards.

“We need to design vector databases that go beyond scatter-gather and likewise can effectively page portions of the index as needed from persistent, low-cost storage. That is, we need true decoupling of storage from compute for vector search.”

His blog then describes Pinecone’s serverless approach and provides performance numbers.

Quantum executes reverse stock split to avoid Nasdaq delisting

By

Chris Mellor

-

August 27, 2024

Faced with the possibility of being delisted by Nasdaq due to its stock being worth less than $1, Quantum is implementing a reverse stock split to drive up the per-share price above a dollar.

The data management and tape library business was told by Nasdaq in March that, because its stock price has been below the minimum $1 threshold on the exchange for more than 30 days, it had to regain compliance with this listing rule or leave. Quantum had previously used a reverse stock split transaction in April 2017 to avoid a similar NYSE delisting threat. The company later transferred to the Nasdaq and has now had to repeat the reverse stock split maneuver.

Quantum’s stock price had declined in recent months due to accounting issues related to the pricing of standalone components in product bundles. This affected its ability to file SEC reports for its first, second, and third quarters of fiscal 2024, and led to a recalculation of its results for fiscal 2022 and 2023.

The business had also been hit by a hyperscaler ceasing purchases of its tape libraries, which resulted in a substantial revenue drop. This was revealed when it filed its fiscal 2024 numbers in June. At the time, chairman and CEO Jamie Lerner said: ”Our full year 2024 results reflect a significant reduction of revenue from our largest hyperscale customer, which we had expected would scale down over time but instead stopped placing orders at the end of fiscal Q1 2024.”

Now Quantum is implementing a 1 for 20 reverse stock split, meaning investors will own 1 share of its common stock for every 20 they owned before. If the 20 shares were priced at $0.19 each then the price for one new share will nominally be 20 x $0.19 or $3.80. The split was approved by the company’s stockholders at its 2024 annual meeting. The reverse stock split took effect at 4:01 p.m. Eastern Time on August 26, and will reduce the number of outstanding shares of the company’s common stock from approximately 95,849,938 shares to approximately 4,792,497 shares.

The reverse stock split will affect all shareholders uniformly and will not change any shareholder’s percentage ownership interest.

Quantum is focusing on its ActiveScale object storage and Myriad scale-out, all-flash, storage operating system software to drive revenues and regain profitability and long-term growth.

Demanding environments? HighPoint reveals NVMe enclosures

By

Antony Savvas

-

August 23, 2024

HighPoint has lifted the covers off a new line of external NVMe RAID enclosures, designed to elevate Gen4 storage applications to “new heights”, it says.

The RocketStor 654x series promises x16 transfer performance and “nearly half a petabyte” of storage capacity, which can be integrated into any x86 platform with a free PCIe x16 slot.

The enclosures are aimed at industrial and edge computing platforms, and professional workstation environments. They are designed to support applications that must rapidly access, transfer and process large volumes of data, such as AI training, media and entertainment post production, as well as medical imaging or diagnostic platforms.

Available with four or eight hot-swappable, vertically aligned 2.5-inch drive bays, and powered by the supplier’s PCI switching architecture and RAID technology, the enclosures can deliver up to 28GB/s of transfer bandwidth. They also support up to eight DC and Enterprise class U.2 and U.3 media, configured into as many as four independent RAID 0, 1 or 10 arrays.

The ability to fully optimize x16 lanes of Gen4 upstream or direct CPU bus bandwidth, combined with self-bifurcation (managing and distributing x48 lanes of internal host bandwidth), is a “major game-changer” claimed HighPoint.

HighPoint RocketStor PCIe Gen4 Switching Architecture.

“This advanced architecture enables each RocketStor 654x enclosure to reach its full performance and capacity potential, while allowing for a remarkably compact 4.84 inch chassis,” it said.

Each system provides x16 lanes of dedicated PCIe Gen4 upstream bandwidth, and x4 lanes of dedicated downstream bandwidth to each U.2 or U.3 SSD. The architecture is also designed to minimize latency and “significantly enhances” signal integrity to streamline I/O transmission between the host computer and NVMe storage, we’re told.

The external RocketStor 654x enclosures are equipped with low-decibel cooling fans, and full manual fan-control enables administrators to adjust the cooling system based on ambient conditions, including an option to disable the fan for workflows that demand complete silence. HighPoint says the external form factor enables admins to easily expand or upgrade storage capacity for any PC platform with PCIe Gen4 x16 connectivity.

The NVMe RAID enclosures will begin shipping next month, and will be available worldwide direct from the supplier online or through approved distributors and resellers.

The RocketStor 6542AW 8-Bay PCIe 4.0 x16 External NVMe RAID Enclosure has a list price of $2,299. The price for the RocketStor 6541AW 4-Bay PCIe 4.0 External NVMe RAID Enclosure is yet to be confirmed.

Starburst research highlights key strategies driving AI success

By

Antony Savvas

-

August 23, 2024

Data lakehouse provider Starburst has outlined the key data management targets and practices driving AI projects, based on research conducted among enterprises.

It commissioned a report from TheCUBE Research, which surveyed 300 IT professionals from diverse industries across the US and Western Europe, confirming the “critical role” of real-time hybrid data access and “robust” security in successful AI implementations.

“Real-time data access and robust security are paramount in the successful deployment of AI technologies,” said Shelly Kramer, managing director and principal analyst at TheCUBE Research. “The insights from this comprehensive report provide a roadmap for businesses to align their data strategies with their AI innovation goals.”

The report found strong AI adoption intent, with 87 percent of organizations expressing a “strong” or “very strong” desire to implement AI within the next 12 months, with “significant progress” reported by 86 percent of respondents.

However, in terms of technical obstacles, 52 percent of organizations said they faced “significant hurdles” in organizing structured data for machine learning with AI applications, and 50 percent cited difficulty in preparing unstructured data for retrieval-augmented generation (RAG) in AI deployments.

Additionally, 49 percent said aligning business intelligence metrics with features required for predictive analytics was a challenge, and 41 percent said it was tricky to use LLMs to refine and extract structure from semi-structured information for use in SQL DBMSes.

The most significant operational barriers to accessing high-quality data for AI projects are data privacy/security concerns (28 percent) and data volume (25 percent). Other barriers cited by respondents include insufficient data quality or reliability (17 percent), and a lack of the right tools and talent – both 11 percent.

Almost two-thirds (62 percent) of those surveyed highlight real-time data access as “critical” for AI success. Building a data-driven culture across companies is also key. Strategies such as increasing awareness of data’s value (69 percent of respondents), fostering cross-functional collaboration (66 percent), and building that data-driven culture (61 percent) were identified.

Other key trends in data management are shaping the AI landscape, with 52 percent of respondents adopting data governance and federated data access strategies to improve data quality and accessibility across systems, including on-premises and in the cloud.

In addition, 59 percent are using cloud-based platforms for scalability, and 61 percent are using agile methodologies for data project management.

“With our advanced and user-friendly open hybrid lakehouse platform, customers can navigate the complexities of data management with greater ease, efficiency, and accuracy, to drive transformative AI outcomes,” said Justin Borgman, co-founder and CEO of Starburst.

To help it scale up, Starburst recently appointed Steven Chung as president, Tobias Ternstrom as chief product officer, and Adam Ferrari as senior vice president of engineering.

Tintri developing autonomous disaster recovery solution

By

Chris Mellor

-

August 23, 2024

DDN enterprise storage subsidiary Tintri is developing a disaster recovery feature with autonomous detection and alerting to combat ransomware attacks.

Tintri supplies virtual machine-aware VMstore storage managed with VMware storage concepts rather than traditional block storage. There is, for example, no provisioning of LUNs or volumes. VMstore users can instantly clone, replicate, and snapshot VMs and sync data between VMs and also disks of VMs.

The GLAS-DP (Global Live Analytics System – Data Protection) facility autonomously detects, alerts, and enables rapid recovery from a ransomware event afflicting a VMstore, Tintri says. It is based on real-time analytics of VMstore device IO streams and metadata. GLAS-DP enables customers to employ recovery capabilities via the Tintri Global Center AI platform. This provides management of a customer’s VMstore systems from a single console.

GLAS-DP is a suite of facilities that provides instantaneous alerts on potential threats, allowing IT managers to restore points in time, recover within milliseconds, and roll forward, we’re told. Two-factor authentication is included with the suite as is a central dashboard. It continuously collects and analyzes data from Tintri storage systems across multiple locations, with the analytics providing insights into data usage patterns, performance metrics, and potential risks. It can detect unusual access patterns or performance bottlenecks, enabling Tintri admins to deal with the issues.

We understand that this is thought by Tintri to be particularly relevant given the CrowdStrike incident and corruption of thousands of Windows systems, and myriad recent malware attacks.

Back in July, Tintri SVP Phil Trickovic stated that CrowdStrike experienced a “major cybersecurity technology issue with their endpoint and detection response agent (EDR). This is resulting in widespread disruption on a global level, as many installations are affected across the world.”

He said: “VMstore offers near zero ‘in frame’ environment rollback. To further enhance recovery options Tintri delivers near zero cloud and second site recovery point objectives. Workloads running on VMstore can rapidly rollback the base OS to the last usable point-in-time. Recovery and restoration take place in seconds ensuring continuous business operations.”

Recovery in milliseconds is the goal with GLAS-DP. The suite can optimize the timing and frequency of backups, local and remote snapshots, and sync/async replication to provide both protection and resource use efficiency.

Tintri will be exhibiting at VMware Explore Las Vegas conference (booth #1518), August 26-29, where it is unveiling GLAS-DP. It will also showcase its turnkey and managed Tintri Cloud Platform and container-driven Tintri Cloud Engine (TCE), which decouples Tintri’s AI-powered software from the VMstore T7000 hardware platform. TCE runs on the AWS public cloud, serving as an AWS VM with EBS storage.

GLAS-DP will be available for customers using the Tintri Global Center AI platform in September 2024.

Kioxia has filed IPO papers

By

Chris Mellor

-

August 23, 2024

Japan’s Nikkei publication reports Kioxia has filed for an IPO on the Tokyo Stock Exchange with an IPO date scheduled for October.

A valuation in excess of ¥1.5 trillion ($10.3 billion) is suggested and Kioxia told the paper: “Preparations are underway for a listing at an appropriate time.” Kioxia is the fifth largest SSD supplier globally by units and capacity shipped, according to TrendForce numbers for the second 2024 quarter. Samsung, Western Digital, Micron and SK hynix lead the market in that order. The FT says Kioxia is seeking to raise at least $500 million

In April we reported NAND and SSD manufacturer Kioxia was preparing an IPO to recapitalize itself with payment of a ¥900 billion ($5.8 billion) loan due in June as part of the background. Kioxia expects to return to profitability in its next financial year, ending March 2025, as the NAND market picks up boosted by demand for fast AI data storage and model training checkpoints.

Earlier IPO plans were curtailed by the Covid pandemic and a NAND market slowdown. That market is now recovering. A possible merger with Kioxia’s NAND manufacturing joint-venture partner Western Digital in 2023 was derailed by SK hynix.

Kioxia is 56.24 percent owned by a Bain Capital-led private equity consortium, which bought its stake from Toshiba in 2017 for $18 billion and includes NAND and SSD maker SK hynix amongst its investors. Toshiba still has a 40.64 percent stake in Kioxia. The Nikkei, quoting un-named sources, says Bain and Toshiba would sell of part of their holdings in phases after the IPO. Conversely it’s possible SK hynix could increase its stake to have more influence over, and a closer relationship with, Kioxia.

A ¥1.5 trillion valuation would signify a 5:1 price:earnings ratio based on annual net income of ¥300 billion, according to the FT. This is less than the approximate 10:1 PE ratio of Samsung and Western Digital, due to Kioxia’s debt, history, market position and an upside incentive to potential stock buyers.

Wedbush analyst Matt Bryson told subscribers today that the $10 billion valuation “is roughly just half the valuation that Kioxia was seeking in 2020 when it last attempted to list.” He added that such an IPO would have “positive implications for WD. While Western Digital only has about 2/3 of the capacity of Kioxia, we believe it’s NAND business is also more profitable and WD has significantly less debt than Kioxia.”

Veeam has knocked Veritas off backup top spot

By

Antony Savvas

-

August 23, 2024

Analyst house Gartner has named Veeam number one for market share in the global enterprise backup and recovery software market for the first time, displacing Veritas from the top position.

Veeam achieved this ranking as part of the Gartner Market Share Analysis: Enterprise Backup and Recovery Software, Worldwide, 2023 report, based on a market share of 15.1 percent, revenues of $1.5 billion and 11.8 percent year-over-year growth in 2022-2023.

Blocks & Files has obtained a summary of the paid report from Gartner to examine the performance of other major vendors in the backup and recovery market. But first the puff from Veeam in response to its new Gartner ranking.

“Today, every organization relies on the availability of data no matter what happens. When the worst happens – whether due to ransomware or a natural disaster or an inadvertent security update – data resilience is critical,” said Anand Eswaran, CEO at Veeam, in a statement. “At Veeam, we are powering data resilience from backup and recovery to end-to-end ransomware protection for over 550,000 organizations in over 150 countries.”

The report provides an analysis of vendor market share, vendor performance, key trends affecting the market, and significant mergers and acquisitions for the calendar year.

Gartner states: “The enterprise backup and recovery software market grew at 5.1 percent in 2023, ending the year at nearly $10 billion in total revenue.

“Veeam returned to double-digit growth after experiencing revenue growth of 9.4 percent in 2022. In 2023, it expanded its revenue by 11.8 percent, and became number one in revenue and market share.”

However, it was a mixed bag for the other players in the market. “While the vendor showing the fastest revenue growth grew at close to 19 percent, revenue growth was slower for most market leaders, with an average under 3 percent,” said the analyst.

Cohesity and Acronis actually grew the most in terms of revenue, by 18.9 percent and 16 percent, respectively.

The top five in terms of revenue are: Veeam ($1.5 billion), Veritas ($1.49 billion), Dell ($1.27 billion), IBM ($1.04 billion), and Commvault ($753 million).

While Veeam holds a 15.1 percent market share, Veritas slipped to 15 percent, with Dell on 12.8 percent, IBM on 10.4 percent, and Commvault with 7.6 percent. Illustrating the crowded market, other vendors made up 39.2 percent of the market.

Compared to their 2022 growth, Veritas, IBM and Commvault experienced higher revenue growth of 3.2 percent, 3.3 percent and 5.3 percent, respectively, while Dell experienced a revenue decrease of 1.8 percent.

Veeam told us it is twice as large as Commvault, 3.3x bigger than Rubrik, and 3.5x bigger than Cohesity in revenue. In fact, Veeam is almost as big as Cohesity, Commvault, and Rubrik combined ($1.503 billion vs $1.637 billion), and Veeam has the fastest growth of the top five leaders (2x Commvault, 3.5x Veritas and IBM).

We have charted the actual top five revenue numbers and estimated Rubrik and Cohesity revenues, using Veeam’s 3.3x and 3.5x multiplier comparisons, respectively:

Gartner numbers plus calculated Rubrik and Cohesity revenues

With Veritas set to merge with Cohesity, this combination may well become the market leader in terms of sales and market share going forward. Their combined 2023 revenues were $1.92 billion, putting them clearly in the lead.

Onehouse launches vector embeddings generator

By

Chris Mellor

-

August 22, 2024

Data lake startup Onehouse is launching a vector embeddings generator to automate pipelines as part of its managed ETL cloud service.

Onehouse provides a fully managed data lakehouse designed to be universal – ingesting data at terabyte scale in minutes from any source and supporting all query engines and standards such as Iceberg and Hudi. Its products include LakeView, a free data lakehouse observability tool for the OSS community, and Table Optimizer, which automates data lakehouse table optimizations. Vector embeddings are symbolic representations of multiple aspects of text, audio, image, and video data that can be searched to find similar groups of vectors – enabling, for example, the location of images of a particular product or the generation of text in AI applications such as large language models (LLMs).

Its ELT service automates embeddings pipelines – continuously delivering data from streams, databases, and files on cloud storage to foundation models from OpenAI, Voyage AI, and others. The models then return the embeddings to Onehouse, which stores them in highly optimized tables on the user’s data lakehouse.

Vinoth Chandar, Onehouse — Vinoth Chandar

Vinoth Chandar, founder and CEO of Onehouse, stated: “AI is going to be only as good as the data fed to it, so managing data for AI is going to be a key aspect of data platforms going forward.”

Chandar is PMC chair of the Apache Hudi open source transactional data lake framework building project, with processes for ingesting, managing, and querying large volumes of data. He led the creation of Apache Hudi while at Uber in 2016.

He said: “Hudi’s powerful incremental processing capabilities also extend to the creation and management of vector embeddings across massive volumes of data. It provides both the open source community and Onehouse customers with significant competitive advantages, such as continuously updating vectors with changing data while reducing the costs of embedding generation and vector database loading.”

Onehouse believes the data lakehouse – with its open data formats on top of scalable, inexpensive cloud storage – is becoming the natural platform of choice for centralizing and managing the vast amounts of data used by AI models. Users are able to choose what data and embeddings need to be moved to downstream vector databases.

By adding a vector embeddings generator, Onehouse claims its customers can streamline their vector embeddings pipelines to store embeddings directly on the lakehouse. This provides all of the lakehouse’s capabilities around update management, late-arriving data, concurrency control and more, while scaling to the data volumes needed to power large-scale AI applications.

Onehouse integrates with vector databases, such as Pinecone and Zilliz, to enable high-scale, low-latency serving of vectors for real-time use cases. The data lakehouse stores all of an organization’s vector embeddings and serves vectors in batch, while hot vectors are moved dynamically to the vector database for real-time serving. This architecture provides scale, cost, and performance advantages for building AI applications such as LLMs and intelligent search.

Kaushik Muniandi, engineering manager at consumer market research business NielsenIQ, was quoted in the Onehouse announcement: “Text search has evolved dramatically. The traditional tools have complications on their own, as in, ingress of data and egress when we would want to move out. Vector embeddings on data lakehouse not only avoids the ingress and egress complexities and cost but also can scale to massive volumes. We found that vector embeddings on data lakehouse is the only solution that scales to support our application’s data volumes while minimizing costs and delivering responses in seconds.”

Readers interested in seeing vector embeddings for AI built and managed in the data lakehouse can join an upcoming webinar with NielsenIQ and Onehouse: Vector Embeddings in the Lakehouse: Bridging AI and Data Lake Technologies. It takes place on August 27 at 10am Pacific Time (03:00 UTC).

Bootnote

Onehouse was founded in 2021 and has raised $68 million in funding, with a June B-round contributing $35 million, a year after a $25 million A-round. It uses the term ELT (Extract, Load, and Transform) in its communications material rather than the more common ETL (Extract, Transform, and Load).

NewsPaperStorages and File System News

NewsPaperStorages and File System News

ExaGrid shines up systems, gets Metallic

Storage news ticker – August 27

Rubrik and MSP bring managed data protection to Latin America

Hitachi Vantara and Broadcom unveil updated Unified Compute Platform RS

VCF v9

Pinecone vector database available on AWS, Azure, and Google clouds

Object storage upload

Availability

Bootnote

Quantum executes reverse stock split to avoid Nasdaq delisting

Demanding environments? HighPoint reveals NVMe enclosures

Starburst research highlights key strategies driving AI success

Tintri developing autonomous disaster recovery solution

Kioxia has filed IPO papers

Veeam has knocked Veritas off backup top spot

Onehouse launches vector embeddings generator

ABOUT US

FOLLOW US