Storage news ticker – 23 May 2024

Cribl, which supplies datalake and data engine software, has announced the launch of the Cribl Technology Alliance Partner (TAP) program, a global ecosystem of technology partners bringing new integrations and validated offerings to customers to transform their data management strategy. With hundreds of existing integrations being used by customers today, we’re told the Cribl TAP provides new integrations with the world’s most widely-used technology providers, expanded partner support, and increased choice for customers to select the data management tools that best fit their needs.

Databricks Ventures (DV) has launched its 2nd fund: the Databricks AI Fund. DV head Andrew Ferguson said: “Just as software ate the world, we believe AI is now eating software. As such, the pace of change in the AI ecosystem has dramatically accelerated. The era of AI in the enterprise has arrived, and our new AI Fund embodies Databricks Ventures’ commitment to supporting a new generation of founders and startups in this critically important ecosystem. To bolster the ecosystem around the Databricks Data Intelligence Platform, we will be aggressively seeking out investments in innovative, early-to growth-stage startups that are utilizing or enabling AI in innovative ways on top of or alongside our platform.”

“Just since last fall, we’ve announced investments in six AI-focused companies: Anomalo, Cleanlab, Glean, Mistral AI, Perplexity and Unstructured. These portfolio companies fall in very different sectors of the landscape — from open-source LLM development (Mistral AI) to AI-powered data quality monitoring (Anomalo) — but all leverage the power of AI to deliver superior customer experiences. And with these varied investments, we have forged deeper partner and integration relationships across the AI value chain that benefit our portfolio companies and our common customers — and help us build a strong, differentiated ecosystem around the Databricks platform.”

Astra DB NoSQL database supplier DataStax has launched Astra Vectorize, a feature that performs embedding generations on the server-side. It includes new integrations with OpenAI and Microsoft Azure OpenAI Service, designed to accelerate and simplify embedding generation for developers. These integrations allow organizations to compare different embedding models with just a couple of clicks, saving hours and days of development time. Astra Vectorize is an addition to the Astra Data API that will allow users to provide raw, unstructured data – like a piece of text or an image – as a part of an insert, update, or vector search operation, and the vector embedding for that data will be automatically generated by Astra DB. 

Dell will offer an integrated turnkey hyperconverged appliance combining the Nutanix Cloud Platform and Dell servers. Nutanix Cloud Platform for Dell PowerFlex will combine the Nutanix Cloud Platform and its AHV hypervisor, for compute with Dell PowerFlex for storage. The companies will collaborate on engineering, go-to-market, support and services, and the joint systems will be sold by Dell sales teams and partners worldwide. The joint systems from Dell and Nutanix are currently under development and will be available to customers in early access later this year.

Stream data lake startup Hydrolix has closed a $35 million Series B round led by S3 Ventures. Prior investors such as Nava Ventures, Wing Ventures, AV8 Ventures and Oregon Venture Fund also participated. The company’s total funding to date to is $68 million. Hydrolix said it doubled revenues in Q3 and Q4 in 2023, developed new partnerships and grew another 75% in Q1, 2024. Hydrolix says it offers a streaming data lake built to power log-intensive applications. Its SW combines real-time stream processing, low-latency indexed search, decoupled storage and high-density compression to create a high-performance, lower-cost data management platform designed to handle the hyper-growth industry demand for long-term data retention. All Hydrolix data is “hot,” eliminating the need to manage multiple storage tiers. This approach allows Hydrolix to offer customers real-time query performance at terabyte scale for a radically lower cost compared to other cloud data platforms, we’re told.

IBM ran performance tests with StorageScale CES S3 (non-containerized S3), using COSBench and large objects (1GB) and small objects. The blog concludes: “With the current cluster setup, the bandwidth measured for CES S3 for reading large objects is 63 GB/s and 24 GB/s for writes. For reading small objects, the maximum number of operations per second was in the range of 56000, using object sizes of 1KB and 4 KB. Interesting bandwidth results were observed with 4MB object size in combination with 256 and 512 workers, getting peaks of 70GB/s.

“Also, it was observed that CPU utilization increased based on the number of COSBench workers. Starting with a very low utilization for 1 and 8 workers and having a max utilization for greater number of workers. Performance engineering work will continue with the execution of diverse tests. In future entries, we will describe performance evaluations using COSBench with different workload characteristics as well as other benchmarking tools.”

France-based Kalray announced Ngenea for AI, a new edition of its Data Acceleration Platform that it says is fine-tuned for AI data pipelines. AI demands new ways to ingest and access massive volumes of data. The pitch for Ngenea for AI is that it helps speed up performance and simplify data management. Ngenea for AI is the companion to two Ngenea editions for Media & Entertainment and HPC customers. The Ngenea for AI empowers AI innovators to speed up their ingest performance and access their unstructured data from a unified, global name space, the company says. 

Ngenea for AI adds data indexing and search capabilities, which users can use to feed any data to Smart Vision, GenAI and RAG applications. It incorporates a high-performance storage tier for the most data-intensive AI workloads, powered by a high-performance parallel file system that can manage petabytes of data and billions of files – whether data lives on the edge, in the cloud, or on premises.

Optional hardware acceleration via Kalray’s DPUs including the TURBOCARD4 (TC4) allows parallel processing in an asynchronous way for ultra-demanding workflows, via a complementary architecture to GPUs.

A record 152.9 EB of total LTO tape capacity (compressed @2.5:1) shipped in 2023, with a growth of 3.14 percent over 2022, driven in part by rapid data generation and the increased infrastructure requirements of hyperscalers and enterprises.

N-able, which supplies data protection software to MSPs, is working with the MSPAlliance to help equip MSPs to meet compliance requirements using MSPAlliance’s Cyber Verify program. The Cyber Verify program is available globally to N-able MSP partners and is built to help MSPs identify and adhere to industry gold standards, stand up and grow a Compliance-as-a-Service practice, and comply with cyber-regulations. The program also offers N-able partners a customized experience, helping them understand how to potentially utilize the N-able offerings in their tech stack to strengthen their compliance initiatives.

IBM-owned Red Hat and Nutanix has announced an expanded collaboration to use Red Hat Enterprise Linux (RHEL) as an element of the Nutanix Cloud Platform. The platform foundation is AOS, which combines components of a traditional OS with additional services and packages. AOS will now build on RHEL for traditional operating system capabilities. Nutanix will also contribute to CentOS Stream, working with Red Hat and the broader open source community on hypervisor functionality, networking and storage performance for emerging artificial intelligence (AI) workloads on RHEL.

NVIDIA and Microsoft have expanded their collaboration: 

  • The latest AI models developed by Microsoft, including the Phi-3 family of small language models, are being optimized to run on NVIDIA GPUs and made available as NVIDIA NIM inference microservices. 
  • NVIDIA cuOpt, a GPU-accelerated AI microservice for route optimization, is now available in Azure Marketplace via NVIDIA AI Enterprise. Developers can now easily integrate the cuOpt microservice, backed by enterprise-grade management tools and security, into their cloud-based workflows to enable real-time logistics management for shipping services, railway systems, warehouses and factories.
  • NVIDIA and Microsoft are delivering a growing set of optimizations and integrations for developers creating high-performance AI apps for PCs powered by GeForce RTX and NVIDIA RTX GPUs. 

Vector database supplier Pinecone this week launched Pinecone serverless into general availability. This vector database, which is designed to make generative artificial intelligence (AI) accurate, fast, and scalable is now ready for mission-critical workloads. Over the past few months, more than 20,000 organizations have been using it in public preview, including names like Notion, Gong and You.com, as well as many smaller companies and individual developers. Users have been able to reduce costs by up to 50x, we’re told, while building more accurate AI applications at scale. Pinecone research shows that the most effective method to improve the quality of generative AI results and reduce hallucinations – unintended, false, or misleading information presented as fact – is by using a vector database for Retrieval-augmented Generation (RAG).

    Samsung is working on 3D DRAM according to a wccftech report, with vertically-mounted transistors via a VTC (Vertical Channel Transistors) technique in a 4F Square cell structure. Sammy has a 16-layer DRAM stacking target. This leads to greatly increased DRAM chip capacity. A Fred Chen tweet showed a Samsung slide about the technology. It suggests product won’t appear until the 2030s. 

    Samsung 3D DRAM slide from Fred Chen.

    Snowflake’s revenue growth continues unabated.

    Cloud datawarehouser Snowflake has announced its FY25 Q1 financial results and the acquisition of TruEra, an AI observability platform. Revenues were $828.7 million, up 32.9 percent hyear-on-year, with a loss of $317.8 million, deeper than the year-ago $226 million loss. The customer count rose to 9,822 from 8,167.

    TruEra’s technology helps evaluate the quality of inputs, outputs, and intermediate results of LLM apps, as well as identifying risks such as hallucination, bias, or toxicity. This is investment will bring LLM and ML observability to Snowflake’s AI Data Cloud, and is aimed at enabling it to provide deeper functionality to help organisations drive AI quality and trustworthiness.

    Dr. Julian Chesterfield.

    Edge HCI supplier StorMagic has appointed Dr. Julian  Chesterfield as its CTO. Based in Cambridge, U.K, Dr. Chesterfield has a Master of Science from University College London and a Ph.D. in computer science from Cambridge University. At Cambridge, he was one of the creators of the Xen OpenSource hypervisor that was ultimately acquired by Citrix Systems. Dr. Chesterfield then founded the Sunlight.io hyperconverged infrastructure (HCI) platform, and served as the company’s CTO to support its growth. During his career as CTO and technology architect at companies such as Xensource, Citrix, and OnApp, he has developed hypervisor, software-defined storage and application management technologies. 

    Enterprise data manager Syniti will showcase a proof of concept that demonstrates using autonomous AI agents, known as the Syniti Squad, to address data quality issues. This will be done at the SAP Sapphire & ASUG Annual Conference in Orlando, Florida from June 3-5 and at SAP Sapphire Barcelona from June 11-13. This Syniti Squad will ask customers about their business challenges or data problems, and the AI agents will automatically collaborate to analyze the user-defined focus area, identify insights, and generate custom business rules tailored to the customer’s specific goals. The agents have an inner dialogue to deliver more accurate results.   Syniti told us: ”Our proof of concept includes custom tools that are assigned to the agents; the agents are assigned an LLM, custom tools and task, they then have inner dialog and external dialog  with humans and other agents” to deliver more accurate results.

    Jason Yeager.

    Virtualized data center suppler VergeIO has hired Jason Yaeger as its new SVP of Engineering. Yaeger will oversee all engineering operations, focusing on advancing product strategy, optimizing technology governance, and fostering an environment of innovation. Yeager has been a self-employed strategic advisor at NYJL advisors for almost 5 years since being co-founder and CEO at TenacityAI for three and a half years.

    Veritone, which designs human-centered AI offerings, has announced a strategic partnership with Creative Artists Agency (CAA), an entertainment and sports agency, to power the CAAvault, a synthetic media vault conceived by CAA to serve the entertainment community and its participants (talent). CAA is using Veritone’s Digital Media Hub (DMH) technology to store the intellectual property of the participating talent’s name, image, likeness and all associated metadata, like synthetic counterparts, including digital scans and voice recordings, in the vault.