Storage news ticker – June 21

Anomalo has expanded its platform that monitors the quality of structured data in data warehouses and data lakes to monitor unstructured text. We’re told this makes it possible for enterprises to discover, curate, use, and ingest high volumes of text data without the risk of using low quality data, which is helpful for GenAI applications. Unstructured text documents can be organized and evaluated for data quality around various documents and document collection characteristics, including length, duplicates, topics, tone, language, abusive language, PII, and sentiment. Users can evaluate the quality of a document collection and identify issues in individual documents, reducing the time needed to organize and use unstructured text data. The feature is currently in private beta.

Databricks has launched its 2024 State of Data + AI Report. Featuring data from its 10,000-plus customers across the globe, this research provides an in-depth look into how organizations across industries are approaching AI. Key stats include:

  • Across all organizations, 1,018 percent more models were registered for production this year compared to last year.
  • The use of vector databases to enable the customization of AI models  grew 377 percent in the last year.
  • Across both Llama and Mistral users, 77 percent choose models that are 13B parameters or smaller.

More information can be found here.

Databricks and Informatica have an expanded partnership bringing together Informatica’s AI-powered Intelligent Data Management Cloud (IDMC) capabilities within the Databricks Data Intelligence Platform. This will enable customers to deploy enterprise-grade GenAI applications at scale, based on a foundation of high-quality, trusted data and metadata. The expanded partnership includes four new capabilities: 

  • GenAI solution blueprint for Databricks DBRX  
  • Native Databricks SQL ELT 
  • Cloud data integration-free service (CDI-Free) on Databricks Partner Connect 
  • Full IDMC support via Unity Catalog

The DNA Storage Alliance has announced new governing board members Biomemory, Entegris, and Imagene to join Catalog, Quantum, Twist Bioscience, and Western Digital. Biomemory is a pure play DNA data storage startup that aims to develop end-to-end DNA data storage systems. Entegris is a leading supplier of advanced materials and critical process solutions for the semiconductor, life sciences, and other high-tech industries. Imagene has more than 20 years of expertise in room temperature storage and stability assessment of nucleic acids and bio-specimens.

Document relational database company Fauna is launching new schema features to solve a problem faced by developers building GenAI, edge, and IoT applications: Ensuring data consistency and security at scale in document databases. With these new features, Fauna enables developers to define their database schema in their application code, combining the flexibility of document databases with the data integrity of relational databases – something it claims MongoDB and DynamoDB can’t deliver.

The FMS (Future of Memory and Storage) 2024 show has opened nominations for its Best of Show Awards across various categories including:

  • Most Innovative Sustainability Technology
  • Most Innovative Artificial Intelligence (AI) Application
  • Most Innovative Hyperscaler Implementation
  • Most Innovative Customer Implementation
  • Most Innovative Startup Company
  • Most Innovative Technology
  • Most Innovative Consumer Application
  • Most Innovative Enterprise Business Application

Nominations are due by 6pm PDT on June 21 and may be completed online here. FMS will be held August 6-8 at the Santa Clara Convention Center. The FMS24 BOS Awards Ceremony is August 6, from 5:30 to 7pm the Exhibit Hall FMS Theater. Winners will be announced and awards given at that time. Register for the event here.

In-memory streaming data supplier Hazelcast has hired distributed compute specialist Anthony Griffin as chief architect. He will focus on applying his serverless compute and financial systems skills to evolve Hazelcast’s unified real-time data platform with a focus on mission-critical and AI applications. Griffin served most recently as a senior engineering leader at AWS Lambda, an event-driven, serverless compute platform provided by Amazon.

We’re told IBM has announced the latest version of the Virtualize software for FlashSystem and SVC. 8.7.0 brings a raft of major and minor enhancements and the instantiation of a scalable storage platform using Flash Grid. Flash Grid is a scalable storage platform comprising of multiple FlashSystem or SVC systems with federated management, AI-powered data placement recommendations, and flexible deployment options. v8.7.0 features:

  • Flash Grid – scale out up to eight FlashSystem or SVC systems to be managed as one including non-disruptive workload mobility between members of the grid.
  • New topology designs for Policy-based High Availability (PB-HA) including fully active active pathing models and non-uniform host configurations. Minor additions with further OS support and clustering using SCSI-PR.
  • Up to two RDMA-based Ethernet partnerships between systems.
  • Pauseless Volume Group Snapshot (VGS) triggering. Default grain size changes for VGS and the ability to convert ThinClones into full Clones
  • Reclaim of unmapped space in Standard Pools (using VDM under the covers)
  • Auto FCM firmware update of candidate drives
  • Auto download of Security Patches from Fix Central
  • Some minor tweaks:
    • VMware discovers volumes as “Flash” without manual changes at ESXi
    • Volume Groups support multi-tenancy (ownership group assignments)
    • Improved vVol scalability and VASA service RAS
    • Full performance stats available via REST-API

India’s largest full-service stockbroking firm, Kotak Securities, is using Infinidat’s InfiniBox storage solution to support business growth, millions of time-sensitive customer transactions daily, and operational cost control initiatives. Kotak has a footprint of 175-plus branches, 1,300-plus franchisees and satellite offices across more than 370 cities in India. It adopted Infinidat’s flexible consumption “pay as you grow” approach, which allows the customer to only pay for the storage capacity they need and use. Read the full case study here.

Wedbush analyst Matt Bryson reckons Micron is expanding its HBM production in the US and considering extending a Malaysia test and assembly plant to build HBM chips. It wants to triple its HBM market share to 25 percent; the same as its DRAM market share. Currently, Nvidia buys nearly half of the global HBM output.

NetApp says its Spot by NetApp software has achieved the FinOps Certified Platform certification from the FinOps Foundation, validating the platform’s ability to provide the depth and breadth of capabilities organizations need to practice sound cloud financial management. Spot by NetApp has also expanded its certified platform with the general availability of its Cost Intelligence and Billing Engine solutions.

Cloud file services and collaboration biz Panzura has hired Petra Davidson as Global Head of Marketing. She is a returnee having been VP Customer Experience before she left in January 2023. CEO Dan Waldschmidt has also recruited Thomas Morelli from Cisco, after 2.5 years in the Office of the Chief Strategy Officer, to be his VP and Global Comms head.

Storage execs Petra Davidson and Thoma Morelli
Petra Davidson (left) and Thomas Morelli (right)

Redis did well in a vector database performance benchmark using the Qdrant framework. This is the first side-by-side comparison of the new, improved Redis Query Engine released June 20. It claims the top spot from Qdrant and its tests show that Redis is faster for vector database workloads compared to any other vector database tested, at recall >= 0.98. Redis has 62 percent more throughput than the second-ranked database for lower-dimensional datasets (deep-image-96-angular) and has 21 percent more throughput for high-dimensional datasets (dbpedia-openai-1M-angular). Read a Redis blog to find out more.

Redis benchmark chart
Redis benchmark chart

Object storage supplier Scality announced a large-scale deployment of its RING distributed file and object storage solution to optimize and accelerate the data lifecycle for high-throughput genomics sequencing laboratory SeqOIA Médecine Génomique. SeqOIA is one of two national laboratories integrating whole genome sequencing into the French healthcare system to benefit patients with rare diseases and cancer. Alban Lermine, IS and Bioinformatics Director of SeqOIA, said: “In collaboration with Scality, we have solved our analytics processing needs through a two-tier storage solution, with all-flash access of temporary hot datasets and long-term persistent storage in RING.”

Seagate is now selling refurbished disk drives. It has set up an official storefront on eBay as a direct channel for consumers to access factory-certified hard drives as part of the Seagate Circularity Program. Seagate says it’s committed to enabling the secure reuse of storage devices and reducing hard drive shredding. Hard drive shredding entails breaking down a hard drive into tiny pieces so that data cannot be recovered. As rare earth materials contained in those parts cannot be reused, hard drive shredding harms the environment and is not sustainable.

The SNIA STA forum (STA) announced completion of the 20th Serial Attached SCSI (SAS) Plugfest around 24G SAS. The plugfest brought together eight SAS equipment manufacturers in Austin, Texas, and was co-located with SNIA Regional SDC Austin for the first time. Test results were audited by an independent engineering consultant. The following companies attended the plugfest, underscoring a commitment to advancing SAS technology: AIC, Amphenol Corp, Broadcom, ConnPro, Kioxia, Microchip Technology, Samsung, and Teledyne LeCroy Corp.

Tiger Technology says Tiger Bridge subscriptions available on Azure Marketplace are enrolled in Microsoft Azure Consumption Commitment (MACC). This means that every dollar of a Tiger Bridge subscription purchased via Azure Marketplace counts toward a customer’s available MACC spend and can be conveniently tracked within its MCA or EA billing account.

Super-fast file system supplier WEKA has accumulated 113 patents to stop competitors using its technology. It has over 70 patent applications still pending. Co-founder and CEO Liran Zvibel said: “When my co-founders and I started WEKA a decade ago, we wanted to create a radically different approach to data management that was infinitely simpler and more customer-centric, which required a complete rethink of traditional data architectures and developing deeply differentiated intellectual property. Crossing this patent milestone validates our day-one vision to build a revolutionary new approach that eliminates the compromises and challenges of legacy data infrastructure to enable organizations to thrive in the AI era.”

DigiTimes reports that China’s YMTC NAND fabber has moved virtually all its production to its 232-layer fourth generation (Xtacking 3.0) process, and is working on its fifth generation with ~300 layers. This is despite US technology export restrictions. For reference, Micron is at the 232-layer level, Samsung at 236, SK hynix at 238, and Kioxia/WD at 218. Samsung is developing a 286-layer product and SK hynix’s gen 8 technology is 321 layers.