AI data pipelines matter more than models

Scality, the object storage supplier, believes that the idea of competing businesses differentiating themselves by the particular AI foundation model they use is mistaken.

Model features will converge. What matters more is getting the right data to the models, and agents, so that they generate accurate and up-to-date responses.

Giorgio Regni.

Scality CTO Giorgio Regni tells us “As foundation models become broadly available and increasingly interchangeable, they stop being the differentiator. What matters more now (and will matter  even more going forward) is the pipeline: how you collect, shape, govern, and deliver data to those models in a way that’s fast, efficient, and reliable at scale.”

It’s not enough to be a bare object storage supplier. You have to be hooked to AI data pipelines to aid the scanning, filtering, and selection of data. In August, we learnt Scality’s RING object storage can be integrated  with a vector database and LangChain framework to feed data to RAG workflows for AI models like GPT.

Regni thinks that there is a change of focus happening, from models to pipelines: “We’re seeing this shift everywhere. The real advantage lies in the systems that manage the full lifecycle of enterprise data — not just moving it around, but versioning it, enriching it, and keeping full context and control. That’s the part most organizations are still struggling with. And it’s why we believe pipelines, not models, are quickly becoming the true competitive edge.”

He sees the storage media tiering model consolidating to two basic layers: “From our point of view, this shift is architectural, too. The old five-tier pyramid is breaking down. What we see working in real deployments, especially at scale, is a collapse to two tiers: fast local flash on the GPU servers, and object storage for everything else. That’s it. Flash gives you the bandwidth and latency to keep the GPUs busy. Object gives you the scale, durability, and metadata to store and govern everything that’s not actively in use.”

Scality RING seen as an AI data pipeline bulk data and stage store.

The backend object storage is an upstack data feeder, and downstack data receiver, and not the top level: “Your pipeline needs to understand and operate on that model. The flash becomes your dynamic working set, constantly hydrated with new data and drained for  checkpoints, snapshots, and derivatives. Any extra hops, extra tiers, or added complexity? That’s overhead. That’s latency. That’s where GPUs go idle and budgets start bleeding.”

In his view: “The cloud hyperscalers figured this out a while ago. They’re all converging on this simplified stack. And they’re building pipelines that make full use of it, pipelines that move fast, scale wide, and don’t break governance in the process. That’s what the leaders are doing. That’s where the edge is now.”

The corollary of this is that enterprises and other organizations developing their own AI data pipelines should do the same. Follow the hyperscaler 2-level stack model; flash for 2-way data sprints to and from GPUs, and object storage for the long-term data storage marathon. Scality’s RING XP product uses a GPU server’s local storage drives as the fast access tier, with microsecond-class latencies, on top of bulk object storage.

Scality RING XP Graphic.

Two points; Regni sees no major role for file storage here. But then he speaks for an object storage supplier which has added file protocol support; NFS and SMB, on top of object. Secondly, the object layer could be tiered inside, with, for example, a public cloud S3 Glacier-type backend for old and largely in-active objects.