CEO Eran Kirzner framed by repeating Lightbits logo graphics

Image via lightbits

Block storage has a role in AI training, says Lightbits Labs

Published sat 24 Jan 2026 // 08:54 UTC

It’s not either block storage or file/object storage for AI training and inference. It’s both.

AI large language model (LLM) training requires stacks of data, both files and objects, but not block data. The LLM training process needs unstructured data, turning it into tokens then vectors. Even AI inference with natural language input handles unstructured data and not block data.

BANDF AD

How, then, does a block storage software supplier like Lightbits Labs make progress in the AI training market?

The GPU servers used in AI training actually need two kinds of data. One is unstructured data used to train the LLMs. The other is stored software code that operates the GPU servers themselves, typically Kubernetes-organized pods (containers), and they have to be loaded into the GPU servers from somewhere.

Think of VMware servers and their bare metal ESXi hypervisor, the bootable ESXi software, with NSX networking and vSAN storage and sundry drivers and firmware. This ESXi software bundle is called an image, an ISO file for installation and booting. A virtual machine disk file (VMDK) is the primary disk image format in VMware, containing the guest OS, applications, and data – the “disk image” of the VM. It is held on storage media, like a local SSD, and booted into life in the server.

Now translate this into the Kubernetes world, roughly, with container images. When a GPU server system can have hundreds if not thousands of GPUs, loading container images onto the GPUs becomes a huge job. To reduce GPU downtime to a minimum, image load should start quickly, meaning with a low latency, and complete quickly, meaning having high throughput. Doing this through a file or object protocol interface adds latency and slows throughput compared to pulling this system-level data through a block interface.

BANDF AD

Lightbits Labs CEO Eran Kirzner says the company is selling its software to deal with two main use cases: transitions from VMware to alternatives, and migration to Kubernetes. This latter, Kirzner says, applies 100 percent to AI workloads, cloud, and GPU cloud datacenters.

These transitions are happening in e-commerce and finance as well. Each of these has peaky workloads, and requires as fast a response as possible and as high performance data delivery as possible to cope with the peaks as they rise. When an e-commerce site is starting Black Friday, it needs software loading onto its dynamically scalable estate of virtualized and/or containerized servers as near instantly as possible, so that it doesn’t lose a moment of Black Friday trading time. Server unavailability translates instantly into lost dollars.

In the AI training world, Lightbits software is serving images to GPU servers so that they can be switched from one training run to another with minimal downtime. These hyper-expensive processors waste a lot of money if they sit idle.

Lightbits’ market is not typically a VMware-type environment. Kirzner says: “Everything is OpenShift, KubeVirt, Kubernetes, and OpenStack,” but “we still have some large VMware customers. The beauty is that, on the same cluster Lightbits, you can run VMware and you can run a new (containerized) datacenter.”

BANDF AD

Lightbits performance advantage over Ceph means you need far fewer block storage servers; four in this case instead of 40

The number of Kubernetes cores involved can be huge. Kirzner tells us: “One of our largest customers has more than two million Kubernetes cores connected to our environment. It’s pretty massive.” That could mean between 15,500 and 31,000 nodes, each one of which has to have a container image moved onto it. The scale at the high end is super-large, but other smaller customers might have 10,000 or even just 1,000 cores.

Lightbits’ cloud service provider customers typically exhibit this pattern as they embrace AI. Kirzner says: “Almost every cloud provider we have is also going to have an AI cloud. And here, one of the important parts is not just performance. Our hall of fame is performance and latency, very high performance, very low latency. If you compare us to Ceph, we are going to be between 5x to 10x better performance, and the same 5x to 10x better latency.”

Apart from the performance and latency, provisioning is another highly important aspect of this, Kirzner says. “I’ll explain why. Because when you need to provision 10,000 systems, 20,000 systems, and start to run a workload, you need all the images to fit in, and then to start simultaneously. If your workload is inference or your workload is training [then] this has to happen really, really fast, and we help some of the cloud providers to go from hours of provisioning to minutes of provisioning.” That’s hours of expensive GPU idle time cut down to minutes.

Any large GPU farm will need block storage, like Lightbits software, to load the system and LLM images onto the processors, and also file and/or block storage to hold the data that’s going to be used by the LLMs.

ceph block ai-ml file kubernetes flash lighbtits

Block storage has a role in AI training, says Lightbits Labs

News ticker – March 6

What’s the M in MWC stand for? Memory if you're Micron or SK Hynix

Nasuni buys Resilio following torrent of exec changes

Businesses still struggling to manage data budgets, deliver ROI when it comes to AI

Seagate HAMRs out production deal for 44TB Mozaic4+ drives with hyperscaler

Storage news ticker – March 2

AI server frenzy fuels record revenues for Dell

All-flash array topline boost puts NetApp on track to strongest year yet

Everpure tops $1B quarter as FY 26 revenue hits $3.7B

Nutanix beats the Street on revenue, lands $150M AMD AI alliance

VAST broadens AI platform push with Nvidia tie-up and control plane

Index Engines flags rise of polymorphic, shadow-encrypting ransomware

Commvault plugs AI anomaly alerts into CrowdStrike Falcon SIEM

Scality RING becomes back-end object store for WEKA NeuralMesh

Druva adds Agentic Memory to speed forensic compliance probes

Backblaze lands first eight-figure neocloud deal as revenue climbs 12%

Backblaze brings backend B2 Neo cloud storage to GPU server farms

HDD is back: The return of the hard drive

Storage news ticker – February 23

How AI Is forcing storage back into the enterprise conversation

StorONE arrays adopt external flash JBODs in flash program

Flamethrower from Backblaze to fire up startup cloud storage