Interview. We had an opportunity to talk with Jim Liddle, Nasuni chief innovation officer for Data Intelligence and AI, and the conversation covered edge AI, ensuring data for AI is resilient, and considering whether malware-infected agents using MCP could wander at will within an AI data space. The conversation has been edited for clarity and brevity.
Blocks and Files: Nasuni was early into AI. I remember a year ago, further than that, and now the rest of the industry has caught up and all the focus seems to be on Nvidia. We’ve got to do AI training, we’ve got to do AI inferencing. And so all the storage companies, almost without exception, are supporting GPUDirect for files, for objects. They’re supporting Nvidia’s GPUs, AI stack, NEMO retrievers, NIMS microservices, and all that stuff. There’s Dell, there’s HPE, there’s VAST, there’s NetApp, Pure – all significant Nvidia partners, and AI factories this, AI factories that. Where does Nasuni play in this space?
Jim Liddle: So I guess a couple of things. If you talk to people like Microsoft and other vendors, companies who are using AI to do training, real training, it’s less than 5 percent. That’s not the business, is it? It’s not the future. No. And the reason for that is it’s expensive. If you want to do a very small training set small as a company, it’s going to cost you like a million dollars just to do the training, not even to hire the staff and get the data ready.

Blocks and Files: I think I can see where you’re going with this, which is you’re inherently a company with an edge focus.
Jim Liddle: We are edge to cloud, edge to cloud. We do something that we’ve got what I would say we’ve got three pillars of AI that very few of the vendors can match; edge to cloud, global namespace and AI data resiliency.
Why is edge-to cloud that important? Because that – the edge – is where all your employees are and the AI services are in the cloud. Most of the enterprises are using hyperscalers. How do you get the data from here to there? It’s easy to do it once. It’s easy to have a NetApp in one place and go, oh, let me see if I can get a pipeline going to get my data there. What do you do if you’ve got 12 arrays in 12 different locations, and how do you do that every single day, every single hour?
We are multi-vendor. We don’t care what the hardware is. And that edge, that hybrid nature of it, obviously it wasn’t designed for AI; it just happens to be an absolutely perfect match for Nasuni to be able to move data from here to there without even thinking about it. You have workers working here every day, every hour. And a customer doesn’t have to worry about migrating data from edge-to-cloud or back because it’s inherent inside the software.
It happens. It just absolutely moves back into the global namespace. And the one thing about AI that’s absolutely fundamentally true is – it wants a single source of truth for the data because you get better context, okay?
Blocks and Files: You’re now in a position to have a single virtual center for all the company’s proprietary data. How do you get it to the GPU servers? Do you do that yourself?
Jim Liddle: I would argue that GPUs become really important when you’re doing training. Yes. Companies, enterprise companies, do they really care about training? Not really. What they care about is how do they get the best value of their domain information from an AI perspective.
Blocks and Files: So we’ll have a general trained AI model, general, we’ll have access to it and then we’ll feed it with our own data.
Jim Liddle: And they’re using retrieval augmented generation (RAG) or they’re using agents. if you think about what we do today, we have an edge server, we have a cloud here, the namespace, and then we go through the edge to get back into the data.
Blocks and Files: Suppose I’ve been listening to another supplier and they say inference at the edge is where it’s going to be because you can’t sit at the edge and have communications back to the datacenter for your inferencing going on. You need the inference to work with local data because data’s got gravity.
Jim Liddle: I’d refer you back to the fact they say that because they can get the data from here to there. We can. All of the data has gravity and that’s why we cache it locally for the applications to get access to it. But your AI doesn’t need to be there if the data is just seamlessly moving back to the cloud where you’ve got big heavy scale AI that can work on it at scale. You don’t need to inference at the edge. I’m not saying there are certain instances you wouldn’t want to.
You can actually have all of the edges communicate directly, transiently, back to the namespace. A lot of the vendors say, oh, you’ve got to inference into the edge because it’s hard to move the data back into the cloud, but not with Nasuni. It just goes back instantly and literally you can be working on a file here and an hour later it’s there and AI’s got access to it.
So a guy over here can go, oh, I need to know the latest update on such and such and it’s there. It just gets told. That’s a huge differentiator.
The other thing I would say around that, and I guess this is the way I think the industry will go from an edge perspective; imagine you’ve got a Nasuni customer. They’ve got 12 locations around the world, not unusual. People who buy Nasuni tend to be big enterprise, heavy duty customers.
Now imagine Agentic is taking off, this is the year of agentic AI. If anything, all of your 12 locations are taking orders and they’re all going into one particular directory in the namespace. So it’s all coalesced here. However, you have an agent over in Phoenix that needs access to those orders, but it also needs access to data from CRM and other systems that are not on cloud because you don’t want, they’re over here.
So with Nasuni, all of those 12 other locations can be pinned down to that one server in Phoenix. Every time somebody puts something in, it’s all going into that global space. Ultimately it’s all being fed out to Phoenix where the agent is saying, oh, I need all of the other information and then I need to go to the CRM and then other systems. It’s all getting processed locally at the edge. That’s a hard architecture to have to replicate if you weren’t using Nasuni.
And I can see that it’s going to happen. I mean some of the agents will be on cloud. Sure. But you’re right. Some of the agents will be inferencing at the edge, but you need to be able to shuffle the data around. It’s not one location. It’s easy when it’s one location. It’s not easy when it’s 12 locations or 20 locations and that’s what is going to happen. You’ll end up with these little multi-step processes that can solve the particular problem in enterprise and they’ll do a step and they’ll go, oh, I need to go and get some access to the new data.
Blocks and Files: Should I think of data in this environment as being like the sea? So in a sense, wherever you are in an ocean, the sea is the sea is the sea. It’s the same. So wherever you are in the Nasuni data namespace, the data is there. You can access it. The data is the data is the data is the data.
Jim Liddle: Wherever you are in the world, the whole point of it is, I guess, when you strip it all back, AI needs access to global data to be the most effective. Not like just data from Phoenix or just data from London. It needs, if data from London and data from Phoenix have some context between each other, you want the AI system to see both. You don’t just want to see one. You’ll have used AI and you’ve asked it the question, the more data or relevance you give it and the more context you give it, the better.
Blocks and Files: This could be like a virtual desktop. I’m sat here with what looks like my PC desktop. It’s actually just a terminal connected to some central place. So I could be using an AI system sitting at some edge location, but it’s actually running up in the cloud.
Jim Liddle: Absolutely. So this idea of inference at the edge in that environment is tough. I’m not saying you don’t don’t have to use it in certain circumstances with heavy-duty stuff, but your day job as a company is to run your business. It’s not to go and use AI, you solve problems. Will there be AI in the edge? Sure. In some of the circumstances. And there’ll be other situations where we’ve got some companies doing that already today, but they’ve made some strategic decisions around we want to purchase some GPUs because we think we’ll get better ROI and TCO all the time. But a lot of companies really, if you look at what the use cases are, use cases are pretty simple.
Blocks and Files: The implication here is that you’re not going to see heavy duty Dell or HPE servers with a rapid GPU inside them sat in small offices.
Jim Liddle: No. Or remote environments at all. It’s not going to happen. I don’t see it. I mean Nvidia has just released DGX Spark, as you probably know, which is kind of an AI PC for the desktop. Do we see employees can be sat at DGX doing RAG workloads? I don’t think so. It’s so expensive. It’s about four grand for a start and then it still needs, obviously, technology to be able to set things up.
Ultimately what a person in the company wants to do is; they want to ask an AI a question, but they want it to be answered not on the foundational model’s knowledge. They want it to be answered on their own company knowledge.
Blocks and Files: And the AIs are going to be running in the equivalent of an AI execution space. They’re in the cloud. It’s global. They’re not sat on my local hardware.
Jim Liddle: No. Look, even if you can do that, for businesses, employees are not going to do that. I would say for the stock standard business, what they’re interested in is how can you let me, Mr. Nasuni, get access to my data with the AI that I choose to use, which in nearly 90 odd percent of cases is going to be on cloud in Microsoft Open AI or AWS. And how can I do that to take advantage of the applications and tools they’re giving me to make it easier to leverage that data from an AI perspective?
Blocks and Files: I think that what you are providing is a way for the data, via your global namespace, to be fed to AI models.
Jim Liddle: Correct.
Blocks and Files: You’re not going to get involved in doing detailed RAG data preparation yourself. It doesn’t make any sense to do that. People will use models or pipeline stuff sitting on top of you for that.
Jim Liddle: Correct. I go back to the three core precepts. What are the core architectural precepts of Nasuni? The first one is edge to cloud. That’s so key. If you can’t get the latest data to the AI service, you’ve got a problem. The second one is the global namespace because if you are moving the data there, it’s got to be visible to all of your company locations. There’s no point being visible just for one. And then the third one is AI data resilience.
We’ve seen ransomware becoming more sophisticated and we’re seeing that probably that’s been driven in part probably by AI itself because these threat actors are using AI to be able to make the ransomware better. Once you start to get some of those business processes and agents embedded in the enterprise, what are they doing? They’re accessing data from all different places, including the center. It’s open door and what will happen once you’ve got a hundred of those running and your enterprise relies on them and your data gets locked down, you’ll be scrambling around wondering why something stopped.
Blocks and Files: You’re working out where it stopped and why when you’ve got a hundred AI agents as well as your normal human users accessing and processing it.
Jim Liddle: It is horrendous and you’re going to need AI data resilience. This really is the next step in ransomware resilience to be honest. Because the underlying threat is still the same. Your data gets hijacked, but here it becomes as important as AI.
Blocks and Files: What will Nasuni do for that?
Jim Liddle: We’re doing a few things for that. First of all because we’ve got, in the architecture itself, the snapshot technology, we’re automatically doing those snaps in the background. All of the snaps we do are immutable so we could easily roll back.
It sounds trite. I always hate saying it because it sounds a bit markety, but if you look at alternative backup strategies, the problem you’ve got generally with backup is you take an initial backup and then you take and you do incrementals. But very few companies go back and check that and roll it to see if it’s going to work. Actually getting the whole thing back and running takes time and effort. Whenever we do our snaps, they’re versions of the original object. For us to move back to through a version, back to a snap, we just change a pointer.
Blocks and Files: What that means is you’ve got a time machine.
Jim Liddle: Yes. You can actually step back in time in minutes.You don’t have to generate an entire backup from a foundation and 2000 incrementals. We literally just move the pointer.
We’ve also got our ransomware protection. It actually looks for anomalous behavior, not just at the headers, but also for anomalous behavior. And what it does is it kind of says: Oh, Chris has locked 50 files in under 30 seconds or under, maybe it’s under a second. I’m going to lock Chris out. I’m going to send a report to the admin and it’s going to be up to the admin to decide what to do with it. Let Chris back in again – or not, because that doesn’t look like the type of behavior that should be happening. And that’s built into the product today.
Blocks and Files: How about attacking this from another angle, which is that you’ll have AI agents accessing the data and those agents will have behavioral profiles. So you need to track what the agents are doing and if an agent is doing something different, anomalous, would you lock an agent out or am I going off on a tangent?
Jim Liddle: No, I think you’ve hit on a good point. I’ll give you your analogy. You’ve probably heard of model context protocol or MCP. We are heavily looking at MCP. If you look at MCP today, Claude is an MCP client so it can connect to any MCP server. All you’ve got to be careful of is the MCP server you are connecting to is not a poisoned MCP server.
You’ve got to be really careful. Now if you are controlling all end to end, then that’s fine. It’s a closed system. But if you’ve downloaded an agent from somewhere and just embedded it into your agent framework, who’s to say that that agent hasn’t been compromised at some point? It’s an agent from a channel partner supposedly. Who knows?
Blocks and Files: That’s frightening. That’s really frightening.
Jim Liddle: It is. And I think in the enterprise, most of them are just going to gravitate towards … closed doors.