Commissioned: In the age of automation and generative AI (GenAI), it’s time to re-think what “datacenter” really means. For those who have become heavily invested in public cloud, the datacenter might not be the first place you think of when it comes to automation and GenAI, but these technologies are rapidly changing what is possible in all environments.
Ten or fifteen years ago, when businesses started bypassing IT by swiping credit cards and setting developers loose on cloud resources, the public cloud was absolutely the right move. In most large organizations, internal customers were often ignored, or their needs were not being fully met. They wanted flexibility, they craved scalability and they needed a low up-front cost to allow incubation projects to flourish.
If time stood still, perhaps the dire prognosticators of the datacenter’s end would have been right. I myself was quite the cloud evangelist before learning more about the other side of the fence. So why hasn’t this extinction-level event come to pass? Because the datacenter has adapted. Sure, there are “aaS” and subscription models now available on-premises; but the real stabilizing force has been automation.
Which brings us to the story of the day: GenAI and how it can augment automation in the datacenter to be an experience nearly on par with the public cloud. Before we get there though we need to look at the role automation and scripting have played in the datacenter. We’ll start by explaining some essentials, then we’ll unpack why automation and GenAI have changed what is possible on-premises.
Cloud operating model and infrastructure as code
Let’s start with the basics: the foundation of cloud was infrastructure as code and the idea of consuming IT as a Service. Your developers never had to talk to a storage admin, IT ops person, or the networking team to rapidly spin up an environment and get to work. This should be table stakes in 2023, and the good news it’s entirely possible to build it for yourself. Adopting this operational model means IT is leveraging policies and processes alongside automation to remove friction from the environment.
Visual representation of the end experience when you’ve automated a cloud operating model
Automation toolsets and telemetry data
Today there are many automation, management and telemetry/AIOps products available that provide unparalleled control and insights into datacenters. Data is the foundation AI and of managing a datacenter effectively. The control and visibility now in datacenters is often a superset of what can be achieved in the public cloud – although the hyperscalers have done a great job in that department as well. Given the cloud’s multitenant nature, cloud providers must obscure some of the operational knowledge to keep every customer secure. This results in architectural decisions that limit how some monitoring systems can be deployed and what data can be collected. One important are of focus is ensuring that you’re heavily integrating these solutions, embracing automation and infrastructure as code, measuring/monitoring everything and using a cohesive workflow for all your roles.
Visual representation of a common automation/management stack
The next wave of IT automation with GenAI
This brings us to the next evolution of the datacenter incorporating GenAI. Let me share a fun story about a past role where the client made the marketing consultant build an HCI deployment hands-on lab for physical and virtual infrastructure, and then didn’t provide any subject matter experts to help. If it’s not clear, that marketing consultant was me, and it was probably one of the most challenging projects I’ve ever worked on. I used code snippets and YouTube tutorials to get to the foundation of how to do such a task. I spent weeks assembling the puzzle, figuring out how each puzzle piece fit together. By some miracle I actually managed to get it right, even though I didn’t know much about coding. Anyway, here’s wonderwall… I mean here’s GenAI doing that.
GenAI is the Search Engine and code assembly machine we were looking for
Now mind you in my hands-on-lab, I was doing a lot more than just installing Windows Server, but there is no doubt in my mind if I asked it to provide the rest of that process, it could. What’s so important is that with the infrastructure-as-code mentality, and in new environments where developers may not be familiar with these types of calls or runbooks, GenAI is a new ally that can really help. Many people don’t realize access to common infrastructure scripts is prevalent – and oftentimes it’s written by the tech companies themselves. Both hardware and software vendors have large runbook repositories, sometimes it’s just a matter of finding them: enter GenAI. Another important consideration is that the infrastructure itself is intelligent and secure. These commands can be pushed out to thousands of servers for remote management purposes. This greatly lowers the bar on managing your environment.
GenAI and process building
One of my favorite customer engagement stories might sound a little long in the tooth – somewhat like those stories of being lost or unable to reach someone that are unfathomable to those who grew up with smartphones. We hear a ton of talk about containers, but when I broached this topic with one customer, he said, “I can’t even keep my VMware admins 18 months, what makes you think I could ever do containers?” This is something I’ve thought a lot about and it’s probably the biggest challenge with technology: if I don’t have the skillset, how could I possibly onboard it? Enter GenAI’s next incredible friction reducer: writing or finding documentation.
In just two prompts we have a routine and highly valuable process documented and ready to use
We’ve long had access to an incredible amount of information, however previously there’s been no ability to parse it all. This all changes with GenAI. Now, instead of navigating search and sifting through code repositories, a simple natural language query or prompt yields exactly the documentation needed. Instead of hours of looking for answers, extensive documentation is at your fingertips in minutes. This completely destroys any barriers to embracing technology. Imposter syndrome, skill gaps, and switching costs: you’re on notice.
Thousands of possibilities but AI Ops is next
I want to acknowledge the wealth of ways this technology can help us run a datacenter. Probably the next one to add significant value is AI Ops. That rich telemetry data can tell us a lot but also tends to have a signal-to-noise ratio problem. We’re simply generating too much data for human beings to analyze and comprehend it all. By pushing this data into GenAI and using natural language as an interface, we will extend insights to a broader audience and make it possible to ask questions we may never have thought of when looking at charts and raw data. The mean time to resolution will plummet when we use this kind of data. But there is one massive drawback, which brings us to our final point.
GenAI and automation change what’s possible, but we must use it carefully
Two of the major challenges with GenAI must be addressed. They are: Intellectual Property (IP) leakage and its ability to “hallucinate” or make things up. Let’s unpack each and determine how to embrace the technology without stumbling during implementation.
First, let’s discuss IP leakage. In any scenario where data is being sent to GenAI models that are delivered as a service, we risk leaking IP. Much like the early days of public cloud and open S3 buckets, early experimenters in their misuse or misunderstanding, created risk for their companies. The best way to counter this is to have a centralized IT strategy, insert them into your common workflows or development pipeline, and lastly prioritize building your own GenAI on-premises for highly sensitive data that cannot go to a AIaaS which is constantly learning off your data.
The other benefit of bringing a large language model (LLM) in house is you can also make it more precise and put guardrails on it. This makes the responses it generates more precise and in context of your own business. The guardrails can also stop some of the “hallucinating” i.e. when the GenAI is compelled to answer but provides inaccurate and/or made-up information to comply with the request. This is a common problem with GenAI. The reality is these tools are all still in their infancy. Just as most would work testing into their release pipeline, this too is an area where more rigor should be placed prior to pushing to production. I’m a big proponent of human in the loop, or human assisted machine learning, as a way to reduce mistakes with AI.
The future is automated
The datacenter is here to stay, but it can be radically transformed with GenAI and automation. These tools can augment our workflows and help IT Ops and developers achieve superhuman capabilities, but they are not a direct replacement for people. As you roll out your AI and automation strategies it’s important to think about what you’re trying to accomplish and what level of automation your organization is comfortable with. The future is bright and the ability to innovate anywhere is now a reality.
Brought to you by Dell Technologies.