The hype surrounding generative AI has meant it’s no longer a tool to experiment with during a lunchbreak – it’s now a technology that businesses are under pressure to adopt. The likes of ChatGPT have spawned new use cases that are upending entire industries. But whether organisations know it or not at this stage, it will also ultimately force them to consider how the infrastructure they use will hold up under the weight of such developments.
Many things are happening at once that are giving companies options as well as dilemmas. Firstly, AI, which is far from new as a technology in and of itself, is now asking more of the underlying infrastructure powering it because of the resource-hungry way in which large language models (LLM) are trained. Compute, GPU technologies and other accelerators that are central to a successful rollout of generative AI systems have been advancing quickly, which means businesses are in a much better place than they might have been. However, cloud computing infrastructure as we know it needs to evolve to keep pace with growing demand.
Secondly, companies are under pressure to invest in AI – many of which are doing so for the first time – and to show progress in this space. Although many businesses might be clear on what they want to achieve with AI, they might not have a clear roadmap of the steps to take from an IT perspective and how long to allow to get a system in place.
The reason we know that many cloud solutions aren’t suitable for generative AI is because generative AI models, like LLMs, are much bigger than previous AI models. Until now, cloud infrastructure did not need the number of specialized chips and GPUs that AI now requires. But because most AI models operate in the cloud, the cloud has to be a major focus for deploying generative AI systems successfully.
Senior Director, Global Digital Innovation Marketing and Strategy at Hitachi Vantara.
Considering the individual needs of your business
However – not all cloud businesses should look to explore generative AI by building their own LLMs, or indeed, use AI in the same way as others. In the first instance, firms need to decide the role they want to play and, crucially, how much they want to invest.
Insights from McKinsey earlier this year separated the use cases of generative AI for businesses into three categories: Taker, Shaper, and Maker. Very few companies are or can be Makers, meaning they create their own LLMs or generative AI. Most companies are more likely to take on the role of “Shaper”, which means integrating AI models with their own internal data and systems, or “Taker”, which means using publicly available models and tools. What these types of businesses may yet fail to realize is that it is in fact possible to begin the generative AI journey with a relatively small, on-premise or hybrid system.
Before launching into the generative AI journey, the first essential step for organizations is to consider which option is best for their business. There are numerous ways to reap the rewards of this new technology, but not without putting in the initial thinking and groundwork.
Should businesses look on premise?
Looking ahead to the available options, the question could be asked: while cloud features heavily in the operation of generative AI solutions, does it mean AI solutions must live there exclusively?
Without specialist infrastructure optimized for AI, cloud could begin to scale cost prohibitively, bearing in mind the heavy usage needed to train and run generative AI models. Companies could therefore start to consider on-premise or hybrid solutions, even if they had been staunch proponents of cloud computing for the past decade. On-premise storage, in some circumstances, become more cost effective in the long run but, on the other hand, there’s the initial capital expenditure to get the equipment up and running. On-premise solutions also often require a dedicated IT team for maintenance, updates, and troubleshooting, while cloud providers handle much of the IT infrastructure management for you.
The decision to use on-premise, hybrid or cloud solutions for deploying generative AI should be based on a careful assessment of a business’s specific requirements, including data sensitivity, scalability, budget, expertise, and deployment speed. For example, if a business deals with highly sensitive or confidential data – such as a healthcare business – on-premise solutions may offer greater control and security. Hosting AI on-premise could mean that businesses can keep all data within its own infrastructure, reducing the risk of data breaches and ensuring compliance with privacy regulations.
On the other hand, cloud-based solutions often provide better accessibility and collaboration capabilities. They enable remote access to AI resources, making it easier for teams to work together from different locations, something that is becoming more common in the post-Covid-19 working landscape. On-premise solutions may offer less flexibility in this regard.
What steps should a company take to roll out generative AI effectively?
Above all, deploying generative AI, in the cloud or on-premise, involves several considerations to ensure scalability, reliability, and security. Whether a business is deploying a language model like ChatGPT, an image generator, or any other generative AI, it’s important they select the right approach and the right cloud provider, as each will have its own set of AI tools and services to match a business’ specific needs. Other considerations include data security and access control, which could involve implementing strict access controls and authentication mechanisms to ensure only authorised users or services can access your AI system and making sure that the deployment complies with relevant laws and regulations.
It’s also worth keeping up to date with the latest cloud services and AI-related updates from any chosen cloud provider. New features and improvements can often enhance the performance and security of AI deployment and many providers’ community forums, support channels, and third-party AI communities can offer help and a space to share knowledge.
Another key consideration as organizations start to bet on AI by putting those solutions and tools into production is to make sure they have the infrastructure and reliability to support this change. They need to be able to consistently serve up the right data, at the right time, to the right users, to ensure they’re hitting the mark. It’s here that enterprise class storage and data management software is required, which can automate the “massaging” of the data to ensure it works with the AI technology.
The short and long way to begin
With all this said, infrastructure isn’t the only consideration when introducing generative AI. Businesses also need to think long and short. They need to commit to the long-term potential that generative AI can deliver, and look for ways they can use the technology to start delivering results in a meaningful way. This is the fun bit – diving into AI, but that shouldn’t mean getting involved in the technology with no plan.
The best starting point is to pick a strategic focus, identify any potential concerns around privacy, regulations and ethics, and experiment. Along the way, look for areas where specific KPIs can be measured and track to see how much AI is benefitting your business using real data. Reassess and repeat until you’re happy with the solution you’ve rolled out.
There’s no one-size-fits-all solution for deploying generative AI – both in the cloud or elsewhere. What’s important is to consider it as an ongoing process that requires regular maintenance, updates, and adaptation to changing requirements. By following these recommendations, businesses could go on to create robust and reliable generative AI systems that support your business strategy.