GPT-in-a-box 2.0: an easy entry to generative AI

Today, every company is exploring various opportunities in the space of Artificial Intelligence and especially generative AI. AI has the potential to transform a wide range of industries and applications including Healthcare for patient questions, Fraud Detection for money transactions, Autonomous Vehicles, Robotics, Financial Analysis, and so on. Where machine learning was prominent, Generative AI is taking over due its ability to use the existing data and create a new and adaptive method to use the data. There is also the fear that business sensitive data leaks through the usage of generative AI that is storing its data in a cloud environment where the customer has no control over. Nutanix partnered with Nvidia, Hugging face and others to let customers use generative AI on premise without using the cloud and its risks.

Nutanix GPT-in-a-Box is a solution that combines Nutanix’s hyper-converged infrastructure capabilities with an AI platform to provide a powerful and user-friendly option for businesses seeking to enhance customer service and streamline operations. The core of Nutanix GPT-in-a-Box is a hyper-converged infrastructure solution that integrates storage, networking, and compute resources into a single, cohesive platform. This is complemented by an AI platform that enables the development and deployment of large language models, ML operations, and generative AI apps. This integration allows businesses to simplify and automate their IT infrastructure, reducing the complexities of managing multiple systems.

Nutanix GPT-in-a-Box provides a sandbox environment for developers to create gen AI tools and applications. Through the use of MLOPS (Machine Learning Operations), data scientists and IT professionals can collaborate to automate, deploy, and monitor machine learning models in production. MLOps merges machine learning with DevOps practices to establish a seamless workflow for the development and deployment of machine learning models and large language models.

What is a LLM:

A generative AI needs its data where it can pull all its information from, for text base AI its called a Large Language Model. There are different ways to give AI its model:

Create a new large language model, this solution is the hardest and most time consuming of all. Creating an LLM from scratch will need a tremendous amount of data and expertise of data scientists. This is a solution that big corporations as a google or a Facebook can use.
Fine tune a pre-existing LLM, finetuning refers to adjusting the parameter, hyperparameter or other configuration of an existing model to improve its performance on a specific task or dataset. This process involves training the LLM on additional data or modifying its leaning process to better fit the specific requirements of the task at hand.
Usage of Retrieval-Augmented Generation also called RAG. Normally a Generative AI uses its LLM to get its information from, if this data is out of date or not customer specific the AI will still use it as it only knows what is in that LLM. What RAG does is adding a content store, this could be open on the internet or internal documents or databases of a customer or policies. Now the LLM first goes and communicates with the datastore and tries to get information that is relevant to task it got and uses that retrieval augmented information to answer a question or perform a task.

How does GPT-in-a-box work?

Now we know why Nutanix and GPT work together to deliver a turnkey solution to customers for their AI needs. But how does it work inside the Nutanix environment and what are the prerequisites?
The following graph summarizes it perfectly:

Nutanix GPT-in-a-box is built on top of the Nutanix platform:

GPU enabled Nodes are nodes with an GPU in them to take care of the computing load.
Nutanix AOS is the know Nutanix OS.
On top of that there is the Nutanix Hypervisor AHV with Prism elements.
Kubernetes is integrated with the Nutanix cluster and Prism Central
Nutanix Files and objects are proven data storage solutions

On top of the standard Nutanix platform and its perks, the GPT-in-a-box solution adds:

PyTorch is an opensource machine learning library based on the Torch library. It is used for the natural language processing part. It is free and open source.
MLOps is a core function of machine learning engineering. It’s an engineering discipline that aims to unify ML systems development and deployment.
Foundation Models are validated large language models, these models are used to store millions of data sets and is used by the generativeAI to perform its tasks.

How to implement Nutanix GPT-in-a-box?

Simply having a Nutanix cluster would not suffice to run GPT on it. A cluster must be sized for usage of AI. This means that the Nutanix sizer would add specific hardware and licenses to the cluster that are needed to run GPT-in-a-box.

The first thing what is needed is the GPU enabled nodes, as GPT needs a lot of parallel compute power that will be covered by the GPUs in every node. The second thing is the need of files and object storage to fine-tune and run the chosen GPT models. This is a part of the Nutanix cloud infrastructure Ultimate license model that is included in the Nutanix GPT-in-a-box service.( Of course next to the AI specific needs the cluster will have its native Nutanix functions with its AHV hypervisor, resilience and ease of management)

The AI compute can run on two different platforms in a Nutanix cluster, a Kubernetes cluster or a ubuntu VM although for production workloads it is recommended to use a Kubernetes cluster. Nutanix uses open source software to deploy and run AI workloads, the software includes PyTorch software and Kubeflow MLOps toolset. Also every generative AI needs a Large language model. This model is where the AI gets its information from. This can be a generic LLM that is available on the validated models of Nutanix or it’s also possible to build your own Large language model. Nutanix also supports the usage of multiple LLM’s on the same cluster.

Conclusion

Nutanix GPT-in-a-Box is a very easy start into having your own generative AI running in your data center with a language model you selected and inserted with the customer business data. A safe way of accelerating your business with generative AI. This is just scratching the surface on what generative AI can do.