Modelcars from Red Hat

Last week I wrote about model cars and how you can build them yourself. Following on from this, there are also some existing modelcars that are provided by the Red Hat AI Business Unit and can be used without further ado: https://quay.io/repository/redhat-ai-services/modelcar-catalog

Just do step 5 from last week with the image you want and you’re done.

CPU models

I am currently working on an OpenShift AI PoC and one of the requirements is to deploy an LLM that only runs on CPU. The team can’t use GPUs at the beginning, but they know that LLMs can only reach their full potential on them. Later, the production system will also have GPUs and be able to use other more common models. A good CPU solution should therefore be found.

We’ve started with tinyllm and with steps similar to this guide from vLLM. (The guide do not take into account that the enterprise clusters are usually hidden behind a proxy, but they are excellent for a quick start.)

Unfortunately, in the RAG use case with an internal document, the llm used the fed information, but there was also a lot of hallucination, so that the result was not really convincing.

I heard about the Pleias model from a colleague. It is very suitable for RAG use cases, is not too big and delivers good results. Unfortunately, this model is not yet available as a modelcar, but I have heard that people are already working on it.

As soon as the model can be easily used via modelcar, I will let you know.

How to backup all of your OpenShift AI stuff (Work in progress)

In order to reproduce errors within the OpenShift AI PoCs, I have built an instance on our demo platform and would soon delete it again for cost reasons.

I wanted to know the best way to secure my work with a backup. My goal was to have a helm chart that can be deployed on clean OpenShift clusters. And I heard in a chat about the helm chart for a cpu only RAG use case. So it was obvious to take all the objects that are in the helm chart and hope for the best.

It’s a work-in-progress. But I’m on it. I fetched all of these objects, trimmed them down via oc neat (== awesome solution for yaml backups that removes all status infos)

oc get deployment -o yaml | oc neat > deployments.yaml
oc get inferenceservice -o yaml | oc neat > inferenceservices.yaml
oc get notebook -o yaml | oc neat > notebooks.yaml 
oc get pvc -o yaml | oc neat > pvcs.yaml 
oc get route -o yaml | oc neat > routes.yaml       
oc get secrets -o yaml | oc neat > secrets.yaml
oc get svc -o yaml | oc neat > services.yaml  
oc get servingruntime -o yaml | oc neat > servingruntimes.yaml

I’m quite confident that all of my objects should be saved. Now I’ll have to remove not needed objects that are not really part of my own deployments from the objects, fill that into a helm chart, provide nice values and it should be done. Did I mention that I’m a helm fanboy?