Configuring the chat bot

The sample retail application includes a built-in chat interface that allows customers to interact with the store using natural language. This feature can help customers find products, get recommendations, or answer questions about store policies. For this module, we'll configure this chat component to use our Mistral-7B model served through vLLM.

Let's reconfigure the UI component to enable the chat bot functionality and point it to our vLLM endpoint:

Kustomize Patch
Deployment/ui
Diff

~/environment/eks-workshop/modules/aiml/chatbot/deployment/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - ../../../../base-application/ui
patches:
  - path: deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/created-by: eks-workshop
    app.kubernetes.io/type: app
  name: ui
  namespace: ui
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: service
      app.kubernetes.io/instance: ui
      app.kubernetes.io/name: ui
  template:
    metadata:
      annotations:
        prometheus.io/path: /actuator/prometheus
        prometheus.io/port: "8080"
        prometheus.io/scrape: "true"
      labels:
        app.kubernetes.io/component: service
        app.kubernetes.io/created-by: eks-workshop
        app.kubernetes.io/instance: ui
        app.kubernetes.io/name: ui
    spec:
      containers:
        - env:
            - name: RETAIL_UI_CHAT_ENABLED
              value: "true"
            - name: RETAIL_UI_CHAT_PROVIDER
              value: openai
            - name: RETAIL_UI_CHAT_MODEL
              value: /models/mistral-7b-v0.3
            - name: RETAIL_UI_CHAT_OPENAI_BASE_URL
              value: http://mistral.vllm:8080
            - name: JAVA_OPTS
              value: -XX:MaxRAMPercentage=75.0 -Djava.security.egd=file:/dev/urandom
            - name: METADATA_KUBERNETES_POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: METADATA_KUBERNETES_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: METADATA_KUBERNETES_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          envFrom:
            - configMapRef:
                name: ui
          image: public.ecr.aws/aws-containers/retail-store-sample-ui:1.2.1
          imagePullPolicy: IfNotPresent
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 45
            periodSeconds: 20
          name: ui
          ports:
            - containerPort: 8080
              name: http
              protocol: TCP
          resources:
            limits:
              memory: 1.5Gi
            requests:
              cpu: 250m
              memory: 1.5Gi
          securityContext:
            capabilities:
              add:
                - NET_BIND_SERVICE
              drop:
                - ALL
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            runAsUser: 1000
          volumeMounts:
            - mountPath: /tmp
              name: tmp-volume
      securityContext:
        fsGroup: 1000
      serviceAccountName: ui
      volumes:
        - emptyDir:
            medium: Memory
          name: tmp-volume

         app.kubernetes.io/name: ui
     spec:
       containers:
         - env:
+            - name: RETAIL_UI_CHAT_ENABLED
+              value: "true"
+            - name: RETAIL_UI_CHAT_PROVIDER
+              value: openai
+            - name: RETAIL_UI_CHAT_MODEL
+              value: /models/mistral-7b-v0.3
+            - name: RETAIL_UI_CHAT_OPENAI_BASE_URL
+              value: http://mistral.vllm:8080
             - name: JAVA_OPTS
               value: -XX:MaxRAMPercentage=75.0 -Djava.security.egd=file:/dev/urandom
             - name: METADATA_KUBERNETES_POD_NAME
               valueFrom:

This configuration makes the following important changes:

Enables the chat bot component in the UI interface
Configures the application to use the OpenAI model provider, which works with vLLM's OpenAI-compatible API
Specifies the appropriate model name, which is required by the OpenAI endpoint format
Sets the endpoint URL to http://mistral.vllm:8080, connecting to our Kubernetes Service for the vLLM Deployment

Let's apply these changes to our running application:

~$kubectl apply -k ~/environment/eks-workshop/modules/aiml/chatbot/deployment

namespace/ui unchanged

serviceaccount/ui unchanged

configmap/ui unchanged

service/ui unchanged

deployment.apps/ui configured

~$kubectl rollout status --timeout=130s deployment/ui -n ui

With these changes applied, the UI will now display a chat interface that connects to our locally deployed language model. In the next section, we'll test this configuration to see our AI-powered chat bot in action.

note

While the UI is now configured to use the vLLM endpoint, the model needs to be fully loaded before it can respond to requests. If you encounter any delays or errors when testing, this may be because the model is still being initialized.