Inference Stack

Ingress

This guide details how to create network access to your models deployed in a Kubernetes cluster.

AWS

Prerequisites

Make sure you have the AWS Load Balancer Controller installed in your cluster. Amazon Elastic Kubernetes Service (EKS) provides two options for exposing applications which can be configured with the Inference Stack.

Application Load Balancer

To provision an Application Load Balancer add the following values to the Inference Stack helm chart:

ingress:
  className: alb
  annotations:
    helm.sh/resource-policy: keep
    alb.ingress.kubernetes.io/load-balancer-name: onwards-ingress
    alb.ingress.kubernetes.io/target-type: ip
    # alb.ingress.kubernetes.io/scheme: internet-facing
    # alb.ingress.kubernetes.io/healthcheck-path: /healthz
  hosts:
    - host: example.com
      paths:
        - path: /
          pathType: Prefix
Note

Add alb.ingress.kubernetes.io/scheme: internet-facing to the annotations section for public access, it will default to being internal otherwise.

AWS Network Load Balancer

To provision a Network Load Balancer add the following values to the Inference Stack helm chart:

onwards:
  loadBalancer:
    enabled: true
    nameOverwrite: onwards-service-loadbalancer
    port: 80
    annotations:
      helm.sh/resource-policy: keep
      service.beta.kubernetes.io/aws-load-balancer-type: "external"
      service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
      # service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
      # alb.ingress.kubernetes.io/healthcheck-path: /healthz
Note

Add service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing" to the annotations section for public access; it will default to being internal otherwise.

GCP

Warning

Any Ingress resource deployed on GKE without the kubernetes.io/ingress.class annotation is interpreted as an external Ingress resource which deploys an external load balancer for the Service

External Application Load Balancer

To provision an External Load Balancer add the following values to the Inference Stack helm chart:

ingress:
  annotations:
    helm.sh/resource-policy: keep
    # kubernetes.io/ingress.class: "gce"
    # kubernetes.io/ingress.regional-static-ip-name: 10.240.0.25
  hosts:
    - host: example.com
      paths:
        - path: /
          pathType: Prefix
Info

ingressClassName are not supported by Google Load Balancers, so no need to provide className.

Internal Application Load Balancer

To provision an Internal Load Balancer add the following values to the Inference Stack helm chart:

ingress:
  annotations:
    helm.sh/resource-policy: keep
    kubernetes.io/ingress.class: "gce-internal"
    # kubernetes.io/ingress.regional-static-ip-name: 10.240.0.25
  hosts:
    - host: example.com
      paths:
        - path: /
          pathType: Prefix

Azure

Warning

If annotation service.beta.kubernetes.io/azure-load-balancer-internal is not present, the load balancer will be created as external by default.

External Access

To provision an External Load Balancer add the following values to the Inference Stack helm chart:

onwards:
  loadBalancer:
    enabled: true
    nameOverwrite: onwards-service-loadbalancer
    port: 80
    annotations:
      helm.sh/resource-policy: keep
      # service.beta.kubernetes.io/azure-load-balancer-ipv4: 10.240.0.25
      # service.beta.kubernetes.io/azure-load-balancer-ipv6: 10.240.0.26

Internal Access

To provision an Internal Load Balancer add the following values to the Inference Stack helm chart:

onwards:
  loadBalancer:
    enabled: true
    nameOverwrite: onwards-service-loadbalancer
    port: 80
    annotations:
      helm.sh/resource-policy: keep
      service.beta.kubernetes.io/azure-load-balancer-internal: "true"
      # service.beta.kubernetes.io/azure-load-balancer-ipv4: 10.240.0.25
      # service.beta.kubernetes.io/azure-load-balancer-ipv6: 10.240.0.26