Starting a new thing is not easy, especially when that thing is new by itself. Recently I had chance to use KFServing. There are some bumps, but with help of KFServing team, I am able to stand up a model as REST API and test it.
KFServing is for serving machine learning (ML) models on arbitrary frameworks. It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX, with Autoscaling, Scale to Zero, and Canary Rollouts feature. The basic sample is to expose a model directly as a REST API (serverless model, you don’t need to build an image).
Installation of kubeflow on GKE, I used following to install kubeflow 1.0-RC
Install KFServing
export TAG=0.2.2
kubectl apply -f ./install/$TAG/kfserving.yaml
Steps
1. The inferenceservice cannot be deployed the service in a namespace with control panel label. Otherwise the storage-initializer init container does not get injected for deployments, e.g. kubeflow namespace has control-plane label
2. With latest kubeflow install, there is a bug in KFserving (knative serving sends probes to istio ingress gateway but blocked). Workaround is
kubectl edit policies ingress-jwt -n istio-system
trigger_rules:
— excluded_paths:
— prefix: /v1/models
— exact: /
https://github.com/kubeflow/kfserving/issues/668
3. By default, it only works in default namespace. Workaround is to turn off Istio RBAC for custom namespace to work using following yaml (RBAC is even deprecated for Authorization policy)
apiVersion: “rbac.istio.io/v1alpha1”
kind: ClusterRbacConfig
metadata:
name: default
namespace: istio-system
spec:
mode: ON_WITH_INCLUSION
inclusion:
namespaces: [ “istio-system”]
4. The RC yaml has a bug for gateway
kubectl edit cm inferenceservice-config -n kfserving-system
https://github.com/kubeflow/kfserving/issues/629
change
“ingressGateway” : “knative-ingress-gateway.knative-serving” to
“ingressGateway” : “kubeflow-gateway.kubeflow”
Create inferenceservice from a model
kubectl apply -f docs/samples/sklearn/sklearn.yaml -n <custom namespace>
wait following to be ready
kubectl get inferenceservices sklearn-iris -n <custom namespace>
test inference
kubectl port-forward — namespace istio-system $(kubectl get pod — namespace istio-system — selector=”app=istio-ingressgateway” — output jsonpath=’{.items[0].metadata.name}’) 8080:80
curl -v -H “Host: sklearn-iris.<custom namespace>.example.com” http://localhost:8080/v1/models/sklearn-iris:predict -d @./docs/samples/sklearn/iris-input.json
Result
{“predictions”: [1, 1]}