
While I’d sort of an idea in terms of where Rancher fits in the ecosystem as a cluster management solution, I hadn’t played with it. Also, 1.x of rancher had it’s own orchestration engine and so on and I wasn’t that interested. Fast forward to 2017 and Rancher is putting it’s weight being Kubernetes and going all in.
Rancher 2.0 was announced in Sep last and has followed up with a stream of RCs and moved to beta. And then a couple of weeks ago, 2.0 went GA.
So now that I’m in a scenario where we might need Ops teams to manage multiple k8s installations across bare metal and different public clouds, I thought I’d give Rancher a try.
What follows is a quick run down of what worked and what didn’t and/or stuff that took some twiddling to get working. Overall, it left me quite gung-ho with Rancher and I can see it’s value on the Ops side of the house but I’m not sure if I’d commit to the Rancher specifics from a development/deployment standpoint.
Sticking to kubectl
, helm
and native k8s resource model is much more
portable than dealing with Rancher’s workloads
and Load Balancers
. I have
no use of another layer of abstraction over native k8s.
I also found the documentation is a little behind… For ex: Rancher labs' quick start guide still directs users to the beta whereas the quick start linked from home page installs the GA. You’ve been warned.
Setup
-
Single box Azure k8s cluster with
rancher/server:preview
worked.-
Deployed nginx on it.
-
nginx not reachable and I didn’t see a LB configured in Azure??
-
Turns out that I had to configure the firewall to allow - should’nt this have been done automatically?
-
Also, no indication that this bit has to be done by the user - you sort of assume that it has enough info to go do this on its own.
-
Also, no Azure load balancer - had to configure the traffic rule directly for the worker node NSG.
-
if you end up having multiple workers, then I suppose you have to do this individually for each worker :(
-
-
Tried deploying kubernetes-dashboard but that didn’t work
-
After
kubectl proxy
, https://localhost:8001/ui shows a rancher page -
Uninstalled kubernetes-dashboard.
-
-
Tried logging into cluster with
kubectl proxy
-
Got a logged out page with reload/logout options. Hitting logout invalidates the token
-
Kubectl token in config is invalidated - so have to get a new kubectl config again.
-
Does not change the situation with logging into the cluster.
-
-
-
-
Set up a multi VM Azure k8s cluster.
-
Ran into kubelet failing health check on Azure.
-
Ditched
preview
and moved to rancher master (rancher/rancher
). After that, the cluster came up.
-
-
Nginx workload on multi node cluster
-
Scaled to multiple pods.
-
There’s a facility to set up an Ingress load balancer - but the docs say that the L7 LB isn’t supported on Azure
-
L4 LB is supported on Azure Container Service (AKS) only.
-
- Other stuff I saw
-
-
If you shut down a worker node from Azure and bring it back up later, Rancher doesn’t seem to fix itself up. Deleting the node and adding a new one works though.
-
Setting up a AKS cluster works just fine and I assume that this would be the preferred approach for folks to set up a cluster on Azure. AKS also means that you can use L4 routing.
-
Importing an existing cluster also works smoothly - I had another AKS cluster and getting it imported into Rancher was as simple as a
kubectl apply -f
-
Conclusion
- What’s nice
-
-
Slick UI - good for non dev folks who aren’t comfy with the cli.
-
Provisioning/scaling and managing the cluster automated on different clouds
-
Can set up multiple node pools for different node profile.
-
Scaling nodes in a cluster is easy.
-
Helm is integrated.
-
Centralized Authentication and RBAC with integration with AD/LDAP and other providers.
-
- What’s not that great
-
-
Cloud support can be spotty.. YMMV.
-
It took me some time to find out that they don’t support a L7 LB on azure with VMs.
-
-
Documentation - I’ve already cribbed about this - but I have to say it again.
-
For ex: No indication on what Rancher will not do for you on a specific cloud.
-
For ex: For Azure, there’s a panel in the UI with about 20 odd params and there’s little explanation about it anywhere.
-
-
Running into P1s on first run doesn’t inspire confidence… this is still rough.
-
For ex: the issue with not being able to log in to cluster dashboard.
-
-
Semantics? What does 'workload' map to? This seems to be either a deployment or a helm chart?
-
Basically, another layer of abstraction means new terms to wrap your head around.
-
OTOH, I never tried Rancher 1.x - so maybe folks who’re used to that have an easier path with moving to 2.0
-
-
Don’t mistake my nitpicking above - these are rough edges that’ll get sorted out in point releases. If there’s one killer feature in Rancher, that’s the centralized authentication and RBAC with LDAP/AD as well as the unified cluster management features across different k8s clusters.