Publicerad: 2024-11-22
Senior Software Engineer
Your Role Accountabilities...
· Operating Kubernetes clusters (version upgrades and managing critical Kubernetes components like Karpenter)
· Security vulnerability mitigations (rolling out security patches to Kubernetes nodes) and cost optimizations (scaling of our thousands of servers running hundreds of in-house built services, supporting millions of customers across the globe).
· As a Senior Software Engineer on the team, you will work with Kubernetes cluster operational tasks like upgrading the Kubernetes version. This includes performing an analysis of what is changing in the new version and how it affects the workload running on each cluster, e.g. deprecated k8s APIs and controller compatibility. You’ll use tools to scan clusters to find things which need remediation, define actions for it, as well as execute them. As part of rolling out upgrades to hundreds of clusters, some ranging up to a thousand nodes, you will identify automation opportunities to reduce the amount of toil needed.
· To keep the runtime infrastructure secure you will roll out security patches to Kubernetes nodes (EC2 AMI updates) on a regular basis (PCI compliance demands new patches to be installed at least every 30 days) and help improve the automation tooling to eventually on an automated schedule.
· In order to keep our runtime cost low, you will take part in cost optimization efforts, like tweaking the Karpenter node scaling strategies, increasing the bin-packing efficiency of pods on nodes, ensuring the right node family type is used (m, c, r, spot and Graviton/ARM) and identifying over scaled infrastructure.
· To enable faster time to new markets and onboarding of new tenants to our streaming platform, you’ll work on automating the creation of new Kubernetes clusters together with bootstrapping of critical platform capabilities like service mesh, deployment systems and the observability stack.
Qualifications and Experience...
· At least 1-2 years of Kubernetes experience.
· At least 1-2 years of AWS experience.
· At least 5 years of software development, infrastructure management or operations experience.
· Ability to write code for automation in Python, Bash or Golang.
· Experience designing infrastructure CI/CD pipelines, e.g. Jenkins or GitHub Actions.
· Experience with IaaC, preferably Terraform.
· Used to Helm templating for k8s manifests.
· Understanding of how GitOps tooling like ArgoCD or Flux works.
· Experience rolling out infrastructure changes to production by following a change management workflow.
· Know which metrics to monitor during a change rollout to identify problems.
· Strong ownership mentality during rollouts, stop/rollback and fix if problems occur or escalate to management/on-call if getting stuck.
· Strong sense of security, always using least privileges access and firewall configurations when needed for maintenance.
· Understanding of how running workloads on the Kubernetes clusters may be affected by cluster changes or node rotations.
· Willingness to talk to service development teams and understand their challenges when they report problems during maintenance windows.
· Ability to define and measure KPIs and honor SLAs for infrastructure maintenance.
· Experience with Git and GitHub PR workflows.
· Experience in working with Agile – Sprints, Epics/Stories, Jira.
Ansök via e-post till faraja.makati@wbd.com