Senior DevOps Engineer Nutanix Kubernetes & AI Platform
Vision Unlimited · Abu Dhabi, Vereinigte Arabische Emirate
Department: AI Delivery Engineering
Reports To: Head of AI Infrastructure or DevOps Manager
Location: On-site
Employment Type: Full-time
About the Role
We are seeking a highly skilled and experienced Senior DevOps Engineer to design, deploy, manage, and optimize Nutanix-based Kubernetes (K8s) infrastructure that powers our enterprise AI delivery platform. In this critical role, you will own the end-to-end lifecycle of Kubernetes clusters on Nutanixfrom architecture and provisioning to monitoring, scaling, and securityenabling rapid delivery of AI services, machine learning pipelines, and intelligent applications.
You will work closely with AI/ML engineers, data scientists, and platform teams to ensure our infrastructure is resilient, scalable, secure, and optimized for AI/ML workloads (e.g., training, inference, real-time analytics).
Key Responsibilities
- End-to-End Kubernetes Platform Ownership: Design, deploy, manage, and maintain production-grade Kubernetes clusters on Nutanix Karbon (or native K8s on Nutanix AHV), ensuring high availability, performance, and security.
- AI/ML Infrastructure Architecture: Architect and implement scalable, cost-efficient infrastructure tailored for AI workloadsincluding GPU orchestration, distributed training, model serving, and data-intensive pipelines.
- Infrastructure as Code (IaC): Automate provisioning and configuration of Nutanix K8s environments using Terraform, Ansible, Helm, and GitOps workflows (e.g., ArgoCD/Flux).
- CI/CD for AI Services: Build and maintain secure, efficient CI/CD pipelines for deploying AI microservices, model endpoints, and data processing jobs into K8s environments.
- Observability & SRE Practices: Implement comprehensive monitoring, logging, and alerting (using Prometheus, Grafana, ELK, OpenTelemetry, etc.) with SLO/SLI tracking for AI platform reliability.
- Security & Compliance: Enforce zero-trust networking, RBAC, pod security policies, image scanning, and secrets management (e.g., HashiCorp Vault) aligned with enterprise security standards.
- Performance Optimization: Tune K8s scheduling, storage (Nutanix Files/Objects), networking (CNI), and resource allocation (CPU/GPU/memory) for AI/ML workloads.
- Collaboration & Enablement: Partner with AI/ML engineers to onboard models and services onto the platform; document best practices and provide self-service tooling.
- Disaster Recovery & Backup: Implement and test backup/recovery strategies for K8s workloads and persistent data using Nutanix-native or third-party tools (e.g., Velero).
Required Qualifications
- 5+ years of DevOps/SRE experience with 3+ years focused on Kubernetes in production environments.
- Deep hands-on experience with Nutanix (AHV, Prism, Karbon, Files, Objects) and managing K8s on-prem or hybrid.
- Proven track record designing and operating AI/ML infrastructure (e.g., Kubeflow, MLflow, Seldon, KServe, Ray).
- Expertise in Infrastructure as Code: Terraform, Helm, Ansible, GitOps.
- Strong scripting/automation skills (Python, Bash, Go).
- Experience with GPU orchestration (NVIDIA device plugins, MIG, CUDA) in K8s.
- Solid understanding of networking, storage, and security in K8s (CNI, CSI, RBAC, OPA/Gatekeeper).
- Familiarity with CI/CD tools (GitLab CI, Jenkins, GitHub Actions) and artifact management (Harbor, JFrog).
- Experience with observability stacks (Prometheus, Grafana, Loki, Tempo, OpenTelemetry).
- Bachelors degree in Computer Science, Engineering, or equivalent practical experience.
Preferred Qualifications
- Nutanix certifications (e.g., NCP-MCI, NCP-DS).
- CNCF certifications (CKA, CKAD, CKS).
- Experience with multi-cluster management (Rancher, Anthos, OpenShift).
- Knowledge of MLOps practices and tools (MLflow, TFX, Kubeflow Pipelines).
- Experience in regulated industries (finance, healthcare) with compliance needs (SOC2, HIPAA, GDPR).
Why Join Us?
- Lead the infrastructure backbone for cutting-edge AI products used by millions.
- Work with a world-class team of AI researchers, engineers, and product innovators.
- Shape the future of on-prem/cloud-hybrid AI infrastructure at scale.
- Competitive compensation, equity, and benefits.
Über den Arbeitgeber

Vision Unlimited was founded in May 2003 and today has grown to become the largest recruitment organization in the territory of Punjab, Haryana, Himachal, J & K, MP, Rajasthan and UP. The organization has delivered sustained top class performance and set high standards of service. Nearly all the big names of the industry look towards us when sourcing manpower from the above territories as do the aspiring candidates when looking for the answers to their future and their career.
Ähnliche Stellen
- SOC LeadDynamed Healthcare Solutions · Abu Dhabi, Vereinigte Arabische Emirate
- Network EngineerAl Salama Hospital · Dschidda, Saudi-Arabien
- IT SpecialistFakeeh Care Group · Dschidda, Saudi-Arabien
- Installation & Upgrade SpecialistGE HealthCare · Riyadh Region, Saudi-Arabien
- Information Technology, Web Developer, Graphic DesignerMagenta Investments · Dubai, Vereinigte Arabische Emirate
- IT Specialist - Healthcare & AestheticsSeline Clinic Dubai · Dubai, Vereinigte Arabische Emirate