DevJobs

DevOps SRE Engineer

Overview
Skills
  • Python Python
  • Bash Bash
  • ML ML
  • MongoDB MongoDB
  • Elasticsearch Elasticsearch
  • PostgreSQL PostgreSQL
  • CI/CD CI/CD
  • Azure DevOps Azure DevOps
  • GitHub Actions GitHub Actions
  • AWS AWS
  • GCP GCP
  • Azure Azure
  • Docker Docker
  • Kubernetes Kubernetes
  • Helm
  • Terraform Terraform
  • Grafana Grafana
  • ArgoCD
  • CloudFormation
  • EKS
  • GitLab CI
  • GKE
  • Kibana
  • OpenShift
  • Prometheus Prometheus
  • AI

DeepKeep is seeking a DevOps SRE Engineer to join their DevOps team. This role focuses on maintaining cloud infrastructure, deployment automation, and CI/CD workflows across AWS & GCP. The successful candidate will join a team of experienced DevOps engineers, contributing to the maintenance of K8S clusters, CI/CD automation, automated testing, and infrastructure optimizations. Close collaboration with engineering teams is essential to ensure smooth and secure deployments.

Key Responsibilities and Impact:

  • Lead and execute cloud infrastructure and DevOps operations on AWS and GCP.
  • Oversee client-specific and SaaS deployments, ensuring seamless, secure, and scalable solutions.
  • Develop and optimize CI/CD pipelines, enhancing automation, testing, and deployment processes.
  • Deploy and maintain containerized AI/ML models on Kubernetes (EKS/GKE).
  • Optimize GPU and compute resource allocation for machine learning workloads.
  • Ensure security, monitoring, and compliance in cloud environments.
  • Implement Infrastructure as Code (IaC) using Terraform, Helm.
  • Collaborate with customers and technical partners to maintain live site reliability.
  • Collaborate with engineering and product teams to align DevOps workflows with business objectives.
  • Continuously improve DevOps processes by identifying bottlenecks and implementing optimizations.

Desired Skills and Experience:

  • 3+ years of hands-on experience in DevOps, Site Reliability Engineering (SRE), or cloud engineering.
  • Experience in at least one Cloud provider (AWS, GCP, Azure), with strong knowledge of Kubernetes, networking, and cloud security.
  • Experience with CI/CD and GitOps methodologies, utilizing tools such as ArgoCD, GitHub Actions, GitLab CI, or Azure DevOps.
  • Proficiency in Infrastructure as Code (IaC) tools like Terraform or CloudFormation.
  • Strong scripting and automation skills (Python, Bash, etc.).
  • Experience in managing containerized workloads like Docker, Kubernetes, Helm, OpenShift.
  • Experience in AI/ML workloads (training, optimization, managing environments for AI models) - Advantage
  • Experience with customer facing task - Advantage
  • Proven ability to troubleshoot and optimize cloud environments for performance and reliability.
  • Experience in observability solutions (Grafana, Promethues, Kibana, etc.)
  • Ability to work in a fast-paced environment and solve complex DevOps challenges.
  • Advantageous skills include:
  • Experience with AI/ML workloads (training, optimization, and managing environments for AI models).
  • Experience in customer-facing tasks.
  • Experience with database management (MongoDB, PostgreSQL, Elasticsearch, etc.).

Why Join DeepKeep?

At DeepKeep, you'll be at the forefront of AI security and cloud infrastructure. This hands-on role is ideal for a passionate DevOps or SRE engineer eager to work with highly capable engineers on the latest tech stack.

DeepKeep