Senior SRE Engineer - Cloud Operations

Qdrant
arbeitnow
Remote
Berlin
professional / experienced
Posted 2 days ago
View Original Posting

Job Type

professional / experienced

Skills & Technologies

Remote
Engineering

Job Description

Qdrant is a cutting-edge vector database company on a mission to revolutionize how organizations manage and query unstructured data. Our open-source engine and managed cloud solutions power AI-driven search, recommendation, and data discovery at scale. We are a remote-first company, building a global team of passionate engineers to push the boundaries of database infrastructure.

As a Senior DevOps / SRE Engineer on the Cloud Operations team, you will focus on keeping Qdrant Cloud reliable, observable, and secure as usage and infrastructure complexity grow. Your primary responsibility is operational excellence: stability, incident response, and continuous improvement of production systems.

This role is operations-heavy, ideal for engineers who thrive in owning reliability and reducing operational risk at scale.

Tasks

  • Operate and maintain production cloud infrastructure at scale
  • Own Kubernetes infrastructure, networking, and deployment pipelines
  • Improve monitoring, logging, alerting, and operational visibility
  • Lead incident response, root cause analysis, and follow-up actions
  • Reduce operational toil through automation and better tooling
  • Improve reliability, security, and performance of production systems
  • Collaborate closely with Platform and Regions & Clusters teams
  • Maintain and evolve runbooks, operational procedures, and alerts
  • Participate in on-call rotations and continuous reliability improvements

Requirements

Must have

  • 5+ years of experience in DevOps, SRE, or infrastructure operations roles
  • Strong hands-on experience operating Kubernetes in production
  • Solid knowledge of Linux systems, networking, and cloud infrastructure
  • Experience working with AWS, GCP, or Azure
  • Strong understanding of monitoring, alerting, and incident management
  • Experience with infrastructure-as-code and automation tooling
  • Comfortable owning on-call responsibilities and production incidents
  • Strong operational mindset and clear communication skills

Nice to have

  • Experience with Terraform or similar IaC tools
  • Familiarity with Prometheus, Grafana, Loki, or OpenTelemetry
  • Exposure to security, compliance, or hardening initiatives
  • Scripting experience in Python, Bash, or Go
  • Experience in SaaS, cloud, or data infrastructure environments

Benefits

  • Competitive salary, equity, and benefits
  • Fully remote setup with flexible working hours
  • Clear ownership of reliability and operational excellence
  • Opportunity to work on mission-critical customer-facing infrastructure
  • Strong collaboration with platform and engineering teams

If you enjoy keeping complex systems reliable and improving operations through automation and discipline, weโ€™d love to hear from you.

Recruiting Agencies and Headhunters, please only via ๐™๐™ž๐™ง๐™š๐™—๐™ช๐™›๐™›๐™š๐™ง.๐™˜๐™ค๐™ข?ref=qdrant

Find Jobs in Germany on Arbeitnow

Apply Now

People who may connect with

Powered by Exa People Search