Skip to content
← All open roles
Data & AI · Posted 1 day ago

Senior Site Reliability Engineer

Databricks
Data & AI · Series C+

The role

Keep the lakehouse running. SRE for the Unity Catalog control plane that serves thousands of enterprise customers.

Series C+, the lakehouse company. Tens of thousands of enterprise customers.

What you'll do

  • ✓ Operate the Unity Catalog control plane across regions
  • ✓ Drive reliability investments — SLOs, error budgets, incident review
  • ✓ Build automation that eliminates repetitive toil

You should have

  • ✓ 7+ years operating large distributed systems in production
  • ✓ Fluency in at least one cloud (AWS, GCP, Azure)
  • ✓ Strong programming skills (Python, Go, or similar)

Nice to have

  • ◦ Data infrastructure background
  • ◦ Apache Spark internals

The stack

AWS Terraform Kubernetes Observability Python

How they hire

Step 01 Phone screen 30 min
Step 02 Onsite loop 5× 45 min
Step 03 Total cycle 10–16 days

Similar roles

All open roles →