Role Overview
As our Senior SRE, you will ensure system reliability and scalability by building automated, cost-optimized environments through IaC, balancing high performance with financial efficiency. You will optimize CI/CD pipelines for fast, consistent releases while owning Monitoring and Observability and defining critical SLAs/SLOs. To ensure system stability, you will lead Root Cause Analysis (RCA) to identify and mitigate underlying failures. You will join our collaborative environment as a key teammate, working alongside Machine Learning, Engineering, and Pre-sales teams to ensure the smooth delivery of our wonderful work as one unified team.
In this role, you will get to:
-
Automate infrastructure, configuration, and deployments using IaC to ensure easy maintenance, consistency, and scalability.
-
Ensure system availability and performance through continuous monitoring and proactive maintenance.
-
Implement and manage SLAs to drive system efficiency and exceed customer expectations.
-
Optimize cloud resource utilization to balance high performance with cost-efficiency.
-
Design and optimize CI/CD pipelines to ensure fast, reliable, and automated software releases.
-
Develop runbooks and procedures for streamlined incident response and routine maintenance.
-
Collaborate with engineering teams to resolve production issues and implement reliability best practices.
-
Lead incident response and post-mortems to identify root causes and prevent recurrence.
-
Mentor engineers and advocate for DevOps culture to drive technical excellence across the organization.
You'll be successful if you have:
-
3–5 years of hands-on experience in designing, building, and maintaining cloud infrastructure and applying SRE practices in large-scale systems.
-
Container Expertise: In-depth knowledge of container orchestration, specifically Docker and Kubernetes, to manage applications at scale.
-
Cloud Proficiency: Hands-on experience with cloud components (VMs, Serverless, Storage, Networking) to ensure optimal utilization and cost-effectiveness.
-
Advanced Automation & IaC: Strong skills in Terraform, ArgoCD, Helm, and GitLab CI.
-
Observability: Experience with monitoring stacks (specifically Prometheus/Grafana) and defining measurable SLAs.
-
Architectural Mindset: Ability to scope projects, define architectures, and select technologies based on project requirements.
-
Security-First Approach: A "secure by design" mindset with experience in troubleshooting complex production issues.
-
Leadership & Collaboration: Ability to work across teams and mentor junior/mid-level engineers through code reviews and workshops.
-
Communication: Excellent verbal and written skills for engaging with both technical and non-technical stakeholders.
It’s a plus if you have:
-
Secret Management: Experience with SOPS (Secrets Operations) for managing encrypted secrets within GitOps workflows.
-
Pre-sales Experience: Experience gathering customer requirements and estimating project scope.
-
Security Expertise: Hands-on experience with penetration testing tools like Nessus, Nikto, or Nmap.
-
Certifications: Active cloud provider (AWS/GCP/Azure) or CNCF certifications (CKA/CKAD).
-
Service Mesh: Experience utilizing technologies like Istio or Linkerd.
-
Advanced Deployments: Implementation of canary or blue/green deployments for controlled and safe releases.
-
Multitasking: Ability to effectively manage and deliver on multiple high-priority projects simultaneously.

Be a part of
Engineering
team
“ Lay the greatest foundation for the best AI solution ”
5 steps
for interview
Depends on the department
This is your chance
to build your career
in a growing data driven
industry.|
Copyright © 2026 Sertis Co.,Ltd. - All rights reserved.





