Senior Staff Site Reliability Engineer
8 hours ago
Job Description
Job Title:
Senior Staff Site Reliability Engineer
Location:
Bangalore
About Movius
At Movius, we solve a critical gap companies face with employee-to-client communication over voice and messaging. We are the leading global provider of Secure Communication as a Service (SCaaS). Our flagship solution, MultiLine, enhances workflows, resolves compliance gaps and unifies cross-channel messaging. Movius AI-powered solutions enable businesses to build strong and lasting relationships with their customers in a company-owned, controllable system. Welcome to Phone 3.0.
Headquartered in Alpharetta, GA, with offices in Silicon Valley, Bangalore, India, New York, and London, Movius partners with leading global wireless carriers like T-Mobile, Vodafone, TELUS, BT, Singtel & more. To learn more about Movius, visit .
*Your Opportunity
We are looking for a
Senior Staff Site Reliability Engineer (SRE)*
with strong technical expertise in distributed systems, cloud infrastructure, observability, and automation.
In this role, you will be responsible for improving the
reliability, scalability, and performance
of our production and pre-production systems. You will work hands-on in designing and implementing SRE frameworks, automating key reliability workflows, and building a culture of operational excellence.
You will also work closely with product engineering, QA, and DevOps teams to define
SLOs/SLIs
, enhance monitoring and alerting, and strengthen our overall reliability practices.
*What You'll Do*
- Reliability Engineering & Architecture
- Design and maintain highly available, fault-tolerant systems on AWS.
- Implement service reliability models based on SLOs, SLIs, and error budgets.
- Continuously improve system performance, scalability, and resilience.
- Automation & Infrastructure-as-Code (IaC)
- Build and maintain automation pipelines using Terraform, Ansible, Bitbucket, and Jenkins.
- Develop reusable IaC modules for multi-account and multi-environment AWS setups.
- Automate operational processes for provisioning, scaling, monitoring, and recovery.
- Observability & Monitoring
- Define observability standards and create dashboards using Elastic Stack, Grafana, or Prometheus.
- Implement intelligent alerting using AIOps and anomaly detection tools.
- Work with development teams to ensure proper telemetry and trace coverage.
- Incident Management & RCA
- Lead major incident response and ensure quick service restoration.
- Conduct blameless post-incident reviews and implement preventive actions.
- Create and maintain runbooks, escalation matrices, and reliability playbooks.
- Performance & Capacity Planning
- Analyse performance bottlenecks and propose tuning or optimization strategies.
- Lead capacity forecasting and ensure the system can handle growth demands.
- Collaboration & Mentorship
- Partner with development, QA, and DevOps teams to embed SRE principles.
- Coach and mentor junior engineers on reliability engineering and automation.
- Documentation & Knowledge Management
- Maintain detailed architecture diagrams, design documents, and operational procedures.
- Document SLOs, automation workflows, and change management reports.
- Technical Leadership
- Lead technical discussions, reliability reviews, and performance retrospectives.
- Promote a code-driven, automation-first reliability culture across teams.
*What You Bring
Education*
- Bachelor's degree in Computer Science, Information Technology, or equivalent experience.
Experience
- 8+ years in SRE or DevOps roles managing large-scale distributed systems.
- Proven hands-on experience in cloud operations (AWS preferred), automation, and CI/CD pipelines.
- Experience in the Telecom domain is an added advantage.
Technical Skills
- Deep knowledge of AWS (EKS, EC2, RDS, IAM, VPC, Kafka, CloudWatch, API Gateway, Lambda, WAF, KMS).
- Strong Linux administration and networking fundamentals.
- Skilled in Terraform, Jenkins, Git, and scripting (Python, Bash).
- Solid understanding of observability tools (Grafana, Elastic Stack, Prometheus).
- Experience with container orchestration (Kubernetes) and microservices-based systems.
Certifications (Preferred)
- AWS Certified DevOps Engineer / Solutions Architect – Associate.
- Terraform Associate or Kubernetes Certified Administrator (CKA).
- SRE Foundation or Google SRE certification is desirable.
*Why Join Movius?*
- Work on a global-scale platform serving enterprise customers.
- Be part of a high-performing, innovation-driven engineering team.
- Competitive pay, benefits, and opportunities for professional growth.
Ready to build the future of reliable, secure, and intelligent communication?
Apply now
-
Staff Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India Okta Full time ₹ 8,00,000 - ₹ 24,00,000 per yearJoin our team Were building a world where Identity belongs to you.Oktas Workforce Identity Cloud Security Engineering group is looking for a Staff Site Reliability Engineer with a passion for DevSecOps , Infrastructure Security , and SRE . Join a team that is not just building solutions but redefining the standards for cloud security. If you have a proven...
-
Senior Staff Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India Zscaler Full time ₹ 15,00,000 - ₹ 20,00,000 per yearAbout ZscalerZscaler accelerates digital transformation so our customers can be more agile, efficient, resilient, and secure. Our cloud native Zero Trust Exchange platform protects thousands of customers from cyberattacks and data loss by securely connecting users, devices, and applications in any location.Here, impact in your role matters more than title...
-
Staff Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India Aerospike Full time ₹ 12,00,000 - ₹ 36,00,000 per yearAbout AerospikeAerospike is the real-time database for mission-critical use cases and workloads, including machine learning, generative, and agentic AI. Aerospike powers millions of transactions per second with millisecond latency, at a fraction of the total cost of ownership compared to other databases.Global leaders, including Adobe, Airtel,...
-
Senior Site Reliability Engineer
1 week ago
Bengaluru, Karnataka, India Okta Full timePosition Overview: We are seeking an experienced and technically influential Senior Site Reliability Engineer to join our Cloud Tooling and Pipelines team. This pivotal team drives the strategy, development, and operation of our core Continuous Delivery (CD) platform (leveraging Spinnaker and custom tooling), Infrastructure as Code (IaC) executions...
-
Senior Platform Site Reliability Engineer
4 days ago
Bengaluru, Karnataka, India MathWorks Full timeSummaryMathWorks has a hybrid work model that enables staff members to split their time between office and home. The hybrid model provides the advantage of having both in-person time with colleagues and flexible at-home life optimizations. Learn More: As a Senior Platform Site Reliability Engineer (SRE) for the IT Observability and Automation Team, you will...
-
Senior Site Reliability Engineer
1 week ago
Bengaluru, Karnataka, India Okta Full timeJoin our team Were building a world where Identity belongs to you.Oktas Workforce Identity Cloud Security Engineering group is looking for a Senior Site Reliability Engineer with a passion for DevSecOps , Infrastructure Security , and SRE . Join a team that is not just building solutions but redefining the standards for cloud security. If you have a proven...
-
Site Reliability Engineer
4 days ago
Bengaluru, Karnataka, India super Full timeSite Reliability Engineer (SRE) Level 3Overview:A Site Reliability Engineer (SRE) Level 3 is a senior technical leadership role focused on designing, implementing, and maintaining large-scale, complex, and highly reliable systems. This role emphasizes a blend of software and systems engineering to ensure the availability, latency, performance, and capacity...
-
Staff Site Reliability Engineer, Auth0
2 days ago
Bengaluru, Karnataka, India Okta Full timeGet to know OktaOkta is The World's Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth.At Okta, we celebrate a variety of...
-
Staff Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India Zscaler Full time ₹ 8,00,000 - ₹ 24,00,000 per yearAbout ZscalerServing thousands of enterprise customers around the world including 45% of Fortune 500 companies, Zscaler (NASDAQ: ZS) was founded in 2007 with a mission to make the cloud a safe place to do business and a more enjoyable experience for enterprise users. As the operator of the world's largest security cloud, Zscaler accelerates digital...
-
Staff Software Engineer, Reliability
7 days ago
Bengaluru, Karnataka, India Veeam Software Full timeVeeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep...