Sr Staff SRE
3 hours ago
Job Description
Job Title
Senior Staff Site Reliability Engineer
Created
17-Oct-2025
Department
SRE
Revised
Job Summary
We are seeking a Senior Staff Site Reliability Engineer (SRE) with deep technical expertise in distributed systems, cloud infrastructure, observability, and automation. The Senior Staff SRE will be responsible for ensuring reliability, scalability, and operational excellence across our production and pre-production environments. This role involves hands-on engineering, designing robust SRE frameworks, and driving large-scale improvements in reliability engineering practices. The candidate will collaborate across teams to define service-level objectives (SLOs), improve incident response processes, and automate end-to-end infrastructure and application reliability workflows.
Key Duties & Responsibilities (in decreasing Critical Emphasis order)
1
Reliability Engineering & Architecture:
- Architect and maintain highly available, fault-tolerant systems on AWS.
- Implement service reliability models based on SLOs, SLIs, and error budgets.
- Drive improvements in system performance, scalability, and resilience.
2
Automation & Infrastructure-as-Code (IaC):
- Develop automation pipelines using Terraform, Ansible,Bitbucket and Jenkins.
- Build reusable IaC modules for multi-account, multi-environment AWS deployments.
- Automate operational processes for scaling, monitoring, and recovery.
3
Observability & Monitoring:
- Define observability standards and implement dashboards using Elastic Stack, Grafana, or Prometheus.
- Integrate alerting with AIOps and anomaly detection tools for proactive issue identification.
- Partner with development teams to ensure telemetry and traceability coverage.
4
Incident Management & Root Cause Analysis:
- Lead high-severity incident response and drive postmortem processes.
- Conduct blameless post-incident reviews and implement reliability improvements.
- Contribute to runbooks, escalation matrices, and reliability playbooks.
5
Performance & Capacity Planning:
- Analyze system bottlenecks, throughput, and latency; propose optimization strategies.
- Lead capacity forecasting and ensure systems meet growth and scaling requirements.
6
Collaboration & Mentorship:
- Partner with development, QA, and DevOps teams to embed SRE principles.
- Mentor junior SREs on reliability best practices, automation, and cloud design.
7
Documentation:
- Maintain detailed design documents, architectural diagrams, and reliability procedures.
- Document automation workflows, SLO reports, and operational standards.
8
Technical Leadership & Mentorship:
- Provide deep technical mentorship to SRE and DevOps engineers.
- Lead technical discussions, reliability reviews, and performance retrospectives.
- Advocate automation-first and code-driven reliability culture within the organization.
9
Documentation & Knowledge Management:
- Maintain architecture blueprints, reliability engineering standards, and automation design documents.
- Establish templates for SLO documentation, postmortems, and change management reports.
Qualifications/Skills/Abilities
Minimum Requirements
Formal Education
- Bachelors degree in computer science, Information Technology, or a related field (or equivalent experience).
Experience (type & duration)
- 8+ years in DevOps/SRE roles, with experience in large-scale distributed systems.
- Proven background in cloud operations (AWS preferred), automation, and CI/CD.
- Telecom domain experience is good to have.
Skills
- Deep expertise in AWS (EKS, EC2, RDS, IAM, VPC,Kafka, CloudWatch,API GW, Lambda,WAF,KMS) and container orchestration (EKS).
- Strong skills in Linux administration and networking.
- Hands-on with Terraform, Jenkins, Git, and scripting (Python, Bash).
- In-depth understanding of observability frameworks (Grafana, Elastic Stack, Prometheus).
- Experience with container orchestration (Kubernetes) and microservices reliability.
Accreditation/certifications/licenses
- AWS Certified DevOps Engineer / Solutions Architect Associate (preferred).
- Terraform Associate or Kubernetes Certified Administrator (CKA) is a plus.
- SRE Foundation or Google SRE certification desirable.
-
Sr Staff DevOps
2 days ago
Bengaluru, Karnataka, India Movius Interactive Full timeAt Movius, we're building a secure, scalable, and intelligent communications platform used by global enterprises. As a Sr. Staff DevOps Engineer, you'll be the technical leader shaping how our systems scale, deploy, and operate in production.You'll design cloud architectures that are secure by design, automated by default, and observable at every layer....
-
Sr. Staff SRE, Application SRE
2 days ago
Bengaluru, Karnataka, India Netskope Full timeAbout NetskopeToday, there's more data and users outside the enterprise than inside, causing the network perimeter as we know it to dissolve. We realized a new perimeter was needed, one that is built in the cloud and follows and protects data wherever it goes, so we started Netskope to redefine Cloud, Network and Data Security.Since 2012, we have built the...
-
Sr. / Staff SRE, Agentic AI
1 week ago
Bengaluru, Karnataka, India Netskope Full timeAbout NetskopeToday, there's more data and users outside the enterprise than inside, causing the network perimeter as we know it to dissolve. We realized a new perimeter was needed, one that is built in the cloud and follows and protects data wherever it goes, so we started Netskope to redefine Cloud, Network and Data Security. Since 2012, we have built...
-
Sr. Staff
2 weeks ago
Bengaluru, Karnataka, India Altera Full time US$ 1,85,000 - US$ 3,70,000 per yearThe Sr. Staff (or Staff) Developer, Windchill PLM System will design and develop our enterprise-wide PLM Systems configurations and workflows improvement initiatives ensuring it aligns with business goals and technical needs. This key role requires a deep understanding of windchill PLM features and capabilities, a passion for innovation, and a proven track...
-
Staff Software Engineer, Reliability
7 days ago
Bengaluru, Karnataka, India Veeam Software Full timeVeeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep...
-
Sr Staff Software Eng
2 days ago
Bengaluru, Karnataka, India CME Group Full timeThe Sr Staff Software Engineering independently engineers secure, scalable and reliable technology solutions to advance CMEG in the global marketplace and serve risk management needs of customers around the world.Principal AccountabilitiesConducts full system testing skills.Demonstrates expertise in design and analysis patterns; Identifies best practices,...
-
Applications Engineering, Sr Staff Engineer
7 days ago
Bengaluru, Karnataka, India Black Duck Software, Inc. Full timeBlack Duck Software, Inc. helps organizations build secure, high-quality software, minimizing risks while maximizing speed and productivity. Black Duck, a recognized pioneer in application security, provides SAST, SCA, and DAST solutions that enable teams to quickly find and fix vulnerabilities and defects in proprietary code, open source components, and...
-
Applications Engineering, Sr Staff Engineer
5 days ago
Bengaluru, Karnataka, India Black Duck Full timeBlack Duck Software, Inc. helps organizations build secure, high-quality software, minimizing risks while maximizing speed and productivity. Black Duck, a recognized pioneer in application security, provides SAST, SCA, and DAST solutions that enable teams to quickly find and fix vulnerabilities and defects in proprietary code, open source components, and...
-
Senior Staff Site Reliability Engineer
42 minutes ago
Bengaluru, Karnataka, India Movius Full timeJob DescriptionJob Title:Senior Staff Site Reliability EngineerLocation:BangaloreAbout MoviusAt Movius, we solve a critical gap companies face with employee-to-client communication over voice and messaging. We are the leading global provider of Secure Communication as a Service (SCaaS). Our flagship solution, MultiLine, enhances workflows, resolves...
-
Sr Staff Engineer Software
1 week ago
Bengaluru, Karnataka, India Infineon Full timeSr Staff Engineer Software-Bluetooth controllerJob DescriptionIn your new role you will:Design, implement and test Bluetooth controller new features.Maintain and support Bluetooth Controller FW for existing chipsDevelop /port / optimize peripheral drivers, boot-up code, Bluetooth functional modules etc.Power optimization, timing improvements and...