Site Reliability Engineering Lead

1 month ago


Bengaluru, Karnataka, India Arcesium Full time
Job Description

The Reliability Engineering team at Arcesium is responsible for ensuring the stability and availability of our mission-critical production systems. We manage incidents to ensure quicker resolution and establish business-as-usual processes. The team also builds tools and infrastructure that all development teams use to monitor and troubleshoot.

Key Responsibilities
  • Lead reliability engineering projects to drive them to closure.
  • Write code and perform code reviews to ensure best practices and code quality.
  • Contribute to the design and architecture of the system.
  • Automate processes and improve observability and availability of the Platform and reduce toil.
  • Supervise a team of SREs to ensure production applications are stable, reliable, and well-documented.
  • Own end-to-end availability and performance of mission-critical services.
  • Analyze and debug complex issues across tiers from frontend to mid-tier to infrastructure.
  • Practice sustainable incident response and blameless postmortems.
Requirements
  • 5 to 9 years of experience handling systems for large-scale production environments.
  • A self-starter who can build, drive, and advocate for SRE solutions.
  • Effective cross-functional collaboration skills to develop tools for secured, scalable, and reliable systems.
  • Solid understanding of SRE concepts like SLAs, SLOs, SLIs, error budgets, MTTR, MTTD, etc.
  • Experience with a variety of tools that help manage, understand, and debug large, complex distributed systems.
  • Good programming experience (Python/Go).
  • Hands-on experience with Kubernetes and Docker.
  • Working knowledge in any one of the cloud platforms (AWS, Azure, GCP).
  • Experience with monitoring and logging tools (e.g. Datadog, ELK, Prometheus, Grafana).
  • Good knowledge of Unix system, networking, web technologies, and databases.
  • Expert with troubleshooting issues and bugs.
  • Incident Management experience coupled with effective communication skills.
  • Experience in the financial domain (desirable).
  • Prior SRE/DevOps experience desirable.

Arcesium and its affiliates are committed to equal employment opportunity and do not discriminate in employment matters on the basis of race, color, religion, gender, gender identity, pregnancy, national origin, age, military service eligibility, veteran status, sexual orientation, marital status, disability, or any other category protected by law.



  • Bengaluru, Karnataka, India Flipkart Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineering Manager to lead our Reliability and Productivity Engineering team at Flipkart. As a key member of our SRE organization, you will be responsible for overseeing the end-to-end development process, from ideation to deployment, ensuring the delivery of high-quality, scalable, and...


  • Bengaluru, Karnataka, India Flipkart Full time

    About the RoleAs a Site Reliability Engineering Manager at Flipkart, you will be responsible for leading a team of skilled engineers in optimizing search functionalities and driving innovation.As a Site Reliability Engineering Manager, you will oversee the end-to-end development process, from ideation to deployment, ensuring the delivery of high-quality,...


  • Bengaluru, Karnataka, India Squareroot Consulting Pvt Ltd. Full time

    About the Role:At Squareroot Consulting Pvt Ltd., we are seeking a highly skilled Site Reliability Engineer to lead our infrastructure efforts in data privacy. As a key member of our team, you will be responsible for designing, implementing, and maintaining secure and scalable infrastructure as a service. Your expertise in DevOps and SRE practices will be...


  • Bengaluru, Karnataka, India myGwork Full time

    About the Role:We are seeking an experienced Site Reliability Engineering (SRE) lead to join our team at American Express, an inclusive employer and a member of myGwork. As an SRE lead, you will play a crucial role in driving the reliability, performance, and scalability of our GRC technology solutions.Key Responsibilities:Develop and implement a...


  • Bengaluru, Karnataka, India ITC Infotech Full time

    Job Title: Site Reliability EngineerAt ITC Infotech, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our systems and infrastructure.Key Responsibilities:Collaboration and Partnership: Partner with application developers and...


  • Bengaluru, Karnataka, India TEKsystems Global Services in India Full time

    Job Title: Site Reliability EngineerTEKsystems Global Services in India is seeking a highly skilled Site Reliability Engineer to join our team.About the RoleWe are looking for a talented individual with a strong background in monitoring and observability tools, automation, and scripting languages. The ideal candidate will have a proven track record in...


  • Bengaluru, Karnataka, India Accolite Full time

    Site Reliability Engineering Leadership OpportunityAccolite is seeking a seasoned Site Reliability Engineering (SRE) expert to serve as the SRE Lead / Architect. This role is based in Bangalore and requires 10+ years of experience.Key Responsibilities:Lead the SRE team and guide them towards excellence in SRECollaborate with stakeholders to define KPIs,...


  • Bengaluru, Karnataka, India NexionPro Services Full time

    Job Title: Site Reliability Engineering Practice HeadLocation: Pune & BengaluruExperience: 20+ YearsEmployment Type: Full-timeWe are seeking an experienced Site Reliability Engineering (SRE) Practice Head to lead and manage our SRE function. Based in Pune and Bengaluru, this senior leadership role will involve building and scaling the SRE teams, ensuring...


  • Bengaluru, Karnataka, India Cisco Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at Cisco. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and reliability of our cloud services.Key ResponsibilitiesDesign and implement scalable and efficient cloud infrastructure solutions.Collaborate with cross-functional...


  • Bengaluru, Karnataka, India Synechron Full time

    ### Role OverviewWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Synechron. As a key member of our technical team, you will be responsible for ensuring the reliability and scalability of our applications and infrastructure.### Key Responsibilities- Conduct performance testing and optimization for applications and...


  • Bengaluru, Karnataka, India CAPCO Full time

    Job Opportunity for a Highly Skilled Site Reliability EngineerWe are seeking a highly skilled and experienced Site Reliability Engineer to join our team at CAPCO. As a Senior Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities: Provide pro-active...


  • Bengaluru, Karnataka, India Microsoft Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Microsoft. As a Sr.SRE, you will be responsible for investigating and solving complex technical issues related to Windows systems and infrastructure.Key ResponsibilitiesDemonstrate strong interpersonal and communication skills to collaborate with Engineering...


  • Bengaluru, Karnataka, India Clear Ventures Full time

    About this RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Toast. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our restaurant platform.Key ResponsibilitiesDesign, implement, and evolve a world-class observability technology stack to detect issues and enable root cause...


  • Bengaluru, Karnataka, India Grizmo Labs 🌐 Full time

    Requirements:At Grizmo Labs, we're looking for an experienced Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design, implement, and maintain scalable and highly available systems on cloud platforms...


  • Bengaluru, Karnataka, India Neptune Retail Solutions Full time

    About UsNeptune Retail Solutions is a leading digital media and promotions technology company that creates cohesive omnichannel brand-building and sales-driving opportunities for advertisers, retailers, and consumers.Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be...


  • Bengaluru, Karnataka, India WELLS FARGO BANK Full time

    Lead Site Reliability Engineer Job DescriptionAt Wells Fargo Bank, we are seeking a highly skilled Lead Site Reliability Engineer to join our team. This role will be responsible for leading the implementation of site reliability engineering capabilities, driving technology transformation, and adoption of SRE aligned enterprise capabilities and products.Key...


  • Bengaluru, Karnataka, India Thomson Reuters Full time

    About the RoleIn this dynamic opportunity as Site Reliability Engineering Manager, you will be responsible for leading a team of experts in delivering reliable 24x7 infrastructure and application operations that meet business expectations across the application portfolio.Key Responsibilities:Develop and implement strategies to deliver operational readiness...


  • Bengaluru, Karnataka, India Tata Consultancy Services Full time

    Job Title: Senior Azure Site Reliability EngineerAbout the Role:We are seeking a highly skilled Senior Azure Site Reliability Engineer to join our team at Tata Consultancy Services. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our cloud infrastructure.Key Responsibilities:Collaborate with application...


  • Bengaluru, Karnataka, India Okta, Inc. Full time

    Be Part of a High-Performing TeamOkta, Inc. is looking for a Site Reliability Engineering Leader to join our Technical Operations team. As a leader in this role, you will be responsible for mentoring, managing, and leading a team of SREs with a broad range of expertise and experience.About the RoleAs a Site Reliability Engineering Leader, you will be an...


  • Bengaluru, Karnataka, India Okta, Inc. Full time

    About This Role We are seeking a highly motivated and experienced Site Reliability Engineer to join our team at Okta. As a Site Reliability Engineer, you will be responsible for designing, building, running, and monitoring Okta's production infrastructure. Key Responsibilities Designing and implementing scalable and reliable infrastructure...