Site Reliability Engineer

1 hour ago


bangalore, India super Full time

Site Reliability Engineer (SRE) Level 3Overview:A Site Reliability Engineer (SRE) Level 3 is a senior technical leadership role focused on designing, implementing, and maintaining large-scale, complex, and highly reliable systems. This role emphasizes a blend of software and systems engineering to ensure the availability, latency, performance, and capacity of critical services. SREs at this level are passionate about quality, efficiency, and reliability, and they play a crucial role in accelerating innovation and driving continuous improvement.Key Responsibilities:- Reliability and System Design:- Design and implement reliability patterns for client applications, communication protocols, and back-office services.- Build, deploy, tune, and own distributed and resilient systems.- Drive the entire lifecycle of a service, from inception and design through deployment, operation, and refinement.- Operations and Automation:- Operate and influence data collection, processing, and delivery systems that are scalable, resilient, and capable of operating at a global scale.- Leverage monitoring and observability tools (e.g., Prometheus, Grafana, Datadog) to ensure system health and reliability.- Lead automation efforts to reduce toil and maintain system efficiency.- Proactively identify opportunities to eliminate toil and automate issue triage to improve overall operational stability.- Incident Management and Improvement:- Lead incident response efforts and participate in on-call rotations.- Ensure root cause analysis and drive continuous improvement after incidents.- Drive postmortem processes, focusing on identifying and remediating systemic issues to prevent recurrence.- Collaboration and Leadership:- Collaborate with other software engineers, operations, product managers, and executives to design and implement deployment approaches using highly scalable, automated, continuous integration, and continuous delivery pipelines.- Work closely with embedded vehicle teams, data engineers, infrastructure engineers, developer experience, and application teams.- Proactively promote the adoption of site reliability engineering best practices within the team and organization.- Lead technical decision-making, balancing reliability, performance, and cost.Required Experience and Skills:- Typically 5-8+ years of combined experience in SRE, software development, or infrastructure engineering.- Strong experience in building and operating enterprise cloud applications.- Proficiency with cloud platforms such as AWS, Azure, or GCP, and container orchestration technologies like Kubernetes.- Familiarity with security practices such as DevSecOps.- Advanced programming skills in one or more languages (e.g., Python, Java, Go, Scala, C++).- Experience with Continuous Integration / Continuous Delivery (CI/CD) tools (e.g. ArgoCD, Jenkins, Gitlab CI/CD)- Familiarity with Infrastructure as Code frameworks like Terraform- Advanced knowledge of networking (firewalls, DNS, Load Balancing, Proxies) and Linux/Windows operating systems.- Excellent problem-solving, communication (written and verbal), and interpersonal skills.- Ability to learn complex systems, identify and mitigate incidents, and work cross-functionally.



  • Bangalore, India Aqilea (formerly Soltia) Full time

    We are a consulting company with a bunch of technology-interested and happy people!We love technology, we love design and we love quality. Our diversity makes us unique and creates an inclusive and welcoming workplace where each individual is highly valued.With us, each individual is her/himself and respects others for who they are and we believe that when a...


  • Bangalore, India Aqilea (formerly Soltia) Full time

    We are a consulting company with a bunch of technology-interested and happy people! We love technology, we love design and we love quality. Our diversity makes us unique and creates an inclusive and welcoming workplace where each individual is highly valued.With us, each individual is her/himself and respects others for who they are and we believe that when...


  • bangalore, India Progress Full time

    We are Progress (Nasdaq: PRGS) - the trusted provider of software that enables our customers to develop, deploy and manage responsible, AI-powered applications and experience with agility and ease.We're proud to have a diverse, global team where we value the individual and enrich our culture by considering varied perspectives because we believe people power...


  • Bangalore, India CodeKarma Full time

    Site Reliability Engineer (Multi-Cloud Deployments) Location: Bangalore / Remote Experience: 4–10 years Type: Full-time (6-month probation) About CodeKarma CodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s workflow. Our platform runs both as SaaS and as sub-


  • Bangalore, India Flipkart Full time

    Hiring Site Reliability Engineers Exp : 2.5 +years (Excluding internship) Location : Bangalore Apply Here : The engineer will work in the Reliability and Productivity Engineering team and is responsible for building industry standard large scale platforms to be utilised across FK that helps to significantly improve the reliability of systems and bring...


  • bangalore, India Andor Tech Full time

    Hiring!!🏢 About AndorTechAndorTech is a global IT services and consulting firm founded in 2009, headquartered in Bangalore. The company specializes in software engineering, AI-enabled IT services, application support, analytics, and test automation. With a presence across India, the USA, Europe, and the UAE, AndorTech partners with Global Capability...


  • bangalore, India Cyberhaven Full time

    About the roleWe're looking for an experienced Site Reliability engineer for making sure systems are reliable, scalable, and performing well especially in production environments. Our technology is new and rapidly evolving as an early member on the team, you'll play a key role in shaping the reliability architecture, building scalable infrastructure, and...


  • Bangalore, India Aqilea (formerly Soltia) Full time

    We are a consulting company with a bunch of technology-interested and happy people!We love technology, we love design and we love quality. Our diversity makes us unique and creates an inclusive and welcoming workplace where each individual is highly valued.With us, each individual is her/himself and respects others for who they are and we believe that when a...


  • bangalore, India Aqilea (formerly Soltia) Full time

    We are a consulting company with a bunch of technology-interested and happy peopleWe love technology, we love design and we love quality. Our diversity makes us unique and creates an inclusive and welcoming workplace where each individual is highly valued.With us, each individual is her/himself and respects others for who they are and we believe that when a...


  • Bangalore, India JRD Systems Full time

    Site Reliability Engineer (Windows / Cloud / Automation) Job Summary: We are seeking an experienced Site Reliability Engineer with a strong background in managing Windows infrastructure and cloud environments. The ideal candidate will be responsible for designing, implementing, automating, and maintaining scalable infrastructure solutions across AWS, Azure,...