Reliability Engineer

4 days ago


Bengaluru Chennai Hyderabad, India ti Steps Full time ₹ 15,00,000 - ₹ 25,00,000 per year

About the Role:

We are looking for a highly skilled Reliability Engineer to join our team and help ensure the availability, performance, and scalability of our systems and services. In this role, you will work closely with engineering, DevOps, and product teams to identify and mitigate reliability risks, improve observability, and build systems that are robust, resilient, and self-healing.

Key Responsibilities:

  • Reliability & Performance

  • Ensure high availability, performance, and uptime of critical systems and services.

  • Conduct performance tuning and capacity planning based on system metrics and forecasts.
  • Identify and remove single points of failure across systems and infrastructure.

  • Monitoring & Observability

  • Design, implement, and maintain monitoring, alerting, and telemetry solutions.

  • Define SLIs, SLOs, and SLAs, and establish metrics-driven reliability practices.
  • Conduct root cause analysis (RCA) and postmortems for production incidents.

  • Incident Management

  • Lead response efforts during outages or service degradations.

  • Improve incident management processes and automate alerting and response.
  • Participate in on-call rotations and ensure rapid recovery and minimal user impact.

  • Automation & Resilience Engineering

  • Build automation for repetitive operational tasks (infrastructure, deployments, maintenance).

  • Implement chaos engineering and fault-injection to proactively uncover failure modes.
  • Collaborate with developers to design systems that degrade gracefully under stress.

Preferred Qualifications:

  • Experience with infrastructure-as-code tools (Terraform, Ansible, CloudFormation).
  • Familiarity with service mesh, chaos engineering tools (e.g., Gremlin, Chaos Monkey).
  • Background in designing highly available, scalable, and fault-tolerant systems.
  • Experience working in regulated or high-compliance environments (e.g., fintech, healthcare).
  • Certifications: AWS/GCP/Azure Certified Engineer, CKA/CKAD, etc.


  • Bengaluru, Chennai, Hyderabad, India ti Steps Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    About the Role:We are seeking a detail-oriented and proactive Software Reliability Engineer (SRE) to join our engineering team. The SRE will be responsible for improving the reliability, scalability, and performance of our software systems by applying software engineering principles to infrastructure and operations. You will work closely with development,...


  • Bengaluru, Chennai, Hyderabad, India ti Steps Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    About the Role:We are seeking a highly motivated and experienced Platform Reliability Engineer (PRE) to ensure the performance, reliability, and scalability of our core platform infrastructure. In this role, you will work at the intersection of software engineering and systems engineering to build resilient systems, automate operational processes, and drive...

  • Reliability engineer

    4 weeks ago


    Chennai, India Supply Chain Resources Group, Inc. Full time

    ResponsibilitiesTranslate product management reliability goals into appropriate testable goals.Perform statistical data analysis, Accelerated Life Testing (ALT) and modeling, and risk assessment.Develop reliability performance metrics and lead management reviews to review progress against those metrics.Drive the failure analysis process for all failures...

  • Reliability engineer

    2 weeks ago


    Chennai, India Supply Chain Resources Group, Inc. Full time

    Responsibilities Translate product management reliability goals into appropriate testable goals. Perform statistical data analysis, Accelerated Life Testing (ALT) and modeling, and risk assessment. Develop reliability performance metrics and lead management reviews to review progress against those metrics. Drive the failure analysis process for all...


  • Chennai, Hyderabad, India Glomatriz Technologies Full time ₹ 2,50,000 - ₹ 7,50,000 per year

    We are seeking a Site Reliability Engineer with 3+ years of experience to ensure system reliability, monitor performance, and implement scalable solutions. The role involves automation, incident management, and collaboration with development teams.

  • Reliability Engineer

    3 weeks ago


    Chennai, India Supply Chain Resources Group, Inc. Full time

    Responsibilities Translate product management reliability goals into appropriate testable goals. Perform statistical data analysis, Accelerated Life Testing (ALT) and modeling, and risk assessment. Develop reliability performance metrics and lead management reviews to review progress against those metrics. Drive the failure analysis process for all failures...

  • Reliability Engineer

    3 weeks ago


    Chennai, India Supply Chain Resources Group, Inc. Full time

    Responsibilities Translate product management reliability goals into appropriate testable goals. Perform statistical data analysis, Accelerated Life Testing (ALT) and modeling, and risk assessment. Develop reliability performance metrics and lead management reviews to review progress against those metrics. Drive the failure analysis process for all failures...


  • Chennai, India Supply Chain Resources Group, Inc. Full time

    ResponsibilitiesTranslate product management reliability goals into appropriate testable goals.Perform statistical data analysis, Accelerated Life Testing (ALT) and modeling, and risk assessment.Develop reliability performance metrics and lead management reviews to review progress against those metrics.Drive the failure analysis process for all failures...

  • Reliability Engineer

    3 weeks ago


    Chennai, India Supply Chain Resources Group, Inc. Full time

    Responsibilities- Translate product management reliability goals into appropriate testable goals.- Perform statistical data analysis, Accelerated Life Testing (ALT) and modeling, and risk assessment.- Develop reliability performance metrics and lead management reviews to review progress against those metrics.- Drive the failure analysis process for all...


  • Chennai, India Supply Chain Resources Group, Inc. Full time

    Responsibilities- Translate product management reliability goals into appropriate testable goals.- Perform statistical data analysis, Accelerated Life Testing (ALT) and modeling, and risk assessment.- Develop reliability performance metrics and lead management reviews to review progress against those metrics.- Drive the failure analysis process for all...