Senior Site Reliability Engineer

3 weeks ago


india, IN Sapaad Full time

WHO WE ARE

Sapaad is a global leader in unified commerce platforms, delivering world-class software solutions for the food and beverage industry. Our flagship product, also named Sapaad, has achieved remarkable success over the past decade, empowering thousands of F&B businesses across 40+ countries—with many more coming onboard each day.


Driven by a passionate team of developers, designers, and product experts, Sapaad is constantly evolving—introducing innovative, industry-defining features that set the benchmark for F&B tech. Headquartered in Singapore, with offices across five countries, Sapaad is backed by seasoned technology veterans with deep expertise in web, mobility, and e-commerce.


JOB OVERVIEW

Sapaad Software Private Limited is seeking a Senior Site Reliability Engineer (SRE) to lead our infrastructure reliability efforts and mentor a growing SRE team.


This is a strategic, hands-on leadership position responsible for ensuring the reliability, scalability, and performance of our global cloud-based restaurant management platform serving thousands of customers worldwide.


As a senior member of our engineering organization, you will take ownership of system availability, drive automation initiatives, and establish SRE best practices across the company. You’ll work at the intersection of development and operations—embedding reliability into every layer of our technology stack while building and leading a team focused on operational excellence.


This role is ideal for an experienced SRE professional who is passionate about building resilient systems at scale, mentoring engineering talent, and shaping the reliability culture of a fast-growing SaaS organization.


WHAT YOU’LL DO

  • Own the reliability, availability, and performance of all production systems supporting our multi-tenant SaaS platform.
  • Define and manage SLIs, SLOs, and error budgets across critical services.
  • Architect and implement highly available, fault-tolerant systems on AWS and Heroku.
  • Proactively monitor and analyze performance to predict capacity needs and prevent issues.
  • Lead incident management and postmortem processes, driving root cause analysis and preventive actions.
  • Develop incident response playbooks, implement chaos engineering, and reduce MTTD and MTTR.
  • Design and implement comprehensive observability solutions—monitoring, logging, and alerting for microservices and distributed systems.
  • Enforce security and compliance standards, including access controls, vulnerability management, and patching.
  • Mentor and lead SRE and infrastructure engineers, driving team growth, knowledge sharing, and operational maturity.
  • Collaborate with development, DevOps, and product teams to embed reliability practices into every stage of the software lifecycle.


YOU’RE A STRONG FIT IF YOU HAVE

  • 5–8 years of experience in SRE, DevOps, or Systems Engineering roles within SaaS or cloud-based environments.
  • 2+ years in a technical leadership or mentoring capacity.
  • Proven experience maintaining large-scale, high-availability systems (99.9%+ uptime).
  • Expertise with AWS (EC2, RDS, S3, ECS/EKS, Lambda) and Heroku.
  • Proficiency in Infrastructure as Code (Terraform, CloudFormation) and containerization (Docker, Kubernetes).
  • Strong scripting and automation skills in Python, Bash, or PowerShell.
  • Experience with CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions) and configuration management tools (Chef, Ansible, Puppet).
  • Deep understanding of SRE principles—SLIs, SLOs, toil reduction, blameless postmortems, and incident management frameworks.
  • Hands-on experience with monitoring tools (Prometheus, Grafana, Datadog, New Relic, CloudWatch, ELK).
  • Excellent leadership, analytical, and communication skills with a customer-first mindset.


PREFERRED QUALIFICATIONS

  • AWS Certified Solutions Architect – Associate or Professional certification.
  • Experience with SOC 2, ISO 27001, GDPR, or PCI DSS compliance frameworks.
  • Background in microservices architectures, disaster recovery planning, or cost optimization.
  • Experience in the restaurant, hospitality, or retail technology sectors.


  • , India, IN Sonata Software Full time

    We're Hiring: Senior Site Reliability Engineer Location: Onsite (Office: Hyderabad – Mandatory from Day 1) Employment Type: Full-time Notice Period: Immediate to 15 Days Only Experience: 8+ Years About the RoleWe’re looking for a Senior Site Reliability Engineer (SRE) to lead reliability initiatives across our production systems. This is a high-impact...


  • india, IN iVedha Inc. Full time

    Senior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering PracticeLocation: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone.Role Summary:Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?We're looking for an SRE with 7+...


  • Bangalore Urban, Karnataka, India, IN GigSky Full time

    We're Hiring: Site Reliability Engineer (5–10 Years Experience) Location: Bangalore, India | Gigsky India Private LimitedAre you passionate about building resilient, scalable, and secure infrastructure? Gigsky is looking for a seasoned Site Reliability Engineer to join our Bangalore team and help drive operational excellence across our global platform....


  • Bangalore Urban, Karnataka, India, IN Trantor Full time

    Job Title - Site Reliability EngineerRole- Contract (9 Months- Extendable)Exp- 5+ yearsLoc- Bangalore ( Hybrid)Notice- Immediate joiner onlyDuties:Responsible for maintaining and scaling production services and servers across multiple datacenters for complex and data-intensive cloud services Improve scalability, service reliability,capacity, and performance...


  • india, IN Ecoh Full time

    Minimum qualifications:Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.1 year of experience with software development in one or more programming languages during coursework/projects, research, internships, or practical experience in school, work, or Open Source projects.Strong problem-solving and analytical...


  • india, IN TalentBridge Full time

    Lead SRE and DevOps initiatives, supporting development teams with CI/CD, automation, and infrastructure design across Azure environments.Maintain Infrastructure as Code (IaC) standards; automate key rotation, backups, and configuration drift detection using pipelines.Design and execute SRE projects including API versioning, security tool integrations...


  • Bangalore Urban, Karnataka, India, IN RecRoots Full time

    The core premise for the SRE lies in treating operational issues as a software problem. We code our way out of problems where operations are concerned, addressing availability, scalability, latency, and efficiency challenges within the vast infrastructure here.Responsibilities: Design, develop, and implement software that improves the stability, scalability,...

  • Site Civil Engineer

    3 weeks ago


    , India, IN Amit Sharma Architects Full time

    Company DescriptionEdufice Infraservices Pvt. Ltd.Role DescriptionThis is a full-time on-site role for a Site Civil Engineer located in Aizawl, Mizoram. The Site Civil Engineer will be responsible for overseeing and execution of day-to-day construction site activities, coordinating with Architects and sub-contractors/vendors, ensuring project compliance with...


  • , India, IN Yadev Instruments and Automation Pvt Ltd . Full time

    Company DescriptionYadev Instrumentation and Automation Private Limited provides advanced automation solutions in Bangalore, India. Our skilled team designs and implements customized systems to improve efficiency and productivity for businesses. We specialize in helping businesses streamline their processes and remain competitive in today's market.Role...

  • Senior Data Analyst

    3 weeks ago


    india, IN Milestone Technologies, Inc. Full time

    We are seeking a Senior Data Analyst / Engineer with strong experience in manufacturing control systems data, data authentication, and readiness engineering. This role will ensure data ingested from manufacturing systems into Client's data lake and downstream applications (e.g., PI) is fit-for-purpose, reliable, and trustworthy. The successful candidate will...