Senior Site Reliability Engineer

2 days ago


Hyderabad, Telangana, India Options Executive Search Private Limited Full time

Job Title : SRE Lead Engineer.

Location : Hyderabad, India.

We are seeking a DevOps / SRE Lead Engineer to architect and scale our client's multi-tenant SaaS platform with AI/ML at the core.

Our client, a fast-growing AI-powered SaaS company in the FinTech space, is looking for a Site Reliability Engineering (SRE) Lead Engineer to join their dynamic team.

This is an opportunity to design and operate large-scale SaaS systems that integrate cutting-edge AI/ML capabilities.

About the Role :

As the SRE Lead Engineer, you will be responsible for architecting, building, and maintaining infrastructure that powers a multi-tenant SaaS platform.

Youll drive reliability, scalability, and security, while supporting AI/ML pipelines in production.

This is a hands-on role with significant ownership, requiring both technical depth and leadership in site reliability practices.

Key Responsibilities :

- Architect, design, and deploy end-to-end infrastructure for large-scale, microservices-based SaaS platforms.

- Ensure system reliability, scalability, and security for AI/ML model integrations and data pipelines.

- Automate environment provisioning and management using Terraform in AWS (EKS-focused).

- Implement full-stack observability across applications, networks, and operating systems.

- Lead incident management and participate in 24/7 on-call rotation.

- Optimize SaaS reliability while enabling REST APIs, SSO integrations (Okta/Auth0), and cloud data services (RDS/MySQL, Elasticsearch).

- Define and maintain backup and disaster recovery for critical workloads.

Required Skills & Experience :

- 8+ years in SRE/DevOps roles, managing enterprise SaaS applications in production.

- Minimum 1 year experience with AI/ML infrastructure or model-serving environments.

- Strong expertise in AWS cloud, particularly EKS, container orchestration, and Kubernetes.

- Hands-on experience with Infrastructure as Code (Terraform), Docker, and scripting (Python, Bash).

- Solid Linux OS and networking fundamentals.

- Experience in monitoring and observability with ELK, CloudWatch, or similar tools.

- Strong track record with microservices, REST APIs, SSO, and cloud databases.

Nice-to-Have Skills :

- Experience with MLOps and AI/ML pipeline observability.

- Cost optimization and security hardening in multi-tenant SaaS.

- Prior exposure to FinTech or enterprise finance solutions.

Qualifications :

- Bachelors degree in Computer Science, Engineering, or related discipline.

- AWS Certified Solutions Architect (strongly preferred).

- Experience in early-stage or high-growth startups is an advantage.

Why Join?

- Be at the forefront of AI/ML-powered SaaS innovation in FinTech.

- Work with a high-energy, entrepreneurial team building next-gen infrastructure.

- Take ownership of mission-critical reliability challenges.

- Grow your career in an environment that values impact, adaptability, and innovation.

(ref:hirist.tech)

  • Hyderabad, Telangana, India Microsoft Full time

    The Windows Cloud division is looking for a Senior Site Reliability Engineer that will help us take the Windows Cloud platform as well as the Windows 365 Cloud PC and Azure Virtual Desktop business to the next level Windows 365 Cloud PC W365 and Azure Virtual Desktop AVD have recently been recognized as leaders in the Gartner Magic Quadrant TM for...


  • Hyderabad, Telangana, India Microsoft Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    The Windows Cloud division is looking for a Senior Site Reliability Engineer that will help us take the Windows Cloud platform, as well as the Windows 365 Cloud PC and Azure Virtual Desktop business to the next level.Windows 365 Cloud PC (W365) and Azure Virtual Desktop (AVD) have recently been recognized as leaders in the Gartner Magic Quadrant for Desktop...


  • Hyderabad, Telangana, India Insight Global Full time

    Join a mission-critical SCADA reliability team —now hiring Lead, Senior, and Junior Site Reliability Engineers in HITECH Hyderabad Telangana.Step into a high-impact role with cutting-edge technologies, a flexible hybrid schedule, and a growth-driven culture backed by Evergreen, the professional services division of Insight Global.Key Technologies &...


  • Hyderabad, Telangana, India Talent Worx Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Site Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...


  • Hyderabad, Telangana, India Talent Worx Full time US$ 1,20,000 - US$ 2,00,000 per year

    Talent Worx is seeking a talented SRE (Site Reliability Engineer) to enhance our technology team. In this role, you will be pivotal in ensuring the reliability, performance, and availability of our applications and services.Your work will involve both software engineering and systems operations as you strive to improve customer experiences and operational...


  • Hyderabad, Telangana, India Talent Worx Full time

    Talent Worx is seeking a talented SRE (Site Reliability Engineer) to enhance our technology team. In this role, you will be pivotal in ensuring the reliability, performance, and availability of our applications and services.Your work will involve both software engineering and systems operations as you strive to improve customer experiences and operational...


  • Hyderabad, Telangana, India Chase Bank Full time

    Job DescriptionElevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.As a Principal Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking, youwork with your fellow stakeholders to define non-functional...


  • Hyderabad, Telangana, India IntraEdge Full time

    Site Reliability EngineerExperience: 7+ YearsLocation: HyderabadHybrid 4-day office and 1 Day remoteSkills for Principal:Strong leadership and people management skills.Exceptional technical proficiency in Pearson's technology stack.Advanced project management capabilities.Excellent communication and collaboration skills.Adept at risk assessment and crisis...


  • Hyderabad, Telangana, India Ivanti Full time US$ 1,00,000 - US$ 1,50,000 per year

    Why We Need YouSite Reliability Engineering (SRE) is a growing team that partners closely with Product Engineering, Security, and Support. We are responsible for the reliability, deployment, and continuous operation of the Ivanti Cloud services.  We need your help to take our existing platform to the next level with observability, release automation, chaos...


  • Hyderabad, Telangana, India Cubic Corporation Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    Business Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...