Principal Site Reliability Engineer

2 weeks ago


Hyderabad, India Vitech Systems Group Full time

About Vitech


V3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables flexible, agile deployment strategies. V3locity employs an advanced, cloud-native architecture that leverages the unique capabilities of AWS to deliver a solution with unparalleled security, scalability, and resiliency.


Principal SRE – Join Our Global Engineering Team

We believe that excellence in production systems starts with engineering-driven solutions to operational challenges. Our Site Reliability Engineering (SRE) team is at the heart of ensuring seamless performance for our clients, preventing potential outages, and proactively identifying and resolving issues before they arise.

Our SRE team is a diverse group of talented engineers across India, the US, and Canada. We have T-shaped expertise spanning application development, database management, networking, and system administration across both on-premise environments and AWS cloud. Together, we support mission-critical client environments and drive automation to reduce manual toil, freeing our team to focus on innovation.


About the Role: Principal SRE

As a Principal SRE, you’ll be a key player in revolutionizing how we operate production systems for single and multi-tenant environments. You'll lead technology initiatives, streamline production processes, and drive infrastructure automation. Working in an Agile team environment, you’ll have the opportunity to explore and implement the latest technologies, engage in on-call duties, and contribute to continuous learning as part of an ever-evolving tech landscape.

If you’re passionate about staying ahead of the curve and inspiring others to lead the charge in SRE, this role is for you.


What You’ll Do:

  • Own and manage our AWS cloud-based technology stack, using native AWS services and top-tier SRE tools to support multiple client environments with Java-based applications and microservices architecture.
  • Design, implement, and monitor HA/DR strategies, reviewing and testing live applications for reliability.
  • Introduce best practices from the industry to enhance our production support, resiliency, and automation.
  • Develop and refine SLIs and SLOs focused on availability, performance, and error budgeting.
  • Create meaningful alerts and dashboards to support SRE operations.
  • Enhance infrastructure as code (IAC) patterns using technologies like Terraform, CloudFormation, Python, and SDK.
  • Lead incident management, drive blameless postmortems, and take ownership of follow-up actions.


The Skills and Experience You Bring:

  • Proven hands-on experience as an SRE for critical, client-facing applications, with the ability to dive deep into daily SRE tasks, manage incidents, and oversee operational tools.
  • 4+ years of experience developing and/or managing software in a public cloud environment.
  • 3+ years of experience hosting enterprise applications in AWS (EC2, EBS, ECS/EKS, Elastic Beanstalk, RDS, CloudWatch).
  • Strong understanding of AWS networking concepts (VPC, VPN/DX/Endpoints, Route53, CloudFront, Load Balancers, WAF).
  • Expertise in AWS security and IAM management (Security Groups, KMS Keys, SCPs).
  • In-depth experience with observability platforms (New Relic, Dynatrace, Honeycomb, Grafana) and OpenTelemetry-based monitoring.
  • Experience managing relational databases (Oracle, and/or PostgreSQL) in both cloud and on-prem environments, including SRE tasks like backup/restore and replication.
  • Hands-on experience with web/application layers (Oracle WebLogic, Apache Tomcat, AWS Elastic Beanstalk, SSL certificates, S3 buckets).
  • Experience with containerized applications (Docker, Kubernetes, ECS).
  • Proficiency in analyzing application logs, GC, and conducting root cause analysis for production issues.
  • Automation experience with Infrastructure as Code (Terraform, CloudFormation, Python, Jenkins, GitHub/Actions).
  • Experience designing and implementing SLIs/SLOs.
  • Programming skills in Python, Bash, Java, JavaScript, Node.js.
  • Strong system administration skills in both Linux and Windows environments.
  • Excellent written/verbal communication, critical thinking, and leadership abilities.
  • Willingness to work in shifts and lead your team to resolve issues efficiently.


Why Join Us?

If you thrive in a dynamic environment and are eager to drive innovation in SRE practices, we want to hear from you

  • At Vitech, you’ll be part of a forward-thinking team that values collaboration, innovation, and continuous improvement. We provide a supportive and inclusive environment where you can grow as a leader while helping shape the future of our organization.


Vitech Inc. is an equal opportunity employer. We celebrate diversity and are committed to creating an equitable and inclusive environment for all employees.



  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • hyderabad, India Vitech Systems Group Full time

    About Vitech V3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About Vitech V3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About Vitech V3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Genpact Full time

    Genpact (NYSE: G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose – the relentless pursuit of a world that works better for people –...


  • hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1Experience: 2 to 10 yearsThe Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems.This entry-level role is ideal for someone who passionate about learning and developing their skills in system reliability, automation,...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1Experience: 2.5 to 6 yearsThe Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system...


  • Hyderabad, India TechBlocks Full time

    Seeking a skilled Senior Site Reliability Engineer with expertise in Google Cloud Platform (GCP) to join our dynamic team. As a Senior SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure and applications hosted on GCP.Responsibilities:Design, build, and maintain the core infrastructure used by all...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1Experience: 2.5 to 6 yearsThe Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1Experience: 2 to 10 yearsThe Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1 Experience: 2.5 to 6 years The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in...


  • Hyderabad, India Quest Diagnostics Full time

    Please Note: This is a Leadership Role with Technically Hand-OnPeople Leader ResponsibilityPosition will manage 5 to 10 engineers both directly and indirectly. The engineers will include Site Reliability Engineers, Observability Engineers, Performance Engineers, DevSecOps Engineers, and others These individuals will vary from entry level to senior...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system reliability, automation,...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system reliability, automation,...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1 The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system reliability,...