Principal Site Reliability Engineer

1 month ago


Hyderabad, India Vitech Systems Group Full time

About Vitech


V3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables flexible, agile deployment strategies. V3locity employs an advanced, cloud-native architecture that leverages the unique capabilities of AWS to deliver a solution with unparalleled security, scalability, and resiliency.


The Team


SRE is an engineering approach to building and running production systems – engineer solutions to operational problems. Practices such as limiting time spent on operational work, blameless postmortems, proactive issue identification, and prevention of potential outages. Vitech SRE Team is a global team of talented engineers with T-Shaped skills across application, database, network, system admin covering both on-prem and public cloud AWS. They support our critical V3locity client environments and automate routine tasks to reduce toil. The Role Principal SRE will be part of a global team spread India, US and Canada engaged in transforming how Vitech operates production systems which are critical for our clients in single and multi-tenant model. Oversees technology development solutions through new/existing production support process, Infrastructure, and automation. You will be challenged with taking ownership within an Agile team environment of exploring, testing, implementing recent technologies, and sharing on call duties. We continuously learn as modern technologies evolve and the successful Principal SRE will have a growth mindset for acquiring and mastering new skills paired with a passion to lead others to the forefront of innovation and mature SRE function.


You will be Responsible for:

  • Manage our technology stack in Cloud (AWS) supported with primarily native AWS services with a combination best-in-class SRE / observability tools.Responsible for supporting multiple client environments with java-based applications and microservices architectures with open standards.
  • Implement, monitor and participate in HA/DR design, reviews, periodic testing for all live applications
  • Bring in industry perspectives/best practices to improve production support, reliability, resiliency, automation.
  • Design/refine and implement SLIs and SLO’s that covers broad spectrum of SRE – availability, performance, Error budgeting
  • Setup/refine alerts and dashboards that are meaningful to SRE operations
  • Enhance current infrastructure as code (IAC) patterns that meet security and engineering standards using one or more technologies (Terraform, Cloud formation, Python, SDK)
  • Contribute to blameless postmortem along with incident management and own the call to action


The Experience You Bring:

  • Performed hands-on SRE role for critical client-facing applications; can rollup his sleeves to deal with daily SRE Tasks, manage incidents, responsible for operational tools
  • 4+ years of experience developing and/or administering software in public cloud
  • 3+ Years of experience hosting enterprise applications in AWS (EC2, EBS, ECS/EKS, Elastic beanstalk, RDS, CloudWatch)
  • Knowledge and understanding of AWS networking concepts, including VPC, VPN/DX/ Endpoints, NG Firewalls, Route53, Cloudfront, loadbalancers, WAF
  • Knowledge of security and IAM within AWS, including the management and operation of Security Groups, KMS Keys, and SCPs.
  • Deep expertise with an observability platform like Newrelic, Dynatrace, Honeycomb, Grafana
  • Understanding of Open telemetry-based instrumentation, monitoring and fine tuning (refinery)
  • Working experience operating relational databases like Oracle or Postgresql, on-prem or Cloud
  • Experience doing database SRE activities like Backup/Restore, Global Database, Goldengate/DMS replication, DBFS, ADDM report and other database troubleshooting
  • Experience administering web and application layers leveraging Oracle Weblogic, Apache Tomcat, AWS Elastic beanstalk, SSL certificates, S3 buckets
  • Experience with administering Container based apps using Docker, Kubernetes or ECS
  • Experience with heap dumps/thread dumps to find root cause, analyze application logs/GC and enable Dev teams to identify root cause for production issues
  • Automation / Infra as Code
  • Demonstrated experience working with tools like Terraform, Cloud formation, Python, Jenkins, GitHub/Actions, etc., to automate operational tasks
  • Understanding of Service Level Objectives (SLOs) and Service Level Indicators (SLIs), along with experience in designing and implementing them.
  • Multiple years of experience in programming languages such as Python, Bash, Java, JavaScript, node.js
  • Multiple years of system administration skills in Linux/Windows based applications
  • Strong written/verbal communication, analytical and critical thinking and presentation skills.
  • Able to work in shifts and lead the team technically to manage the tasks/issues that arise in the shift


Vitech Inc. is an equal opportunity employer. We celebrate diversity and are committed to creating an equitable and inclusive environment for all employees.



  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • hyderabad, India Vitech Systems Group Full time

    About Vitech V3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Vitech Systems Group Full time

    About VitechV3locity, Vitech’s cloud-native administration, engagement, and analytics platform, is a transformative suite of complementary applications that offers full life cycle business functionality and robust enterprise capabilities. It marries core administration with superior digital experience and augmented analytics. Its modular design enables...


  • Hyderabad, India Genpact Full time

    Genpact (NYSE: G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose – the relentless pursuit of a world that works better for people –...


  • hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1Experience: 2 to 10 yearsThe Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems.This entry-level role is ideal for someone who passionate about learning and developing their skills in system reliability, automation,...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1Experience: 2.5 to 6 yearsThe Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system...


  • Hyderabad, India TechBlocks Full time

    Seeking a skilled Senior Site Reliability Engineer with expertise in Google Cloud Platform (GCP) to join our dynamic team. As a Senior SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure and applications hosted on GCP.Responsibilities:Design, build, and maintain the core infrastructure used by all...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1Experience: 2.5 to 6 yearsThe Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1 Experience: 2.5 to 6 years The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1Experience: 2 to 10 yearsThe Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system...


  • Hyderabad, India Quest Diagnostics Full time

    Please Note: This is a Leadership Role with Technically Hand-OnPeople Leader ResponsibilityPosition will manage 5 to 10 engineers both directly and indirectly. The engineers will include Site Reliability Engineers, Observability Engineers, Performance Engineers, DevSecOps Engineers, and others These individuals will vary from entry level to senior...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1 The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system reliability,...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system reliability, automation,...


  • Hyderabad, India SID Global Solutions Full time

    Job Description: Site Reliability Engineer (SRE) – Apigee Level 1The Site Reliability Engineer (SRE) Level 1 will be responsible for maintaining and improving the reliability, availability, and performance of the systems. This entry-level role is ideal for someone who passionate about learning and developing their skills in system reliability, automation,...