Site Reliability Engineer
2 days ago
5 - 7 Years
5 Openings
Trivandrum
Role descriptionUST Global is seeking a highly skilled Site Reliability Engineer (SRE) to work with one of the leading financial services organizations in the US. This role involves managing the end-to-end application and system stack, ensuring high reliability, scalability, and performance of distributed systems. As an SRE, you will combine software engineering and systems engineering to build and operate large-scale, fault-tolerant production environments.
Key Responsibilities- Engage in and improve the software development lifecycle – from design and development to deployment, operations, and refinement.
- Design, develop, and maintain large-scale infrastructure, CI/CD automation pipelines, and build tools.
- Influence infrastructure architecture, standards, and methods for highly scalable systems.
- Support services prior to production through infrastructure design, platform development, load testing, capacity planning, and launch reviews.
- Maintain and monitor services in production by tracking key performance indicators (availability, latency, system health).
- Automate scalability, resiliency, and system performance improvements.
- Investigate and resolve performance and reliability issues across large-scale and high-throughput services.
- Collaborate with architects and engineers to ensure applications are scalable, maintainable, and follow DR/HA strategies.
- Create and maintain documentation, runbooks, and operational guides.
- Implement corrective action plans with a focus on sustainable, preventative, and automated solutions.
- Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience).
- 8+ years of experience as a Site Reliability Engineer or in a similar role.
- Strong hands-on expertise in Google Cloud Platform (GCP); experience with AWS is a plus.
- Proficiency in DevOps practices, CI/CD pipelines, and build tools (e.g., Jenkins).
- Solid understanding of container orchestration (Docker, Kubernetes).
- Familiarity with configuration management and deployment tools (Chef, Octopus, Puppet, Ansible, SaltStack, etc.).
- Strong cross-functional knowledge of systems, storage, networking, security, and databases.
- Experience operating production environments at scale with focus on availability and latency.
- Excellent communication, collaboration, and problem-solving skills.
- Strong system administration skills on Linux/Windows, with automation and orchestration experience.
- Hands-on with infrastructure as code (Terraform, CloudFormation).
- Proficiency in CI/CD tools and practices.
- Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
- Passion for automation and eliminating manual toil.
- Experience working in highly secure, regulated, or compliant industries.
- Knowledge of security and compliance best practices.
- Experience in DevOps culture, thriving in collaborative and fast-paced environments
Gcp,Aws,Jenkins,Kubernetes
About USTUST is a global digital transformation solutions provider. For more than 20 years, UST has worked side by side with the world's best companies to make a real impact through transformation. Powered by technology, inspired by people and led by purpose, UST partners with their clients from design to operation. With deep domain expertise and a future-proof philosophy, UST embeds innovation and agility into their clients' organizations. With over 30,000 employees in 30 countries, UST builds for boundless impact—touching billions of lives in the process.
-
Senior Site Reliability Engineer
4 weeks ago
Thiruvananthapuram, Kerala, India Zafin Full timeJob SummaryZafin is seeking a Cloud Site Reliability Engineer II (CSRE II) to lead strategic initiatives in ensuring the reliability, scalability, and performance of our cloud infrastructure and applications. This advanced role requires mastery in cloud technologies, strategic planning, and incident management to drive innovative solutions and operational...
-
Senior Site Reliability Engineer
4 weeks ago
Thiruvananthapuram, Kerala, India Zafin Full timeJob SummaryZafin is seeking a Cloud Site Reliability Engineer II (CSRE II) to lead strategic initiatives in ensuring the reliability, scalability, and performance of our cloud infrastructure and applications. This advanced role requires mastery in cloud technologies, strategic planning, and incident management to drive innovative solutions and operational...
-
Senior Site Reliability Engineer
2 weeks ago
Thiruvananthapuram, Kerala, India Zafin Full timeJob Summary Zafin is seeking a Cloud Site Reliability Engineer II (CSRE II) to lead strategic initiatives in ensuring the reliability, scalability, and performance of our cloud infrastructure and applications. This advanced role requires mastery in cloud technologies, strategic planning, and incident management to drive innovative solutions and operational...
-
Senior Site Reliability Engineer
3 days ago
Thiruvananthapuram, Kerala, India Zafin Full time US$ 1,50,000 - US$ 2,00,000 per yearJob SummaryZafin is seeking aCloud Site Reliability Engineer II (CSRE II)to lead strategic initiatives in ensuring the reliability, scalability, and performance of our cloud infrastructure and applications. This advanced role requires mastery in cloud technologies, strategic planning, and incident management to drive innovative solutions and operational...
-
Senior DevOps/Site Reliability Engineer
4 days ago
Thiruvananthapuram, Kerala, India Scoop Technologies Pvt Ltd Full timeJob Title : Senior DevOps Engineer / Site Reliability Engineer (SRE)Experience : 5 to 8 YearsLocation : Thiruvananthapuram (TVM), Kochi, ChennaiJob Overview : We are seeking a highly skilled Senior DevOps Engineer / Site Reliability Engineer (SRE) with 58 years of experience to join our fast-paced and technology-driven environment. The ideal candidate will...
-
Senior Site Reliability Expert
2 weeks ago
Thiruvananthapuram, Kerala, India beBeeTechnical Full time ₹ 20,00,000 - ₹ 25,00,000Job TitleSite Reliability Engineer - Technical Leader and Problem SolverKey Responsibilities:Investigate and resolve high-impact production issues across infrastructure and applications.Collaborate with development teams to improve performance, reliability, and architecture of systems.Participate in incident response efforts as a technical expert.Develop...
-
Senior Site Reliability Engineer
3 days ago
Thiruvananthapuram, Kerala, India Equifax Full time US$ 80,000 - US$ 1,50,000 per yearTrivandrumIndiaTechnologyFull time8/10/2025J Equifax is where you can power your possible. If you want to achieve your true potential, chart new paths, develop new skills, collaborate with bright minds, and make a meaningful impact, we want to hear from you.Site Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems...
-
Site Reliability Engineer
6 days ago
Thiruvananthapuram, Kerala, India CareStackTM - Dental Practice Management Full timeJob Location - Trivandrum Rotational Shifts Responsibilities: 1. Manage and maintain day-to-day BAU operations, including monitoring system performance, troubleshooting issues, and ensuring high availability. 2. Build infrastructure as code (IAC) patterns that meet security and engineering standards. 3. Build CI/CD pipelines using Octopus, GitLab-CI and...
-
Site Reliability Engineer
1 week ago
Thiruvananthapuram, Kerala, India CareStack™ - Dental Practice Management Full timeJob Location - TrivandrumRotational ShiftsResponsibilities:1. Manage and maintain day-to-day BAU operations, including monitoring systemperformance, troubleshooting issues, and ensuring high availability.2. Build infrastructure as code (IAC) patterns that meet security and engineeringstandards.3. Build CI/CD pipelines using Octopus, GitLab-CI and...
-
Site Reliability Engineer
3 days ago
Thiruvananthapuram, Kerala, India CareStack™ - Dental Practice Management Full time ₹ 9,00,000 - ₹ 12,00,000 per yearJob Location - TrivandrumRotational ShiftsResponsibilities:Manage and maintain day-to-day BAU operations, including monitoring systemperformance, troubleshooting issues, and ensuring high availability.Build infrastructure as code (IAC) patterns that meet security and engineeringstandards.Build CI/CD pipelines using Octopus, GitLab-CI and cloud-native...