
Senior Site Reliability Engineer
4 weeks ago
Job Summary
Zafin is seeking a Cloud Site Reliability Engineer II (CSRE II) to lead strategic initiatives in ensuring the reliability, scalability, and performance of our cloud infrastructure and applications. This advanced role requires mastery in cloud technologies, strategic planning, and incident management to drive innovative solutions and operational excellence.
As a CSRE II, you will influence the direction of cloud reliability strategies, mentor junior engineers, and lead significant projects that have a broad organizational impact. This position reports directly to the VP of Cloud Services and requires a proactive, collaborative mindset to achieve operational and strategic objectives.
Key Responsibilities
- Lead and manage the resolution of complex technical issues involving Zafin's products and Azure cloud environment.
- Design and implement strategic operational enhancements to improve resiliency and system reliability.
- Conduct in-depth Root Cause Analysis (RCA) for high-severity incidents and drive initiatives to reduce error recurrence.
- Represent the organization in external client escalation calls, providing expert guidance and solutions.
- Architect and optimize cloud infrastructure for high performance, scalability, and cost-effectiveness.
- Provide thought leadership in managing and scaling container orchestration platforms such as AKS and OpenShift.
- Oversee the implementation of advanced monitoring solutions and integrate predictive analytics for proactive issue resolution.
- Develop and execute automation strategies to streamline operational workflows and incident responses.
- Create and maintain comprehensive documentation of cloud architectures, processes, and incident management strategies.
- Mentor and coach junior engineers, fostering a culture of continuous learning and innovation.
- Drive strategic initiatives, collaborating with cross-functional teams to achieve organizational objectives.
Qualifications
- Bachelor's degree in Computer Science, Engineering, or a related field (Master's degree preferred).
- 7- 12 years of experience in cloud support, operations, or a related role.
- Advanced expertise in Microsoft Azure (preferred) or equivalent cloud platforms.
- Demonstrated experience in designing and scaling container orchestration systems like AKS or OpenShift.
- Proven leadership in managing automated deployment pipelines, including Azure DevOps.
- Mastery in enterprise monitoring platforms (e.g., Azure Insights, Grafana) and predictive analytics tools.
- Advanced scripting skills with PowerShell, Python, or similar languages.
- Extensive experience in incident management and defining SLAs for global production environments.
- In-depth knowledge of database management, particularly Postgres.
Preferred Qualifications
- Advanced certifications in cloud platforms (e.g., Azure Solutions Architect Expert).
- Experience with ITSM tools and processes (e.g., ServiceNow).
- Comprehensive understanding of security and compliance in cloud environments.
Soft Skills
- Exceptional analytical and problem-solving abilities.
- Strong leadership and mentoring skills.
- Advanced communication and collaboration capabilities.
- Visionary approach to operational innovation and strategic planning.
-
Reliable Software Engineer
1 week ago
Thiruvananthapuram, Kerala, India beBeesre Full time ₹ 20,00,000 - ₹ 25,00,000Senior Site Reliability EngineerWe are seeking a skilled Senior Site Reliability Engineer to join our team. As a critical member of our platform engineering group, you will play a key role in ensuring the reliability and scalability of our SaaS real estate platform.
-
Senior DevOps/Site Reliability Engineer
2 weeks ago
Thiruvananthapuram, Kerala, India Scoop Technologies Pvt Ltd Full timeJob Title : Senior DevOps Engineer / Site Reliability Engineer (SRE)Experience : 5 to 8 YearsLocation : Thiruvananthapuram (TVM), Kochi, ChennaiJob Overview : We are seeking a highly skilled Senior DevOps Engineer / Site Reliability Engineer (SRE) with 58 years of experience to join our fast-paced and technology-driven environment. The ideal candidate will...
-
Site Reliability Engineer
1 week ago
Thiruvananthapuram, Kerala, India Apexsync Technologies Full timeHello Everyone,We're looking for an experienced Site Reliability Engineer who excels in automation, cloud infrastructure, and observability solutions. The right candidate will combine technical depth with a proactive mindset to drive system reliability and performance.Location: Hyderabad (Hybrid Role. 2-3 days in office ) Experience level: Senior ( 7 years...
-
Site Reliability Engineer
2 weeks ago
Thiruvananthapuram, Kerala, India UST Full time US$ 90,000 - US$ 1,20,000 per year5 - 7 Years5 OpeningsTrivandrumRole descriptionUST Global is seeking a highly skilled Site Reliability Engineer (SRE) to work with one of the leading financial services organizations in the US. This role involves managing the end-to-end application and system stack, ensuring high reliability, scalability, and performance of distributed systems. As an SRE,...
-
Site Reliability Engineer
1 week ago
Thiruvananthapuram, Kerala, India Equifax Full time ₹ 1,04,000 - ₹ 1,30,878 per yearSite Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles.SRE is also an...
-
Senior Site Reliability Engineer
4 days ago
Thiruvananthapuram, Kerala, India Equifax Full time ₹ 5,00,000 - ₹ 15,00,000 per yearSite Reliability Engineering (SRE)at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles.SRE is also an...
-
Site Reliability Engineer
2 weeks ago
Thiruvananthapuram, Kerala, India Equifax Full time ₹ 5,00,000 - ₹ 8,00,000 per yearSite Reliability Engineering (SRE)at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles.SRE is also an...
-
Site Reliability Engineer II
2 weeks ago
Thiruvananthapuram, Kerala, India Zafin Full time ₹ 1,04,000 - ₹ 1,30,878 per yearSenior Site Reliability Engineer (SRE II)Own availability, latency, performance, and efficiency for Zafin's SaaS on Azure. You'll define and enforce reliability standards, lead high-impact projects, mentor engineers, and eliminate toil at scale.Reports to the Director of SRE.What you'll doSLIs/SLOs & contracts: Define customer-centric SLIs/SLOs for...
-
Reliability Infrastructure Specialist
1 week ago
Thiruvananthapuram, Kerala, India beBeeSite Full time ₹ 75,00,000 - ₹ 1,25,00,000We are looking for a highly skilled Senior Site Reliability Engineer to enhance our team's capabilities.About the RoleThe successful candidate will have at least 5 years of experience in site reliability engineering, with a strong understanding of distributed systems, cloud platforms (AWS, Azure or GCP), and microservices architecture.Key responsibilities...
-
Chief Operations Officer
2 weeks ago
Thiruvananthapuram, Kerala, India beBeeLeadership Full time US$ 15,00,000 - US$ 20,00,000Senior Engineering Leadership OpportunityWe are seeking a seasoned leader to spearhead our Site Reliability Engineering (SRE) team.This is a critical role that requires strong leadership and technical expertise to drive the development of our SRE strategy, promote a culture of automation, and enhance operational efficiency.The ideal candidate will have 12+...