Lead / Manager - Site Reliability Engineering (SRE)
1 hour ago
Job Summary
The Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture. Reporting to the US-based Director of Systems and Security, this role is responsible for overseeing day-to-day operations, technical mentorship, and strategic alignment with the company's goals. The Technical Manager will act as a bridge between the team and senior leadership, ensuring clear communication, efficient issue resolution, and continuous improvement in service delivery.
Job Category
Technology Solutions
Responsibilities:
● Provide leadership and management to a remote team of Site Reliability Engineers, ensuring alignment with organizational priorities and goals.
● Oversee team operations, including incident management, technical support, and infrastructure maintenance.
● Act as the primary point of escalation for complex technical issues, collaborating with the Director of Systems and Security, Quality Assurance and Product teams as needed.
● Ensure the team adheres to established SLAs for issue resolution and maintains high customer satisfaction levels.
● Mentor and develop team members, fostering growth in technical skills, problem-solving abilities, and customer engagement.
● Lead initiatives to improve operational processes, tools, and workflows, driving greater efficiency and reliability.
● Collaborate with cross-functional teams, including Product, Engineering, and Operations, to address customer needs and improve platform performance.
● Facilitate regular team meetings, performance reviews, and one-on-one sessions to ensure clear communication and ongoing development.
● Maintain and report on key performance metrics, providing insights and recommendations to senior leadership.
● Stay informed on industry trends and best practices, ensuring the team is equipped with the latest tools and methodologies.
● Participate in strategic planning and contribute to the continuous improvement of the SRE function.
Qualifications:
● Proven experience managing technical teams, preferably in Site Reliability Engineering, DevOps, or a related field.
● Strong technical background in cloud computing and infrastructure management, particularly with AWS and Linux-based systems.
● Demonstrated ability to lead and mentor teams in remote and distributed environments.
● Excellent written and oral English communication and interpersonal skills, with the ability to engage effectively with both technical and non-technical stakeholders.
● Strong problem-solving and decision-making abilities, with a focus on root cause analysis and long-term solutions.
● Experience with automation tools (Terraform, Ansible, CloudFormation) and CI/CD pipelines.
● Familiarity with incident management practices and tools, as well as ticketing systems.
● High attention to detail and a commitment to operational excellence.
● Bachelor's degree in a technical or quantitative science field, or equivalent work experience.
Preferred Qualifications:
● AWS certification (any level).
● 3+ years of experience leading customer-facing technical teams, with a focus on improving service delivery.
● Knowledge of security best practices and governance in cloud environments.
● Strong understanding of networking concepts and system architecture.
Key Attributes:
● Empathetic leader who values collaboration, transparency, and accountability.
● Proactive mindset with a focus on continuous improvement and innovation.
● Ability to prioritize and manage multiple initiatives in a fast-paced environment.
● Strategic thinker who can align team efforts with broader organizational objectives.
● Passion for enabling team growth and fostering a culture of learning and development.
Job Location: Kolkata
-
Site Reliability Engineer
7 days ago
west bengal, India TECEZE Full timeRole: Site Reliability Engineer (SRE) – Core IT Infrastructure Location: Kolkata Work mode: On-site (full Time) Experience: 6+ year‘s Key Responsibilities Infrastructure Reliability & Operations • Design, implement, and maintain highly available and fault-tolerant infrastructure • Ensure reliability, performance, scalability, and security of core IT...
-
Senior Site Reliability Engineer
3 days ago
Greater Kolkata Area, India Flipped - Transforming Talent Acquisition with AI Full timeDescriptionPosition :Senior Site Reliability Engineer (SRE)Experience :10+ Note : Candidates should be ready to work in 24X7 rotational shifts, on call support. Weekly Rotations .About The RoleWe are seeking a Senior Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of large-scale production systems.This role demands...
-
Site Reliability Engineer
6 days ago
Greater Bengaluru Area, India CodeVyasa Full timeJob DescriptionWe are looking for a skilled SRE Engineer ll Bangalore ll 4-7 yrs. of exp to design, develop, and maintain scalable backend applications. The ideal candidate should have strong experience in Laravel framework, RESTful APIs, and database-driven applications, with a focus on clean code and performance.About UsCodeVyasa is a mid-sized product...
-
Cloud Operations
4 weeks ago
west bengal, India elevarae Full timeWe are seeking a Cloud Operations Lead to support a leading IT R&D organization in Kolkata. This role ensures the stability, performance, and security of cloud-based systems while driving operational excellence through proactive monitoring, incident management, automation, and capacity planning. You will lead cross-functional teams, optimize cloud resources...
-
Openshift SRE
3 weeks ago
Kolkata, India Tata Consultancy Services Full timeJob Description Role: OpenShift SRE Experience: 8 t0 15 Years Locations: Chennai, Kolkata, Hyderabad, Bangalore, Pune, Delhi Job Description: - 8+ years of overall experience in roles such as Site Reliability Engineering, DevOps, or Linux Systems Engineering. - 5+ years of hands-on, intensive experience administering, automating, and troubleshooting Red Hat...
-
SRE Lead
2 days ago
Pune/Pimpri-Chinchwad Area, India Amdocs Full timeJob ID:206284Required Travel: No TravelManagerial - NoLocation::India- Pune (Amdocs Site)In one sentenceAs the SRE Lead, you will be responsible for the reliability, operational excellence, and release governance of amAIz (Telco Agentic Suite). You will lead a cross-functional team of NFT, QA, and DevOps Engineers, driving best practices in observability,...
-
Cloud Operations
4 weeks ago
Kolkata, India elevarae Full timeWe are seeking a Cloud Operations Lead to support a leading IT R&D organization in Kolkata. This role ensures the stability, performance, and security of cloud-based systems while driving operational excellence through proactive monitoring, incident management, automation, and capacity planning. You will lead cross-functional teams, optimize cloud resources...
-
Cloud Operations
2 weeks ago
Kolkata, India elevarae Full timeWe are seeking a Cloud Operations Lead to support a leading IT R&D organization in Kolkata. This role ensures the stability, performance, and security of cloud-based systems while driving operational excellence through proactive monitoring, incident management, automation, and capacity planning. You will lead cross-functional teams, optimize cloud resources...
-
Cloud Operations
2 weeks ago
kolkata, India elevarae Full timeWe are seeking a Cloud Operations Lead to support a leading IT R&D organization in Kolkata. This role ensures the stability, performance, and security of cloud-based systems while driving operational excellence through proactive monitoring, incident management, automation, and capacity planning. You will lead cross-functional teams, optimize cloud resources...
-
Site Reliability Engineer
2 weeks ago
Greater Bengaluru Area, India Pro5.ai Full timeOur client is seeking a Site Reliability Engineer I to join their growing technology operations team. This role is ideal for someone passionate about system reliability, incident response, and cross-team collaboration in a large-scale cloud environment. What You’ll Do - Act as the first point of contact for all customer-affecting issues. - Drive and manage...