Lead Site Reliability Engineer
4 weeks ago
COMPANY- LANDMARK GROUP
Job Title: SRE Lead (Engineering & Reliability)
Experience: 8-12 years
Job Summary:
We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to
oversee the reliability, scalability, and performance of our critical systems. As an SRE Lead,
you will play a pivotal role in establishing and implementing SRE practices, leading a team
of engineers, and driving automation, monitoring, and incident response strategies. This
position combines software engineering and systems engineering expertise to build and
maintain high-performing, reliable systems.
Key Responsibilities:
Reliability & Performance:
• Lead efforts to maintain high availability and reliability of critical services.
• Define and monitor SLIs, SLOs, and SLAs to ensure business requirements are met.
• Proactively identify and resolve performance bottlenecks and system inefficiencies.
Incident Management & Response:
• Establish and improve incident management processes and on-call rotations.
• Lead incident response and root cause analysis for high-priority outages.
• Drive post-incident reviews and ensure actionable insights are implemented.
Automation & Tooling:
• Develop and implement automated solutions to reduce manual operational tasks.
• Enhance system observability through metrics, logging, and distributed tracing tools
(e.g., Prometheus, Grafana, Elastic APM).
• Optimize CI/CD pipelines for seamless deployments.
Collaboration:
• Partner with software engineering teams to improve the reliability of applications and
infrastructure.
• Work closely with product/ engineering teams to design scalable and robust systems.
• Ensure seamless integration of monitoring and alerting systems across teams.
Leadership & Team Building:
• Manage, mentor, and grow a team of SREs.
• Promote SRE best practices and foster a culture of reliability and performance across
the organization.
• Drive performance reviews, skills development, and career progression for team
members.
Capacity Planning & Cost Optimization:
• Perform capacity planning and implement autoscaling solutions to handle traffic
spikes.
• Optimize infrastructure and cloud costs while maintaining reliability and
performance.
Skills & Qualifications:
• Technical Expertise:
o Experience with cloud platforms (AWS / Azure / GCP) and Kubernetes.
o Hands-on knowledge of infrastructure-as-code tools like Terraform /Helm/Ansible.
o Proficiency in Java
o Expertise in distributed systems, databases, and load balancing.
• Monitoring & Observability:
o Proficient with tools like Prometheus, Grafana,, Elastic APM, or New relic.
o Understanding of metrics-driven approaches for system monitoring and alerting.
• Automation & CI/CD:
o Hands-on experience with CI/CD pipelines (e.g., Jenkins, Azure Pipelines etc).
o Skilled in automation frameworks and tools for infrastructure and application deployments.
• Incident Management:
o Proven track record in handling incidents, post-mortems, and implementing
solutions to prevent recurrence.
Leadership & Communication Skills:
• Strong people management and leadership skills with the ability to inspire and motivate teams.
• Excellent problem-solving and decision-making skills.
• Clear and concise communication, with the ability to translate technical concepts for non-technical stakeholders.
Preferred Qualifications:
• Experience with database optimization, Kafka, or other messaging systems.
• Knowledge of autoscaling techniques
• Previous experience in an SRE, DevOps, or infrastructure engineering leadership role.
• Understanding of compliance and security best practices in distributed systems.
Why Join Us?
• Be a key driver in building and scaling reliable systems in a fast-paced environment.
• Work with cutting-edge technologies and influence the evolution of the infrastructure.
• Lead a high-impact team and foster a culture of reliability and innovation.
-
Lead Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India JPMorganChase Full time ₹ 20,00,000 - ₹ 25,00,000 per yearJOB DESCRIPTIONAssume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.As a Lead Site Reliability Engineer at JPMorgan Chase within the Commercial and Investment Banking - Digital team, you hold a leadership role in your team, demonstrate...
-
Lead Site Reliability Engineer
7 days ago
Bengaluru, Karnataka, India Landmark Group Full time ₹ 8,00,000 - ₹ 12,00,000 per yearJob Title:SRE Lead (Engineering & Reliability)Job Summary:We are seeking an experienced and dynamicSite Reliability Engineering (SRE) Leadto oversee the reliability, scalability, and performance of our critical systems. As an SRE Lead, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving...
-
Site reliability engineer
3 weeks ago
Bengaluru, India HDFC Limited Full timeHiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore LocationExperience - 8 - 14 YearsJob PurposeAnalysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance.Job Responsibilities:Help build a Site Reliability...
-
Site Reliability Engineer
2 weeks ago
Bengaluru, India HDFC Limited Full timeHiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore Location Experience - 8 - 14 YearsJob PurposeAnalysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance.Job Responsibilities:Help build a Site Reliability...
-
Site Reliability Engineer
3 weeks ago
Bengaluru, India HDFC Limited Full timeHiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore Location Experience - 8 - 14 YearsJob PurposeAnalysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance.Job Responsibilities:Help build a Site Reliability...
-
Site Reliability Engineer
2 weeks ago
Bengaluru, India HDFC Limited Full timeHiring for Lead / Sr Site Reliability Engineer for Mumbai & Bangalore LocationExperience - 8 - 14 YearsJob PurposeAnalysing, troubleshooting, and designing vital services, platforms, and infrastructure on GCP while always thinking about reliability, scalability, resilience, security, and performance.Job Responsibilities:Help build a Site Reliability...
-
Lead Site Reliability Engineer
7 days ago
Bengaluru, Karnataka, India Landmark Group Full time ₹ 12,00,000 - ₹ 36,00,000 per yearCOMPANY- LANDMARK GROUPJob Title: SRE Lead (Engineering & Reliability)Experience: 8-12 yearsJob Summary:We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead tooversee the reliability, scalability, and performance of our critical systems. As an SRE Lead,you will play a pivotal role in establishing and implementing SRE practices,...
-
Site Reliability Engineer
6 days ago
Bengaluru, India Whatjobs IN C2 Full timeAbout Us: Endpoint is an interactive response technology (IRT®) systems and solutions provider that supports the life sciences industry. Since 2009, we have been working with a single vision in mind, to help sponsors and pharmaceutical companies achieve clinical trial success. Our solutions, realized through the proprietary PULSE® platform, have proven to...
-
Site Reliability Engineer
6 days ago
Bengaluru, India Endpoint Clinical Full timeAbout Us:Endpoint is an interactive response technology (IRT®) systems and solutions provider that supports the life sciences industry. Since 2009, we have been working with a single vision in mind, to help sponsors and pharmaceutical companies achieve clinical trial success. Our solutions, realized through the proprietary PULSE® platform, have proven to...
-
Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India AppHelix Full time ₹ 9,00,000 - ₹ 12,00,000 per yearRole DescriptionThis is a full-time on-site role located in Bengaluru for a Site Reliability Engineer. The Site Reliability Engineer will be responsible for maintaining and improving the reliability of AppHelix's systems. Daily tasks include monitoring system performance, troubleshooting issues, managing infrastructure, and supporting software development....