
Site Reliability Engineer Leader
1 day ago
Job Title: SRE Lead
As a Site Reliability Engineering (SRE) expert, you will lead the design, implementation, and maintenance of high-performing and reliable systems. You will oversee the scalability and performance of our critical services, ensuring they meet business requirements.
The ideal candidate has 6+ years of experience in software engineering and systems engineering. A strong background in cloud platforms (AWS / Azure / GCP), Kubernetes, and infrastructure-as-code tools like Terraform / Helm/ Ansible is required.
- Key Responsibilities:
- Reliability & Performance:
- Ensure high availability and reliability of critical services.
- Monitor and enforce Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs).
- Identify and resolve performance bottlenecks and system inefficiencies.
- Incident Management & Response:
- Establish incident management processes and on-call rotations.
- Lead root cause analysis for high-priority outages.
- Implement post-incident reviews to ensure actionable insights.
- Automation & Tooling:
- Develop automated solutions to reduce manual operational tasks.
- Enhance system observability through metrics, logging, and distributed tracing tools.
- Optimize Continuous Integration/Continuous Deployment (CI/CD) pipelines.
- Collaboration:
- Partner with software engineering teams to improve application and infrastructure reliability.
- Work closely with product/ engineering teams to design scalable and robust systems.
- Integrate monitoring and alerting systems across teams.
- Leadership & Team Building:
- Manage, mentor, and grow a team of SREs.
- Promote SRE best practices and foster a culture of reliability and innovation.
- Drive skills development and career progression for team members.
- Capacity Planning & Cost Optimization:
- Perform capacity planning and implement autoscaling solutions.
- Optimize infrastructure costs while maintaining reliability and performance.
Skills & Qualifications:
A successful candidate must have the following skills:
- Technical Expertise:
- Proficiency in Java and expertise in distributed systems, databases, and load balancing.
- Monitoring & Observability:
- Experience with tools like Prometheus, Grafana, Elastic APM, or New Relic.
- Automation & CI/CD:
- Hands-on experience with CI/CD pipelines and automation frameworks.
- Incident Management:
- Proven track record in handling incidents and implementing solutions.
Why this role?
This is an excellent opportunity to lead a high-impact team and foster a culture of reliability and innovation. You will work with cutting-edge technologies and influence the evolution of the infrastructure. Join us to drive the growth of our organization and create reliable systems.
-
Site Reliability Leader
3 days ago
Bengaluru, Karnataka, India beBeeReliabilityEngineer Full time ₹ 15,00,000 - ₹ 20,00,000Senior Site Reliability Engineer PositionSynopsis: We seek a highly skilled Senior Site Reliability Engineer to spearhead our platform's reliability, scalability, and performance.Job Description:This role is instrumental in ensuring the seamless operation of our infrastructure and applications. Key responsibilities include designing, implementing, and...
-
Site Reliability Engineer
3 days ago
Bengaluru, Karnataka, India Enterprise Minds, Inc Full timeWe're Hiring | Site Reliability Engineer | 8-10 years
-
Site Reliability Engineer
4 weeks ago
Bengaluru, Karnataka, India myGwork - LGBTQ+ Business Community Full timeJob DescriptionThis job is with eBay, an inclusive employer and a member of myGwork the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.At eBay, we&aposre more than a global ecommerce leader - we&aposre changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in...
-
site reliability engineer
1 day ago
Bengaluru, Karnataka, India Randstad Full timeRole: Site Reliability Engineer SummaryThe Network Engineer 2 provides technical design, planning, operation, maintenance, and advanced troubleshooting of the Bread Financials' network infrastructure. This position ensures continuity and alignment of the network administration/engineering direction. This position supports Bread Financials' strategies and...
-
Site Reliability Engineer
5 days ago
Bengaluru, Karnataka, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes . In this role, you will focus on monitoring , basic troubleshooting , and incident response , helping to maintain high...
-
Site Reliability Engineer
1 day ago
Bengaluru, Karnataka, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...
-
Site Reliability Engineer
7 days ago
Bengaluru, Karnataka, India beBeeReliability Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Role OverviewAs a Site Reliability Engineer, you will play a pivotal role in driving innovation and modernizing complex systems by leveraging cutting-edge technologies and collaboration with cross-functional teams.
-
Site Reliability Engineer
3 days ago
Bengaluru, Karnataka, India Coforge Full timeJob Description- Design, implement, and maintain scalable infrastructure to ensure high availability and performance of software applications.- Collaborate with development teams to identify and resolve issues affecting application performance, stability, and reliability.- Develop automated monitoring scripts using tools like Prometheus, Grafana, etc. to...
-
Site Reliability Engineering
3 days ago
Bengaluru, Karnataka, India Infrasoft Technologies Limited Full timeJob DescriptionJob Title: DeveloperWork Location: Bangalore, KarnatakaExperience Range: 68 YearsJob Description:We are looking for a skilled Developer with strong hands-on experience in Site Reliability Engineering (SRE), Java, JavaScript, and Production Support. The ideal candidate should have a solid background in application monitoring and troubleshooting...
-
Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, India Collabera Full timeJob Description As a Principal/Chief Site Reliability Engineer , you will play a critical role in designing, developing, and maintaining scalable and highly reliable systems. You'll work closely with development teams to improve system reliability, monitor critical applications, and design fail-proof infrastructure. Responsibilities Design and implement...