Site Reliability Engineer

3 weeks ago


Bengaluru Karnataka India, Karnataka Resource Algorithm Full time

Senior SRE (Engineering & Reliability)

Job Summary:

We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to oversee the reliability, scalability, and performance of our critical systems. As an SeniorSRE, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving automation, monitoring, and incident response strategies. This position combines software engineering and systems engineering expertise to build and maintain high-performing, reliable systems.

Experience:7+ years

Key Responsibilities:

Reliability & Performance:

• Lead efforts to maintain high availability and reliability of critical services.

• Define and monitor SLIs, SLOs, and SLAs to ensure business requirements are met.

• Proactively identify and resolve performance bottlenecks and system inefficiencies. Incident Management & Response:

• Establish and improve incident management processes and on-call rotations.

• Lead incident response and root cause analysis for high-priority outages.

• Drive post-incident reviews and ensure actionable insights are implemented.

Automation & Tooling:

• Develop and implement automated solutions to reduce manual operational tasks.

• Enhance system observability through metrics, logging, and distributed tracing tools (e.g., Prometheus, Grafana, Elastic APM).

• Optimize CI/CD pipelines for seamless deployments.

Collaboration:

• Partner with software engineering teams to improve the reliability of applications and infrastructure.

• Work closely with product/ engineering teams to design scalable and robust systems.

• Ensure seamless integration of monitoring and alerting systems across teams. Leadership & Team Building:

• Manage, mentor, and grow a team of SREs.

• Promote SRE best practices and foster a culture of reliability and performance across the organization.

• Drive performance reviews, skills development, and career progression for team members. Capacity Planning & Cost Optimization:

• Perform capacity planning and implement autoscaling solutions to handle traffic spikes.

• Optimize infrastructure and cloud costs while maintaining reliability and performance.

Skills & Qualifications:

Required Skills:

• Technical Expertise: o Experience with cloud platforms (AWS / Azure / GCP) and Kubernetes.

o Hands-on knowledge of infrastructure-as-code tools like Terraform /Helm/ Ansible.

o Proficiency in Java o Expertise in distributed systems, databases, and load balancing.

Monitoring & Observability:

o Proficient with tools like Prometheus, Grafana,, Elastic APM, or New relic.

o Understanding of metrics-driven approaches for system monitoring and alerting.

• Automation & CI/CD:

o Hands-on experience with CI/CD pipelines (e.g., Jenkins, Azure Pipelines etc).

o Skilled in automation frameworks and tools for infrastructure and application deployments.

• Incident Management:

o Proven track record in handling incidents, post-mortems, and implementing solutions to prevent recurrence.

Leadership & Communication Skills:

• Strong people management and leadership skills with the ability to inspire and motivate teams.

• Excellent problem-solving and decision-making skills.

• Clear and concise communication, with the ability to translate technical concepts for non-technical stakeholders.

Preferred Qualifications:

• Experience with database optimization, Kafka, or other messaging systems.

• Knowledge of autoscaling techniques

• Previous experience in an SRE, DevOps, or infrastructure engineering leadership role.

• Understanding of compliance and security best practices in distributed systems.



  • Bengaluru, Karnataka, India, Karnataka WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • Bengaluru, Karnataka, India, Karnataka IntraEdge Full time

    Job Title: Site Reliability Engineer (SRE) – Production SupportLocation: BengaluruJob Summary:We are looking for a skilled Site Reliability Engineer (SRE) with strong experience in production support, DevOps practices, and cloud infrastructure management. The ideal candidate will be responsible for maintaining the reliability, performance, and scalability...


  • Bengaluru, Karnataka, India, Karnataka Tata Consultancy Services Full time

    Role**: Manager, Site Reliability EngineeringRequired Technical Skill Set: Manager, Site Reliability EngineeringDesired Experience Range: 12 - 18 yrsNotice Period: Immediate to 90Days onlyLocation of Requirement: BangaloreWe are currently planning to do a Virtual Interview Job Description:Describe what the person will do in the role - how he/she will impact...


  • Bengaluru, Karnataka, India, Karnataka JRD Systems Full time

    Position: Site Reliability Engineer (SRE) Role Overview: We are seeking an experienced Site Reliability Engineer (SRE) with a strong background in Windows infrastructure to manage and optimize our cloud and on-premises environments. The ideal candidate will partner with development teams to improve service reliability, implement automation, and ensure...


  • Bengaluru, Karnataka, India, Karnataka CodeKarma Full time

    Site Reliability Engineer (Multi-Cloud Deployments)Location: Bangalore / RemoteExperience: 4–10 yearsType: Full-time (6-month probation)About CodeKarmaCodeKarma is redefining how engineering teams understand and evolve complex systems — bringing production context directly into the developer’s workflow.Our platform runs both as SaaS and as sub-account...


  • Bengaluru, Karnataka, India AppHelix Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Role DescriptionThis is a full-time on-site role located in Bengaluru for a Site Reliability Engineer. The Site Reliability Engineer will be responsible for maintaining and improving the reliability of AppHelix's systems. Daily tasks include monitoring system performance, troubleshooting issues, managing infrastructure, and supporting software development....


  • Bengaluru, Karnataka, India, Karnataka Brillio Full time

    Hiring: Senior Infrastructure Technical Specialist (SRE Experience) Location: Bengaluru, Pune, Chennai Mode of work: 3 days WFO Experience: Senior LevelWe’re looking for a Senior Infrastructure Technical Specialist with strong Site Reliability Engineering (SRE) expertise to join our dynamic team. The ideal candidate will have hands-on experience with IT...


  • Bengaluru, Karnataka, India Programming Full time ₹ 10,00,000 - ₹ 25,00,000 per year

    Role - Site Reliability Engineering.Location - BengaluruYears of Expereince - 4+ YearsProfessional & Technical Skills:Must To Have Skills: Proficiency in Site Reliability Engineering.Good To Have Skills: Experience with cloud service providers such as AWS, Azure, or Google Cloud.Strong understanding of CI/CD tools and practices.Experience with container...


  • Bengaluru, Karnataka, India super Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    Site Reliability Engineer (SRE) Level 3Overview:A Site Reliability Engineer (SRE) Level 3 is a senior technical leadership role focused on designing, implementing, and maintaining large-scale, complex, and highly reliable systems. This role emphasizes a blend of software and systems engineering to ensure the availability, latency, performance, and capacity...


  • Bengaluru, Karnataka, India eBay Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    At eBay, we're more than a global ecommerce leader — we're changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190 markets around the world. We're committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.Our customers are our compass, authenticity...