Site reliability engineer

4 weeks ago


Bengaluru, India Resource Algorithm Full time

Senior SRE (Engineering & Reliability)Job Summary:We are seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to oversee the reliability, scalability, and performance of our critical systems. As an Senior SRE, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving automation, monitoring, and incident response strategies. This position combines software engineering and systems engineering expertise to build and maintain high-performing, reliable systems.Experience:7+ yearsKey Responsibilities:Reliability & Performance:• Lead efforts to maintain high availability and reliability of critical services.• Define and monitor SLIs, SLOs, and SLAs to ensure business requirements are met.• Proactively identify and resolve performance bottlenecks and system inefficiencies. Incident Management & Response:• Establish and improve incident management processes and on-call rotations.• Lead incident response and root cause analysis for high-priority outages.• Drive post-incident reviews and ensure actionable insights are implemented.Automation & Tooling:• Develop and implement automated solutions to reduce manual operational tasks.• Enhance system observability through metrics, logging, and distributed tracing tools (e.g., Prometheus, Grafana, Elastic APM).• Optimize CI/CD pipelines for seamless deployments.Collaboration:• Partner with software engineering teams to improve the reliability of applications and infrastructure.• Work closely with product/ engineering teams to design scalable and robust systems.• Ensure seamless integration of monitoring and alerting systems across teams. Leadership & Team Building:• Manage, mentor, and grow a team of SREs.• Promote SRE best practices and foster a culture of reliability and performance across the organization.• Drive performance reviews, skills development, and career progression for team members. Capacity Planning & Cost Optimization:• Perform capacity planning and implement autoscaling solutions to handle traffic spikes.• Optimize infrastructure and cloud costs while maintaining reliability and performance.Skills & Qualifications:Required Skills:• Technical Expertise: o Experience with cloud platforms (AWS / Azure / GCP) and Kubernetes.o Hands-on knowledge of infrastructure-as-code tools like Terraform /Helm/ Ansible.o Proficiency in Java o Expertise in distributed systems, databases, and load balancing.• Monitoring & Observability:o Proficient with tools like Prometheus, Grafana,, Elastic APM, or New relic.o Understanding of metrics-driven approaches for system monitoring and alerting.• Automation & CI/CD:o Hands-on experience with CI/CD pipelines (e.g., Jenkins, Azure Pipelines etc).o Skilled in automation frameworks and tools for infrastructure and application deployments.• Incident Management:o Proven track record in handling incidents, post-mortems, and implementing solutions to prevent recurrence.Leadership & Communication Skills:• Strong people management and leadership skills with the ability to inspire and motivate teams.• Excellent problem-solving and decision-making skills.• Clear and concise communication, with the ability to translate technical concepts for non-technical stakeholders.Preferred Qualifications:• Experience with database optimization, Kafka, or other messaging systems.• Knowledge of autoscaling techniques• Previous experience in an SRE, Dev Ops, or infrastructure engineering leadership role.• Understanding of compliance and security best practices in distributed systems.



  • Bengaluru, Karnataka, India Thakral One Full time US$ 60,000 - US$ 1,20,000 per year

    Company DescriptionThakral One, headquartered in Singapore, is a technology consulting and services company with a strong presence across Asia. The company specializes in technology-driven consulting, custom solution development, data analytics, and leveraging cloud capabilities to deliver enhanced decision support and practical outcomes. Collaborating...


  • Bengaluru, Karnataka, India FIS Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    About the Role :Site Reliability Engineer (SRE)with deep expertise inMainframe technologies like COBOL, JCL, etc. to support and enhance ourCard Management & Payment processing functions. This role will be responsible for ensuring reliability, high availability, scalability, stability and performance of mission-critical mainframe software applications and...


  • Bengaluru, India Whatjobs IN C2 Full time

    Site Reliability Engineer (SRE) Level 3 Overview: A Site Reliability Engineer (SRE) Level 3 is a senior technical leadership role focused on designing, implementing, and maintaining large-scale, complex, and highly reliable systems. This role emphasizes a blend of software and systems engineering to ensure the availability, latency, performance, and capacity...


  • Bengaluru, India Relanto Full time

    Job Description Job Title: Site Reliability Engineer Summary We are looking for a Site Reliability Engineer to join our Digital & Transformation department. The ideal candidate will have 2-3 years of experience in this field and will be responsible for ensuring the reliability, availability, and performance of our systems and applications. Roles And...


  • Bengaluru, Karnataka, India Viraaj HR Solutions Private Limited Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Site Reliability Engineer (SRE)About The OpportunityA fast-growing organization in the Enterprise Cloud Infrastructure & SaaS sector delivering highly available, mission-critical services to enterprise customers. We are hiring an on-site Site Reliability Engineer in India to own reliability, automation, and operational excellence across cloud-native...


  • Bengaluru, India ViewSonic Full time

    Job Requirements:Bachelor's degree in Computer Science, Engineering, or a related field.3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory.Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS.Interest and understanding of Platform Engineering...


  • Bengaluru, India ViewSonic Full time

    Job Requirements:Bachelor's degree in Computer Science, Engineering, or a related field.3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory.Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS.Interest and understanding of Platform Engineering...


  • Bengaluru, India ViewSonic Full time

    Job Requirements: 1. Bachelor's degree in Computer Science, Engineering, or a related field. 2. 3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. 3. Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS. 4. Interest and understanding of...


  • Bengaluru, India IntraEdge Full time

    Job Title: Site Reliability Engineer (SRE) – Production SupportLocation: BengaluruJob Summary:We are looking for a skilled Site Reliability Engineer (SRE) with strong experience in production support, Dev Ops practices, and cloud infrastructure management. The ideal candidate will be responsible for maintaining the reliability, performance, and scalability...


  • Bengaluru, India eBay Full time

    This job is with eBay, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.At eBay, we're more than a global ecommerce leader - we're changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190...