Site Reliability Engineer

4 days ago


Noida, Uttar Pradesh, India beBeeReliable Full time
Reliable System Engineer Position

We are seeking an experienced Reliable System Engineer to join our team. As a key member of our infrastructure group, you will be responsible for ensuring the reliability and scalability of our systems.

Key Responsibilities:
  • Monitoring & Alerting: Design and implement monitoring and alerting systems using Datadog to proactively identify and address potential issues, ensuring optimal system performance.
  • CICD Pipelines: Participate in the design and implementation of CICD pipelines using Azure DevOps, enabling automated and reliable software delivery.
  • Incident Response: Lead efforts in incident response and troubleshooting to quickly diagnose and resolve production incidents, minimizing downtime and impact on users.
  • Reliability Initiatives: Take ownership of reliability initiatives by identifying areas for improvement, conducting root cause analysis, and implementing solutions to prevent recurrence of incidents.
  • Collaboration: Collaborate with cross-functional teams to ensure security, compliance, and performance standards are met throughout the development lifecycle.
  • On-call Support: Participate in on-call rotations and provide 24/7 support for critical incidents, ensuring rapid response and resolution.
  • SLOs & SLIs: Work with development teams to define and establish Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to measure and maintain system reliability.
  • Documentation: Contribute to the documentation of processes, procedures, and best practices to enhance knowledge sharing within the team.
Qualifications:
  • Education: Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent work experience.
  • Experience: Minimum of 4 years of experience in a Site Reliability Engineer or similar role, managing cloud-based infrastructure on AWS with EKS.
  • AWS Expertise: Strong expertise in AWS services, especially EKS, including cluster provisioning, scaling, and management.
  • Monitoring & Observability: Proficiency in using monitoring and observability tools, with hands-on experience in Datadog or similar tools for tracking system performance and generating meaningful alerts.
  • CICD Experience: Experience in implementing CICD pipelines using Azure DevOps or similar tools to automate software deployment and testing.
  • Containerization & Orchestration: Solid understanding of containerization and orchestration technologies (e.g., Docker, Kubernetes) and their role in modern application architectures.
  • Troubleshooting: Excellent troubleshooting skills and the ability to analyze complex issues, determine root causes, and implement effective solutions.
  • Scripting & Automation: Strong scripting and automation skills (e.g., Python, Bash).
  • IaC (Infrastructure as Code): Familiarity with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.
  • Incident Management: Experience with incident management, post-incident analysis, and implementing improvements based on lessons learned.
  • Security & Compliance: Good understanding of security best practices and compliance standards in cloud environments.
  • Communication: Exceptional communication skills and the ability to collaborate effectively with cross-functional teams.
  • On-call Rotations: Willingness to participate in on-call rotations and provide off-hours support when necessary.
Preferred Qualifications:
  • Certifications: Relevant certifications such as AWS Certified DevOps Engineer, AWS Certified SRE, Kubernetes certifications.
  • Cloud Platforms: Experience with other cloud platforms (e.g., Azure, Google Cloud Platform).
  • Microservices Architecture: Familiarity with microservices architecture and service mesh technologies.
  • Application Performance Tuning: Prior experience with application performance tuning and optimization.


  • Noida, Uttar Pradesh, India HCLTech Full time

    Job Title: Site Reliability Engineer (SRE) - LEADDepartment: COEJob Summary:We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our engineering team. As an SRE, you will be responsible for ensuring the reliability, availability, and performance of our systems and services. You will work closely with development and...


  • Noida, Uttar Pradesh, India CorroHealth Full time

    Hiring AlertWe are looking for highly skilled Lead Site Reliability Engineer (SRE) for our Product Development team based out at Noida LocationOnly Immediate Joiners preferredJob DescriptionWe are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and...


  • Noida, Uttar Pradesh, India CorroHealth Full time

    Hiring AlertWe are looking for highly skilled Lead Site Reliability Engineer (SRE) for our Product Development team based out at Noida LocationOnly Immediate Joiners preferredJob DescriptionWe are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and...


  • Noida, Uttar Pradesh, India CorroHealth Full time

    Hiring AlertWe are looking for highly skilled Lead Site Reliability Engineer (SRE) for our Product Development team based out at Noida LocationOnly Immediate Joiners preferredJob DescriptionWe are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and...


  • Noida, Uttar Pradesh, India CorroHealth Full time

    Hiring AlertWe are looking for highly skilled Lead Site Reliability Engineer (SRE) for our Product Development team based out at Noida LocationOnly Immediate Joiners preferredJob DescriptionWe are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and...


  • Noida, Uttar Pradesh, India CorroHealth Full time

    Hiring AlertWe are looking for highly skilled Lead Site Reliability Engineer (SRE) for our Product Development team based out at Noida LocationOnly Immediate Joiners preferredJob DescriptionWe are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and...


  • Noida, Uttar Pradesh, India CorroHealth Full time

    Hiring AlertWe are looking for highly skilled Lead Site Reliability Engineer (SRE) for our Product Development team based out at Noida LocationOnly Immediate Joiners preferredJob DescriptionWe are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering and...


  • Noida, Uttar Pradesh, India CorroHealth Full time

    Hiring Alert We are looking for highly skilled Lead Site Reliability Engineer (SRE) for our Product Development team based out at Noida Location Only Immediate Joiners preferred Job Description We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a deep understanding of both software engineering...


  • Noida, Uttar Pradesh, India Microsoft Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    Do you want to work on a product that is used by millions of people around the world daily, and growing rapidly? Do you care deeply about how software is designed with a focus on supporting global scale? Do you want to be part of a world-class team that continuously pushes the boundary of service and engineering excellence?The Web Experience and Services...


  • Noida, Uttar Pradesh, India Celsior Full time

    This individual will play a crucial, client-facing role in Application Performance Monitoring (APM), User Experience Monitoring (UEM), and Site Reliability Engineering (SRE) solutions, translating client requirements into scalable and effective implementations. Valid Dynatrace certification is mandatory. Take complete charge of the Dynatrace Architecture,...