Site Reliability Engineer

2 months ago


Gurgaon, India DotPe Full time
Role Summary:We are seeking a dedicated and skilled Site Reliability Engineer (SRE) with a minimum of 2 years of hands-on experience. The ideal candidate will have a strong foundation in Linux, networking, AWS, and Kubernetes, along with expertise in monitoring and alerting tools.

As an SRE, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and applications. You must be flexible to work in shifts, including nights and weekends, to support our 24/7 operations.

Key Responsibilities:● System Administration: Maintain, monitor, and troubleshoot Linux-based servers and systems, ensuring their stability, performance, and security.● Cloud Infrastructure: Manage and optimise AWS infrastructure, ensuring high availability, scalability, and cost-effectiveness.● Kubernetes Management: Deploy, manage, and monitor containerized applications in Kubernetes clusters, ensuring efficient resource utilisation and uptime.● Networking: Monitor and maintain network infrastructure, troubleshoot issues, and ensuresecure, efficient data flow across systems.● Monitoring and Alerting: Implement, configure, and maintain monitoring and alerting tools (e.g., Prometheus, Grafana, Opensearch etc) to proactively identify and address system issues.● Incident Response: Respond to system alerts, troubleshoot problems, and ensure timely resolution of incidents to minimise downtime.● Automation: Develop and maintain scripts and tools for automation of routine tasks, improving efficiency.● Documentation: Create and maintain detailed documentation for system configurations, procedures, and troubleshooting steps.● Collaboration: Work closely with development, operations, and other teams to ensure seamless integration and support of new and existing systems.● Continuous Improvement: Identify areas for improvement in system reliability, performance, and efficiency, and implement solutions.

Required Skills:● Basic understanding of AWS services and infrastructure● Strong proficiency in Linux administration● Fundamentals of networking● Experience with Kubernetes at an Associate level● Expertise in monitoring and alerting tools such as Prometheus, Grafana, and Alertmanager● Familiarity with incident management tools like Squadcast● Proficiency in Bash or Python scripting

Qualifications:● Minimum of 2 years of hands-on experience in a similar role● Ability to work in shifts, including nights and weekends● Strong problem-solving skills and attention to detail● Excellent communication and collaboration abilities

  • Gurgaon, Haryana, India upGrad Full time

    Job Title: Site Reliability EngineerAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team and play a pivotal role in the operation, support, and security of our Core production infrastructure. As a Site Reliability Engineer, you will be responsible for overseeing the monitoring of both internal and production environments,...


  • Gurgaon, Haryana, India Citadel Securities Full time

    Job Title: Site Reliability EngineerCitadel Securities is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our distributed systems and applications.Responsibilities:Design and implement scalable and reliable systems and...


  • Gurgaon, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • Gurgaon, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • gurgaon, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • Gurgaon, Haryana, India Majid al futtaim Full time

    Job Title: Site Reliability EngineerAt Majid Al Futtaim, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a key role in ensuring the reliability and scalability of our cloud-based infrastructure.Key Responsibilities:Lead the implementation of critical platform enhancements.Mentor and...


  • Gurgaon, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • Gurgaon, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • gurgaon, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • gurgaon, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • Gurgaon/Gurugram/Bangalore, India Grizmo Labs Full time

    Job Title: Site Reliability EngineerGrizmo Labs is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key Responsibilities:Design and implement scalable and highly available...


  • Gurgaon, Haryana, India StatusNeo Full time

    About StatusNeoWe're a leader in digital transformation, leveraging cutting-edge technologies to empower organizations globally. Our commitment to continuous learning and improvement provides an unparalleled platform for professional growth.Job DescriptionWe're seeking a talented Site Reliability Engineer to join our team. As a pioneer in Platform and...


  • Gurgaon, Haryana, India UnitedHealth Group Full time

    Optum Site Reliability EngineerAt Optum, we're dedicated to delivering care and improving health outcomes for millions of people worldwide. As a Site Reliability Engineer, you'll play a critical role in ensuring the smooth operation of our applications and infrastructure, leveraging your expertise in cloud computing, DevOps, and security to drive business...


  • Gurgaon, Haryana, India UnitedHealth Group Full time

    Cloud-Optimized Site Reliability Engineer Opportunity at UnitedHealth GroupWe are seeking a skilled Cloud-Optimized Site Reliability Engineer to join our team at UnitedHealth Group. As a Cloud-Optimized Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based applications.Main...


  • Gurgaon, Haryana, India RELX India (Pvt) Ltd Risk div Company Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at RELX India (Pvt) Ltd Risk div Company. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining highly available and scalable container-based infrastructure using Docker and Kubernetes.Key ResponsibilitiesDesign and...


  • Gurgaon, Haryana, India Callisto Talent Solutions Private limited Full time

    Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Callisto Talent Solutions Private limited. The ideal candidate will have a strong background in reliability engineering, application performance monitoring, and experience with APM tools.Responsibilities- Design and implement reliable and performant systems within...


  • Gurgaon, Haryana, India UnitedHealth Group Full time

    Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives.The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best.Here, you will find a culture guided by diversity and inclusion,...


  • Gurgaon, Haryana, India PayU Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team in Gurgaon/Pune/Mumbai. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our systems, as well as providing exceptional support to our customers.Key ResponsibilitiesDesign and implement scalable and reliable systems to...


  • Gurgaon, Haryana, India AMEX Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Global Risk and Compliance Technology team at American Express. As a key member of our team, you will be responsible for leading the development and implementation of a comprehensive SRE strategy aligned with our company's goals and objectives.Key...


  • Gurgaon, Haryana, India PayU Full time

    Job Title: Site Reliability EngineerAt PayU, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our systems and services, and for identifying and implementing improvements to our infrastructure and processes.Key Responsibilities:Provide...