Site Reliability Engineer

2 months ago


gurugram, India DotPe Full time

Role Summary:

We are seeking a dedicated and skilled Site Reliability Engineer (SRE) with a minimum of 2 years of hands-on experience. The ideal candidate will have a strong foundation in Linux, networking, AWS, and Kubernetes, along with expertise in monitoring and alerting tools.


As an SRE, you will play a critical role in ensuring the reliability, scalability, and performance of our systems and applications. You must be flexible to work in shifts, including nights and weekends, to support our 24/7 operations.


Key Responsibilities:

● System Administration: Maintain, monitor, and troubleshoot Linux-based servers and systems, ensuring their stability, performance, and security.

● Cloud Infrastructure: Manage and optimise AWS infrastructure, ensuring high availability, scalability, and cost-effectiveness.

● Kubernetes Management: Deploy, manage, and monitor containerized applications in Kubernetes clusters, ensuring efficient resource utilisation and uptime.

● Networking: Monitor and maintain network infrastructure, troubleshoot issues, and ensure

secure, efficient data flow across systems.

● Monitoring and Alerting: Implement, configure, and maintain monitoring and alerting tools (e.g., Prometheus, Grafana, Opensearch etc) to proactively identify and address system issues.

● Incident Response: Respond to system alerts, troubleshoot problems, and ensure timely resolution of incidents to minimise downtime.

● Automation: Develop and maintain scripts and tools for automation of routine tasks, improving efficiency.

● Documentation: Create and maintain detailed documentation for system configurations, procedures, and troubleshooting steps.

● Collaboration: Work closely with development, operations, and other teams to ensure seamless integration and support of new and existing systems.

● Continuous Improvement: Identify areas for improvement in system reliability, performance, and efficiency, and implement solutions.


Required Skills:

● Basic understanding of AWS services and infrastructure

● Strong proficiency in Linux administration

● Fundamentals of networking

● Experience with Kubernetes at an Associate level

● Expertise in monitoring and alerting tools such as Prometheus, Grafana, and Alertmanager

● Familiarity with incident management tools like Squadcast

● Proficiency in Bash or Python scripting


Qualifications:

● Minimum of 2 years of hands-on experience in a similar role

● Ability to work in shifts, including nights and weekends

● Strong problem-solving skills and attention to detail

● Excellent communication and collaboration abilities



  • Gurgaon/Gurugram/Bangalore, India Grizmo Labs Full time

    Job Title: Site Reliability EngineerGrizmo Labs is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key Responsibilities:Design and implement scalable and highly available...


  • gurugram, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • gurugram, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • Gurugram, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • Gurugram, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • Gurugram, India Bijak Full time

    As a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...


  • Gurugram, India DotPe Full time

    Role Summary:We are seeking a dedicated and skilled Site Reliability Engineer (SRE) with a minimum of 2 years of hands-on experience. The ideal candidate will have a strong foundation in Linux, networking, AWS, and Kubernetes, along with expertise in monitoring and alerting tools. As an SRE, you will play a critical role in ensuring the reliability,...


  • Gurugram, India DotPe Full time

    Role Summary:We are seeking a dedicated and skilled Site Reliability Engineer (SRE) with a minimum of 2 years of hands-on experience. The ideal candidate will have a strong foundation in Linux, networking, AWS, and Kubernetes, along with expertise in monitoring and alerting tools. As an SRE, you will play a critical role in ensuring the reliability,...


  • gurugram, India DotPe Full time

    Role Summary:We are seeking a dedicated and skilled Site Reliability Engineer (SRE) with a minimum of 2 years of hands-on experience. The ideal candidate will have a strong foundation in Linux, networking, AWS, and Kubernetes, along with expertise in monitoring and alerting tools. As an SRE, you will play a critical role in ensuring the reliability,...


  • Gurugram, India AMEX Full time

    You Lead the Way. Weve Got Your Back. With the right backing, people and businesses have the power to progress in incredible ways. When you join Team Amex, you become part of a global and diverse community of colleagues with an unwavering commitment to back our customers, communities and each other. Here, youll learn and grow as we help you create a career...

  • Senior Engineer

    3 weeks ago


    Gurugram, India Callisto Talent Solutions Private limited Full time

    AM - Site Reliability Engineer - F2F Interviews onlyOur client is a leading global investment banking firm specializing in Financial Services like Advisory and capital raising, financing, investing, leasing, research, trading and hedging, and banking, advice and intermediary services, and funds management. They are setting up new SRE team based in Gurugram...


  • Gurugram, India RELX India (Pvt) Ltd Risk div Company Full time

    About the role We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) to manage and optimize our AWS cloud resources. The ideal candidate will have a strong background in AWS, Terraform, Kubernetes, and scripting, with proficiency in monitoring and CI/CD tools. Experience with Hashicorp Vault is a plus. Responsibilities: ...


  • Gurugram, India upGrad Full time

    Location : Gurugram, Kolkata, Hyderabad, Jaipur, Bangalore. Experience : 4+.Job Description :Job Specification for Site Reliability Engineer (SRE) Futures First. Skills and Experience Required :- 4+ years of experience in a Site Reliability Engineering (SRE) or DevOps role.- Strong expertise in managing high-performance, low-latency distributed systems,...


  • Gurugram, India Cvent Full time

    Site Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...


  • Gurugram, India Cvent Full time

    Site Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...


  • gurugram, India Cvent Full time

    Site Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...


  • Gurugram, India Cvent Full time

    Site Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...


  • gurugram, India Cvent Full time

    Site Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...

  • Senior Engineer

    3 weeks ago


    Gurgaon/Gurugram, IN Callisto Talent Solutions Private limited Full time

    AM - Site Reliability Engineer - F2F Interviews onlyOur client is a leading global investment banking firm specializing in Financial Services like Advisory and capital raising, financing, investing, leasing, research, trading and hedging, and banking, advice and intermediary services, and funds management. They are setting up new SRE team based in Gurugram...


  • Gurgaon/Gurugram, IN upGrad Full time

    Location : Gurugram, Kolkata, Hyderabad, Jaipur, Bangalore. Experience : 4+.Job Description :Job Specification for Site Reliability Engineer (SRE) Futures First. Skills and Experience Required :- 4+ years of experience in a Site Reliability Engineering (SRE) or DevOps role.- Strong expertise in managing high-performance, low-latency distributed systems,...