Coffeebeans.io - Site Reliability Engineer II

3 weeks ago


Hyderabad, India COFFEEBEANS CONSULTING LLP Full time

About the Job :


We're looking for a highly skilled and self-driven Site Reliability Engineer (SRE-2) to join our team in Hyderabad. This is a full-time, work-from-office role (5 days a week) perfect for someone with 8-12 years of experience who thrives on challenges and is passionate about building robust, scalable, and highly available systems.

You'll play a crucial role in ensuring the reliability, performance, and efficiency of our critical infrastructure and applications, with a particular focus on Kubernetes, DevOps, and observability. If you have hands-on experience with ML applications, GPU optimization, and Big Data systems, you'll be an ideal fit.

Key Responsibilities :


As a Site Reliability Engineer (SRE-2), you will :

- Design, deploy, and manage highly available and scalable Kubernetes clusters and robust DevOps pipelines.

- Troubleshoot and resolve complex infrastructure and application issues across various environments.

- Implement, maintain, and enhance comprehensive observability solutions, with a strong emphasis on Thanos and related monitoring and alerting tools.

- Provide expert support for machine learning (ML) workflows, leveraging tools like MLflow and Kubeflow.

- Optimize applications to maximize performance in GPU-accelerated environments.

- Contribute individually to projects and proactively learn and adopt new technologies to stay ahead of industry trends.

- Automate repetitive tasks and streamline operational processes using a diverse set of scripting and automation tools including Python, Ansible, Groovy, and Shell scripting.

Qualifications :


To be successful in this role, you should have :

- Strong, hands-on experience with Kubernetes and a deep understanding of core DevOps principles and tools.

- Proven expertise in observability and monitoring solutions, with a strong preference for experience with Thanos.

- Demonstrable experience working with ML platforms and optimizing applications for GPU-based environments.

- CKS (Certified Kubernetes Security Specialist) certification is preferred.

- Experience with Big Data systems is a significant plus.

- Proficiency in multiple scripting and automation languages : Python, Ansible, Groovy, and Shell scripting.

- Hands-on experience with CI/CD tools such as Jenkins, Ansible, and ArgoCD.


(ref:hirist.tech)

  • Hyderabad, India JP Morgan Chase & Co. Full time

    Job Description Play a key role in ensuring system reliability at one of the world's most iconic and largest financial institutions. As a Site Reliability Engineer II at JPMorgan Chase within the Corporate Technology, youwill use technology to solve business problems and leverage software engineering best practices as we strive towards excellence. This...


  • Hyderabad, India COFFEEBEANS CONSULTING LLP Full time

    Job Title : Site Reliability EngineerExperience : 2 - 5 YearsLocation : HyderabadWork Mode : Work From Office (5 Days a Week)Overview :We are seeking a proactive and technically skilled Site Reliability Engineer with a strong background in Kubernetes and DevOps practices. This role requires a self-starter who is enthusiastic about automation,...


  • Hyderabad, Telangana, India JPMorgan Chase Full time

    Job Category Software Engineering Play a key role in ensuring system reliability at one of the world s most iconic and largest financial institutions As a Site Reliability Engineer II at JPMorgan Chase within the Corporate Technology you will use technology to solve business problems and leverage software engineering best practices as we strive...


  • Hyderabad, Telangana, India COFFEEBEANS CONSULTING LLP Full time

    About the Job :We're looking for a highly skilled and self-driven Site Reliability Engineer (SRE-2) to join our team in Hyderabad. This is a full-time, work-from-office role (5 days a week) perfect for someone with 8-12 years of experience who thrives on challenges and is passionate about building robust, scalable, and highly available systems.You'll play a...


  • Hyderabad, Telangana, India Talent Worx Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Site Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...


  • Hyderabad, India Talent Worx Full time

    Site Reliability Engineer (SRE) At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...


  • Hyderabad, India Jigya Software Services Full time

    Job Title:Senior Site Reliability Engineer (SRE) - AWS/Kubernetes Location:Hyderabad - Onsite Job Type:Full-Time About the Role: We are looking for a highly skilled and motivated Site Reliability Engineer to design, build, and maintain our high-performance, scalable cloud infrastructure. You will play a critical role in ensuring the reliability, performance,...


  • Hyderabad, Telangana, India Jigya Software Services Full time ₹ 1,50,000 - ₹ 28,00,000 per year

    Job Title:Senior Site Reliability Engineer (SRE) - AWS/KubernetesLocation:Hyderabad - OnsiteJob Type:Full-TimeAbout the Role:We are looking for a highly skilled and motivated Site Reliability Engineer to design, build, and maintain our high-performance, scalable cloud infrastructure. You will play a critical role in ensuring the reliability, performance, and...


  • Greater Hyderabad Area, India Candescent Full time ₹ 5,00,000 - ₹ 15,00,000 per year

    Candescent is the largest non-core digital banking provider. We bring together the transformative technologies that power and connect account opening, digital banking and branch solutions for banks and credit unions of all sizes on any core. Our Candescent solutions power the top three U.S. mobile banking apps and are trusted by banks and credit unions of...


  • Hyderabad, India Talentiser Full time

    Hiring hybrid Site Reliability Engineers for a fast-growing product company building scalable tech solutions and transforming how businesses run mission-critical operations. Our Saa S platform is designed for high performance, reliability, and automation at scale. Your Impact As a Site Reliability Engineer , you’ll play a key role in ensuring ...