
Coffeebeans.io - Site Reliability Engineer II
3 weeks ago
About the Job :
We're looking for a highly skilled and self-driven Site Reliability Engineer (SRE-2) to join our team in Hyderabad. This is a full-time, work-from-office role (5 days a week) perfect for someone with 8-12 years of experience who thrives on challenges and is passionate about building robust, scalable, and highly available systems.
You'll play a crucial role in ensuring the reliability, performance, and efficiency of our critical infrastructure and applications, with a particular focus on Kubernetes, DevOps, and observability. If you have hands-on experience with ML applications, GPU optimization, and Big Data systems, you'll be an ideal fit.
Key Responsibilities :
As a Site Reliability Engineer (SRE-2), you will :
- Design, deploy, and manage highly available and scalable Kubernetes clusters and robust DevOps pipelines.
- Troubleshoot and resolve complex infrastructure and application issues across various environments.
- Implement, maintain, and enhance comprehensive observability solutions, with a strong emphasis on Thanos and related monitoring and alerting tools.
- Provide expert support for machine learning (ML) workflows, leveraging tools like MLflow and Kubeflow.
- Optimize applications to maximize performance in GPU-accelerated environments.
- Contribute individually to projects and proactively learn and adopt new technologies to stay ahead of industry trends.
- Automate repetitive tasks and streamline operational processes using a diverse set of scripting and automation tools including Python, Ansible, Groovy, and Shell scripting.
Qualifications :
To be successful in this role, you should have :
- Strong, hands-on experience with Kubernetes and a deep understanding of core DevOps principles and tools.
- Proven expertise in observability and monitoring solutions, with a strong preference for experience with Thanos.
- Demonstrable experience working with ML platforms and optimizing applications for GPU-based environments.
- CKS (Certified Kubernetes Security Specialist) certification is preferred.
- Experience with Big Data systems is a significant plus.
- Proficiency in multiple scripting and automation languages : Python, Ansible, Groovy, and Shell scripting.
- Hands-on experience with CI/CD tools such as Jenkins, Ansible, and ArgoCD.
(ref:hirist.tech)
-
Site Reliability Engineer II
3 weeks ago
Hyderabad, India JP Morgan Chase & Co. Full timeJob Description Play a key role in ensuring system reliability at one of the world's most iconic and largest financial institutions. As a Site Reliability Engineer II at JPMorgan Chase within the Corporate Technology, youwill use technology to solve business problems and leverage software engineering best practices as we strive towards excellence. This...
-
Hyderabad, India COFFEEBEANS CONSULTING LLP Full timeJob Title : Site Reliability EngineerExperience : 2 - 5 YearsLocation : HyderabadWork Mode : Work From Office (5 Days a Week)Overview :We are seeking a proactive and technically skilled Site Reliability Engineer with a strong background in Kubernetes and DevOps practices. This role requires a self-starter who is enthusiastic about automation,...
-
Site Reliability Engineer Ii
3 weeks ago
Hyderabad, Telangana, India JPMorgan Chase Full timeJob Category Software Engineering Play a key role in ensuring system reliability at one of the world s most iconic and largest financial institutions As a Site Reliability Engineer II at JPMorgan Chase within the Corporate Technology you will use technology to solve business problems and leverage software engineering best practices as we strive...
-
- Site Reliability Engineer II
4 weeks ago
Hyderabad, Telangana, India COFFEEBEANS CONSULTING LLP Full timeAbout the Job :We're looking for a highly skilled and self-driven Site Reliability Engineer (SRE-2) to join our team in Hyderabad. This is a full-time, work-from-office role (5 days a week) perfect for someone with 8-12 years of experience who thrives on challenges and is passionate about building robust, scalable, and highly available systems.You'll play a...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Talent Worx Full time ₹ 15,00,000 - ₹ 25,00,000 per yearSite Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...
-
Site Reliability Engineer
5 days ago
Hyderabad, India Talent Worx Full timeSite Reliability Engineer (SRE) At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, India Jigya Software Services Full timeJob Title:Senior Site Reliability Engineer (SRE) - AWS/Kubernetes Location:Hyderabad - Onsite Job Type:Full-Time About the Role: We are looking for a highly skilled and motivated Site Reliability Engineer to design, build, and maintain our high-performance, scalable cloud infrastructure. You will play a critical role in ensuring the reliability, performance,...
-
Site Reliability Engineer
6 days ago
Hyderabad, Telangana, India Jigya Software Services Full time ₹ 1,50,000 - ₹ 28,00,000 per yearJob Title:Senior Site Reliability Engineer (SRE) - AWS/KubernetesLocation:Hyderabad - OnsiteJob Type:Full-TimeAbout the Role:We are looking for a highly skilled and motivated Site Reliability Engineer to design, build, and maintain our high-performance, scalable cloud infrastructure. You will play a critical role in ensuring the reliability, performance, and...
-
Site Reliability Engineer II
1 week ago
Greater Hyderabad Area, India Candescent Full time ₹ 5,00,000 - ₹ 15,00,000 per yearCandescent is the largest non-core digital banking provider. We bring together the transformative technologies that power and connect account opening, digital banking and branch solutions for banks and credit unions of all sizes on any core. Our Candescent solutions power the top three U.S. mobile banking apps and are trusted by banks and credit unions of...
-
Site reliability engineer
3 days ago
Hyderabad, India Talentiser Full timeHiring hybrid Site Reliability Engineers for a fast-growing product company building scalable tech solutions and transforming how businesses run mission-critical operations. Our Saa S platform is designed for high performance, reliability, and automation at scale. Your Impact As a Site Reliability Engineer , you’ll play a key role in ensuring ...