Senior Cloud Reliability Engineer

5 days ago


Bengaluru, Karnataka, India beBeeSiteReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

About this role

This is an exceptional opportunity to leverage your expertise in Site Reliability Engineering to drive scalability and reliability for our systems.

We are seeking a highly skilled professional with 8-12 years of experience in SRE, DevOps or a related field. As a key member of the team, you will be responsible for ensuring the smooth operation of our systems, identifying areas for improvement and implementing solutions to optimize their performance.

Key Responsibilities

  • Design and implement reliable and scalable systems using Kubernetes, Docker, and Istio.
  • Monitor system performance and respond to incidents as they arise, utilizing Datadog for observability.
  • Develop automation scripts for deployment and monitoring.
  • Leverage GitOps to ensure that software can reliably and smoothly be shipped to production.
  • Collaborate with development teams to identify and resolve reliability issues.
  • Conduct load testing to verify that systems can handle expected loads for new products and updates to existing products.
  • Implement A/B deployments, canary deployments, and traffic mirroring strategies to ensure critical updates go smoothly and can be rolled back easily if necessary.
  • Utilize Helm charts for application deployment and management.
  • Understand AWS systems, including AWS Load Balancers, EKS and routing, to support systems handling millions of requests per hour.

Requirements

  • 8+ years of experience in Site Reliability Engineering, DevOps, or a related field.
  • Expertise with AWS.
  • Expertise with Kubernetes, Docker, and Istio.
  • Knowledge of monitoring and alerting tools, particularly Datadog, AppDynamics, ELK, Grafana, or Prometheus.
  • Implement and tune Horizontal Pod Autoscalers (HPAs) to optimize resource utilization.
  • Understanding of Argo CD for GitOps practices.
  • Familiarity with A/B, Canary, Blue/Green deployments, and traffic mirroring techniques.
  • Understanding of scripting and orchestration tools such as Terraform, Ansible, or equivalent.
  • Awareness of cost management in cloud environments and the ability to balance cost with performance and reliability.
  • Demonstrates advanced problem-solving, troubleshooting, decision making skills.
  • Ability to handle a team effectively.
  • Excellent verbal and written communication skills.
  • Expertise in Golang or Rust.


  • Bengaluru, Karnataka, India beBeeCloudReliability Full time US$ 1,25,000 - US$ 1,75,000

    Job OverviewWe are seeking a seasoned professional to fill the role of Senior Cloud Reliability Engineer. This position will play a crucial part in our Shared Capabilities, Service Reliability and Operations group.About the RoleBe a Professional SRE: Implement site reliability engineering and DevOps best practices, ensuring seamless integration into the...


  • Bengaluru, Karnataka, India beBeeCloud Full time ₹ 1,04,000 - ₹ 1,30,878

    Job OverviewThe role of Senior Cloud Engineer is a key position in our organization.We are looking for an experienced and skilled individual to join our team as a Senior Cloud Engineer.The successful candidate will be responsible for designing, implementing, and maintaining cloud-based infrastructure and systems.They will also be responsible for ensuring the...


  • Bengaluru, Karnataka, India beBeeCloud Full time ₹ 1,00,00,000 - ₹ 1,60,00,000

    About the JobOur organization empowers employees to craft their own success stories.We challenge, listen, value and support them in their journey of growth.This is an ideal opportunity for experienced engineers who want to develop their expertise in cloud infrastructure and site reliability engineering.Key Responsibilities:Maintain the reliability,...


  • Bengaluru, Karnataka, India ZEN Cloud Systems Private Limited Full time US$ 90,000 - US$ 1,20,000 per year

    Job Title: Site Reliability Engineer (SRE)Duration: 12 monthsLocation: BangaloreTimings: Full Time (As per company timings)Notice Period: (Immediate Joiner - Only)Experience: 7-8 YearsJob Description:We are seeking a skilled and proactive engineer with expertise in Kubernetes, Java-based applications, and cloud platforms (AWS/Azure/GCP), along with...


  • Bengaluru, Karnataka, India Aerospike Full time

    Job DescriptionAbout AerospikeAt Aerospike, we dream big. Our focus is helping companies tackle seemingly insurmountable problems and doing whats never been done before. That is why we developed the world&aposs leading real-time data platform that powers mission-critical applications at the world&aposs most innovative, category-disrupting companies....


  • Bengaluru, Karnataka, India Aerospike Full time US$ 1,50,000 - US$ 2,00,000 per year

    About AerospikeAt Aerospike, we dream big. Our focus is helping companies tackle seemingly insurmountable problems and doing what's never been done before. That is why we developed the world's leading real-time data platform that powers mission-critical applications at the world's most innovative, category-disrupting companies. Aerospike companies have...


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes . In this role, you will focus on monitoring , basic troubleshooting , and incident response , helping to maintain high...


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • Bengaluru, Karnataka, India SolarWinds Full time

    About the Role:As a Senior Staff Site Reliability Engineer (SRE) at SolarWinds, you will drive the reliability, scalability, and performance of our Observability Platform. This role focuses on managing SaaS infrastructure at scale, improving system reliability through cloud-native architecture, advanced data platform operations, and automation. You will...


  • Bengaluru, Karnataka, India Josys Full time US$ 1,50,000 - US$ 2,00,000 per year

    Senior Site Reliability Engineer (SRE)About JOSYSJosys, a dynamic B2B SaaS platform startup, has embarked on a mission to revolutionize IT operations globally, following an exceptional launch in Japan and securing $125 million in Series A and B funding. Our platform enables businesses to conquer the complexities of work-from-anywhere setups, rapid digital...