
Senior Cloud Reliability Engineer
5 days ago
About this role
This is an exceptional opportunity to leverage your expertise in Site Reliability Engineering to drive scalability and reliability for our systems.
We are seeking a highly skilled professional with 8-12 years of experience in SRE, DevOps or a related field. As a key member of the team, you will be responsible for ensuring the smooth operation of our systems, identifying areas for improvement and implementing solutions to optimize their performance.
Key Responsibilities
- Design and implement reliable and scalable systems using Kubernetes, Docker, and Istio.
- Monitor system performance and respond to incidents as they arise, utilizing Datadog for observability.
- Develop automation scripts for deployment and monitoring.
- Leverage GitOps to ensure that software can reliably and smoothly be shipped to production.
- Collaborate with development teams to identify and resolve reliability issues.
- Conduct load testing to verify that systems can handle expected loads for new products and updates to existing products.
- Implement A/B deployments, canary deployments, and traffic mirroring strategies to ensure critical updates go smoothly and can be rolled back easily if necessary.
- Utilize Helm charts for application deployment and management.
- Understand AWS systems, including AWS Load Balancers, EKS and routing, to support systems handling millions of requests per hour.
Requirements
- 8+ years of experience in Site Reliability Engineering, DevOps, or a related field.
- Expertise with AWS.
- Expertise with Kubernetes, Docker, and Istio.
- Knowledge of monitoring and alerting tools, particularly Datadog, AppDynamics, ELK, Grafana, or Prometheus.
- Implement and tune Horizontal Pod Autoscalers (HPAs) to optimize resource utilization.
- Understanding of Argo CD for GitOps practices.
- Familiarity with A/B, Canary, Blue/Green deployments, and traffic mirroring techniques.
- Understanding of scripting and orchestration tools such as Terraform, Ansible, or equivalent.
- Awareness of cost management in cloud environments and the ability to balance cost with performance and reliability.
- Demonstrates advanced problem-solving, troubleshooting, decision making skills.
- Ability to handle a team effectively.
- Excellent verbal and written communication skills.
- Expertise in Golang or Rust.
-
Senior Cloud Reliability Engineer
3 days ago
Bengaluru, Karnataka, India beBeeCloudReliability Full time US$ 1,25,000 - US$ 1,75,000Job OverviewWe are seeking a seasoned professional to fill the role of Senior Cloud Reliability Engineer. This position will play a crucial part in our Shared Capabilities, Service Reliability and Operations group.About the RoleBe a Professional SRE: Implement site reliability engineering and DevOps best practices, ensuring seamless integration into the...
-
Senior Cloud Engineer
4 days ago
Bengaluru, Karnataka, India beBeeCloud Full time ₹ 1,04,000 - ₹ 1,30,878Job OverviewThe role of Senior Cloud Engineer is a key position in our organization.We are looking for an experienced and skilled individual to join our team as a Senior Cloud Engineer.The successful candidate will be responsible for designing, implementing, and maintaining cloud-based infrastructure and systems.They will also be responsible for ensuring the...
-
Chief Cloud Reliability Engineer
1 day ago
Bengaluru, Karnataka, India beBeeCloud Full time ₹ 1,00,00,000 - ₹ 1,60,00,000About the JobOur organization empowers employees to craft their own success stories.We challenge, listen, value and support them in their journey of growth.This is an ideal opportunity for experienced engineers who want to develop their expertise in cloud infrastructure and site reliability engineering.Key Responsibilities:Maintain the reliability,...
-
Site Reliability Engineer
13 hours ago
Bengaluru, Karnataka, India ZEN Cloud Systems Private Limited Full time US$ 90,000 - US$ 1,20,000 per yearJob Title: Site Reliability Engineer (SRE)Duration: 12 monthsLocation: BangaloreTimings: Full Time (As per company timings)Notice Period: (Immediate Joiner - Only)Experience: 7-8 YearsJob Description:We are seeking a skilled and proactive engineer with expertise in Kubernetes, Java-based applications, and cloud platforms (AWS/Azure/GCP), along with...
-
Senior Site Reliability Engineer
4 weeks ago
Bengaluru, Karnataka, India Aerospike Full timeJob DescriptionAbout AerospikeAt Aerospike, we dream big. Our focus is helping companies tackle seemingly insurmountable problems and doing whats never been done before. That is why we developed the world&aposs leading real-time data platform that powers mission-critical applications at the world&aposs most innovative, category-disrupting companies....
-
Senior Site Reliability Engineer
15 hours ago
Bengaluru, Karnataka, India Aerospike Full time US$ 1,50,000 - US$ 2,00,000 per yearAbout AerospikeAt Aerospike, we dream big. Our focus is helping companies tackle seemingly insurmountable problems and doing what's never been done before. That is why we developed the world's leading real-time data platform that powers mission-critical applications at the world's most innovative, category-disrupting companies. Aerospike companies have...
-
Site Reliability Engineer
1 week ago
Bengaluru, Karnataka, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes . In this role, you will focus on monitoring , basic troubleshooting , and incident response , helping to maintain high...
-
Site Reliability Engineer
4 days ago
Bengaluru, Karnataka, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...
-
Senior Site Reliability Engineer
6 days ago
Bengaluru, Karnataka, India SolarWinds Full timeAbout the Role:As a Senior Staff Site Reliability Engineer (SRE) at SolarWinds, you will drive the reliability, scalability, and performance of our Observability Platform. This role focuses on managing SaaS infrastructure at scale, improving system reliability through cloud-native architecture, advanced data platform operations, and automation. You will...
-
Senior Site Reliability Engineer
13 hours ago
Bengaluru, Karnataka, India Josys Full time US$ 1,50,000 - US$ 2,00,000 per yearSenior Site Reliability Engineer (SRE)About JOSYSJosys, a dynamic B2B SaaS platform startup, has embarked on a mission to revolutionize IT operations globally, following an exceptional launch in Japan and securing $125 million in Series A and B funding. Our platform enables businesses to conquer the complexities of work-from-anywhere setups, rapid digital...