Cloud Engineer for High-Availability Systems

7 days ago


Hyderabad, Telangana, India beBeeCloud Full time ₹ 1,50,00,000 - ₹ 2,25,00,000

Cloud Operations L2 Support Engineer Job Overview

This highly skilled Cloud Engineer position is a critical role in ensuring the availability, reliability, and performance of our platform services and applications. The ideal candidate will possess deep expertise in Kubernetes, cloud operations, and a passion for optimizing complex distributed systems.

Key Responsibilities:

  • Platform Reliability & Availability (SRE Focus):
  • Run the production environment by proactively monitoring availability and taking a holistic view of system health for our cloud-based platforms.
  • Improve the reliability and quality of the system through automation, process refinement, and best practices.
  • Measure and optimize system performance to ensure efficient resource utilization and optimal user experience.
  • Ensure services are available and monitor critical applications and related services to guarantee system availability.
  • Cloud Operations & Kubernetes Management:
  • Design, deploy, and manage Kubernetes clusters and related cloud infrastructure for application deployments.
  • Implement and maintain containerization strategies and orchestration best practices for telecom workloads.
  • Manage and troubleshoot Robin storage solutions within the Kubernetes environment, supporting unique storage needs.
  • Implement and manage CI/CD pipelines for cloud-native applications.
  • Responsible for cloud resource provisioning, scaling, and cost optimization for deployed network functions.
  • Incident & Problem Management:
  • Collaborate on high-priority incident tickets, ensuring rapid system recovery for impacted services.
  • Provide immediate technical insights and support for cloud-native network functions.
  • Lead Problem Management efforts, including Root Cause Analysis, for complex incidents affecting cloud deployments.
  • Identify bugs and work with development teams to prioritize and implement fixes for cloud-native network elements.
  • Monitoring & Alerting:
  • Implement and maintain robust monitoring, logging, and alerting solutions for cloud infrastructure and applications.
  • Define and track Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for critical services running in the cloud.
  • Automation & Tooling:
  • Develop and implement automation scripts and tools to streamline operational tasks, deployments, and incident response.
  • Evaluate and integrate new tools and technologies to enhance operational efficiency.
  • Collaboration & Knowledge Sharing:
  • Support governance reports, providing technical data and insights on cloud platform performance.
  • Handle customer queries with technical expertise and provide timely resolutions related to cloud-deployed network services.
  • Provide training and mentorship to junior team members on cloud technologies and SRE practices.

Technical Requirements:

  • Deep expertise in Kubernetes.
  • Cluster deployment, management, and troubleshooting for high-performance telecom workloads.
  • Container orchestration, Pod lifecycle, Deployments, Services, Ingress.
  • Helm charts, Kustomize.
  • Advanced networking within Kubernetes.
  • Security best practices in Kubernetes.
  • Proficiency in Cloud Platforms: Experience with at least one major cloud provider.
  • Containerization Technologies: Docker, container.
  • Robin Storage: Hands-on experience with Robin.io or similar distributed persistent storage solutions.
  • Infrastructure as Code (IaC): Terraform, Ansible, or similar tools.
  • Scripting & Automation: Strong proficiency in Python, Go, Bash, or similar.
  • Monitoring & Logging Tools: Prometheus, Grafana, ELK Stack, Splunk, Datadog, or similar.
  • CI/CD Tools: Jenkins, GitLab CI/CD, Argo CD, or similar.
  • Operating Systems: Linux expert-level knowledge.
  • Networking Fundamentals: Deep understanding of TCP/IP, DNS, Load Balancing, Firewalls, VPNs, and advanced network concepts relevant to telecom.
  • Telecommunications Network Knowledge:
  • Strong understanding of Radio Access Network architecture, components, and interfaces.
  • Strong understanding of Core Network architecture, functions, and protocols.

Qualifications:

  • Education: Bachelor's degree in computer science, Engineering, or a related field.
  • Experience: Minimum of 5-6 years of experience in a Cloud Engineering, DevOps, or SRE role.
  • Problem-Solving: Exceptional analytical and problem-solving skills.
  • Communication: Excellent verbal and written communication skills.
  • Proactive Mindset: Ability to anticipate issues and propose preventative solutions.
  • Incident Response: Proven experience in responding to and resolving critical production incidents.
  • Continuous Improvement: A strong desire to learn and adapt.


  • Hyderabad, Telangana, India beBeeReliability Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Job OverviewThe role of Senior Site Reliability Engineer is pivotal in ensuring the stability, scalability, and performance of critical cloud and on-prem services that support millions of customers globally.This position involves overseeing incident management, driving automation efforts, and collaborating with cross-functional teams to align SRE strategy...


  • Hyderabad, Telangana, India beBeeSystems Full time ₹ 20,00,000 - ₹ 25,00,000

    Job Overview:">This Senior Systems Engineer role requires expertise in cloud services, application architecture, and system reliability. The ideal candidate will oversee the management of high-severity incidents, drive automation of operational processes, and ensure systems can scale effectively to support growing user demand. ">Key Responsibilities:...


  • Hyderabad, Telangana, India beBeeCloudEngineer Full time ₹ 20,00,000 - ₹ 25,00,000

    Cloud Engineer Job Description">Are you a skilled cloud engineer looking for a challenging role? We are seeking an experienced engineer with expertise in Azure Kubernetes Service (AKS) and Terraform to join our team. The ideal candidate will be responsible for designing, implementing, and maintaining scalable, secure, and efficient cloud-based systems using...


  • Hyderabad, Telangana, India beBeeSiteReliabilityEngineer Full time ₹ 15,00,000 - ₹ 20,00,000

    Job OverviewWe are seeking an experienced Site Reliability Engineer to ensure the availability, scalability and performance of critical systems and services. Key responsibilities include:• Designing, developing, and deploying reliable and scalable systems and services.• Collaborating with cross-functional teams to identify and prioritize technical...


  • Hyderabad, Telangana, India beBeeInfrastructure Full time ₹ 20,00,000 - ₹ 25,00,000

    Job Role: High Availability Infrastructure SpecialistJob Overview:We are seeking experienced professionals to fill multiple openings for data centre engineers, SQL/Mongo DBAs, and cloud engineers. As a key member of our team, you will play a crucial role in ensuring the smooth operation of our data centres.Responsibilities:Design, implement, and maintain...


  • Hyderabad, Telangana, India beBeeInfrastructure Full time ₹ 9,00,000 - ₹ 12,00,000

    Redis Infrastructure Engineer">We are seeking a skilled and proactive Redis Engineer to join our Data Movement team. The ideal candidate will have deep experience in installing, configuring, maintaining, and providing production support for Redis instances (both standalone and clustered setups). The successful candidate will be responsible for automating...


  • Hyderabad, Telangana, India beBeeExpertise Full time US$ 1,00,000 - US$ 1,50,000

    About Our Team:We are a group of passionate engineers dedicated to building and maintaining cutting-edge solutions that improve operations and elevate customer experiences.Our core values make us who we are. We believe success is never final, and our values remain the same as we grow – putting people first, pursuing excellence, embracing change, acting...

  • Cloud System Engineer

    2 weeks ago


    Hyderabad, Telangana, India Infor Full time US$ 80,000 - US$ 1,20,000 per year

    General informationCountryIndiaStateTelanganaCityHyderabadJob ID45728DepartmentSaaSExperience LevelMID_SENIOR_LEVELEmployment StatusFULL_TIMEWorkplace TypeHybridDescription & RequirementsThe Cloud System Engineer Associate is responsible for delivering Cloud Operations services, including maintaining applications and technical environments, troubleshooting...


  • Hyderabad, Telangana, India beBeeCloud Full time US$ 1,25,000 - US$ 1,75,000

    Job Summary:A Cloud Infrastructure ArchitectWe are seeking a highly skilled and experienced Cloud Infrastructure Architect to join our team. The successful candidate will design, deploy, and manage cloud infrastructure solutions that meet the needs of our organization.The ideal candidate will have a deep understanding of cloud computing and be able to...


  • Hyderabad, Telangana, India beBeeHighAvailability Full time ₹ 8,00,000 - ₹ 12,00,000

    Job Title: High Availability SpecialistAs a key member of our technical team, the High Availability Specialist will be responsible for ensuring the reliability and uptime of our systems.Key Responsibilities:Design and implement high availability solutions to minimize downtime and maximize system performance.Collaborate with cross-functional teams to identify...