Cloud System Debug Engineer

4 days ago


Bengaluru, Karnataka, India Ola Full time ₹ 12,00,000 - ₹ 24,00,000 per year

Job Title: Cloud System Debug Engineer

Position Overview

We are seeking an experienced
Cloud System Debug Engineer
with deep expertise in
cloud infrastructure, Kubernetes, OpenStack, Linux systems
, and
Ceph storage
. This role focuses on diagnosing, analyzing, and resolving complex issues across large-scale cloud and distributed environments. You will work across multi-cloud, hybrid, and private cloud platforms to ensure high availability, performance, and reliability of mission-critical systems.

Key Responsibilities

  • Debug complex issues across large-scale
    public, private, and hybrid cloud environments
    .
  • Knowledge of microservices debugging and cloud-native application behavior.
  • Investigate failures in cloud infrastructure components such as networking, storage, virtualization, and orchestration layers.
  • Diagnose and resolve system issues in
    Kubernetes clusters
    , including nodes, pods, networking (CNI), and storage (CSI).
  • Troubleshoot problems with container runtimes such as Docker, containerd, and CRI-O.
  • Debug
    OpenStack
    components including Nova, Neutron, Cinder, Keystone, Glance, Horizon, and related APIs.
  • Debug and optimize
    Ceph
    storage clusters, including OSD issues, MON behavior, CRUSH map analysis, and performance bottlenecks.
  • Perform deep
    Linux system debugging
    , including kernel-level issues, network stack debugging, storage subsystem issues, and performance anomalies.
  • Conduct thorough
    Root Cause Analysis (RCA)
    and implement long-term corrective actions.
  • Improve system observability by enhancing monitoring, logging, and tracing using tools like Prometheus, Grafana, ELK/EFK, and Jaeger.
  • Develop and refine internal tools and automation for diagnostics, system debugging, and infrastructure monitoring.
  • Support production operations through an on-call rotation, addressing high-impact incidents quickly and effectively.
  • Optimize cloud and on-premise infrastructure for performance, scalability, and reliability.
  • Collaborate with DevOps, SRE, platform engineering, and development teams to resolve infrastructure and cloud platform issues.
  • Produce high-quality technical documentation, runbooks, and troubleshooting guides for system and cloud operations.

Required Skills & Qualifications

  • 4+ years
    of experience in cloud infrastructure, distributed systems, Linux administration, or systems engineering.
  • Good expertise with
    cloud platforms
    (AWS, GCP, Azure) or large-scale
    private cloud environments
    .
  • Strong proficiency with
    Kubernetes
    cluster debugging, scaling, and cloud-native architectures.
  • Hands-on experience with
    OpenStack
    cloud components and troubleshooting.
  • Good knowledge of
    Ceph
    distributed storage systems and cluster tuning.
  • In-depth understanding of
    Linux internals
    , including networking, kernel behavior, process management, and storage subsystems.
  • Strong scripting/automation experience (Bash, Python, Ansible, Terraform, Helm).
  • Experience analyzing system logs, traces, crashes, and performance metrics in distributed systems.
  • Proficiency with observability stacks such as Prometheus, Grafana, OpenTelemetry
  • Ability to debug complex interactions between cloud services, orchestration tools, and infrastructure layers.
  • Strong analytical, communication, and documentation skills.

Preferred Qualifications

  • Certifications in AWS/Azure/GCP, CKA/CKAD/CKS, OpenStack, or Ceph.
  • Experience with cloud networking (VXLAN, BGP, SDN, overlay networks).
  • Experience designing, analyzing or operating high-availability, multi-region distributed architectures.

Education

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience).


  • Bengaluru, Karnataka, India Microsoft Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft's expanding Cloud Infrastructure and responsible for powering Microsoft's "Intelligent Cloud" mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox...


  • Bengaluru, Karnataka, India Microsoft Full time ₹ 1,00,00,000 - ₹ 2,00,00,000 per year

    Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft's expanding Cloud Infrastructure and responsible for powering Microsoft's "Intelligent Cloud" mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox...


  • Bengaluru, Karnataka, India Microsoft Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft's expanding Cloud Infrastructure and responsible for powering Microsoft's "Intelligent Cloud" mission. SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox...


  • Bengaluru, Karnataka, India Cloud Software Group Full time

    Job Description:As a Senior Software Engineer, you will design and implement Enterprise grade web applications and REST API services in large Public Clouds or on premise setups. The major technology stack includes .NET and C#, the Azure services, RDBMS, and advanced knowledge on CI/CD (TeamCity/Jenkins). Engineering a solution that can withstand failure and...


  • Bengaluru, Karnataka, India Source-Right Inc. Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Position: DevOps/System Debug Engineer (EI812FT RM 3691)Key Skills:8–12 years in DevOps/System DebugCI/CD pipeline (GitLab, Jenkins, JFrog)Python/Linux scriptingAndroid Build Systems, Docker, Agile practicesHypervisor, JNI, Android 12+ knowledge preferredExcellent debugging and collaboration with global


  • Bengaluru, Karnataka, India Cloud Software Group Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Position OverviewCitrix NetScaler is a top of the line ADC from Cloud Software Group. It is primarily used for providing L4-L7 services like Load Balancing, Caching, Compression, Gateway etc.,Team is working on a platform that enables Citrix NetScaler to be managed/monitored through cloud services.You will be an integral part of designing & developing...


  • Bengaluru, Karnataka, India AutoStore System Full time ₹ 4,00,000 - ₹ 6,00,000 per year

    About UsAutomation Anywhere is the leader in Agentic Process Automation (APA), transforming how work gets done with AI-powered automation. Its APA system, built on the industry's first Process Reasoning Engine (PRE) and specialized AI agents, combines process discovery, RPA, end-to-end orchestration, document processing, and analytics—all delivered with...

  • Cloud Data Engineer

    2 weeks ago


    Bengaluru, Karnataka, India Blue Cloud Softech Solutions Full time ₹ 25,00,000 - ₹ 30,00,000 per year

    Title-Cloud data Engineermode:RemoteJob Description:Primary Responsibilities· Analyze and understand existing data warehouse implementations to support migration and consolidation efforts.· Reverse-engineer legacy stored procedures (PL/SQL, SQL) and translate business logic into scalable Spark SQL code within Databricks notebooks.· Design and develop data...


  • Bengaluru, Karnataka, India toss system Full time ₹ 9,00,000 - ₹ 12,00,000 per year

    Role DescriptionThis is a part-time on-site role for an Internal System Operations Engineer located in Seoul, South Korea. The Internal System Operations Engineer will be responsible for troubleshooting and resolving system issues, providing technical support to internal teams, and ensuring the stability and performance of production systems. Additional...


  • Bengaluru, Karnataka, India Spectro Cloud Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    Who We AreSpectro Cloud aims to make infrastructure boundaryless for the enterprise, from data center to edge and every platform in between. We provide solutions that help enterprises run applications on Kubernetes, their way, anywhere.Established by a team of multi-cloud management experts and industry veterans with a track record of success, we're at the...