Senior Site Reliability Developer

10 hours ago


Bengaluru, Karnataka, India Oracle Full time US$ 1,25,000 - US$ 1,75,000 per year

Oracle Health & Analytics is a rapidly growing organization that leverages Oracle's cloud technologies to modernize and automate healthcare. Our mission is to improve the quality of life by delivering better, more secure experiences and easier access to health and research data for patients and providers. As a new line of business, we foster a creative, entrepreneurial environment unencumbered by legacy systems and value expertise that helps us create a world-class engineering center focused on excellence. Required Qualifications BS or MS in Computer Science or equivalent domain experience. 4–6 years of relevant SRE or cloud engineering experience, operating independently on senior projects. Experience deploying and managing large-scale, customer-facing web services in a public cloud infrastructure (e.g., OCI, AWS, Azure). Expertise in automated deployment and configuration management tools (Terraform, Kubernetes, Ansible, etc.). Hands-on experience with CI/CD for data workflows, DataOps orchestration, and automated data pipeline management. Familiarity with observability tools and methodologies: monitoring, alerting, logging, and performance tuning. Proficient with scripting and programming languages (Python, Bash, etc.) for automation and system integration. Track record of incident management/troubleshooting and root cause analysis in distributed systems. Strong written and verbal communication skills, able to clearly present complex technical information to diverse audiences. US citizenship and eligibility for federal security clearance (if applicable). Preferred Qualifications Knowledge of healthcare data management, compliance, and governance. Experience with data migration, modernization, and control plane architecture. As a Site Reliability Engineer, you will play a critical role in building and operating the control plane for Oracle Health's modern cloud-based SI platform, with an emphasis on Observability & Scaling. You will design, implement, and automate processes and systems that ensure mission-critical data workflows are secure, reliable, resilient, and highly available. This role presents an opportunity to solve complex problems involving large-scale distributed systems, data pipeline management, and automation, all in a highly collaborative, agile environment. Key Responsibilities Design, implement, and operate the control plane that ensures observability & scaling for data-centric services. Lead efforts in automated data pipeline management, including CI/CD for data workflows, data migration, and modernization. Develop and maintain robust monitoring, alerting, and observability tooling to ensure system performance, reliability, and rapid incident response. Partner with development teams to implement improvements in service architecture, focusing on automation, self-healing, and real-time monitoring. Build and operate DataOps automation and orchestration platforms, including onboarding & bootstrapping automation for new services and tenants. Participate in incident management, troubleshooting, and root cause analysis for issues impacting data pipelines, access, or system availability. Support data access control and governance by designing solutions that meet strict security and compliance requirements. Define and improve KPIs, SLOs, and metrics for data platforms and services. Contribute to technology strategy—including data modernization, automation frameworks, and integration of new technologies. Collaborate in cross-functional teams and communicate complex technical concepts to stakeholders in clear, concise ways.

Oracle Health & Analytics is a rapidly growing organization that leverages Oracle's cloud technologies to modernize and automate healthcare. Our mission is to improve the quality of life by delivering better, more secure experiences and easier access to health and research data for patients and providers. As a new line of business, we foster a creative, entrepreneurial environment unencumbered by legacy systems and value expertise that helps us create a world-class engineering center focused on excellence. Required Qualifications BS or MS in Computer Science or equivalent domain experience. 4–6 years of relevant SRE or cloud engineering experience, operating independently on senior projects. Experience deploying and managing large-scale, customer-facing web services in a public cloud infrastructure (e.g., OCI, AWS, Azure). Expertise in automated deployment and configuration management tools (Terraform, Kubernetes, Ansible, etc.). Hands-on experience with CI/CD for data workflows, DataOps orchestration, and automated data pipeline management. Familiarity with observability tools and methodologies: monitoring, alerting, logging, and performance tuning. Proficient with scripting and programming languages (Python, Bash, etc.) for automation and system integration. Track record of incident management/troubleshooting and root cause analysis in distributed systems. Strong written and verbal communication skills, able to clearly present complex technical information to diverse audiences. US citizenship and eligibility for federal security clearance (if applicable). Preferred Qualifications Knowledge of healthcare data management, compliance, and governance. Experience with data migration, modernization, and control plane architecture. As a Site Reliability Engineer, you will play a critical role in building and operating the control plane for Oracle Health's modern cloud-based SI platform, with an emphasis on Observability & Scaling. You will design, implement, and automate processes and systems that ensure mission-critical data workflows are secure, reliable, resilient, and highly available. This role presents an opportunity to solve complex problems involving large-scale distributed systems, data pipeline management, and automation, all in a highly collaborative, agile environment. Key Responsibilities Design, implement, and operate the control plane that ensures observability & scaling for data-centric services. Lead efforts in automated data pipeline management, including CI/CD for data workflows, data migration, and modernization. Develop and maintain robust monitoring, alerting, and observability tooling to ensure system performance, reliability, and rapid incident response. Partner with development teams to implement improvements in service architecture, focusing on automation, self-healing, and real-time monitoring. Build and operate DataOps automation and orchestration platforms, including onboarding & bootstrapping automation for new services and tenants. Participate in incident management, troubleshooting, and root cause analysis for issues impacting data pipelines, access, or system availability. Support data access control and governance by designing solutions that meet strict security and compliance requirements. Define and improve KPIs, SLOs, and metrics for data platforms and services. Contribute to technology strategy—including data modernization, automation frameworks, and integration of new technologies. Collaborate in cross-functional teams and communicate complex technical concepts to stakeholders in clear, concise ways.



  • Bengaluru, Karnataka, India Josys Full time US$ 1,50,000 - US$ 2,00,000 per year

    Senior Site Reliability Engineer (SRE)About JOSYSJosys, a dynamic B2B SaaS platform startup, has embarked on a mission to revolutionize IT operations globally, following an exceptional launch in Japan and securing $125 million in Series A and B funding. Our platform enables businesses to conquer the complexities of work-from-anywhere setups, rapid digital...


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes . In this role, you will focus on monitoring , basic troubleshooting , and incident response , helping to maintain high...


  • Bengaluru, Karnataka, India beBeeReliabilityEngineer Full time ₹ 15,00,000 - ₹ 20,00,000

    Senior Site Reliability Engineer PositionSynopsis: We seek a highly skilled Senior Site Reliability Engineer to spearhead our platform's reliability, scalability, and performance.Job Description:This role is instrumental in ensuring the seamless operation of our infrastructure and applications. Key responsibilities include designing, implementing, and...


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time

    We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • Bengaluru, Karnataka, India HireAlpha Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    We're Hiring | Senior Site Reliability Engineer (SRE)Bangalore | HybridPermanent RoleAre you ready to help shape the future of cloud contact centers? we're building scalable, reliable, and cutting-edge infrastructure for world-class customer experiences — and we're looking for aSenior SREto join our teamWhat you'll do:Lead efforts in building a seamless ...


  • Bengaluru, Karnataka, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our engineering organization, you will be responsible for ensuring the reliability and performance of our applications.The ideal candidate will have strong hands-on experience with Azure and Kubernetes (AKS preferred) and...


  • Bengaluru, Karnataka, India Aerospike Full time

    Job DescriptionAbout AerospikeAt Aerospike, we dream big. Our focus is helping companies tackle seemingly insurmountable problems and doing whats never been done before. That is why we developed the world&aposs leading real-time data platform that powers mission-critical applications at the world&aposs most innovative, category-disrupting companies....


  • Bengaluru, Karnataka, India Aerospike Full time US$ 1,50,000 - US$ 2,00,000 per year

    About AerospikeAt Aerospike, we dream big. Our focus is helping companies tackle seemingly insurmountable problems and doing what's never been done before. That is why we developed the world's leading real-time data platform that powers mission-critical applications at the world's most innovative, category-disrupting companies. Aerospike companies have...


  • Bengaluru, Karnataka, India WhiteLotus Talent Partners Full time

    Our client is looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes.In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...


  • Bengaluru, Karnataka, India Aerospike Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    About Aerospike Aerospike is the real-time database for mission-critical use cases and workloads, including machine learning, generative, and agentic AI. Aerospike powers millions of transactions per second with millisecond latency, at a fraction of the total cost of ownership compared to other databases. Global leaders, including Adobe, Airtel, Barclays,...