▷ [15h Left] Senior Site Reliability Developer

5 days ago


Bengaluru India Oracle Full time

Job Description Job Description OCI is Oracle's next-generation cloud platform, built for the most demanding enterprise workloads. We deliver high-performance computing, storage, networking, and platform services at global scale. The AI Platform, Services & Solutions organization within OCI is building the foundation for enterprise AIspanning GPU infrastructure, training pipelines, orchestration systems, and model deployment services. As part of this mission, we are looking for a Senior Site Reliability Engineer (SRE) to join our team and take ownership of managing and evolving our OKE (Oracle Kubernetes Engine) infrastructure. This is a hands-on, high-impact role where you will be responsible for ensuring the reliability, scalability, and security of cloud-scale services that power AI workloads across Oracle Cloud. Qualifications - 410 years of experience in site reliability, DevOps, or systems engineering. - Strong background in operating large-scale, distributed, and highly available systems. - Proficient with Linux, Python, and shell scripting. - Hands-on experience with Kubernetes (OKE, EKS, GKE, or similar) and Docker. - Experience with Infrastructure as Code (Terraform, Ansible, etc.) on a major cloud provider. - Knowledge of cloud networking, security, and routing (VPC, CIDR, security groups). - Familiarity with observability tools (Prometheus, Elasticsearch, Fluentd, Grafana). - Experience with CI/CD pipelines, git workflows, and agile development. - Understanding of disaster recovery, redundancy, and operational uptime planning. - Strong troubleshooting, problem-solving, and communication skills. - BS/MS in Computer Science or equivalent experience. Desired Attributes - Resourceful and pragmatic in solving operational challenges. - Strong focus on automating repetitive tasks and reducing toil. - Committed to shared responsibility and improving the on-call experience. - Detail-oriented with strong critical-thinking skills. - Eager to learn and to mentor others in a collaborative environment. Responsibilities - Design, automate, and operate infrastructure resources in OCI (compute, storage, networking, load balancing). - Manage large-scale OKE clusters and containerized workloads. - Build automation for service provisioning, monitoring, and lifecycle management. - Develop dashboards, alerts, runbooks, and tooling to improve observability and reliability. - Troubleshoot and resolve complex production issues with a focus on resilience and uptime. - Contribute to service authentication, authorization, and security best practices. - Collaborate with software and ML engineers to deliver highly available AI infrastructure. - Participate in on-call rotations and improve incident response processes. Qualifications Career Level - IC3 About Us As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's challenges. We've partnered with industry-leaders in almost every sectorand continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when everyone is empowered to contribute. That's why we're committed to growing an inclusive workforce that promotes opportunities for all. Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs. We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing [Confidential Information] or by calling +1 888 404 2494 in the United States. Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.



  • India Sapaad Full time

    WHO WE ARE Sapaad is a global leader in unified commerce platforms, delivering world-class software solutions for the food and beverage industry. Our flagship product, also named Sapaad, has achieved remarkable success over the past decade, empowering thousands of F&B businesses across 40+ countries—with many more coming onboard each day. Driven by a...


  • Bengaluru, India Truecaller Full time

    Job Description Hello, Truecaller is calling you from Bangalore, India! Ready to pick up Our goal is to make communication smarter, safer, and more efficient, while building trust across the world. With our roots in Sweden and a global reach, we deliver smart services that create meaningful social impact. We are committed to protecting you from fraud,...


  • Bengaluru, India Groww Full time

    Job Description About Groww We are a passionate group of people focused on making financial services accessible to every Indian through a multi-product platform. Each day, we help millions of customers take charge of their financial journey. Customer obsession is in our DNA. Every product, every design, every algorithm down to the tiniest detail is...


  • Noida, India Thales Full time

    Job Description Location: Noida, India Thales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become...


  • Bengaluru, Karnataka, India Oracle Financial Services Software Ltd Full time ₹ 8,00,000 - ₹ 25,00,000 per year

    Senior Site Reliability Developer Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale...


  • India Akamai Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Would you enjoy improving stability and safety of one of the largest global networks?Would you enjoy hands-on network operations work on a global scale to improve our operational efficiency?Join the Platform Cloud Services Engineering TeamThe Platform Cloud Services SRE team supports globally distributed hosting and database systems for Akamai. These systems...


  • , India, IN Sonata Software Full time

    We're Hiring: Senior Site Reliability Engineer Location: Onsite (Office: Hyderabad – Mandatory from Day 1) Employment Type: Full-time Notice Period: Immediate to 15 Days Only Experience: 8+ Years About the RoleWe’re looking for a Senior Site Reliability Engineer (SRE) to lead reliability initiatives across our production systems. This is a high-impact...


  • India Akamai Technologies Full time

    Job Description Job Description Do you have the passion to architect and lead the next generation of public cloud infrastructure Would you like to lead modernization initiatives while building a public cloud platform from scratch Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and operate infrastructure and services that power...


  • Pune, India Barclays Full time

    Job Description Step into the role of Senior Site Reliability Engineer. At Barclays, we are more than a bank we are a force for progress. You will be the part of the central SRE (Site Reliability Engineer) core team within our wider Infrastructure team. You will act as a centre of excellence providing hands on consultancy to our different infrastructure...


  • India Akamai Technologies Full time

    Job Description Job Description Do you like collaborating across teams to solve complex problems Do you enjoy solving large scale distributed systems problems Join the Mapping SRE team The Mapping SRE team manages availability, reliability, performance, and change processes for Akamai's mapping system. This system routes trillions of daily client...