AI Data Platform Reliability

5 days ago


India Oracle Full time

Job Description Job Summary: Oracle's AI Data Platform is accelerating enterprise AI and redefining how AI applications are built. The AI Data Platform team is seeking an experience engineer to help drive AI platform reliability. This role is vital to ensuring our enterprise-scale, AI-powered data platform is robust, performant, and reliable. You will develop and execute end-to-end scenario tests across distributed systems, You will design and execute end-to-end scenario tests across distributed systems, and partner with engineering and architecture teams to develop tooling that improves and maintains the platform. You will also embed operational excellence by applying modern SRE practices. - Design, develop, and execute end-to-end (E2E) scenario validations that simulate real-world usage of complex AI data platform workflows (data ingestion, transformation, ML pipeline orchestration, etc.). - Collaborate closely with product, engineering, and field teams to identify gaps in coverage and propose test automation strategies. - Develop and maintain automated test frameworks supporting E2E, integration, performance, and regression testing for distributed data/AI services - Monitor system health across the stack (infrastructure, data pipelines, AI/ML workloads), proactively detect failures or SLA breaches. - Champion SRE best practices including observability, incident management, blameless postmortems, and runbook automation. - Analyze logs, traces, and metrics to identify reliability, latency, and scalability issues drive root cause analysis and corrective actions. - Partner with engineering to drive high-availability, fault tolerance, and continuous delivery (CI/CD) improvements. - Participate in on-call rotation to support critical services, ensuring rapid resolution and minimizing customer impact. Desired Qualifications: - Bachelor's or master's degree in computer science, Engineering, or related field (or demonstrated equivalent experience) - 3+ years experience in software QA/validation, SRE, or DevOps roles, ideally in data platforms, cloud, or AI/ML environments. - Proficient with DevOps automation and tools for continuous integration, deployment, and monitoring (e.g., Terraform, Jenkins, GitLab CI/CD, Prometheus). - Working knowledge of distributed systems, data engineering pipelines, and cloud-native architectures (OCI, AWS, Azure, GCP, etc.). - Strong proficiency in Java, Python and related technologies - Hands-on experience with test automation frameworks (e.g., Selenium, pytest, JUnit) and scripting (Python, Bash, etc.). - Familiarity with SRE practices: service-level objectives (SLO/SLA), incident response, observability (Prometheus, Grafana, ELK, etc.). - Strong troubleshooting and analytical skills with a passion for reliability engineering and process automation. - Excellent communication and cross-team collaboration abilities. Career Level - IC2



  • India Oracle Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    DescriptionJob Summary:Oracle's AI Data Platform is accelerating enterprise AI and redefining how AI applications are built. The AI Data Platform team is seeking an experience engineer to help drive AI platform reliability. This role is vital to ensuring our enterprise-scale, AI-powered data platform is robust, performant, and reliable. You will develop...


  • India Oracle Full time

    Job Description Oracle's AI Data Platform is accelerating enterprise AI and redefining how AI applications are built. The AI Data Platform team is seeking an experience engineer to help drive AI platform reliability. This role is vital to ensuring our enterprise-scale, AI-powered data platform is robust, performant, and reliable. You will develop and execute...


  • India Quantara AI Full time

    Data & AI Engineer – Cyber Risk Intelligence Platform – India Location: India (Remote) About Quantara AI & the Role Quantara AI is a next-generation Cyber Risk Intelligence and Governance platform that helps CISOs, Boards, and executive teams quantify, prioritize, and communicate cyber risk in business terms . Our AI-powered solution combines Cyber Risk...


  • India Quantara AI Full time

    Data & AI Engineer – Cyber Risk Intelligence Platform – India Location: India (Remote) About Quantara AI & the Role Quantara AI is a next-generation Cyber Risk Intelligence and Governance platform that helps CISOs, Boards, and executive teams quantify, prioritize, and communicate cyber risk in business terms. Our AI-powered solution combines Cyber Risk...


  • India Quantara AI Full time

    Data & AI Engineer – Cyber Risk Intelligence Platform – IndiaLocation: India (Remote)About Quantara AI & the RoleQuantara AI is a next-generation Cyber Risk Intelligence and Governance platform that helps CISOs, Boards, and executive teams quantify, prioritize, and communicate cyber risk in business terms. Our AI-powered solution combines Cyber Risk...

  • AI Platform Engineer

    4 weeks ago


    Bengaluru, India NTT Data Full time

    Job Description NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now. We are currently seeking a AI Platform Engineer to join our team in Bangalore, Karntaka (IN-KA), India (IN). Job Duties: Exercise expertise in...


  • India Weekday AI Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    This role is for one of Weekday's clientsMin Experience: 4 yearsJobType: full-timeWe are looking for an experienced and motivated Site Reliability Engineer (SRE) – Platform Engineering to join our growing technology team. In this role, you will be responsible for designing, building, and maintaining scalable, resilient, and secure infrastructure platforms...


  • India GTM Intelli Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    About GTM IntelliAt GTM Intelli, we're redefining how enterprises manage their Go-to-Market operations. Our platform leverages AI-powered agents to optimize sales, marketing, and customer success workflows at scale.We are building next-generation AI solutions that enable enterprise teams to execute smarter, faster, and with greater precision:GTMCommandCenter...


  • India BayOne Solutions Full time

    Job DescriptionWe are seeking a highly skilled AI Platform Engineer to design, build, and operate our next-generation AI application platform. In this role, you will work on advanced AI systems including Retrieval-Augmented Generation (RAG) pipelines, multi-model gateways, Model Context Protocol (MCP) tools, agentic workflow automations (e.g., n8n), and...


  • India BayOne Solutions Full time

    Job Description We are seeking a highly skilled AI Platform Engineer to design, build, and operate our next-generation AI application platform . In this role, you will work on advanced AI systems including Retrieval-Augmented Generation (RAG) pipelines, multi-model gateways , Model Context Protocol (MCP) tools , agentic workflow automations (e.g., n8n), and...